SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Visual Information Retrieval:
Advances, Challenges and Opportunities
Oge	
  Marques,	
  PhD	
  
Professor	
  
College	
  of	
  Engineering	
  and	
  Computer	
  Science	
  
Florida	
  Atlan8c	
  University	
  –	
  Boca	
  Raton,	
  FL	
  (USA)	
  
The Distinguished Speakers Program
is made possible by
For additional information, please visit http://dsp.acm.org/
About ACM
ACM, the Association for Computing Machinery is the world’s largest
educational and scientific computing society, uniting educators, researchers and
professionals to inspire dialogue, share resources and address the field’s
challenges.
ACM strengthens the computing profession’s collective voice through strong
leadership, promotion of the highest standards, and recognition of technical
excellence.
ACM supports the professional growth of its members by providing
opportunities for life-long learning, career development, and professional
networking. 

With over 100,000 members from over 100 countries, ACM works to advance
computing as a science and a profession. www.acm.org
Image and video everywhere!
•  1.4-1.7 billion people have a smartphone with
camera
•  350 million photos uploaded to Facebook every
day
•  Instagram (b. Oct' 2010) has
300 million users
•  Snapchat, Instagram, Facebook
and WhatsApp users
(combined) share 1.8 billion
photos each day.
Oge	
  Marques	
  
The big mismatch
Oge	
  Marques	
  
Example: your digital shoebox
•  Does this
screenshot look
familiar to you?
Oge	
  Marques	
  
Example: the Web
•  I wanted a picture (e.g., a photo or clipart) to
illustrate my presentation, report or handout and
can't find it!
Oge	
  Marques	
  
Source: Flickr (https://www.flickr.com/photos/83633410@N07/7658225516)
Working definition
Visual Information Retrieval (VIR) 
techniques aim at solving 
the problem of
Finding relevant 
(documents containing) 
images and videos 
based on incomplete input/query,
which can be visual, text, or both.
Oge	
  Marques	
  
Possible solutions
1. Text-based (also known as tag-based) search
•  Examples:
– Google image search (https://images.google.com/)
– Bing image search (https://www.bing.com/?scope=images)
– Yahoo image search (https://images.search.yahoo.com/)
– Creative Commons portal (http://search.creativecommons.org/)
– Many others (Flickr, Shutterstock, PhotoBucket, etc.)
Oge	
  Marques	
  
Example: text-based web search
Google image search results for “thunderbird
Source: Google Image Search (http://images.google.com/)
Oge	
  Marques	
  
Example: text-based web search
Google image search results for “blue thunderbird”
Source: Google Image Search (http://images.google.com/)
Oge	
  Marques	
  
Example: text-based web search
Bing image search results for “thunderbird”
Source: Bing Image Search (http://bing.com/)
Oge	
  Marques	
  
Example: text-based web search
Bing image search results for “blue thunderbird”
Source: Bing Image Search (http://bing.com/)
Oge	
  Marques	
  
Example: text-based web search
Yahoo image search results for “thunderbird”
Source:Yahoo Image Search (http://images.search.yahoo.com)
Oge	
  Marques	
  
Example: text-based web search
Yahoo image search results for “blue thunderbird”
Source:Yahoo Image Search (http://images.search.yahoo.com)
Oge	
  Marques	
  
Possible solutions
2. Content-based search (also known as 
reverse image search or search by visual similarity)
•  Examples:
– Google image search (https://images.google.com/)
– TinEye (https://www.tineye.com/)
– ImageRaider (https://www.imageraider.com)
– Others
Oge	
  Marques	
  
Example: content-based web search
Source: Google Image Search (http://images.google.com/)
Oge	
  Marques	
  
Query image
Google image
search results
Example: content-based web search
TinEye search results
Source:TinEye (http://www.tineye.com/)
Oge	
  Marques	
  
Example: content-based web search
ImageRaider search results
Source: ImageRaider (http://www.imageraider.com)
Oge	
  Marques	
  
Possible solutions
3. Mixed search (text + visual aspects)
•  Examples:
– Google image search (https://images.google.com/)
– Bing image search (https://www.bing.com/?scope=images)
Oge	
  Marques	
  
Example: mixed (text + image) search
Source: Google Image Search (http://images.google.com/) Oge	
  Marques	
  
Query image
Google image
search results
Query text:
Ferrari
Example: mixed (text + image) search
Source: Bing Image Search (http://bing.com/)
Oge	
  Marques	
  
Bing image search results
Question
Are you happy with the results so far?
Oge	
  Marques	
  
Outline
•  Part I – Fundamental concepts
•  Part II – Challenges
•  Part III – Latest advances
•  Part IV – Research opportunities
Oge	
  Marques	
  
Part I
Fundamental concepts 
(a very brief overview)
Basic framework
Oge	
  Marques	
  
Source: Lei Zhang andYong Rui. 2013. Image search—from thousands to billions in 20 years. 
ACMTrans. Multimedia Comput. Commun.Appl. 9, 1s,Article 36.
Build your ownVIR solution
•  LIRE: Lucene Image Retrieval
–  Java library that provides a
simple way to retrieve images
and photos based on their
color and texture
characteristics.
–  LIRE creates a Lucene index
of image features for content
based image retrieval (CBIR).
–  Provides easy-to-use methods
for searching the index and
browsing results.
–  Open source (available under
the GNU GPL license).
Oge	
  Marques	
  
LIRE
•  Web demo: http://demo-itec.uni-klu.ac.at/liredemo/
– 1 M index images (the MIRFLICKR-1M data set)
http://press.liacs.nl/mirflickr/
– 6 descriptors: CL, CEDD, EH, JCD, PH, SC
– Several other features (check back often)
•  For additional information:
http://www.lire-project.net/
Oge	
  Marques	
  
Part II
Challenges
Four selected challenges
•  Some challenges faced byVIR designers and
researchers include:
1.  the need to capture and measure similarity among
images;
2.  the semantic gap (along with other gaps);
3.  the need to take users’ intentions into account
when designingVIR systems; and
4.  the inherent difficulty in makingVIR solutions work
effectively in broad domains.
Oge	
  Marques	
  
The elusive notion of similarity
•  Are these two images similar?
visually similar
Source: Lux, Mathias, and Oge Marques. Visual information retrieval using Java and LIRE. Morgan  Claypool Synthesis Lectures
on Information Concepts, Retrieval, and Services 5.1 (2013): 1-112.
Oge	
  Marques	
  
The elusive notion of similarity
•  Are these two images similar?
semantically related
Oge	
  Marques	
  
Source: Lux, Mathias, and Oge Marques. Visual information retrieval using Java and LIRE. Morgan  Claypool Synthesis Lectures
on Information Concepts, Retrieval, and Services 5.1 (2013): 1-112.
The semantic gap
•  The semantic gap is the lack of coincidence
between the information that one can extract
from the visual data and the interpretation that
the same data have for a user in a given situation.
•  “The pivotal point in content-based retrieval is that the user
seeks semantic similarity, but the database can only provide
similarity by data processing.This is what we called the
semantic gap.” [Smeulders et al., 2000]
Oge	
  Marques	
  
Other gaps
Ontology of gaps [1]: 
14 types of gaps, in 4 categories:
•  Content-related
•  Feature-related
•  Performance-related
•  Usability-related
________
[1] Deserno,T. M.,Antani, S.,  Long, R. (2009). Ontology of Gaps in Content-Based
Image Retrieval. Journal of Digital Imaging:The Official Journal of the Society for
Computer Applications in Radiology, 22(2), 202–215. doi:10.1007/s10278-007-9092-x
Oge	
  Marques	
  
The utility gap
•  Defined as the gap between what people expect
from the MIR systems (results that would help them
in their further actions) and what they actually get
offered as the system output. [2]
•  Major steps towards bridging the utility gap [2]:
–  to revisit the notion of relevance in order to address the full
information need of the user,
–  to look beyond the relevance when designing and assessing
MIR solutions,
–  to steer the retrieval process towards the result that is
maximally helpful for the user.
________
[2] Hanjalic,A. 2013. Multimedia retrieval that matters. ACMTrans. Multimedia Comput.
Commun. Appl. 9, 1s,Article 44 (October 2013)
Oge	
  Marques	
  
Users’ needs and intentions
•  The intention gap
– Users and developers have quite different views
– Cultural and contextual information should be taken
into account
– User intentions are hard to infer
•  Privacy issues
•  Users themselves don’t always know what they want
•  Who misses the MS Office paper clip?
Oge	
  Marques	
  
Broad domains
Source: Smeulders et al.,“Content-based image retrieval at the end of
the early years”, IEEETransactions on PAMI,Vol 22, Issue 12, Dec 2000 Oge	
  Marques	
  
Part III
Latest advances
Medical image retrieval
Oge	
  Marques	
  
Source: http://medgift.hevs.ch/silverstripe/
Medical image retrieval
•  What has been achieved
– Moderately successful results in 2D (e.g., chest x-rays)
and 3D (e.g., brain MR slices) image retrieval
– Integration of visual and text-based search and
retrieval techniques
– Experiments with multiple query images (from
different modalities)
– Well-established thesaurus (MeSH) and ontologies
(SNOMED CT)
– Ongoing yearly challenge (ImageCLEF) addressing
many open topics, e.g., image classification
Oge	
  Marques	
  
Medical image retrieval
•  Examples of successful solutions to specific
problems
•  IRMA (Image Retrieval in Medical Applications) -
Aachen University (Germany) 
(http://ganymed.imib.rwth-aachen.de/irma/index_en.php)
•  MedGIFT (GNU Image Finding Tool) - Geneva
University (Switzerland) 
(http://medgift.hevs.ch/silverstripe/)
•  WebMIRS - NIH / NLM (USA) 
(https://ceb.nlm.nih.gov/proj/webmirs/index.php)
Oge	
  Marques	
  
Medical image retrieval
•  Ongoing challenges
– Development of effective user interfaces and
visualization techniques
– Feature selection algorithms that take multiple
modalities into account
– Direct use of multidimensional images
– More publicly available standardized datasets for
evaluation
•  Example:VISCERAL 
(http://www.visceral.eu/benchmarks/retrieval2-benchmark/)
– Clinical adoption
Oge	
  Marques	
  
Medical image retrieval
•  Suggested reading:
– A. Kumar et al. (2013). Content-Based Medical Image
Retrieval:A Survey of Applications to Multidimensional
and Multimodality Data. Journal of Digital Imaging, 26(6),
1025–1039.
– Hwang, K. H., Lee, H.,  Choi, D. (2012). Medical Image
Retrieval: Past and Present. Healthcare Informatics
Research, 18(1), 3–9.
Oge	
  Marques	
  
Mobile visual search (MVS)
Oge	
  Marques	
  
Source: Girod et al. IEEE Multimedia 2011
IEEE SIGNAL PROCESSING MAGAZINE [62] JULY 2011
ROBUST MOBILE IMAGE RECOGNITION
Today, the most successful algorithms for content-based image
retrieval use an approach that is referred to as bag of features
(BoFs) or bag of words (BoWs). The BoW idea is borrowed from
text retrieval. To find a particular text document, such as a Web
page, it is sufficient to use a few well-chosen words. In the
database, the document itself can be likewise represented by a
Finally, a geometric verificatio
most similar matches in the datab
spatial pattern between features of
didate database image to ensure
Example retrieval systems are pres
For mobile visual search, ther
to provide the users with an int
deployed systems typically transm
the server, which might require t
large databases, the inverted file in
memory swapping operations slow
ing stage. Further, the GV step
and thus increases the response t
the retrieval pipeline in the follow
the challenges of mobile visual se
[FIG1] A snapshot of an outdoor mobile visual search system
being used. The system augments the viewfinder with
information about the objects it recognizes in the image taken
with a camera phone.
Query
Image
Feature
Extraction
[FIG2] A Pipeline for image retrieva
from the query image. Feature mat
images in the database that have m
with the query image. The GV step
feature locations that cannot be pl
in viewing position.
Mobile visual search (MVS)
•  What has been achieved
– Substantial research on compact image descriptors
– MPEG-7 Part 13: Compact descriptors for visual
search (CDVS) (2013)
– Standardized datasets, for example:
•  Stanford MobileVisual Search Data Set 
(http://web.cs.wpi.edu/~claypool/mmsys-dataset/2011/stanford/)
– Many apps and APIs, such as:
•  CamFind (http://camfindapp.com/)
•  Moodstocks (https://moodstocks.com/)
Oge	
  Marques	
  
Mobile visual search (MVS)
•  Ongoing challenges
– Ensure low latency (and interactive queries) under
constraints such as:
•  Network bandwidth
•  Computational power
•  Battery consumption
– Achieve robust visual recognition in spite of low-
resolution cameras, varying lighting conditions, etc.
– Handle broad and narrow domains
– Explore the full potential of using MVS to bridge the
virtual (digital) world and real world
Oge	
  Marques	
  
Mobile visual search (MVS)
•  Suggested reading:
– Girod, B.; Chandrasekhar,V.; Chen, D.M.; Ngai-Man
Cheung; Grzeszczuk, R.; Reznik,Y.;Takacs, G.;Tsai, S.S.;
Vedantham, R., MobileVisual Search, Signal Processing
Magazine, IEEE, vol.28, no.4, pp.61-76, July 2011
– Ling-Yu Duan; Jie Lin; Jie Chen;Tiejun Huang;Wen Gao,
Compact Descriptors forVisual Search, MultiMedia,
IEEE, vol.21, no.3, pp.30-40, July-Sept. 2014
Oge	
  Marques	
  
Part IV
Research Opportunities
Advice for [young] researchers
•  In this last part, I’ve compiled bits of advice that I
believe might help researchers who are entering
the field.
•  They are based on almost 20 years of work inVIR
and related topics.
•  It is my sincere hope that they will be useful and
lead to successful research experiences.
Oge	
  Marques	
  
Advice for [young] researchers
•  Advice # 1
– Pick a specific problem and domain
•  Example (from our ongoing work)
– Veterinary radiology image retrieval
Oge	
  Marques	
  
!
!
!
Advice for [young] researchers
•  Advice # 2
– Find (or build) the appropriate dataset
•  Examples:
– ImageNet (http://www.image-net.org/)
– MIRFLICKR (http://press.liacs.nl/mirflickr/)
– INRIA Holidays dataset (http://lear.inrialpes.fr/~jegou/data.php )
– UCID: Uncompressed Color Image Database 
(http://homepages.lboro.ac.uk/~cogs/datasets/ucid/ucid.html)
– Many others (associated with challenges – see next
slide)
Oge	
  Marques	
  
Advice for [young] researchers
•  Advice # 3
–  Participate in challenges related to the problem you are
working on
•  Examples:
–  ImageCLEF (http://www.imageclef.org/2015)
–  LifeCLEF (http://www.imageclef.org/lifeclef/2015)
–  ImageNet Large ScaleVisual Recognition Challenge
(ILSVRC2015) (http://image-net.org/challenges/LSVRC/2015/index)
–  MSR-Bing Image Retrieval Challenge (IRC) 
(http://research.microsoft.com/en-us/projects/irc/)
–  MediaEval Benchmarking Initiative for Multimedia
Evaluation (http://www.multimediaeval.org/mediaeval2015)
Oge	
  Marques	
  
Advice for [young] researchers
•  Advice # 4
– Be mindful of related areas where you can make a
contribution
•  Example (from our own work):
– While working on the broader topic of Medical Case
Retrieval (MCR), Mario Taschwer and I participated in
the ImageCLEF 2015 compound figure separation task 
(http://www.imageclef.org/2015/medical).
– Our current results are the best reported in the
literature.
Oge	
  Marques	
  
Advice for [young] researchers
•  Advice # 5
– Beware of the trap of building a solution is search of a
problem
•  Example:
– For many years, a tough selling point for CBIR was the
reliance on a query-by-example (QBE) paradigm.
•  What if I don't have a good example image to begin with?
– With the popularization of mobile visual search (MVS)
solutions, a use case for this model emerged naturally,
since the example is right in front of the user!
Oge	
  Marques	
  
Advice for [young] researchers
•  Advice # 6
– Think about creative ways to leverage the power of
human computation
•  Examples:
– Crowdsourcing campaigns
– Mining social media visual data and associated
metadata (text, URLs, hashtags, etc.)
– Games with a purpose (GWAPs)
Oge	
  Marques	
  
Advice for [young] researchers
•  Advice # 7
– Put yourself in the shoes of the user
•  More specifically:
– Take into account:
•  the context of the search
•  the usefulness of results to the user
– Understand (and model) user’s intentions, preferences
and needs
– Create better interfaces
– Provide a better user experience
Oge	
  Marques	
  
Final remarks
•  “Visual information retrieval” is an active and vibrant research
area, with many open research challenges and market
opportunities.
•  There is a great need for good solutions to specific problems.
•  Pick one of the many open problems, challenges, and
opportunities and build a successful solution!
Oge	
  Marques	
  
Contact information: 
omarques@fau.edu

Weitere ähnliche Inhalte

Was ist angesagt?

Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
butest
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
antiw
 
Technology & robotics dem
Technology & robotics demTechnology & robotics dem
Technology & robotics dem
lrobtison
 
Lecture 5: Personalization on the Social Web (2013)
Lecture 5: Personalization on the Social Web (2013)Lecture 5: Personalization on the Social Web (2013)
Lecture 5: Personalization on the Social Web (2013)
Lora Aroyo
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII
 

Was ist angesagt? (13)

Who are the users of a video search system?
Who are the users of a video search system?Who are the users of a video search system?
Who are the users of a video search system?
 
Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
 
Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
 
Lecture 4: Social Web Personalization (2012)
Lecture 4: Social Web Personalization (2012)Lecture 4: Social Web Personalization (2012)
Lecture 4: Social Web Personalization (2012)
 
Technology & robotics dem
Technology & robotics demTechnology & robotics dem
Technology & robotics dem
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Information Architecture - Part 2 - Spring 2013 - Class 3
Information Architecture - Part 2 - Spring 2013 - Class 3Information Architecture - Part 2 - Spring 2013 - Class 3
Information Architecture - Part 2 - Spring 2013 - Class 3
 
Rapid Innovative Design Notes
Rapid Innovative Design NotesRapid Innovative Design Notes
Rapid Innovative Design Notes
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introduction
 
Lecture 5: Personalization on the Social Web (2013)
Lecture 5: Personalization on the Social Web (2013)Lecture 5: Personalization on the Social Web (2013)
Lecture 5: Personalization on the Social Web (2013)
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
 

Andere mochten auch

Image retrieval: challenges and opportunities
Image retrieval: challenges and opportunitiesImage retrieval: challenges and opportunities
Image retrieval: challenges and opportunities
Oge Marques
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Universitat Politècnica de Catalunya
 
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Universitat Politècnica de Catalunya
 
The impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classificationThe impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classification
Universitat Politècnica de Catalunya
 

Andere mochten auch (20)

Using games to improve computer vision solutions
Using games to improve computer vision solutionsUsing games to improve computer vision solutions
Using games to improve computer vision solutions
 
Image retrieval: challenges and opportunities
Image retrieval: challenges and opportunitiesImage retrieval: challenges and opportunities
Image retrieval: challenges and opportunities
 
Faces in Places: Compound Query Retrieval
Faces in Places: Compound Query RetrievalFaces in Places: Compound Query Retrieval
Faces in Places: Compound Query Retrieval
 
Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
The impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classificationThe impact of visual saliency prediction in image classification
The impact of visual saliency prediction in image classification
 
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
 
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
 
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
 
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
 
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
 
Digital image processing using matlab: filters (detail)
Digital image processing using matlab: filters (detail)Digital image processing using matlab: filters (detail)
Digital image processing using matlab: filters (detail)
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_cho
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
 
High level-api in tensorflow
High level-api in tensorflowHigh level-api in tensorflow
High level-api in tensorflow
 

Ähnlich wie Visual Information Retrieval: Advances, Challenges and Opportunities

Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systems
unyil96
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
unyil96
 
Top 10 Ways to Make Your Digital Content Accessible
Top 10 Ways to Make Your Digital Content AccessibleTop 10 Ways to Make Your Digital Content Accessible
Top 10 Ways to Make Your Digital Content Accessible
D2L Barry
 
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
Editor IJMTER
 

Ähnlich wie Visual Information Retrieval: Advances, Challenges and Opportunities (20)

Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systems
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectors
 
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORYBUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
 
Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015
 
Top 10 Ways to Make Your Digital Content Accessible
Top 10 Ways to Make Your Digital Content AccessibleTop 10 Ways to Make Your Digital Content Accessible
Top 10 Ways to Make Your Digital Content Accessible
 
IUI 2010: An Informal Summary of the International Conference on Intelligent ...
IUI 2010: An Informal Summary of the International Conference on Intelligent ...IUI 2010: An Informal Summary of the International Conference on Intelligent ...
IUI 2010: An Informal Summary of the International Conference on Intelligent ...
 
Connecting Through Technology Part 1
Connecting Through Technology Part 1Connecting Through Technology Part 1
Connecting Through Technology Part 1
 
CAEPIA 2011
CAEPIA 2011CAEPIA 2011
CAEPIA 2011
 
Interaction design: desiging user interfaces for digital products
Interaction design: desiging user interfaces for digital productsInteraction design: desiging user interfaces for digital products
Interaction design: desiging user interfaces for digital products
 
Who needs a repository when you’ve got Google? Information and Digital Litera...
Who needs a repository when you’ve got Google? Information and Digital Litera...Who needs a repository when you’ve got Google? Information and Digital Litera...
Who needs a repository when you’ve got Google? Information and Digital Litera...
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case Study
 
Week 3
Week 3Week 3
Week 3
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education Organisation
 
Multimedia Database
Multimedia DatabaseMultimedia Database
Multimedia Database
 
App4 gauldd
App4 gaulddApp4 gauldd
App4 gauldd
 
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
 
Participatory Media Literacy Diverse2008
Participatory Media Literacy Diverse2008Participatory Media Literacy Diverse2008
Participatory Media Literacy Diverse2008
 
The Social Semantic Server: A Flexible Framework to Support Informal Learning...
The Social Semantic Server: A Flexible Framework to Support Informal Learning...The Social Semantic Server: A Flexible Framework to Support Informal Learning...
The Social Semantic Server: A Flexible Framework to Support Informal Learning...
 
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
 

Kürzlich hochgeladen

Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 

Kürzlich hochgeladen (20)

Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 

Visual Information Retrieval: Advances, Challenges and Opportunities

  • 1. Visual Information Retrieval: Advances, Challenges and Opportunities Oge  Marques,  PhD   Professor   College  of  Engineering  and  Computer  Science   Florida  Atlan8c  University  –  Boca  Raton,  FL  (USA)  
  • 2. The Distinguished Speakers Program is made possible by For additional information, please visit http://dsp.acm.org/
  • 3. About ACM ACM, the Association for Computing Machinery is the world’s largest educational and scientific computing society, uniting educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. ACM strengthens the computing profession’s collective voice through strong leadership, promotion of the highest standards, and recognition of technical excellence. ACM supports the professional growth of its members by providing opportunities for life-long learning, career development, and professional networking. With over 100,000 members from over 100 countries, ACM works to advance computing as a science and a profession. www.acm.org
  • 4. Image and video everywhere! •  1.4-1.7 billion people have a smartphone with camera •  350 million photos uploaded to Facebook every day •  Instagram (b. Oct' 2010) has 300 million users •  Snapchat, Instagram, Facebook and WhatsApp users (combined) share 1.8 billion photos each day. Oge  Marques  
  • 5. The big mismatch Oge  Marques  
  • 6. Example: your digital shoebox •  Does this screenshot look familiar to you? Oge  Marques  
  • 7. Example: the Web •  I wanted a picture (e.g., a photo or clipart) to illustrate my presentation, report or handout and can't find it! Oge  Marques   Source: Flickr (https://www.flickr.com/photos/83633410@N07/7658225516)
  • 8. Working definition Visual Information Retrieval (VIR) techniques aim at solving the problem of Finding relevant (documents containing) images and videos based on incomplete input/query, which can be visual, text, or both. Oge  Marques  
  • 9. Possible solutions 1. Text-based (also known as tag-based) search •  Examples: – Google image search (https://images.google.com/) – Bing image search (https://www.bing.com/?scope=images) – Yahoo image search (https://images.search.yahoo.com/) – Creative Commons portal (http://search.creativecommons.org/) – Many others (Flickr, Shutterstock, PhotoBucket, etc.) Oge  Marques  
  • 10. Example: text-based web search Google image search results for “thunderbird Source: Google Image Search (http://images.google.com/) Oge  Marques  
  • 11. Example: text-based web search Google image search results for “blue thunderbird” Source: Google Image Search (http://images.google.com/) Oge  Marques  
  • 12. Example: text-based web search Bing image search results for “thunderbird” Source: Bing Image Search (http://bing.com/) Oge  Marques  
  • 13. Example: text-based web search Bing image search results for “blue thunderbird” Source: Bing Image Search (http://bing.com/) Oge  Marques  
  • 14. Example: text-based web search Yahoo image search results for “thunderbird” Source:Yahoo Image Search (http://images.search.yahoo.com) Oge  Marques  
  • 15. Example: text-based web search Yahoo image search results for “blue thunderbird” Source:Yahoo Image Search (http://images.search.yahoo.com) Oge  Marques  
  • 16. Possible solutions 2. Content-based search (also known as reverse image search or search by visual similarity) •  Examples: – Google image search (https://images.google.com/) – TinEye (https://www.tineye.com/) – ImageRaider (https://www.imageraider.com) – Others Oge  Marques  
  • 17. Example: content-based web search Source: Google Image Search (http://images.google.com/) Oge  Marques   Query image Google image search results
  • 18. Example: content-based web search TinEye search results Source:TinEye (http://www.tineye.com/) Oge  Marques  
  • 19. Example: content-based web search ImageRaider search results Source: ImageRaider (http://www.imageraider.com) Oge  Marques  
  • 20. Possible solutions 3. Mixed search (text + visual aspects) •  Examples: – Google image search (https://images.google.com/) – Bing image search (https://www.bing.com/?scope=images) Oge  Marques  
  • 21. Example: mixed (text + image) search Source: Google Image Search (http://images.google.com/) Oge  Marques   Query image Google image search results Query text: Ferrari
  • 22. Example: mixed (text + image) search Source: Bing Image Search (http://bing.com/) Oge  Marques   Bing image search results
  • 23. Question Are you happy with the results so far? Oge  Marques  
  • 24. Outline •  Part I – Fundamental concepts •  Part II – Challenges •  Part III – Latest advances •  Part IV – Research opportunities Oge  Marques  
  • 25. Part I Fundamental concepts (a very brief overview)
  • 26. Basic framework Oge  Marques   Source: Lei Zhang andYong Rui. 2013. Image search—from thousands to billions in 20 years. ACMTrans. Multimedia Comput. Commun.Appl. 9, 1s,Article 36.
  • 27. Build your ownVIR solution •  LIRE: Lucene Image Retrieval –  Java library that provides a simple way to retrieve images and photos based on their color and texture characteristics. –  LIRE creates a Lucene index of image features for content based image retrieval (CBIR). –  Provides easy-to-use methods for searching the index and browsing results. –  Open source (available under the GNU GPL license). Oge  Marques  
  • 28. LIRE •  Web demo: http://demo-itec.uni-klu.ac.at/liredemo/ – 1 M index images (the MIRFLICKR-1M data set) http://press.liacs.nl/mirflickr/ – 6 descriptors: CL, CEDD, EH, JCD, PH, SC – Several other features (check back often) •  For additional information: http://www.lire-project.net/ Oge  Marques  
  • 30. Four selected challenges •  Some challenges faced byVIR designers and researchers include: 1.  the need to capture and measure similarity among images; 2.  the semantic gap (along with other gaps); 3.  the need to take users’ intentions into account when designingVIR systems; and 4.  the inherent difficulty in makingVIR solutions work effectively in broad domains. Oge  Marques  
  • 31. The elusive notion of similarity •  Are these two images similar? visually similar Source: Lux, Mathias, and Oge Marques. Visual information retrieval using Java and LIRE. Morgan Claypool Synthesis Lectures on Information Concepts, Retrieval, and Services 5.1 (2013): 1-112. Oge  Marques  
  • 32. The elusive notion of similarity •  Are these two images similar? semantically related Oge  Marques   Source: Lux, Mathias, and Oge Marques. Visual information retrieval using Java and LIRE. Morgan Claypool Synthesis Lectures on Information Concepts, Retrieval, and Services 5.1 (2013): 1-112.
  • 33. The semantic gap •  The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation. •  “The pivotal point in content-based retrieval is that the user seeks semantic similarity, but the database can only provide similarity by data processing.This is what we called the semantic gap.” [Smeulders et al., 2000] Oge  Marques  
  • 34. Other gaps Ontology of gaps [1]: 14 types of gaps, in 4 categories: •  Content-related •  Feature-related •  Performance-related •  Usability-related ________ [1] Deserno,T. M.,Antani, S., Long, R. (2009). Ontology of Gaps in Content-Based Image Retrieval. Journal of Digital Imaging:The Official Journal of the Society for Computer Applications in Radiology, 22(2), 202–215. doi:10.1007/s10278-007-9092-x Oge  Marques  
  • 35. The utility gap •  Defined as the gap between what people expect from the MIR systems (results that would help them in their further actions) and what they actually get offered as the system output. [2] •  Major steps towards bridging the utility gap [2]: –  to revisit the notion of relevance in order to address the full information need of the user, –  to look beyond the relevance when designing and assessing MIR solutions, –  to steer the retrieval process towards the result that is maximally helpful for the user. ________ [2] Hanjalic,A. 2013. Multimedia retrieval that matters. ACMTrans. Multimedia Comput. Commun. Appl. 9, 1s,Article 44 (October 2013) Oge  Marques  
  • 36. Users’ needs and intentions •  The intention gap – Users and developers have quite different views – Cultural and contextual information should be taken into account – User intentions are hard to infer •  Privacy issues •  Users themselves don’t always know what they want •  Who misses the MS Office paper clip? Oge  Marques  
  • 37. Broad domains Source: Smeulders et al.,“Content-based image retrieval at the end of the early years”, IEEETransactions on PAMI,Vol 22, Issue 12, Dec 2000 Oge  Marques  
  • 39. Medical image retrieval Oge  Marques   Source: http://medgift.hevs.ch/silverstripe/
  • 40. Medical image retrieval •  What has been achieved – Moderately successful results in 2D (e.g., chest x-rays) and 3D (e.g., brain MR slices) image retrieval – Integration of visual and text-based search and retrieval techniques – Experiments with multiple query images (from different modalities) – Well-established thesaurus (MeSH) and ontologies (SNOMED CT) – Ongoing yearly challenge (ImageCLEF) addressing many open topics, e.g., image classification Oge  Marques  
  • 41. Medical image retrieval •  Examples of successful solutions to specific problems •  IRMA (Image Retrieval in Medical Applications) - Aachen University (Germany) (http://ganymed.imib.rwth-aachen.de/irma/index_en.php) •  MedGIFT (GNU Image Finding Tool) - Geneva University (Switzerland) (http://medgift.hevs.ch/silverstripe/) •  WebMIRS - NIH / NLM (USA) (https://ceb.nlm.nih.gov/proj/webmirs/index.php) Oge  Marques  
  • 42. Medical image retrieval •  Ongoing challenges – Development of effective user interfaces and visualization techniques – Feature selection algorithms that take multiple modalities into account – Direct use of multidimensional images – More publicly available standardized datasets for evaluation •  Example:VISCERAL (http://www.visceral.eu/benchmarks/retrieval2-benchmark/) – Clinical adoption Oge  Marques  
  • 43. Medical image retrieval •  Suggested reading: – A. Kumar et al. (2013). Content-Based Medical Image Retrieval:A Survey of Applications to Multidimensional and Multimodality Data. Journal of Digital Imaging, 26(6), 1025–1039. – Hwang, K. H., Lee, H., Choi, D. (2012). Medical Image Retrieval: Past and Present. Healthcare Informatics Research, 18(1), 3–9. Oge  Marques  
  • 44. Mobile visual search (MVS) Oge  Marques   Source: Girod et al. IEEE Multimedia 2011 IEEE SIGNAL PROCESSING MAGAZINE [62] JULY 2011 ROBUST MOBILE IMAGE RECOGNITION Today, the most successful algorithms for content-based image retrieval use an approach that is referred to as bag of features (BoFs) or bag of words (BoWs). The BoW idea is borrowed from text retrieval. To find a particular text document, such as a Web page, it is sufficient to use a few well-chosen words. In the database, the document itself can be likewise represented by a Finally, a geometric verificatio most similar matches in the datab spatial pattern between features of didate database image to ensure Example retrieval systems are pres For mobile visual search, ther to provide the users with an int deployed systems typically transm the server, which might require t large databases, the inverted file in memory swapping operations slow ing stage. Further, the GV step and thus increases the response t the retrieval pipeline in the follow the challenges of mobile visual se [FIG1] A snapshot of an outdoor mobile visual search system being used. The system augments the viewfinder with information about the objects it recognizes in the image taken with a camera phone. Query Image Feature Extraction [FIG2] A Pipeline for image retrieva from the query image. Feature mat images in the database that have m with the query image. The GV step feature locations that cannot be pl in viewing position.
  • 45. Mobile visual search (MVS) •  What has been achieved – Substantial research on compact image descriptors – MPEG-7 Part 13: Compact descriptors for visual search (CDVS) (2013) – Standardized datasets, for example: •  Stanford MobileVisual Search Data Set (http://web.cs.wpi.edu/~claypool/mmsys-dataset/2011/stanford/) – Many apps and APIs, such as: •  CamFind (http://camfindapp.com/) •  Moodstocks (https://moodstocks.com/) Oge  Marques  
  • 46. Mobile visual search (MVS) •  Ongoing challenges – Ensure low latency (and interactive queries) under constraints such as: •  Network bandwidth •  Computational power •  Battery consumption – Achieve robust visual recognition in spite of low- resolution cameras, varying lighting conditions, etc. – Handle broad and narrow domains – Explore the full potential of using MVS to bridge the virtual (digital) world and real world Oge  Marques  
  • 47. Mobile visual search (MVS) •  Suggested reading: – Girod, B.; Chandrasekhar,V.; Chen, D.M.; Ngai-Man Cheung; Grzeszczuk, R.; Reznik,Y.;Takacs, G.;Tsai, S.S.; Vedantham, R., MobileVisual Search, Signal Processing Magazine, IEEE, vol.28, no.4, pp.61-76, July 2011 – Ling-Yu Duan; Jie Lin; Jie Chen;Tiejun Huang;Wen Gao, Compact Descriptors forVisual Search, MultiMedia, IEEE, vol.21, no.3, pp.30-40, July-Sept. 2014 Oge  Marques  
  • 49. Advice for [young] researchers •  In this last part, I’ve compiled bits of advice that I believe might help researchers who are entering the field. •  They are based on almost 20 years of work inVIR and related topics. •  It is my sincere hope that they will be useful and lead to successful research experiences. Oge  Marques  
  • 50. Advice for [young] researchers •  Advice # 1 – Pick a specific problem and domain •  Example (from our ongoing work) – Veterinary radiology image retrieval Oge  Marques   ! ! !
  • 51. Advice for [young] researchers •  Advice # 2 – Find (or build) the appropriate dataset •  Examples: – ImageNet (http://www.image-net.org/) – MIRFLICKR (http://press.liacs.nl/mirflickr/) – INRIA Holidays dataset (http://lear.inrialpes.fr/~jegou/data.php ) – UCID: Uncompressed Color Image Database (http://homepages.lboro.ac.uk/~cogs/datasets/ucid/ucid.html) – Many others (associated with challenges – see next slide) Oge  Marques  
  • 52. Advice for [young] researchers •  Advice # 3 –  Participate in challenges related to the problem you are working on •  Examples: –  ImageCLEF (http://www.imageclef.org/2015) –  LifeCLEF (http://www.imageclef.org/lifeclef/2015) –  ImageNet Large ScaleVisual Recognition Challenge (ILSVRC2015) (http://image-net.org/challenges/LSVRC/2015/index) –  MSR-Bing Image Retrieval Challenge (IRC) (http://research.microsoft.com/en-us/projects/irc/) –  MediaEval Benchmarking Initiative for Multimedia Evaluation (http://www.multimediaeval.org/mediaeval2015) Oge  Marques  
  • 53. Advice for [young] researchers •  Advice # 4 – Be mindful of related areas where you can make a contribution •  Example (from our own work): – While working on the broader topic of Medical Case Retrieval (MCR), Mario Taschwer and I participated in the ImageCLEF 2015 compound figure separation task (http://www.imageclef.org/2015/medical). – Our current results are the best reported in the literature. Oge  Marques  
  • 54. Advice for [young] researchers •  Advice # 5 – Beware of the trap of building a solution is search of a problem •  Example: – For many years, a tough selling point for CBIR was the reliance on a query-by-example (QBE) paradigm. •  What if I don't have a good example image to begin with? – With the popularization of mobile visual search (MVS) solutions, a use case for this model emerged naturally, since the example is right in front of the user! Oge  Marques  
  • 55. Advice for [young] researchers •  Advice # 6 – Think about creative ways to leverage the power of human computation •  Examples: – Crowdsourcing campaigns – Mining social media visual data and associated metadata (text, URLs, hashtags, etc.) – Games with a purpose (GWAPs) Oge  Marques  
  • 56. Advice for [young] researchers •  Advice # 7 – Put yourself in the shoes of the user •  More specifically: – Take into account: •  the context of the search •  the usefulness of results to the user – Understand (and model) user’s intentions, preferences and needs – Create better interfaces – Provide a better user experience Oge  Marques  
  • 57. Final remarks •  “Visual information retrieval” is an active and vibrant research area, with many open research challenges and market opportunities. •  There is a great need for good solutions to specific problems. •  Pick one of the many open problems, challenges, and opportunities and build a successful solution! Oge  Marques   Contact information: omarques@fau.edu