5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
Image Search: Then and Now
1. Image Search: Then and Now
Integrated Knowledge Solutions
iksinc@yahoo.com
sikrishan@gmail.com
iksinc.wordpress.com
2. Outline
• Introduction
• Image = Content + Context
• Content Based Image Retrieval (CBIR)
• Bridging the Semantic Gap
• Using Social Interactions for Retrieval
• Where do we go from here
3. What is Image Search?
• Image search means retrieving images from an
image database that satisfy the user’s need.
• The user need may be expressed in the following
ways:
– Keywords or text describing the image content
– An exemplar image
• Other names for image search
– Image retrieval
– Image similarity search
– Content based image retrieval (CBIR)
5. Nalanda University was one of the first universities
in the world, founded in the 5th Century BC, and
reported to have been visited by the Buddha during
his lifetime. At its peak, in the 7th century AD,
Nalanda held some 10,000 students when it was
visited by the Chinese scholar Xuanzang.
6. The Royal Library of Alexandria, in Egypt, seems to have been
the largest and most significant great library of the ancient
world. It functioned as a major center of scholarship from its
construction in the third century B.C. until the Roman
conquest of Egypt in 48 B.C.
8. But Now a Days
No Distinction Between Document Producers
and Consumers
9.
10. Some Relevant Numbers
Flickr has over 6 billion pictures as of August 2011,
and 3.5 million images are uploaded daily.
Photobucket has more than 10 Billion images, and
over 4 million images are uploaded everyday.
Facebook has over 60 Billion photos and more than
350 million photos are uploaded everyday.
Instagram has over 20 billion photos. About 60
million photos are uploaded everyday.
11. An image now a days is not just a
picture but it is a picture with
thousand words
12. Image = Content + Context
Tags
Cherry
blossom
Japantown
San
Francisco
Peace
Pagoda
Content Context
13. So, image retrieval should benefit
from the contextual component, if
present.
How?
But, first let us look at image
retrieval from the content
perspective only
15. A Typical QBIC Type Image Retrieval
System
Feature
Extraction
FeaturesMedia
Collection
Indexing &
Matching
Query Feature
Extraction
Retrieved
Results
Relevance
Feedback
Such systems/approaches are often referred to as Content
Based Image Retrieval (CBIR)
16.
17.
18. Semantic Gap
Early systems produced results
wherein the retrieved
documents were visually similar
(signal level similar) but not
necessarily similar in showing
the same semantic concept.
Content-Based Image Retrieval at the End of the Early Years,
IEEE Transactions on Pattern Analysis and Machine Intelligence , Arnold
Smeulders , Marcel Worring , Simone Santini , Amarnath Gupta ,
Ramesh Jain , December 2000
http://www.searchenginejournal.com/7-similarity-based-image-search-engines/8265/
19. Semantic Gap
Users also like to query using descriptive
words rather than query images or other
multimedia objects. This requires retrieval
systems to correlate low-level features
with high level concepts.
Visually dissimilar
images representing
the same concept.
21. How to Bridge the Semantic Gap?
Manual annotation
Use machine learning to:
• Build image category classifiers to
perform semantic filtering of the
results
• Build specific detectors for objects to
associate concepts with images
•Build object models using low level
features
Exploit context:
• Text surrounding images
• Associated sound track and
closed captions in videos
• Query history
27. Example of Image Search using Keywords
Search result in 2014
Again, the results are better organized in sub-
categories
28. Exploiting Context: An Example
Kulesh, Petrushin and Sethi, “The PERSEUS Project: Creating Personalized Multimedia News Portal,”
Proceedings Second Int’l Workshop on Multimedia Data Mining, 2001
29. Machine Learning of Image Concepts
• Challenging problem
• Presence of multiple concepts/multiple instances
• Disproportionate number of negative examples
• Manpower need for labeling training examples
30. Feature Extraction Issues
Whole image based features.
Easy to use but not very
effective
Region based features. Both
regular region structure and
segmented regions are popular
Salient objects based features.
Connected regions
corresponding to dominant
visual properties of objects in an
image
31. Scale Invariant Feature Transform
(SIFT) Descriptors
SIFT descriptors or its variants are
currently the most popular features
in use. Each image generates
thousands of features (key point
descriptors) with each feature
typically consisting of 128 values
http://www.vlfeat.org/
D. G. Lowe, “Distinctive image
features from scale-invariant
keypoints,” IJCV, 2004.
32. Learning Image Concepts
• Both supervised and unsupervised
learning methods (SVM, DT, AdaBoost,
VQ etc.) have been used
• Early work limited to few tens of
categories; however some of the current
systems can work with thousands of
categories/concepts
33. VQ Based Learning Classifier
Test
Image
Best
Codebook
Label
Water Codebook
Sky Codebook
Fire Codebook
Mustafa & Sethi (2004)
36. Co-occurrence of Bag of Words
Image
Collection
Edge
Analysis
Images
Collection of
Binary Image
Blocks
Clustering
Local
Feature
Descriptors
(Codewords)
Codeword
Representation
Of Images
Co-occurrence
Matrices of
Local Features
Compute
Distances
Image
Distance
Matrix
Pathfinder
Network
Mukhopadhyay, Ma, and Sethi, “Pathfinder Networks for Content Based Image
Retrieval Based on Automated Shape Feature Discovery,” ISMSE 2004
37. Co-occurrence of BoW
Original image
Representation by
feature indices
(cluster membership)
Co-occurrence matrix
)},(),,(max{),( ABhBAhBAH
))max(min(),( AaBbbaBAh
Hausdorff metric
Manhattan distance
43. IMARS provides a large number of built-in classifiers for visual categories that cover places, people, objects, settings,
activities and events. It is easy to add new ones. IMARS can work on PC or laptop (trial version is available at IBM
alphaWorks). IMARS can also work at large-scale for high-volume batch processing of millions and images and videos
per day. Several demos of IMARS are available (see IMARS demos)
Image Category Classifiers Examples
44. Semantic labeling. (a) An MPE semantic retrieval system groups images by semantic
concept and learns a probabilistic model for each concept. (b) The system represents
each image by a vector of posterior concept probabilities.
From Pixels to Semantic Spaces: Advances in Content-Based Image
Retrieval (Nuno Vasconcelos, IEEE Computer, July 2007)
Image Classification via Probabilistic
Modeling
45. Image = Content + Context
Tags
Cherry
blossom
Japantown
San
Francisco
Peace
Pagoda
Content Context
47. About Tags
• User centered
• Imprecise and often overly personalized
• Tag distribution follows power law
• Most users use very few distinct tags while a small group of users works
with extremely large set of tags
• Also known as Folksonomy, social tagging, and social classification
48. Why Not Use Social Tags for Retrieval?
Problem: The relevant tag is
often not at the top of the list.
Only less than 10% of the
images have their most relevant
tag at the top of the list.
Solution: Improve tagging by
suggesting potential tags to a
user / tag ranking /tag
completion etc.
Dong Liu, Xian-Sheng Hua, Linjun Yang, Meng Wang, Hong-
Jiang Zhang. Tag Ranking. WWW 2009. Madrid, Spain
49. Tag Recommendation using Tags
Co-occurrences
Given a target image and initial tags, use co-occurrence of tags to
recommend tags for the target image. This approach doesn’t take into
account the visual features co-occurrences.
50. Tag Recommendation using Tags
Co-occurrences and Visual Similarity
Kucuktunc, Sevil, Tosun, Zitouni, Duygulu, and Can (SAMT 08)
Given a target image and initial tags, use the existing tagged images to
suggest tags for the target image.
55. Tag Recommendation After Tag
Ranking
• Given an untagged image, find its visually similar “k” images
• Pool the top two ranked tags from k images and select the unique tags as
recommended tags
56. Tag Completion
The complete tag matrix is
generated by imposing
constraints based on visual
similarity, tag to tag similarity,
and similarity with the initial
tag matrix. The matrix
completion is done by an
optimization procedure.
Wu and Jain, IEEE-PAMI, JANUARY
2011
57. What about Taggers & Commenters?
Question: How can we incorporate taggers/commenters
characteristics for improved tag recommendations?
Answer: Use three sets of features: derived from image to
be tagged, user’s tag history, and user’s social interactions
58. Tag History & Social Interaction
Features
Tag history features are based
on the tags the user has used
in the past
Social interaction features are
derived from tags/comments
posted by the user’s
friends/favorite posters
X. Chen & H. Shin, ICDM 2010
59. Current Status of Image Search
• Extensive interest as evident from conferences, journals, and
special issues
• Overall, solid progress is being made
• Efforts towards performance evaluation with benchmarked
collections are gaining more traction
• Integration of content and context through tags and
comments is receiving increasing attention to help improve
retrieval
• Killer applications are beginning to emerge as visual search
gains prominence
• Need for more applications outside entertainment
60. Performance Evaluation Efforts
ImageCLEF2013
- Annotation Task:
- 250000 Training Images
- 95 (develop), 116 (test) concepts to be identified
- A lot of label Noise inside the training set, due to the automatic label
extraction from websites
61. Performance Evaluation Efforts
TRECVID workshops, an offshoot of TREC, are yearly evaluation meetings since
2003. The goal of the workshops is to encourage research in content-based
video retrieval and analysis by providing large test collections, realistic system
tasks, uniform scoring procedures, and a forum for organizations interested in
comparing their results.
63. CBIR for Whole Slide Imageries
• The availability of digital whole slide data sets
represent an enormous opportunity to carry out
new forms of numerical and data- driven query,
in modes not based on textual, ontological or
lexical matching.
– Search image repositories with whole images or
image regions of interest
– Carry our search in real-time via use of scalable
computational architectures
Extraction from Image
repositories based upon
spatial information
Analysis of data
in the digital domain
…001011010111010111..
Resultant Surface Map or
gallery of matching images
or
Slide courtesy of Ulysses J. Balis, M.D.
Director, Division of Pathology Informatics
Department of Pathology
University of Michigan Health System
64. Medical Image Retrieval
Text
“Find all the cases in which a tumor decrease in size
for less than three month post treatment, then
resumed a growth pattern after that period”
QUERY ?
Text + medical image
“Find images with large-sized frontal lobes brain tumors for
patients approximately 35 years old”
+Medical image
QUERY IMAGE-BASED CONCEPTS
Medical image ij - Specific Signature
ImageiQuery
VB-Spec CUIp
VB-Gen CUI1
VB-Spec CUIkIMAGE-BASED
ONTOLOGY
GENRAL AND SPECIALIZED
QUERY MEDICAL IMAGE
VISUAL ANALYSIS
Text query
CUIn
CUI1
CUI2
QUERY TEXT-BASED CONCEPTS
Textual query i - Indexes
MEDICAL
ONTOLOGY
TEXT QUERY
CONCEPTS
EXTRACTION
71. Take Home Message
• Image/video retrieval is moving in the
commercial domain. Lot more activity is expected
in near future
• Multimodal/cross-modal retrieval is gaining
importance
• Approaches combining social search and visual
search techniques are expected to gain
prominence
• Crowdsourcing is a cheap and effective way of
tagging media
72. Acknowledgement
• This presentation is based on the work of
numerous researchers from the MIR/ML/CVPR
community. I have tried to give
credit/references wherever possible. Any
omission is unintentional and I apologize for
that.
• Also want to thank my present and past
students and collaborators.