SlideShare ist ein Scribd-Unternehmen logo
1 von 78
Downloaden Sie, um offline zu lesen
Lei Wang
School of Computing and Information Technology
University of Wollongong, Australia
15-Oct-2016
CBIR in the Era of Deep Learning
-- A Perspective from Feature Representation
• Introduction of CBIR
• Evolution of CBIR
– Early days (before 2000)
– Days of BoF model (2000 ~ 2012)
– Era of Deep learning (after 2012)
• Conclusion
Outline
Images courtesy of related papers and authors
Introduction
• Retrieval
– Getting back information that has been stored in a
database
• Image Retrieval
Introduction
• Text-based image retrieval (TBIR, since late 1970’s)
– Manually associate images with text annotations
– Interpret images with high-level semantics
– Retrieval by matching the associated text annotations
Retrieval result of Google Images for “Airplane”
Introduction
• Issus with text-based image retrieval
– Annotation is time consuming and labour intensive
– Only partially describe the visual content
– Human’s perception subjectivity
– Not support query by example
Drouin Post Office, front desks Iron Ore Fashion
Introduction
• Content-based image retrieval
– Human annotators are replaced by computers
– Text annotations are replaced by visual features
– Retrieval by comparing the associated visual features
Drouin Post Office, front desks Iron Ore Fashion
Introduction
• National Science Foundation (NSF) organised a special
workshop on the topic of visual information
management (Feb 1992, San Jose, CA)
• "It would be impossible to cope with this explosion of image
information, unless the images were organized for retrieval.
The fundamental problem is that images, video, and other
similar data differ from numeric data and text data format,
and hence they require a totally different technique of
organization, indexing and query processing."
Introduction
• CBIR categorisation
– No query: Randomly browse similar images
– Query by text (by typing “airplane” or description)
– Query by example
• by using an image, sketch, or graphic of airplane
Introduction
• CBIR categorisation
– Find images of similar colour, texture or shape
– Find images of similar object, scene, place, event, etc.
Introduction
• CBIR categorisation
– Narrow domain
– Broad domain
Introduction
CBIR
Image matching
Image Recognition
Image Segmentation
Object detection
Image annotation
More tasks …
Introduction
CBIR
Computer
Vision
Informational
Retrieval
Database
Machine
Learning
Introduction
• Applications of CBIR
– Archival photo collection management
– Personal album management
– Crime investigation
– Fashion and design
– Education and entertainment
– Localisation and navigation
– Medical Image analysis
– ….
Introduction
• CBIR systems
– QBIC, Virage, Photobook, VisualSEEk, MARS, etc.
Source: http://vismod.media.mit.edu/vismod/demos/photobook/Source: http://www.cse.unsw.edu.au/~jas/talks/curveix/notes.html
Introduction
• CBIR systems
– QBIC, Virage, Photobook, VisualSEEk, MARS, etc.
• Introduction of CBIR
• Evolution of CBIR
– Early days (before 2000)
– Days of BoF model (2000 ~ 2012)
– Era of Deep learning (after 2012)
• Conclusion
Outline
Images courtesy of related papers and authors
Early days
A new research problem received great interest
CBIR
Application
Semantic gap
Domain
knowledge
User model
Query mode
Visual features
Similarity measure
Interaction
Learning from data
System
Evaluation
• Hand-crafted features
– Color, texture, shape, structure, etc.
– Goal: “Invariant and discriminative”
• Similarity or distance measure
– Euclidean distance, Manhattan distance, etc.
– Specific measures designed for specific features
Early days
• Relevance feedback
– Bring user into the loop of CBIR to handle “Semantic Gap”
– A key point of “machine Learning” research in CBIR
Early days
• Relevance feedback
– Learning from small sample
– Semi-supervised learning
– Transductive learning
– Feature selection, dimensionality reduction
– Kernel based learning
– Manifold learning
– Relation learning
– …
Early days
• Achievements
– Researched CBIR from various perspectives
– Identified the key issues and obstacles
– Many initial but insightful observations and attempts
– Machine learning started playing an important role
• To be improved
– Basic, hand-crafted features, limited invariance
– Considerably depend on domain theory
– Small-sized databases for evaluation
• Introduction of CBIR
• Evolution of CBIR
– Early days (before 2000)
– Days of the BoF model (2000 ~ 2012)
– Era of Deep learning (after 2012)
• Conclusion
Outline
Images courtesy of related papers and authors
• SIFT, HOG, SURF, CENTRIST, filter-based, …
– Invariant to view angle, rotation, scale, illumination, ...
Days of the BoF model
Local Invariant Features
http://www.robots.ox.ac.uk/~vgg/software/
Image courtesy of David Lowe, IJCV04
SIFT (Scale Invariant Feature Transform
Days of the BoF model
Local Invariant Features
http://www.robots.ox.ac.uk/~vgg/research/affine/#software/
Image A Image B
Days of the BoF model
Local Invariant Features
Source: http://ivt.sourceforge.net/examples.html
Image A Image B
Days of the BoF model
Local Invariant Features
Source: http://www.robots.ox.ac.uk/~vgg/share/SearchPractical2012.html
Image A Image B
Days of the BoF model
Local Invariant Features
Days of the BoF model
Bag-of-features (BoF) model is borrowed from text analysis
Days of the BoF model
Interest point detection
or
Dense sampling
The cropped detected regions
Bag-of-feature model is borrowed from text analysis
Days of the BoF model
A close-up view
Days of the BoF model
A close-up view
Days of the BoF model
Extract features from all training/test images
x 2 Rd
Days of the BoF model
Cluster all features to generated “Visual Words”
Rd
Days of the BoF model
Generated “Visual Words”
…
…
…
…
Word 1:
Word 2:
Word 3:
Word 4:
Word k: … … … … … … … … … … … … … … … … … … … … … … … … …
…
Days of the BoF model
From an image to a histogram
[ n1 , n2, … , nk ]
The number of
occurrence of 1st “word”
in this image
2 Rk
[ 0 , 1, 0, … , 0 ] 2 Rk
[ 1 , 0, 0, … , 0 ] 2 Rk
[ 0 , 0, 1, … , 0 ] 2 Rk
… … … …
Days of the BoF model
Classifying, clustering or retrieving images
Rk
y = w>
x + b
Days of the BoF model
A Bag-of-Features Image Analysis System
Image
database
Feature
extraction
Codebook
generation
Feature
coding
Feature
pooling
Classification
Clustering or
Retrieval
Days of the BoF model
Local Invariant Features, such as SIFT (Lowe, ICCV99)
Video Google (Sivic, CVPR03); Bag-of-keypoints (Csurka, SLCV@ECCV04)
Vocabulary tree (Nister, CVPR06); Randomized Clustering Forests
(Moosmann, NIPS06); Spatial Pyramid Matching (Lazebnik, CVPR06)
Pyramid Match
Kernel (Grauman,
ICCV05);
Dense sampling
(Jurie, ICCV05);
Compact Codebook
(Winn, ICCV05)
Comparative Study (Zhang, IJCV07);
Coding with Fisher Kernels (Perronnin, CVPR07)
Local Soft-assignment Coding & Mix-order pooling (Liu, ICCV11);
Comparative Study on BoF model (Chatfield, BMVC, 2011);
Locality-constrained Linear Coding for BoF (Wang, CVPR10);
Coding & pooling scheme comparison (Boureau, CVPR10);
Sparse coding for BoF (Yang, CVPR09)
Local Coordinate Coding (Yu, NIPS09)
Kernel Codebook
(van Gemert, ECCV08);
In Defense of Nearest
Neighbor Classifier
(Boiman, CVPR08)
11
10
09
08
07
06
05
03
99
Days of the BoF model
Key issues of CBIR with the BoF model
Source: Nister and Stewenius, CVPR06
• How to quickly create a large visual codebook
– hierarchical k-means clustering
– Approximate k-means clustering
Days of the BoF model
Key issues of CBIR with the BoF model
• How to incorporate spatial information
– The BoF model ignores the spatial information of
SIFT features
Spatial Pyramid Matching Re-ranking with Spatial verification
Days of the BoF model
Key issues of CBIR with the BoF model
Retrieval result before spatial verification
Query:
Days of the BoF model
25 points matched under a consistent spatial relationship
Only 4 points matched under a consistent spatial
relationship
• Re-ranking with spatial verification
Key issues of CBIR with the BoF model
Days of the BoF model
Retrieval result after spatial verification
Query:
Key issues of CBIR with the BoF model
Days of the BoF model
• Large-scale image retrieval
– Memory, time, precision
– Approximate nearest-neighbor search
x1
x2
xd
.
.
.
0100101100…
How?
Key issues of CBIR with the BoF model
Days of the BoF model
• Local sensitive hashing (LSH)
– Random projection, data independent, unsupervised,
• Learning compact binary codes
– Preserving sample similarities, data dependent
1
1
1
0
0
0
LSH
Key issues of CBIR with the BoF model
Days of the BoF model
Retrieval examples from the “Oxford5K” data set
Source: Philbin et. al, Object retrieval with large vocabularies and fast spatial matching, CVPR07
Days of the BoF model (Summary)
• Achievements
– Local invariant features plays a fundamental role
– Visual codebook creation, feature coding, and feature
pooling are extensively studied
– Multiple benchmark data sets are established
– Large-scale image retrieval is also researched
• To be improved
– Feature representation and recognition separate
– Focused more on object level level retrieval but less
on semantic level retrieval
• Introduction of CBIR
• Evolution of CBIR
– Early days (before 2000)
– Days of the BoF model (2000 ~ 2012)
– Era of Deep learning (after 2012)
• Conclusion
Outline
Images courtesy of related papers and authors
Era of Deep Learning
 Visual
• Images
• Videos
 Audio
• Speech
• Music
 Text
• Natural Language
 Planning
 …
Era of Deep Learning
• Image Recognition
– Faces, objects, poses, scenes, …
• Video content analysis
– Action, activities, events, summarization, …
• Visual information management
– Search, retrieval, indexing, browsing, …
• Potential Outcome: AI
– Computers can see and understand visual
information
– Robotics, self-driving cars, surveillance
– ….
Era of Deep Learning
Object detection (Source: Rich feature hierarchies for accurate object detection and
semantic segmentation, CVPR 2014)
Face Recognition (Source: DeepFace: Closing the Gap to Human-Level Performance in Face
Verification, CVPR 2014)
Era of Deep Learning
Pose estimation (DeepPose: Human Pose Estimation via Deep Neural Networks, CVPR2014)
Image Segmentation (Source: SegNet: A Deep Convolutional Encoder-Decoder
Architecture for Image Segmentation, IEEE TPAMI 2016)
Era of Deep Learning
• Fine-grained image recognition
• Human attribute classification
[Ning Zhang et al.
CVPR 2014]
[Branson et al. arXiv 2014 ]
Era of Deep Learning
• Action Recognition
• Large-scale Video Classification
[Karpathy et al. CVPR 2014]
[Simonyan et al. arXiv 2014]
Era of Deep Learning
• Invariant and discriminative features
Feature Representation
Feature Extraction Classification “Panda”?
Prior Knowledge,
Experience
Pose Occlusion Multiple
objects
Inter-class
similarity
Image courtesy of M. Ranzato
Era of Deep Learning
• From hand-crafted features to automatically learned ones
Rd
Rk
y = w>
x + b
Era of Deep Learning
• Directly learn features representations from data.
• Joint learn feature representation and classifier.
Low-level
Features
Mid-level
Features
High-level
Features
Classifier
Deep Learning: train layers of features so that classifier works well.
More abstract representation
“Panda”?
Image courtesy of M. Ranzato
Era of Deep Learning
• Deep Learning
– Inspired by the way human brain processes information
– Many layers of non-linear information processing stages
Era of Deep Learning
Yes.
• Basic ideas common to past neural networks research
• Standard machine learning strategies still relevant.
No.
Have we been here before?
Computational
Power
Large-scale Data New Algorithms
Deep Learning
Era of Deep Learning
Convolutional Neural Networks (CNNs)
• A special multi-stage architecture inspired by visual system
Era of Deep Learning
Source: Slide: Girshick
Fukushima 1980
Neocognitron
LeCun et al. 1989-1998
Hand-written digit reading
Rumelhart, Hinton, Williams 1986
“T” versus “C” problem
...
Krizhevksy, Sutskever, Hinton 2012
ImageNet classification breakthrough
“SuperVision” CNN
Convolutional Neural Networks (CNNs)
Era of Deep Learning
CNNs: ImageNet Breakthrough
● Krizhevsky et al. win 2012 ImageNet classification with a much bigger ConvNet
○ deeper: 7 stages vs 3 before
○ larger: 60 million parameters vs 1 million before
○ 16.4% error (top-5) vs Next best 26.2% error
● This was made possible by:
○ fast hardware: GPU-optimized code
○ big dataset: 1.2 million images vs thousands before
○ better regularization: dropout et al.
[Krizhevsky et al. NIPS 2012]
Image courtesy of Deng et al.
Era of Deep Learning
Learned Features of CNNs
[Matthew D. Zeiler et al. ECCV 2014]
Era of Deep Learning
CBIR: From SIFT to CNNs
• Three main approaches
– Directly use pre-trained CNNs models
• to extract feature representations
– Fine-tune pre-trained CNNs models
• with information (pairwise or triplet similarity)
– Bag-of-features model on CNN features
• “Deep SIFT”
Era of Deep Learning
1. Directly use pre-trained CNNs
• How to use the feature representations?
– Which layer?
– How to pool the features in a convolutional layer?
– How to select the features in a convolutional layer?
Era of Deep Learning
1. Directly use pre-trained CNNs
• How to use the feature representations?
– Which layer?
Fully connected layer
Convolutional layer
Era of Deep Learning
1. Directly use pre-trained CNNs
• How to use the feature representations?
– How to pool the features in a convolutional layer?
Depth
Height
Width
x1
x2
.
.
.
xn
How?
Era of Deep Learning
1. Directly use pre-trained CNNs
• How to use the feature representations?
– How to pool the features in a convolutional layer?
Depth
Height
Width
x1
x2
.
.
.
xn
How?
• Sum-pooling
• Max-pooling
• Grid-based max-pooling
• Region-based pooling
• Mixed sum & max pooling
Era of Deep Learning
1. Directly use pre-trained CNNs
• How to use the feature representations?
– How to select the features in a convolutional layer?
• Weighting
• Activation
magnitude
• Region
detection
Source: Cao et. al, Where to Focus: Query Adaptive Matching for Instance Retrieval Using Convolutional Feature Maps
Era of Deep Learning
2. Fine-tune pre-trained CNNs
• To incorporate extra information from a new
image data set
– Side information (pairwise or triplet similarity)
– Distance metric learning
√
X
Era of Deep Learning
2. Fine-tune pre-trained CNNs
Source: MatchNet, CVPR2015
Source: Learning Fine-Grained Image Similarity with Deep
Ranking. CVPR 2014
Era of Deep Learning
3. Bag-of-features model on “Deep SIFT”
SIFT (Scale Invariant Feature Transform
Source: Multi-scale Orderless Pooling of Deep Convolutional Activation Features, ECCV2014
Era of Deep Learning
3. Bag-of-features model on “Deep SIFT”
SIFT (Scale Invariant Feature Transform
“Deep SIFT”
Source: Cao et. al, Where to Focus: Query Adaptive Matching for Instance Retrieval Using Convolutional Feature Maps
Era of Deep Learning
3. Bag-of-features model on “Deep SIFT”
Codebook
generation
Feature
coding
Feature
pooling
Classification
Clustering or
Retrieval
Or
Era of Deep Learning
Image Classification with DCNN (Krizhevsky, NIPS12)
CNN Features off-the-shelf
(Razavian, CVPRW14);
Neural codes (Babenko,
ECCV14)
Deep ranking (Wang, CVPR14)
Multi-scale orderless pooling
(Gong, ECCV14)
Encoding High Dimensional
Local Features (Liu, NIPS14)
Survey: Deep learning for CBIR
(Wan, ACMMM14)
16
15
14
13
12
Deep filter banks (Cimpoi, CVPR15);
Exploiting Local Features from DNN (Ng,
CVPRW15)
SPoC (Babenko, ICCV15);
MatchNet (Han, CVPR15)
R-MAC (Tolias, ICLR16);
CNN IR Learns from BoW (Radenovic,
ECCV16);
CroW (Kalantidis, ECCVW16);
Where to focus (Cao, 2016)
Some papers appeared on Arxiv
Summary
• A very limited (and biased) account of CBIR
• CBIR has made significant progress during two
past decades
• The development of feature representation plays
a key role
• Issues to be resolved
– How to transfer the benefit of Deep Learning?
– How to deal with unsupervised learning case?
– How to better handle the semantic gap?
– …
Color
histogram
Gabor feature
Euclidean
distance
User model
Query model
…
SIFT
Bag-of-features
Hashing
Fine-grained
recognition
…
Deep features
Deep
retrieval
Deep ranking
Deep hashing
…
Images Courtesy of Google Image
…

Weitere ähnliche Inhalte

Was ist angesagt?

Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonJonathon Hare
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonJonathon Hare
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Jonathon Hare
 
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...Jonathon Hare
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015Jia-Bin Huang
 
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...Jonathon Hare
 
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...Jonathon Hare
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrievalMind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrievalJonathon Hare
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementSean Moran
 
Chapter 1 and 2 gonzalez and woods
Chapter 1 and 2 gonzalez and woodsChapter 1 and 2 gonzalez and woods
Chapter 1 and 2 gonzalez and woodsasodariyabhavesh
 
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Jonathon Hare
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-ResolutionTaegyun Jeon
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementSean Moran
 

Was ist angesagt? (20)

Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at Southampton
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at Southampton
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...Saliency-based Models of Image Content and their Application to Auto-Annotati...
Saliency-based Models of Image Content and their Application to Auto-Annotati...
 
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
Scale Saliency: Applications in Visual Matching,Tracking and View-Based Objec...
 
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlat...
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrievalMind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
 
Depth estimation using deep learning
Depth estimation using deep learningDepth estimation using deep learning
Depth estimation using deep learning
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
 
Chapter 1 and 2 gonzalez and woods
Chapter 1 and 2 gonzalez and woodsChapter 1 and 2 gonzalez and woods
Chapter 1 and 2 gonzalez and woods
 
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
 
ei2106-submit-opt-415
ei2106-submit-opt-415ei2106-submit-opt-415
ei2106-submit-opt-415
 
Super resolution from a single image
Super resolution from a single imageSuper resolution from a single image
Super resolution from a single image
 
Object detection
Object detectionObject detection
Object detection
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
 
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
Deep and Young Vision Learning at UPC BarcelonaTech (NIPS 2016)
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
 

Ähnlich wie CBIR in the Era of Deep Learning

Content based image retrieval with LIRe
Content based image retrieval with LIReContent based image retrieval with LIRe
Content based image retrieval with LIRedermotte
 
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Symeon Papadopoulos
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval Swati Chauhan
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchSujit Pal
 
Content based image retrieval Projects.pdf
Content based image retrieval Projects.pdfContent based image retrieval Projects.pdf
Content based image retrieval Projects.pdfrupaymts
 
SIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological ArtifactsSIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological ArtifactsBIPUL MOHANTO [LION]
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...SWAMI06
 
The International Image Interoperability Framework: why it's a game-changer f...
The International Image Interoperability Framework: why it's a game-changer f...The International Image Interoperability Framework: why it's a game-changer f...
The International Image Interoperability Framework: why it's a game-changer f...UCD Library
 
The International Image Interoperability Framework Why It’s a Game-Changer fo...
The International Image Interoperability Framework Why It’s a Game-Changer fo...The International Image Interoperability Framework Why It’s a Game-Changer fo...
The International Image Interoperability Framework Why It’s a Game-Changer fo...CONUL Conference
 
Instance Search (INS) task at TRECVID
Instance Search (INS) task at TRECVIDInstance Search (INS) task at TRECVID
Instance Search (INS) task at TRECVIDGeorge Awad
 
Image Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOSImage Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOSOge Marques
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and NowSi Krishan
 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object RetrievalNguyen Tuan
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)mayankraj86
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Dr. Aparna Varde
 
Compact Descriptors for Visual Search
Compact Descriptors for Visual SearchCompact Descriptors for Visual Search
Compact Descriptors for Visual SearchAntonio Capone
 
image processing image processing image processing
image processing  image processing  image processingimage processing  image processing  image processing
image processing image processing image processingSportsAcademy1
 

Ähnlich wie CBIR in the Era of Deep Learning (20)

Content based image retrieval with LIRe
Content based image retrieval with LIReContent based image retrieval with LIRe
Content based image retrieval with LIRe
 
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Content based image retrieval Projects.pdf
Content based image retrieval Projects.pdfContent based image retrieval Projects.pdf
Content based image retrieval Projects.pdf
 
SIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological ArtifactsSIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological Artifacts
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
 
The International Image Interoperability Framework: why it's a game-changer f...
The International Image Interoperability Framework: why it's a game-changer f...The International Image Interoperability Framework: why it's a game-changer f...
The International Image Interoperability Framework: why it's a game-changer f...
 
The International Image Interoperability Framework Why It’s a Game-Changer fo...
The International Image Interoperability Framework Why It’s a Game-Changer fo...The International Image Interoperability Framework Why It’s a Game-Changer fo...
The International Image Interoperability Framework Why It’s a Game-Changer fo...
 
Instance Search (INS) task at TRECVID
Instance Search (INS) task at TRECVIDInstance Search (INS) task at TRECVID
Instance Search (INS) task at TRECVID
 
Image Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOSImage Processing and Computer Vision in iOS
Image Processing and Computer Vision in iOS
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object Retrieval
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
 
ICPC06.ppt
ICPC06.pptICPC06.ppt
ICPC06.ppt
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
 
Compact Descriptors for Visual Search
Compact Descriptors for Visual SearchCompact Descriptors for Visual Search
Compact Descriptors for Visual Search
 
image processing image processing image processing
image processing  image processing  image processingimage processing  image processing  image processing
image processing image processing image processing
 

Mehr von Xiaohu ZHU

Theoretical Deep Learning
Theoretical Deep LearningTheoretical Deep Learning
Theoretical Deep LearningXiaohu ZHU
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationXiaohu ZHU
 
苏宁图像智能分析实践
苏宁图像智能分析实践苏宁图像智能分析实践
苏宁图像智能分析实践Xiaohu ZHU
 
Deep Reinforcement Learning An Introduction
Deep Reinforcement Learning An IntroductionDeep Reinforcement Learning An Introduction
Deep Reinforcement Learning An IntroductionXiaohu ZHU
 
Hangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement LearningHangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement LearningXiaohu ZHU
 
神经网络与深度学习
神经网络与深度学习神经网络与深度学习
神经网络与深度学习Xiaohu ZHU
 
Shanghai deep learning meetup 4
Shanghai deep learning meetup 4Shanghai deep learning meetup 4
Shanghai deep learning meetup 4Xiaohu ZHU
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Xiaohu ZHU
 

Mehr von Xiaohu ZHU (9)

Theoretical Deep Learning
Theoretical Deep LearningTheoretical Deep Learning
Theoretical Deep Learning
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its Application
 
Deep cv 101
Deep cv 101Deep cv 101
Deep cv 101
 
苏宁图像智能分析实践
苏宁图像智能分析实践苏宁图像智能分析实践
苏宁图像智能分析实践
 
Deep Reinforcement Learning An Introduction
Deep Reinforcement Learning An IntroductionDeep Reinforcement Learning An Introduction
Deep Reinforcement Learning An Introduction
 
Hangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement LearningHangzhou Deep Learning Meetup-Deep Reinforcement Learning
Hangzhou Deep Learning Meetup-Deep Reinforcement Learning
 
神经网络与深度学习
神经网络与深度学习神经网络与深度学习
神经网络与深度学习
 
Shanghai deep learning meetup 4
Shanghai deep learning meetup 4Shanghai deep learning meetup 4
Shanghai deep learning meetup 4
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1
 

Kürzlich hochgeladen

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 

Kürzlich hochgeladen (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

CBIR in the Era of Deep Learning

  • 1. Lei Wang School of Computing and Information Technology University of Wollongong, Australia 15-Oct-2016 CBIR in the Era of Deep Learning -- A Perspective from Feature Representation
  • 2. • Introduction of CBIR • Evolution of CBIR – Early days (before 2000) – Days of BoF model (2000 ~ 2012) – Era of Deep learning (after 2012) • Conclusion Outline Images courtesy of related papers and authors
  • 3. Introduction • Retrieval – Getting back information that has been stored in a database • Image Retrieval
  • 4. Introduction • Text-based image retrieval (TBIR, since late 1970’s) – Manually associate images with text annotations – Interpret images with high-level semantics – Retrieval by matching the associated text annotations Retrieval result of Google Images for “Airplane”
  • 5. Introduction • Issus with text-based image retrieval – Annotation is time consuming and labour intensive – Only partially describe the visual content – Human’s perception subjectivity – Not support query by example Drouin Post Office, front desks Iron Ore Fashion
  • 6. Introduction • Content-based image retrieval – Human annotators are replaced by computers – Text annotations are replaced by visual features – Retrieval by comparing the associated visual features Drouin Post Office, front desks Iron Ore Fashion
  • 7. Introduction • National Science Foundation (NSF) organised a special workshop on the topic of visual information management (Feb 1992, San Jose, CA) • "It would be impossible to cope with this explosion of image information, unless the images were organized for retrieval. The fundamental problem is that images, video, and other similar data differ from numeric data and text data format, and hence they require a totally different technique of organization, indexing and query processing."
  • 8. Introduction • CBIR categorisation – No query: Randomly browse similar images – Query by text (by typing “airplane” or description) – Query by example • by using an image, sketch, or graphic of airplane
  • 9. Introduction • CBIR categorisation – Find images of similar colour, texture or shape – Find images of similar object, scene, place, event, etc.
  • 10. Introduction • CBIR categorisation – Narrow domain – Broad domain
  • 11. Introduction CBIR Image matching Image Recognition Image Segmentation Object detection Image annotation More tasks …
  • 13. Introduction • Applications of CBIR – Archival photo collection management – Personal album management – Crime investigation – Fashion and design – Education and entertainment – Localisation and navigation – Medical Image analysis – ….
  • 14. Introduction • CBIR systems – QBIC, Virage, Photobook, VisualSEEk, MARS, etc. Source: http://vismod.media.mit.edu/vismod/demos/photobook/Source: http://www.cse.unsw.edu.au/~jas/talks/curveix/notes.html
  • 15. Introduction • CBIR systems – QBIC, Virage, Photobook, VisualSEEk, MARS, etc.
  • 16. • Introduction of CBIR • Evolution of CBIR – Early days (before 2000) – Days of BoF model (2000 ~ 2012) – Era of Deep learning (after 2012) • Conclusion Outline Images courtesy of related papers and authors
  • 17. Early days A new research problem received great interest CBIR Application Semantic gap Domain knowledge User model Query mode Visual features Similarity measure Interaction Learning from data System Evaluation
  • 18. • Hand-crafted features – Color, texture, shape, structure, etc. – Goal: “Invariant and discriminative” • Similarity or distance measure – Euclidean distance, Manhattan distance, etc. – Specific measures designed for specific features Early days
  • 19. • Relevance feedback – Bring user into the loop of CBIR to handle “Semantic Gap” – A key point of “machine Learning” research in CBIR Early days
  • 20. • Relevance feedback – Learning from small sample – Semi-supervised learning – Transductive learning – Feature selection, dimensionality reduction – Kernel based learning – Manifold learning – Relation learning – … Early days
  • 21. • Achievements – Researched CBIR from various perspectives – Identified the key issues and obstacles – Many initial but insightful observations and attempts – Machine learning started playing an important role • To be improved – Basic, hand-crafted features, limited invariance – Considerably depend on domain theory – Small-sized databases for evaluation
  • 22. • Introduction of CBIR • Evolution of CBIR – Early days (before 2000) – Days of the BoF model (2000 ~ 2012) – Era of Deep learning (after 2012) • Conclusion Outline Images courtesy of related papers and authors
  • 23. • SIFT, HOG, SURF, CENTRIST, filter-based, … – Invariant to view angle, rotation, scale, illumination, ... Days of the BoF model Local Invariant Features http://www.robots.ox.ac.uk/~vgg/software/ Image courtesy of David Lowe, IJCV04 SIFT (Scale Invariant Feature Transform
  • 24. Days of the BoF model Local Invariant Features http://www.robots.ox.ac.uk/~vgg/research/affine/#software/ Image A Image B
  • 25. Days of the BoF model Local Invariant Features Source: http://ivt.sourceforge.net/examples.html Image A Image B
  • 26. Days of the BoF model Local Invariant Features Source: http://www.robots.ox.ac.uk/~vgg/share/SearchPractical2012.html Image A Image B
  • 27. Days of the BoF model Local Invariant Features
  • 28.
  • 29. Days of the BoF model Bag-of-features (BoF) model is borrowed from text analysis
  • 30. Days of the BoF model Interest point detection or Dense sampling The cropped detected regions Bag-of-feature model is borrowed from text analysis
  • 31. Days of the BoF model A close-up view
  • 32. Days of the BoF model A close-up view
  • 33. Days of the BoF model Extract features from all training/test images x 2 Rd
  • 34. Days of the BoF model Cluster all features to generated “Visual Words” Rd
  • 35. Days of the BoF model Generated “Visual Words” … … … … Word 1: Word 2: Word 3: Word 4: Word k: … … … … … … … … … … … … … … … … … … … … … … … … … …
  • 36. Days of the BoF model From an image to a histogram [ n1 , n2, … , nk ] The number of occurrence of 1st “word” in this image 2 Rk [ 0 , 1, 0, … , 0 ] 2 Rk [ 1 , 0, 0, … , 0 ] 2 Rk [ 0 , 0, 1, … , 0 ] 2 Rk … … … …
  • 37. Days of the BoF model Classifying, clustering or retrieving images Rk y = w> x + b
  • 38. Days of the BoF model A Bag-of-Features Image Analysis System Image database Feature extraction Codebook generation Feature coding Feature pooling Classification Clustering or Retrieval
  • 39. Days of the BoF model Local Invariant Features, such as SIFT (Lowe, ICCV99) Video Google (Sivic, CVPR03); Bag-of-keypoints (Csurka, SLCV@ECCV04) Vocabulary tree (Nister, CVPR06); Randomized Clustering Forests (Moosmann, NIPS06); Spatial Pyramid Matching (Lazebnik, CVPR06) Pyramid Match Kernel (Grauman, ICCV05); Dense sampling (Jurie, ICCV05); Compact Codebook (Winn, ICCV05) Comparative Study (Zhang, IJCV07); Coding with Fisher Kernels (Perronnin, CVPR07) Local Soft-assignment Coding & Mix-order pooling (Liu, ICCV11); Comparative Study on BoF model (Chatfield, BMVC, 2011); Locality-constrained Linear Coding for BoF (Wang, CVPR10); Coding & pooling scheme comparison (Boureau, CVPR10); Sparse coding for BoF (Yang, CVPR09) Local Coordinate Coding (Yu, NIPS09) Kernel Codebook (van Gemert, ECCV08); In Defense of Nearest Neighbor Classifier (Boiman, CVPR08) 11 10 09 08 07 06 05 03 99
  • 40. Days of the BoF model Key issues of CBIR with the BoF model Source: Nister and Stewenius, CVPR06 • How to quickly create a large visual codebook – hierarchical k-means clustering – Approximate k-means clustering
  • 41. Days of the BoF model Key issues of CBIR with the BoF model • How to incorporate spatial information – The BoF model ignores the spatial information of SIFT features Spatial Pyramid Matching Re-ranking with Spatial verification
  • 42. Days of the BoF model Key issues of CBIR with the BoF model Retrieval result before spatial verification Query:
  • 43. Days of the BoF model 25 points matched under a consistent spatial relationship Only 4 points matched under a consistent spatial relationship • Re-ranking with spatial verification Key issues of CBIR with the BoF model
  • 44. Days of the BoF model Retrieval result after spatial verification Query: Key issues of CBIR with the BoF model
  • 45. Days of the BoF model • Large-scale image retrieval – Memory, time, precision – Approximate nearest-neighbor search x1 x2 xd . . . 0100101100… How? Key issues of CBIR with the BoF model
  • 46. Days of the BoF model • Local sensitive hashing (LSH) – Random projection, data independent, unsupervised, • Learning compact binary codes – Preserving sample similarities, data dependent 1 1 1 0 0 0 LSH Key issues of CBIR with the BoF model
  • 47. Days of the BoF model Retrieval examples from the “Oxford5K” data set Source: Philbin et. al, Object retrieval with large vocabularies and fast spatial matching, CVPR07
  • 48. Days of the BoF model (Summary) • Achievements – Local invariant features plays a fundamental role – Visual codebook creation, feature coding, and feature pooling are extensively studied – Multiple benchmark data sets are established – Large-scale image retrieval is also researched • To be improved – Feature representation and recognition separate – Focused more on object level level retrieval but less on semantic level retrieval
  • 49. • Introduction of CBIR • Evolution of CBIR – Early days (before 2000) – Days of the BoF model (2000 ~ 2012) – Era of Deep learning (after 2012) • Conclusion Outline Images courtesy of related papers and authors
  • 50. Era of Deep Learning  Visual • Images • Videos  Audio • Speech • Music  Text • Natural Language  Planning  …
  • 51. Era of Deep Learning • Image Recognition – Faces, objects, poses, scenes, … • Video content analysis – Action, activities, events, summarization, … • Visual information management – Search, retrieval, indexing, browsing, … • Potential Outcome: AI – Computers can see and understand visual information – Robotics, self-driving cars, surveillance – ….
  • 52. Era of Deep Learning Object detection (Source: Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR 2014) Face Recognition (Source: DeepFace: Closing the Gap to Human-Level Performance in Face Verification, CVPR 2014)
  • 53. Era of Deep Learning Pose estimation (DeepPose: Human Pose Estimation via Deep Neural Networks, CVPR2014) Image Segmentation (Source: SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, IEEE TPAMI 2016)
  • 54. Era of Deep Learning • Fine-grained image recognition • Human attribute classification [Ning Zhang et al. CVPR 2014] [Branson et al. arXiv 2014 ]
  • 55. Era of Deep Learning • Action Recognition • Large-scale Video Classification [Karpathy et al. CVPR 2014] [Simonyan et al. arXiv 2014]
  • 56. Era of Deep Learning • Invariant and discriminative features Feature Representation Feature Extraction Classification “Panda”? Prior Knowledge, Experience Pose Occlusion Multiple objects Inter-class similarity Image courtesy of M. Ranzato
  • 57. Era of Deep Learning • From hand-crafted features to automatically learned ones Rd Rk y = w> x + b
  • 58. Era of Deep Learning • Directly learn features representations from data. • Joint learn feature representation and classifier. Low-level Features Mid-level Features High-level Features Classifier Deep Learning: train layers of features so that classifier works well. More abstract representation “Panda”? Image courtesy of M. Ranzato
  • 59. Era of Deep Learning • Deep Learning – Inspired by the way human brain processes information – Many layers of non-linear information processing stages
  • 60. Era of Deep Learning Yes. • Basic ideas common to past neural networks research • Standard machine learning strategies still relevant. No. Have we been here before? Computational Power Large-scale Data New Algorithms Deep Learning
  • 61. Era of Deep Learning Convolutional Neural Networks (CNNs) • A special multi-stage architecture inspired by visual system
  • 62. Era of Deep Learning Source: Slide: Girshick Fukushima 1980 Neocognitron LeCun et al. 1989-1998 Hand-written digit reading Rumelhart, Hinton, Williams 1986 “T” versus “C” problem ... Krizhevksy, Sutskever, Hinton 2012 ImageNet classification breakthrough “SuperVision” CNN Convolutional Neural Networks (CNNs)
  • 63. Era of Deep Learning CNNs: ImageNet Breakthrough ● Krizhevsky et al. win 2012 ImageNet classification with a much bigger ConvNet ○ deeper: 7 stages vs 3 before ○ larger: 60 million parameters vs 1 million before ○ 16.4% error (top-5) vs Next best 26.2% error ● This was made possible by: ○ fast hardware: GPU-optimized code ○ big dataset: 1.2 million images vs thousands before ○ better regularization: dropout et al. [Krizhevsky et al. NIPS 2012] Image courtesy of Deng et al.
  • 64. Era of Deep Learning Learned Features of CNNs [Matthew D. Zeiler et al. ECCV 2014]
  • 65. Era of Deep Learning CBIR: From SIFT to CNNs • Three main approaches – Directly use pre-trained CNNs models • to extract feature representations – Fine-tune pre-trained CNNs models • with information (pairwise or triplet similarity) – Bag-of-features model on CNN features • “Deep SIFT”
  • 66. Era of Deep Learning 1. Directly use pre-trained CNNs • How to use the feature representations? – Which layer? – How to pool the features in a convolutional layer? – How to select the features in a convolutional layer?
  • 67. Era of Deep Learning 1. Directly use pre-trained CNNs • How to use the feature representations? – Which layer? Fully connected layer Convolutional layer
  • 68. Era of Deep Learning 1. Directly use pre-trained CNNs • How to use the feature representations? – How to pool the features in a convolutional layer? Depth Height Width x1 x2 . . . xn How?
  • 69. Era of Deep Learning 1. Directly use pre-trained CNNs • How to use the feature representations? – How to pool the features in a convolutional layer? Depth Height Width x1 x2 . . . xn How? • Sum-pooling • Max-pooling • Grid-based max-pooling • Region-based pooling • Mixed sum & max pooling
  • 70. Era of Deep Learning 1. Directly use pre-trained CNNs • How to use the feature representations? – How to select the features in a convolutional layer? • Weighting • Activation magnitude • Region detection Source: Cao et. al, Where to Focus: Query Adaptive Matching for Instance Retrieval Using Convolutional Feature Maps
  • 71. Era of Deep Learning 2. Fine-tune pre-trained CNNs • To incorporate extra information from a new image data set – Side information (pairwise or triplet similarity) – Distance metric learning √ X
  • 72. Era of Deep Learning 2. Fine-tune pre-trained CNNs Source: MatchNet, CVPR2015 Source: Learning Fine-Grained Image Similarity with Deep Ranking. CVPR 2014
  • 73. Era of Deep Learning 3. Bag-of-features model on “Deep SIFT” SIFT (Scale Invariant Feature Transform Source: Multi-scale Orderless Pooling of Deep Convolutional Activation Features, ECCV2014
  • 74. Era of Deep Learning 3. Bag-of-features model on “Deep SIFT” SIFT (Scale Invariant Feature Transform “Deep SIFT” Source: Cao et. al, Where to Focus: Query Adaptive Matching for Instance Retrieval Using Convolutional Feature Maps
  • 75. Era of Deep Learning 3. Bag-of-features model on “Deep SIFT” Codebook generation Feature coding Feature pooling Classification Clustering or Retrieval Or
  • 76. Era of Deep Learning Image Classification with DCNN (Krizhevsky, NIPS12) CNN Features off-the-shelf (Razavian, CVPRW14); Neural codes (Babenko, ECCV14) Deep ranking (Wang, CVPR14) Multi-scale orderless pooling (Gong, ECCV14) Encoding High Dimensional Local Features (Liu, NIPS14) Survey: Deep learning for CBIR (Wan, ACMMM14) 16 15 14 13 12 Deep filter banks (Cimpoi, CVPR15); Exploiting Local Features from DNN (Ng, CVPRW15) SPoC (Babenko, ICCV15); MatchNet (Han, CVPR15) R-MAC (Tolias, ICLR16); CNN IR Learns from BoW (Radenovic, ECCV16); CroW (Kalantidis, ECCVW16); Where to focus (Cao, 2016) Some papers appeared on Arxiv
  • 77. Summary • A very limited (and biased) account of CBIR • CBIR has made significant progress during two past decades • The development of feature representation plays a key role • Issues to be resolved – How to transfer the benefit of Deep Learning? – How to deal with unsupervised learning case? – How to better handle the semantic gap? – …
  • 78. Color histogram Gabor feature Euclidean distance User model Query model … SIFT Bag-of-features Hashing Fine-grained recognition … Deep features Deep retrieval Deep ranking Deep hashing … Images Courtesy of Google Image …