SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
A location-aware embedding technique for
accurate landmark recognition
Federico Magliani, Navid Mahmoudian Bidgoli, Andrea Prati
ICDSC 2017 – Stanford, USA – 5-7 September 2017
Agenda
2
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
Motivations
3
Landmark Recognition problem
➢ try to understand what’s is
in front of you
➢ using client-server
communication
➢ helping with geolocalization
(GPS)
Motivations
4
➢ Challenges
○ high accuracy retrieval (precision)
○ fast research (response to query)
○ reduced memory occupied (mobile friendly)
○ work well with big data (>100k data)
➢ Possible applications
○ augmented reality (tourism)
➢ Why mobile based?
○ everyone owns a mobile phone
○ a mobile phone has powerful HW, that allows to run some applications
Motivations
5
“Changes in the image resolution, illumination conditions, viewpoint and the presence
of distractors such as trees or traffic signs (just to mention some) make the task of
matching features between a query image and the database rather difficult.”
➢ In order to mitigate these problems, the existing approaches rely on feature
description with a certain degree of invariance to scale, orientation and
illumination changes.
Agenda
6
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
Summary of contribution
7
➢ A location-aware version of VLAD, called locVLAD, that allows to outperform the state
of the art in the intra-dataset problem. It tries to overcome a weakness of VLAD,
reducing the noise of the features in the borders of the images
➢ The time for vocabulary creation is significantly reduced, using only ⅕ random of the
detected features
➢ A new balanced version of the public dataset ZuBuD is proposed and made available
to the scientific community (ZuBuD+)
Agenda
8
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
Related work
9
➢ Bag of Words (BoW): first method for solving the problem (different
techniques: vocabulary tree, …)
➢ Fisher vector: embedding based on Fisher kernel
➢ VLAD and its variants: simplified version of Fisher vector
➢ Hamming embedding: embedding based on binarized descriptors
➢ CNN based: deep neural network, that at the end contain
classification layers
10
Proposed Pipeline
Agenda
11
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
VLAD (Vector of Locally Aggregated Descriptors)
C = {c1
,.., ck
} codebook of k visual words (K-means clustering)
1. Every local descriptor x, extracted from the image, is assigned to the closest cluster
center of the codebook (ci
= NN(xj
))
2. vi
= ∑ (x - ci
) (residuals)
3. VLAD vector is the concatenation of vi
vectors (i = 1, …, k) d-dimensional
4. VLAD normalization to contrast the burstiness problem
16 centroids, features described with SIFT 128d → D=128x16=2048 12
VLAD normalization
13
➢ Signed Square Rooting normalization: sign(xi
) sqrt(|xi
|) followed by L2
norm
➢ Residual normalization: independent residual L2
norm followed by L2
norm
➢ Z-Score normalization: residual normalization followed by subtraction of the mean
from every vector and division by the standard deviation
➢ Power normalization: sign(xi
)|xi
|α
(usually α=0.2) followed by L2
norm
Agenda
14
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
Proposed approach: locVLAD
➢ This method allows to improve the performance of VLAD vectors in the recognition
problem.
➢ It tackles this problem by reducing the influence of features found at the borders of the
image.
How does it work?
It consists in a new global descriptor, that is the mean of VLAD descriptors of the original
query image (v̇) and a VLAD descriptor calculated on a cropped query image (v̇cropped
).
15
Proposed approach: locVLAD
The dimension of the cropped image is a parameter, that depends on the used dataset
➢ ZuBuD → 90% of the original query images
➢ Holidays → 70% of the original query images.
16424 features detected 367 features detected
Why does it increase the performance?
Because, usually, the important features for the recognition are located in the center of the
images while the features close to the border are noisy features.
Why not applying VLAD encoding directly on the cropped image?
Because useful information might be lost. Not any guarantee that features in the borders
are only noisy features.
Why not creating a cropped vocabulary?
Experiments were conducted but results were poor.
Proposed approach: locVLAD
17
Agenda
18
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
Datasets
➢ INRIA Holidays (1491 images in 2448x3264: 500 classes, 500 query)
➢ ZuBuD (1005 images in 640x480: 201 classes, 115 query in 320x240)
➢ ZuBuD+ (1005 images in 640x480: 201 classes, 1005 query in 320x240)
19
Holidays
20
ZuBuD
21
ZuBuD+
2222
It is the balanced version of ZuBuD
➢ 1005 query in 320x240 instead of 115 query.
➢ The new query images are random choices of database images, but different from other
query images
○ rotation (±90°) and resize
○ resize only
Download: http://implab.ce.unipr.it/?page_id=194
Evaluation Metrics
2323
Different evaluation metrics are used to compare with the state-of-the-art approaches:
➢ Top1 → accuracy retrieval, evaluating only the first position of the ranking
➢ 5 x Recall in Top5 → average of how many times the correct image is in the top 5
results in the ranking
➢ mAP (mean Average Precision) → mean of Average Precision scores (correct results) for
each query, based on the position in the ranking
Results on ZuBuD (and ZuBuD+)
24
Results on ZuBuD (and ZuBuD+)
25
Method Descriptor size Top1 5 x Recall in Top5
Tree histogram (ZuBuD) [7] 10M 98.00 % -
Decision tree (ZuBuD) [9] n/a 91.00 % -
Sparse coding (ZuBuD) [22] 8k*64+1k*36 - 4.538
VLAD (ZuBuD) [12] 4281*128 99.00 % 4.416
VLAD (ZuBuD+) [12] 4281*128 99.00 % 4.526
locVLAD (ZuBuD) 4281*128 100.00 % 4.469
locVLAD (ZuBuD+) 4281*128 100.00 % 4.543
It is worth to note that on ZuBuD the method based on sparse coding slightly outperforms the proposed one.
This is due to an unbalanced query set and, probably, on the use of color information.
Results on Holidays
26
Results on Holidays
27
Method Descriptor size mAP
Sparse coding [22] 8k*64+1k*36 76.51 %
VLAD [12] 4281*128 74.43 %
locVLAD 4281*128 77.20 %
Sparse coding [4] 20k*128 79.00 %
VLAD [12] 20k*128 78.78 %
locVLAD 20k*128 80.89 %
Vocabulary creation
28
Agenda
29
➢ Motivations
➢ Summary of contribution
➢ Related works
➢ Introduction to VLAD
➢ Proposed approach (locVLAD)
➢ Experimental results
➢ Conclusions and Future Works
Conclusions
➢ The proposed locVLAD technique includes, at a certain degree, information on
the location of the features, by mitigating the negative effects of distractors
found at the image borders.
➢ Experiments are performed on two public datasets, namely ZuBuD and Holidays,
and demonstrate superior recognition accuracy w.r.t. the state of the art.
30
Future works
➢ Compression: try to reduce the dimension of the descriptors, while keeping the
same accuracy in retrieval (mobile friendly).
➢ Indexing: create a system for the evaluation in a large scale domain (adding until 1M
distractors). Passing from Nearest Neighbor problem to Approximate Nearest
Neighbor problem. We are working with kd tree and permutation-based methods.
➢ Sparse coding: new methods for the creation of the vocabulary and the assignment
of the features to the VLAD vector.
31
Thank you for your attention!
questions?
http://implab.ce.unipr.it
32

Weitere ähnliche Inhalte

Was ist angesagt?

Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Sangwoo Mo
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsSangwoo Mo
 
Learning from Computer Simulation to Tackle Real-World Problems
Learning from Computer Simulation to Tackle Real-World ProblemsLearning from Computer Simulation to Tackle Real-World Problems
Learning from Computer Simulation to Tackle Real-World ProblemsNAVER Engineering
 
is anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesis anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesNAVER Engineering
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsNAVER Engineering
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learningNAVER Engineering
 
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtionNÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtionKai Katsumata
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveySangwoo Mo
 
NeuralArt 電腦作畫
NeuralArt 電腦作畫NeuralArt 電腦作畫
NeuralArt 電腦作畫Mark Chang
 
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Universitat Politècnica de Catalunya
 
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...Kodaira Tomonori
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksSang Jun Lee
 
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...CSCJournals
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Universitat Politècnica de Catalunya
 
Gan seminar
Gan seminarGan seminar
Gan seminarSan Kim
 

Was ist angesagt? (20)

Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
 
Learning from Computer Simulation to Tackle Real-World Problems
Learning from Computer Simulation to Tackle Real-World ProblemsLearning from Computer Simulation to Tackle Real-World Problems
Learning from Computer Simulation to Tackle Real-World Problems
 
is anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesis anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayes
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernels
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learning
 
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtionNÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
 
Deep Learning for Computer Vision: Visualization (UPC 2016)
Deep Learning for Computer Vision: Visualization (UPC 2016)Deep Learning for Computer Vision: Visualization (UPC 2016)
Deep Learning for Computer Vision: Visualization (UPC 2016)
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
NeuralArt 電腦作畫
NeuralArt 電腦作畫NeuralArt 電腦作畫
NeuralArt 電腦作畫
 
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
 
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
 
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...
Performance Comparison of Image Retrieval Using Fractional Coefficients of Tr...
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
IC2IT 2013 Presentation
IC2IT 2013 PresentationIC2IT 2013 Presentation
IC2IT 2013 Presentation
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
 
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Gan seminar
Gan seminarGan seminar
Gan seminar
 

Ähnlich wie Location-aware embedding technique improves landmark recognition accuracy

Data quality evaluation & orbit identification from scatterometer
Data quality evaluation & orbit identification from scatterometerData quality evaluation & orbit identification from scatterometer
Data quality evaluation & orbit identification from scatterometerMudit Dholakia
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DLLeapMind Inc
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Mohammad Shaker
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Symeon Papadopoulos
 
DALL-E.pdf
DALL-E.pdfDALL-E.pdf
DALL-E.pdfdsfajkh
 
Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Sara Granados Cabeza
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval
 
Evaluation of conditional images synthesis: generating a photorealistic image...
Evaluation of conditional images synthesis: generating a photorealistic image...Evaluation of conditional images synthesis: generating a photorealistic image...
Evaluation of conditional images synthesis: generating a photorealistic image...SamanthaGallone
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersSungchul Kim
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksDatabricks
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Olaf Hartig
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...남주 김
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskMediaEval2012
 
(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...
(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...
(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...Mohamed Elawady
 
ICCSA 2010 Conference Presentation
ICCSA 2010 Conference PresentationICCSA 2010 Conference Presentation
ICCSA 2010 Conference PresentationGonçalo Amador
 

Ähnlich wie Location-aware embedding technique improves landmark recognition accuracy (20)

Data quality evaluation & orbit identification from scatterometer
Data quality evaluation & orbit identification from scatterometerData quality evaluation & orbit identification from scatterometer
Data quality evaluation & orbit identification from scatterometer
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
DALL-E.pdf
DALL-E.pdfDALL-E.pdf
DALL-E.pdf
 
Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
 
Evaluation of conditional images synthesis: generating a photorealistic image...
Evaluation of conditional images synthesis: generating a photorealistic image...Evaluation of conditional images synthesis: generating a photorealistic image...
Evaluation of conditional images synthesis: generating a photorealistic image...
 
OBDPC 2022
OBDPC 2022OBDPC 2022
OBDPC 2022
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural Networks
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
 
(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...
(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...
(Msc Thesis) Sparse Coral Classification Using Deep Convolutional Neural Netw...
 
Visual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic VideosVisual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic Videos
 
ICCSA 2010 Conference Presentation
ICCSA 2010 Conference PresentationICCSA 2010 Conference Presentation
ICCSA 2010 Conference Presentation
 

Kürzlich hochgeladen

Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 

Kürzlich hochgeladen (20)

Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 

Location-aware embedding technique improves landmark recognition accuracy

  • 1. A location-aware embedding technique for accurate landmark recognition Federico Magliani, Navid Mahmoudian Bidgoli, Andrea Prati ICDSC 2017 – Stanford, USA – 5-7 September 2017
  • 2. Agenda 2 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 3. Motivations 3 Landmark Recognition problem ➢ try to understand what’s is in front of you ➢ using client-server communication ➢ helping with geolocalization (GPS)
  • 4. Motivations 4 ➢ Challenges ○ high accuracy retrieval (precision) ○ fast research (response to query) ○ reduced memory occupied (mobile friendly) ○ work well with big data (>100k data) ➢ Possible applications ○ augmented reality (tourism) ➢ Why mobile based? ○ everyone owns a mobile phone ○ a mobile phone has powerful HW, that allows to run some applications
  • 5. Motivations 5 “Changes in the image resolution, illumination conditions, viewpoint and the presence of distractors such as trees or traffic signs (just to mention some) make the task of matching features between a query image and the database rather difficult.” ➢ In order to mitigate these problems, the existing approaches rely on feature description with a certain degree of invariance to scale, orientation and illumination changes.
  • 6. Agenda 6 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 7. Summary of contribution 7 ➢ A location-aware version of VLAD, called locVLAD, that allows to outperform the state of the art in the intra-dataset problem. It tries to overcome a weakness of VLAD, reducing the noise of the features in the borders of the images ➢ The time for vocabulary creation is significantly reduced, using only ⅕ random of the detected features ➢ A new balanced version of the public dataset ZuBuD is proposed and made available to the scientific community (ZuBuD+)
  • 8. Agenda 8 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 9. Related work 9 ➢ Bag of Words (BoW): first method for solving the problem (different techniques: vocabulary tree, …) ➢ Fisher vector: embedding based on Fisher kernel ➢ VLAD and its variants: simplified version of Fisher vector ➢ Hamming embedding: embedding based on binarized descriptors ➢ CNN based: deep neural network, that at the end contain classification layers
  • 11. Agenda 11 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 12. VLAD (Vector of Locally Aggregated Descriptors) C = {c1 ,.., ck } codebook of k visual words (K-means clustering) 1. Every local descriptor x, extracted from the image, is assigned to the closest cluster center of the codebook (ci = NN(xj )) 2. vi = ∑ (x - ci ) (residuals) 3. VLAD vector is the concatenation of vi vectors (i = 1, …, k) d-dimensional 4. VLAD normalization to contrast the burstiness problem 16 centroids, features described with SIFT 128d → D=128x16=2048 12
  • 13. VLAD normalization 13 ➢ Signed Square Rooting normalization: sign(xi ) sqrt(|xi |) followed by L2 norm ➢ Residual normalization: independent residual L2 norm followed by L2 norm ➢ Z-Score normalization: residual normalization followed by subtraction of the mean from every vector and division by the standard deviation ➢ Power normalization: sign(xi )|xi |α (usually α=0.2) followed by L2 norm
  • 14. Agenda 14 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 15. Proposed approach: locVLAD ➢ This method allows to improve the performance of VLAD vectors in the recognition problem. ➢ It tackles this problem by reducing the influence of features found at the borders of the image. How does it work? It consists in a new global descriptor, that is the mean of VLAD descriptors of the original query image (v̇) and a VLAD descriptor calculated on a cropped query image (v̇cropped ). 15
  • 16. Proposed approach: locVLAD The dimension of the cropped image is a parameter, that depends on the used dataset ➢ ZuBuD → 90% of the original query images ➢ Holidays → 70% of the original query images. 16424 features detected 367 features detected
  • 17. Why does it increase the performance? Because, usually, the important features for the recognition are located in the center of the images while the features close to the border are noisy features. Why not applying VLAD encoding directly on the cropped image? Because useful information might be lost. Not any guarantee that features in the borders are only noisy features. Why not creating a cropped vocabulary? Experiments were conducted but results were poor. Proposed approach: locVLAD 17
  • 18. Agenda 18 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 19. Datasets ➢ INRIA Holidays (1491 images in 2448x3264: 500 classes, 500 query) ➢ ZuBuD (1005 images in 640x480: 201 classes, 115 query in 320x240) ➢ ZuBuD+ (1005 images in 640x480: 201 classes, 1005 query in 320x240) 19
  • 22. ZuBuD+ 2222 It is the balanced version of ZuBuD ➢ 1005 query in 320x240 instead of 115 query. ➢ The new query images are random choices of database images, but different from other query images ○ rotation (±90°) and resize ○ resize only Download: http://implab.ce.unipr.it/?page_id=194
  • 23. Evaluation Metrics 2323 Different evaluation metrics are used to compare with the state-of-the-art approaches: ➢ Top1 → accuracy retrieval, evaluating only the first position of the ranking ➢ 5 x Recall in Top5 → average of how many times the correct image is in the top 5 results in the ranking ➢ mAP (mean Average Precision) → mean of Average Precision scores (correct results) for each query, based on the position in the ranking
  • 24. Results on ZuBuD (and ZuBuD+) 24
  • 25. Results on ZuBuD (and ZuBuD+) 25 Method Descriptor size Top1 5 x Recall in Top5 Tree histogram (ZuBuD) [7] 10M 98.00 % - Decision tree (ZuBuD) [9] n/a 91.00 % - Sparse coding (ZuBuD) [22] 8k*64+1k*36 - 4.538 VLAD (ZuBuD) [12] 4281*128 99.00 % 4.416 VLAD (ZuBuD+) [12] 4281*128 99.00 % 4.526 locVLAD (ZuBuD) 4281*128 100.00 % 4.469 locVLAD (ZuBuD+) 4281*128 100.00 % 4.543 It is worth to note that on ZuBuD the method based on sparse coding slightly outperforms the proposed one. This is due to an unbalanced query set and, probably, on the use of color information.
  • 27. Results on Holidays 27 Method Descriptor size mAP Sparse coding [22] 8k*64+1k*36 76.51 % VLAD [12] 4281*128 74.43 % locVLAD 4281*128 77.20 % Sparse coding [4] 20k*128 79.00 % VLAD [12] 20k*128 78.78 % locVLAD 20k*128 80.89 %
  • 29. Agenda 29 ➢ Motivations ➢ Summary of contribution ➢ Related works ➢ Introduction to VLAD ➢ Proposed approach (locVLAD) ➢ Experimental results ➢ Conclusions and Future Works
  • 30. Conclusions ➢ The proposed locVLAD technique includes, at a certain degree, information on the location of the features, by mitigating the negative effects of distractors found at the image borders. ➢ Experiments are performed on two public datasets, namely ZuBuD and Holidays, and demonstrate superior recognition accuracy w.r.t. the state of the art. 30
  • 31. Future works ➢ Compression: try to reduce the dimension of the descriptors, while keeping the same accuracy in retrieval (mobile friendly). ➢ Indexing: create a system for the evaluation in a large scale domain (adding until 1M distractors). Passing from Nearest Neighbor problem to Approximate Nearest Neighbor problem. We are working with kd tree and permutation-based methods. ➢ Sparse coding: new methods for the creation of the vocabulary and the assignment of the features to the VLAD vector. 31
  • 32. Thank you for your attention! questions? http://implab.ce.unipr.it 32