The recent advances brought by deep learning allowed to improve the performance in image retrieval tasks. Through the many convolutional layers, available in a Convolutional Neural Network (CNN), it is possible to obtain a hierarchy of features from the evaluated image. At every step, the patches extracted are smaller than the previous levels and more representative. Following this idea, this paper introduces a new detector applied on the feature maps extracted from pre-trained CNN. Specifically, this approach lets to increase the number of features in order to increase the performance of the aggregation algorithms like the most famous and used VLAD embedding. The proposed approach is tested on different public datasets: Holidays, Oxford5k, Paris6k and UKB.
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
A dense depth representation for vlad descriptors in
1. A Dense-Depth Representation
for VLAD descriptors in
Content-Based Image Retrieval
Federico Magliani, Tomaso Fontanini, and Andrea
Prati
IMP lab - University of Parma
25/10/2018 IMP Lab - University of Parma 1
2. Agenda
• Motivations and objectives
• Related works
• Proposed approach (Bag of Indexes)
• Experimental results
• Conclusions
25/10/2018 IMP Lab - University of Parma 2
3. Motivations and objectives
• The recent advances brought by deep learning allowed to improve
the performance in image retrieval tasks;
• Convolutional Neural Networks (CNN) allow to obtain a hierarchy of
features from the evaluated image;
• We introduce a new detector applied on the feature maps;
• The objective is to increase the number of features in order to boost
the performance of the aggregation algorithms like VLAD
embedding.
25/10/2018 IMP Lab - University of Parma 3
4. Agenda
• Motivations and objectives
• Related works
• Proposed approach (Bag of Indexes)
• Experimental results
• Conclusions
25/10/2018 IMP Lab - University of Parma 4
5. Image retrieval pipeline
25/10/2018 IMP Lab - University of Parma 5
Extraction of
local features
Aggregation of
the extracted
features
Retrieval of the
most similar
image
6. CNN codes
• Feature vectors extracted from pre-trained networks: InceptionV3 in
our case;
• CNN codes are extracted from the 8th inception pooling layer
(mixed8)
25/10/2018 IMP Lab - University of Parma 6
7. Agenda
• Motivations and objectives
• Related works
• Proposed approach (DDR)
• Experimental results
• Conclusions
25/10/2018 IMP Lab - University of Parma 7
8. Dense-Depth representation (DDR)
• VLAD-based embedding works better with dense representations;
• The features are grouped into a larger set of features of lower
dimensionality;
• A feature map of 𝑊 × 𝐻 × 𝐷 is splitted along 𝐷 -> maintain
geometrical information;
• Number of descriptors: from 𝑊 × 𝐻 to 𝑊 × 𝐻 ×
𝐷
𝑠𝑝𝑙𝑖𝑡𝑓𝑎𝑐𝑡𝑜𝑟
25/10/2018 IMP Lab - University of Parma 8
9. Example
Feature maps of 8 × 8 × 1280 with 𝑠𝑝𝑙𝑖𝑡𝑓𝑎𝑐𝑡𝑜𝑟 = 128
8 ∗ 8 ∗ 10 descriptors of 128𝐷
25/10/2018 IMP Lab - University of Parma 9
W
H
D
D’
D’’
D’’’
10. locVLAD1
• Evolution of VLAD;
• Calculated through the mean of two different VLAD descriptors:
• One computed on the whole image;
• One computed on the central portion of the image.
• The most important and useful features in the images are often in
the central region;
• Applied only on the query images.
25/10/2018 IMP Lab - University of Parma 10
1. Magliani, Federico, Navid Mahmoudian Bidgoli, and Andrea Prati. "A location-aware embedding technique
for accurate landmark recognition." 11th International Conference on Distributed Smart Cameras. ACM, 2017.
11. Agenda
• Motivations and objectives
• Related works
• Proposed approach (Bag of Indexes)
• Experimental results
• Conclusions
25/10/2018 IMP Lab - University of Parma 11
12. Datasets
• Holidays: 1491 images subdivided in 500 classes. The database
images are 991 and the query images are 500, one for every class;
• Oxford5k: 5062 images. The classes are 11 and the queries are 55 (5
for each class);
• Paris6k: 6412 images. The classes are 11 and the queries are 55 (5 for
each class);
• UKB: 10200 images subdivided in 2550 classes. All the images are
used as database images and only one for category is used as a
query image.
25/10/2018 IMP Lab - University of Parma 12
14. Comparison with the state of the art
Method Dimension Oxford5k Paris6k Holidays UKB
VLAD 4096 37.80% 38.60% 55.60% 3.18
CEVLAD 128 53.00% - 68.10% 3.093
FVLAD 128 - - 62.20% 3.43
HVLAD 128 - - 64.00% 3.40
gVLAD 128 60.00% - 77.90% -
Ng et al. 128 55.80% 58.30% 83.60% -
DDR locVLAD 128 57.52% 64.70% 87.38% 3.70
NetVLAD 512 59.00% 70.20% 82.90% -
DDR locVLAD 512 61.46% 71.88% 90.46% 3.76
25/10/2018 IMP Lab - University of Parma 14
15. Agenda
• Motivations and objectives
• Related works
• Proposed approach (Bag of Indexes)
• Experimental results
• Conclusions
25/10/2018 IMP Lab - University of Parma 15
16. Conclusion and future development
• DDR and locVLAD outperform the state of the art related to VLAD
descriptors;
• Small descriptor dimension;
• The future work will be on a different embedding like R-MAC;
• Also, the application of fine-tuning could help to improve the final
accuracy results.
25/10/2018 IMP Lab - University of Parma 16
17. Thanks for your attention!
• Questions?
• Contacts: tomaso.fontanini@studenti.unipr.it
• Website: implab.ce.unipr.it/?page_id=122
25/10/2018 IMP Lab - University of Parma 17