Presentation of the joint participation between CERTH and CEA LIST in the MediaEval 2015 edition of the Retrieving Diverse Social Images Task in Wurzen, Germany on 14-15 September, 2015.
Symeon PapadopoulosResearcher at CERTH-ITI, Co-founder at infalia um infalia
1. MediaEval 2015 Workshop, Retrieving Diverse Social Images Task
14-15 September 2015, Wurzen, Germany
USEMP: Finding Diverse Images at MediaEval 2015
Eleftherios Spyromitros-Xioufis1, Adrian Popescu2,
Symeon Papadopoulos1, Yiannis Kompatsiaris1
1 CERTH-ITI, Thermi-Thessaloniki, Greece, {espyromi,papadop,ikom}@iti.gr
2 CEA, LIST, 91190 Gif-sur-Yvette, France, adrian.popescu@cea.fr
2. Summary of our participation
• supervised Maximal Marginal Relevance (sMMR) [1]:
– A supervised diversification method that jointly optimizes
relevance and diversity
• The runs
– Fully automated, no external data*
– Each run corresponds to a different instantiation of sMMR
#2
Run id Run Type Relevance Features Diversity Features
1 visual-only CNN* [1] VLAD+CSURF [2]
2 text-only BOW BOW
3 & 5 visual+textual CNN, BOW, META VLAD+CSURF
[1] E. Spyromitros-Xioufis et al., “Improving diversity in image search via supervised relevance scoring”, ICMR 2015
[2] E. Spyromitros-Xioufis et al., “A comprehensive study over VLAD and Product Quantization in large-scale image
retrieval”, IEEE Transactions on Multimedia, 2014
3. Overview of our approach
• sMMR builds incrementally a refined set 𝑆 ⊂ 𝐼, 𝑆 = 𝐾
• At each step 𝐽 = 1, … , 𝐾 selects the image 𝑖𝑚∗ that
scores highest to the following criterion:
#3
𝑈(𝑖𝑚∗
|𝑞) = 𝑤 ∗ 𝑅 𝑖𝑚∗
𝑞 + 1 − 𝑤 ∗ min
𝑖𝑚 𝑗∈𝑆 𝐽−1
𝑑(𝑖𝑚∗
, 𝑖𝑚𝑗)
4. Overview of our approach
• sMMR builds incrementally a refined set 𝑆 ⊂ 𝐼, 𝑆 = 𝐾
• At each step 𝐽 = 1, … , 𝐾 selects the image 𝑖𝑚∗ that
scores highest to the following criterion:
#4
𝑈(𝑖𝑚∗
|𝑞) = 𝑤 ∗ 𝑅 𝑖𝑚∗
𝑞 + 1 − 𝑤 ∗ min
𝑖𝑚 𝑗∈𝑆 𝐽−1
𝑑(𝑖𝑚∗
, 𝑖𝑚𝑗)
Relevance to the query
output of a task and query specific classifier
5. Overview of our approach
• sMMR builds incrementally a refined set 𝑆 ⊂ 𝐼, 𝑆 = 𝐾
• At each step 𝐽 = 1, … , 𝐾 selects the image 𝑖𝑚∗ that
scores highest to the following criterion:
#5
𝑈(𝑖𝑚∗
|𝑞) = 𝑤 ∗ 𝑅 𝑖𝑚∗
𝑞 + 1 − 𝑤 ∗ min
𝑖𝑚 𝑗∈𝑆 𝐽−1
𝑑(𝑖𝑚∗
, 𝑖𝑚𝑗)
Relevance to the query
output of a task and query specific classifier
Diversity in 𝑆
distance to the most similar image already selected
6. Learning relevance from ground truth
#6
devset queries
q1 q2 q3
test query, e.g. “Eiffel Tower”
Wikipedia images
Flickr images ? ?
?
?
?
Flickrimages
7. Learning relevance from ground truth
#7
devset queries
q1 q2 q3
test query, e.g. “Eiffel Tower”
Wikipedia images
Flickr images ? ?
?
?
?
Flickrimages
training set for ℎeiffel
8. Learning relevance from ground truth
#8
devset queries
q1 q2 q3
test query, e.g. “Eiffel Tower”
Wikipedia images
Flickr images ? ?
?
?
?
Flickrimages
training set for ℎeiffel
9. Learning relevance from ground truth
#9
devset queries
q1 q2 q3
test query, e.g. “Eiffel Tower”
Wikipedia images
Flickr images ? ?
?
?
?
Flickrimages
training set for ℎeiffel
10. Learning relevance from ground truth
#10
devset queries
q1 q2 q3
test query, e.g. “Eiffel Tower”
Wikipedia images
Flickr images ? ?
?
?
?
Flickrimages
training set for ℎeiffel
11. Learning relevance from ground truth
#11
devset queries
q1 q2 q3
test query, e.g. “Eiffel Tower”
Wikipedia images
Flickr images ? ?
?
?
?
Flickrimages
training set for ℎeiffel
12. #12
This work was supported by the USEMP FP7 project
More details at the poster session!
Hinweis der Redaktion
Here is an overview of our approach. sMMT builds a refined set of images S with K elements from a larger set of images I, incrementally at K steps. At each step, the method greedily selects to include in S the image (among the unselected ones) that maximizes the following criterion that jointly considers relevance and diversity.
The criterion is a weighted combination of a Relevance score and a Diversity score:
The relevance score is basically the output of a classifier that is trained to distinguish relevant from irrelevant images. It is task specific because it uses the relevance ground truth given for this task and query specific because it includes the Wikipedia images/page given for each location in the set of positive/relevant examples.
For the diversity part, we define the diversity score for an image at step J, as being equal to the distance of this images to the most similar image among those already included in S.
Here is an overview of our approach. sMMT builds a refined set of images S with K elements from a larger set of images I, incrementally at K steps. At each step, the method greedily selects to include in S the image (among the unselected ones) that maximizes the following criterion that jointly considers relevance and diversity.
The criterion is a weighted combination of a Relevance score and a Diversity score:
The relevance score is basically the output of a classifier that is trained to distinguish relevant from irrelevant images. It is task specific because it uses the relevance ground truth given for this task and query specific because it includes the Wikipedia images/page given for each location in the set of positive/relevant examples.
For the diversity part, we define the diversity score for an image at step J, as being equal to the distance of this images to the most similar image among those already included in S.
Here is an overview of our approach. sMMT builds a refined set of images S with K elements from a larger set of images I, incrementally at K steps. At each step, the method greedily selects to include in S the image (among the unselected ones) that maximizes the following criterion that jointly considers relevance and diversity.
The criterion is a weighted combination of a Relevance score and a Diversity score:
The relevance score is basically the output of a classifier that is trained to distinguish relevant from irrelevant images. It is task specific because it uses the relevance ground truth given for this task and query specific because it includes the Wikipedia images/page given for each location in the set of positive/relevant examples.
For the diversity part, we define the diversity score for an image at step J, as being equal to the distance of this images to the most similar image among those already included in S.