Semi-supervised concept detection by learning the structure of similarity graphs

Semi-supervised concept detection by learning
the structure of similarity graphs
Symeon Papadopoulos1, Christos Sagonas1, Ioannis Kompatsiaris1, Athena Vakali2
1
Centre for Research and Technology Hellas, Information Technologies Institute
2
Aristotle University of Thessaloniki, Informatics Department

19th International Conference on Multimedia Modeling
Huangshan, China, Jan 7-9, 2012

IMAGE TAGS CONCEPTS

chocolate
cake food
chocolateganachebuttercream
shamsd

female
N/A indoor
people
portrait

nature
landscape clouds
water lake
reflection sky
mirror water
flickrelite
abigfave
SOURCE: MIR-Flickr
mklab.iti.gr #2

Overview

• Problem formulation
• Related work
• Graph Structure Features Approach
• Evaluation
– Synthetic datasets
– MIR-Flickr
• Conclusions

mklab.iti.gr #3

Overview

• Related work
• Evaluation
– MIR-Flickr
• Conclusions

mklab.iti.gr #4

Concept detection

ML perspective
• Given an image, produce a set of relevant concepts

IR perspective
• Given an image collection and a concept of interest,
rank all images in order of relevance.

mklab.iti.gr #5

Semi-supervised learning

• Transductive learning setting
target concepts

annotated set

D-dimensional feature vector from image i
concept indicator vector (labels) for image i

set of unknown items

Predict concepts associated with items of by processing
together and .

mklab.iti.gr #6

Overview

• Related work
• Evaluation
– MIR-Flickr
• Conclusions

mklab.iti.gr #7

Related work
• Neighborhood similarity (Wang et al., 2009)
– Uses image similarity graphs in combination with graph-based SSL
(Zhu, 2005; Zhou et al., 2004) – Not incremental
• Sparse similarity graph by convex optim. (Tang et al., 2009)
– Applicable to online settings - Computationally intensive training step
• Hashing-based graph construction (Chen et al., 2010)
– Uses KL divergence multi-label propagation, but relies on iterative
computational scheme – Difficult to apply in incremental settings
• Social dimensions (Tang & Liu, 2011)
– Uses LEs for networked classification problems (i.e. when network
between nodes is explicit) – Not incremental, not applied to
multimedia

mklab.iti.gr #8

Overview

• Related work
• Evaluation
– MIR-Flickr
• Conclusions

mklab.iti.gr #9

Graph Structure Features (GSF)

mklab.iti.gr #10

Graph construction

image similarity graph

set of nodes-images

cardinality of node set

Construction options
• full weighted graph
• kNN graph (connect k most similar images)
• εNN graph (connect images < similarity threshold)
mklab.iti.gr #11

Eigenvector/value computation

Normalized graph Laplacian

degree matrix (diagonal)
adjacency matrix
(typical form of graph Laplacian: )

non-zero eigenvalues

graph structure features*

by solving
*aka Laplacian Eigenmaps
mklab.iti.gr #12

Graph structure feature learning

• Each media item is represented by a vector

• At this point, any supervised learning method could be used.
[note that the whole framework is still SSL since unlabeled items are
used during graph construction]

• SVM is selected
– good performance in several problems
– good implementations available (LibSVM, LIBLINEAR)
– real-valued output (IR perspective  rank images by concept)

mklab.iti.gr #13

Intuition

coast coast, person
coast

0.2415 -0.4552
coast, person coast, person
0.3077 coast

-0.0893
-0.4552
0.2748

0.3144 -0.4663
coast 0.2415
coast coast, person

2nd eigenvector of graph Laplacian

mklab.iti.gr #14

Incremental learning setting (1)

• Transductive learning setting often impractical. For
each new set of unlabeled items:
1. recompute image similarity matrix
2. recompute graph structure features (LEs)
3. use SVM to obtain prediction scores
• Step 2 is computationally expensive.
• Devise two incremental schemes:
– Linear Projection (LP) :
set of k most similar images

– Submanifold Analysis (SA) [cf. next slide]
mklab.iti.gr #15

Incremental learning setting (2)

• Submanifold Analysis [Jia et al., 2009]
– Construct (k+1)x(k+1) similarity matrix WS between new
item and k most images from the annotated set
– Construct sub-diagonal and sub-Laplacian matrices

– Compute eigenvalues and d
eigenvectors corresponding to non-zero
eigenvalues [computation is lightweight since k << n]
– Minimize reconstruction error:

– Reconstruct approximate eigenvectors:

mklab.iti.gr #16

Fusion of multiple features
Graph struct. feature fusion (F-GSF)

Feature fusion (F-FEAT)

Similarity graph fusion (F-SIM) Result fusion (F-RES)

mklab.iti.gr #17

Overview

• Related work
• Evaluation
– MIR-Flickr
• Conclusions

mklab.iti.gr #18

Synthetic data - experiments
• Use of four 2D distributions with limited number of
samples (thousands) to test many settings
TWO MOONS LINES CIRCLES GAUSSIANS

• Performance aspects
– Parameters of approach: number of features (CD), graph
construction technique (kNN, εNN) and parameters (k, ε)
– Learning setting (training size, data noise, nr. of classes)
– Inductive learning (LP vs SA)
– Fusion method
mklab.iti.gr #19

Role of number of GSF (CD)
TWO MOONS LINES

noise
levels

CIRCLES GAUSSIANS

higher CD  better mAP
higher noise  higher CD

mklab.iti.gr #20

Role of graph construction technique

kNN εNN

kNN better and less sensitive than εΝΝ

mklab.iti.gr #21

Role of noise (σ)
TWO MOONS LINES

competing
CIRCLES methods GAUSSIANS

In most cases GSF equal or better than the expensive SVM-RBF.

mklab.iti.gr #22

Role of training samples (α%)
TWO MOONS LINES

CIRCLES GAUSSIANS

In most cases few training samples (2-5%) are sufficient for high accuracy.
mklab.iti.gr #23

Number of classes (K)

LINES CIRCLES

Sufficiently good accuracy wrt. number of classes
(much better than linear SVM, a bit worse than SVM-RBF).

mklab.iti.gr #24

Scalability wrt. number of features

Linearly increasing cost wrt.
dimensionality

Constant cost wrt.
dimensionality

mklab.iti.gr #25

Comparison between fusion methods

LINES CIRCLES

Even when one feature goes bad, result and GSF fusion still do
better than the best.

mklab.iti.gr #26

Incremental schemes SA much better and less sensitive than LP.
TWO MOONS LINES

CIRCLES GAUSSIANS

mklab.iti.gr #27

Overview

• Related work
• Evaluation
– MIR-Flickr
• Conclusions

mklab.iti.gr #28

Experimental setting

• MIR-Flickr
– 25,000 images + tags
– 38 concepts (24 + 14 with two interpretations [strict/rel])

• Benchmark methods
– Semantic Spaces (SESPA) [Hare & Lewis, 2010]
– Multiple Kernel Learning (MKL) [Guillaumin et al., 2010]

mklab.iti.gr #29

GSF vs SESPA

GSF-F1, F2, F3: Single feature GSF
GSF-C: Graph structure feature fusion
GSF-D1, D2: Result fusion using LIBLINEAR (1) and RBF (2)

mklab.iti.gr #30

GSF vs MKL

VISUAL

MKL better in: baby, bird, river, sea.

Possible thanks to
scalable behavior wrt.
TAG
number of features.

GSF better in: baby, bird, car, dog, river, sea.

mklab.iti.gr #31

Example results

mklab.iti.gr #32

Evaluation: adding unlabeled samples (1)

~6% relative
increase in mAP

GIST

mklab.iti.gr #33


~12% relative
increase in mAP

DenseSiftV3H1

mklab.iti.gr #34


~4% relative
increase in mAP

TagRaw50

mklab.iti.gr #35

Overview

• Related work
• Evaluation
– MIR-Flickr
• Conclusions

mklab.iti.gr #36

Conclusions
• Concept detection approach based on the structure of image
similarity graphs
– Transductive learning setting
– Two variants for online learning
• Thorough experimental analysis
– Behavior under a variety of settings/parameters
– Equivalent or better behavior compared to SoA approaches
• Fast:
– SA with k=5 takes 38.4msec per image (not incl. feature extraction)
– Future work: further analysis of computational characteristics +
application to larger scale datasets (NUS-Wide, ImageNet)

mklab.iti.gr #37

Thank you

Further contact: papadop@iti.gr
www.socialsensor.eu

mklab.iti.gr #38

References (1)
• Graph-based semi-supervised learning
Zhu, X.: Semi-supervised learning with graphs. PhD Thesis, Carnegie
Mellon University, 0-542-19059-1 (2005)
Zhou, D., Bousquet, O., Navin Lal, T., Weston, J. Schoelkopf, B.: Learning
with Local and Global Consistency. Advances in NIPS 16, MIT Press
(2004) 321-328
• Related approaches
Wang, M., Hua, X.-S. Tang, J., Hong, R.: Beyond distance measurement:
constructing neighborhood similarity for video annotation. TMM 11
(3) (2009), 465-476
Tang, J. et al.: Inferring semantic concepts from community contributed
images and noisy tags. ACM Multimedia (2009) 223-232
Chen, X. et al.: Efficient large scale image annotation by probabilistic
collaborative multi-label propagation. ACM Multimedia (2010), 35-44
Tang, L., Liu, H.: Leveraging social media networks for classification. Data
Mining and Knowledge Discovery 23 (3) (2011), 447-478

mklab.iti.gr #39

References (2)
• Relational classification
Macskassy, S.A., Provost, F.: Classification in Networked Data: A Toolkit
and a Univariate Case Study. Journal of Machine Learning Research 8,
(2007), 935-983

• Laplacian Eigenmaps
Mikhail, B., Partha, N.: Laplacian Eigenmaps for dimensionality reduction
and data representation. Neural Computing 15 (6), MIT Press (2003)
1373-1396
Jia, P., Yin, J., Huang, X., Hu, D.: Incremental Laplacian eigenmaps by
preserving adjacent information between data points. PR Letters 30
(16) (2009), 1457–1463

mklab.iti.gr #40

References (3)
• Tools
Leyffer, S., Mahajan, A.: Nonlinear Constrained Optimization: Methods and
Software. Preprint ANL/MCS-P1729-0310 (2010)
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A Library for Large Linear
Classification. Journal of ML Research 9 (2008), 1871-1874
Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology 2 (3) (2011), 27:1–27:27
• Dataset
Huiskes, M.J., Michael S. Lew, M.S.: The MIR Flickr Retrieval Evaluation.
Proceedings of ACM Intern. Conf. on Multimedia Information Retrieval (2008)
• Competing methods
Hare, J.S., Lewis, P.H.: Automatically annotating the MIR Flickr dataset. ACM ICMR
(2010), 547-556
Guillaumin, M., Verbeek, J., Schmid, C.: Multimodal semi supervised learning for
image classification. Proceedings of IEEE CVPR Conference (2010), 902-909

mklab.iti.gr #41

Semi-supervised concept detection by learning the structure of similarity graphs

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Semi-supervised concept detection by learning the structure of similarity graphs

Ähnlich wie Semi-supervised concept detection by learning the structure of similarity graphs (20)

Mehr von Symeon Papadopoulos

Mehr von Symeon Papadopoulos (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Semi-supervised concept detection by learning the structure of similarity graphs