This document compares semantic similarity measures for detecting near-duplicate video clips (NDVCs) using semantic features. It finds that semantic NDVC detection is most effective when similarity is measured using tag statistics from Flickr, rather than WordNet-based measures that are limited to concepts in the English WordNet. Experiments show lower NDVR (better detection) using tag co-occurrence statistics compared to semantic similarity measures based on WordNet concepts and hierarchies.
Comparison of Semantic Similarity Measures for NDVC Detection Using Semantic Features
1. Comparison of Semantic Similarity Measures for
NDVC Detection Using Semantic Features
Hyun-seok Min, Jae Young Choi, Wesley De Neve, and Yong Man Ro
Image and Video Systems Lab
Korea Advanced Institute of Science and Technology (KAIST)
Daejeon, South Korea
e-mail: ymro@ee.kaist.ac.kr website: http://ivylab.kaist.ac.kr
I. INTRODUCTION 1.3. Jiang–Conrath : based on the conditional probability of encountering
- Observations an instance of a child concept in a certain corpus
- an increasing number of near-duplicate video clips (NDVCs) can be
found on websites for video sharing
1
simJC (ti , t j ) = .
- content transformations tend to preserve semantic information log( p(ti )) + log( p(t j )) - log(p(lso(ti , t j )))
- Novel idea
- NDVC detection by means of semantic features and adaptive 1.4. Lin : follows from his theory of similarity between arbitrary objects
semantic distance measurement
- Objective 2 × log p(lso(ti , t j ))
- to answer the question: ‘which semantic similarity measure is most simL (ti , t j ) = .
effective in the context of NDVC detection using semantic features?’ log p(ti ) + log p(t j )
II. SEMANTIC NDVC DETECTION 2. Similarity measurement using Flickr tag occurrence and co-occurrence
Input: query video clip statistics
Video shot segmentation
Image folksonomy
I ti ∩ j I ti ∩ j : the set of images annotated with both
t
t ti and tj
simTC (ti , t j ) = ,
... ...
I ti I ti : the set of images annotated with tag ti
Tag relevance learning
Shot 1 ... Shot i ... Shot N using neighbor voting
IV. EXPERIMENTS
Semantic concept detection
1. Experimental setup
... ...
- Use of TRECVID 2009 for creating NDVCs and reference video clips
Creation of a semantic video signature - Use of MIRFLICKR-25000 as a source of collective knowledge
- Use of Toolbox and the Natural Language Toolkit (NLTK) for WordNet-
Matching of semantic video signatures based semantic similarity measurement
Reference video 2. Experimental results
Output: NDVC identification database - Semantic NDVC detection is, in general, most effective when similarity
measurement makes use of tag statistics derived from Flickr
Fig. 1. NDVC detection by means of semantic video signatures.
- similarity measurement using Flickr-based tag statistics is able to
exploit an unrestricted concept vocabulary, whereas the WordNet-
Ai ti , j , wi , j , j 1,..., Ai , wi , j is a weight value for tag ti,j based similarity measures are only able to make use of semantic
concepts that are part of the English-language version of WordNet
0.8
q r q r q r q r T Tag statistics Leacock–Chodorow
Dshot (S , S ) = SQFD( A , A ) = w | -w G w | -w , 0.7 Jiang–Conrath Lin
0.6 Resnik
SQFD: Signature Quadratic Form Distance 0.5
NDCR
W: vector of weight values for the tags t under consideration 0.4
G: matrix of ground distances (computed using tag statistics) 0.3
III. SEMANTIC SIMILARITY MEASURES 0.2
0.1
1. Similarity measurement using the WordNet knowledge base 0
blur crop pattern change in mirroring resize shift average
1.1. Leacock–Chodorow : relies on the length of the shortest path insertion brightness
between two concepts Transformations
len(ti , t j )
simLC (ti , t j ) = log , Fig. 2. Influence of semantic similarity measurement on the effectiveness of semantic
2E NDVC detection. The lower the NDCR, the more effective NDVC detection.
len(ti , t j ) : the shortest path between two concepts (ti, tj)
V. CONCLUSIONS
E : the overall depth of the taxonomy used
- We presented a novel technique for NDVC detection
1.2. Resnik : measures the information content of the most specific - takes advantage of the collective knowledge in an image folksonomy,
common ancestor of two concepts thus allowing for the use of an unrestricted concept vocabulary
- We quantified the influence of several semantic similarity measures on
simR (ti , t j ) = log p(lso(ti , t j )), the effectiveness of NDVC detection using semantic features
- semantic NDVC detection is most effective when semantic similarity
lso(ti , t j ) : the lowest super-ordinate of ti and tj measurement takes advantage of tag occurrence and co-occurrence
statistics derived from Flickr (an unstructured source of knowledge),
p(t ) : the probability of encountering an instance of a concept t outperforming semantic similarity measurement that takes advantage
in a certain corpus
of WordNet (a knowledge base with a hierarchical structure)
The International Conference on Multimedia Information Technology and Applications (MITA), July 2012, Beijing (China)