Multimedia Semantics:Metadata, Analysis and Interaction. Keynote Talk at the Latin-American Conference on Networked Electronic Media (LACNEM), August 2009, Bogota, Colombia
2. Some BIG numbers
User Generated Content (Jul'09)
3.7+ billion photos
10+ billion photos
110+ million videos
20 hours uploaded / min ≈ 75 000 full length movies / week
Archived TV content
1.5 million hours ≈ 120 km of shelves
300000 hours | 1 petabyte / year
News content
Content difficult to search and reuse
Barely invisible for the search engines
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 -2
3. Image/Video indexing
Techniques used by mainstream search engines
search term occurs in the filename or in the caption or in user tags
no semantics
Image indexing: main problem
an image is not alphabetic: there is no countable discrete units, that, in
combination will provide the meaning of the image
image descriptors are not given with the image: one needs to extract or
interpret them
Video indexing: additional problem
a video has additionally a temporal dimension to take into account
a video has a priori no discrete units neither (i.e. frames, shots, sequences
cannot be absolutely defined)
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 -3
4. Why is it so difficult to find
appropriate multimedia content, to
reuse and repurpose content
previously published and to present
this content in interfaces that vary
with user needs?
5. Sounds Familiar?
[Arnold Smeulders,
PAMI, 2000]
The semantic gap is the
lack of coincidence
between the information
that one can extract from
the sensory data and the
interpretation that the
same data has for a user
in a given situation
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 -5
6. a little drop of semantics goes a
long way
Jim Hendler [1997]
7. Agenda
1. Semantics in multimedia analysis
• Detecting concepts for video indexing
• Evaluating interactive search tasks
2. Semantics in metadata
• Multimedia metadata interoperability
• Expose your data following 4 basic principles
• Re-use a growing amount of publicly open datasets
3. Semantics in user interfaces
• Provide meaningful presentation of underlying data
• Explore large knowledge bases powered by linked data
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 -7
8. The science of labeling
Automatically detecting the presence of a
concept in a video stream
airplane
Naming visual information
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 -8
9. The Computer Vision Approach
Building detectors one-at-the-time
a face detector for
frontal faces
3 years later
a face detector for
non-frontal faces
One (or more) PhD for
every new concept
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 -9
10. So how about these?
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 10
11. A Simple Concept Detector
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 11
16. NIST TRECVID Evaluation
Until 2001, everybody defined his own concepts
Using specific and small data sets
Hard to compare methodologies
Since 2001, worldwide evaluation by NIST
Promote progress in video retrieval search
Provide common datasets (shots, ASR, key frames)
Use open, metrics-based evaluation
Large-Scale Concept
Ontology for Multimedia
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 16
17. Success and Criticism
More and more concept detectors available:
TRECVID 2005: 101 concept lexicon
TRECVID 2006: 491 concept lexicon
MediaMill Challenge 2007: 572 concept lexicon
... but focus is on the final result
relative merit of indexing methods: ignore intermediary
steps while systems become more complex (several
features and learning methods)
... but concept detectors developed mismatch
user information needs
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 17
18. TRECVID Interactive Video Search Task
Query selection:
by keyword,
by concept,
by example
Topics unknown
Test set
English (2004)
Chinese (2005-6)
Dutch (2007-8-9)
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 18
19. VideOlympics
Benchmark performance cannot be sole criterion
Experience of searcher counts
Usability of systems matters
VideoOlympics: live interactive search task
Simultaneous exposure
of video retrieval systems
Showcase that goes
beyond a regular demo
session
Fun to do (participants)
& Fun to watch (audience)
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 19
20. VideOlympics Setup
One display
TRECVID like queries
Results pushed by searchers
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 20
21. Agenda
1. Semantics in multimedia analysis
• Detecting concepts for video indexing
• Evaluating interactive search tasks
2. Semantics in metadata
• Multimedia metadata interoperability
• Expose your data following 4 basic principles
• Re-use a growing amount of publicly open datasets
3. Semantics in user interfaces
• Provide meaningful presentation of underlying data
• Explore large knowledge bases powered by linked data
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 21
23. MPEG-7: a multimedia description language?
ISO standard
since December
of 2001 Content organization
Collections Models User
interaction
Main
components: Creation &
Navigation & User
Access Preferences
Descriptors Production
Summaries
(Ds) and Media Usage
Content management User
Description Views History
Schemes Content description
(DSs) Structural
aspects
Semantic
aspects
Variations
DDL (XML
Schema +
Basic elements
extensions) Schema Basic Links & media Basic
Tools datatypes localization Tools
Concern all
types of media Part 5 – MDS
Multimedia Description Schemes
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 23
24. MPEG-7 and the Semantic Web
MDS Upper Layer represented in RDFS
2001: Hunter
Later on: link to the ABC upper ontology
MDS fully represented in OWL-DL
2004: Tsinaraki et al., DS-MIRF model
MPEG-7 fully represented in OWL-DL
2005: Garcia and Celma, Rhizomik model
Fully automatic translation of the whole standard
MDS and Visual parts represented in OWL-DL
2007: Arndt et al., COMM model
Re-engineering MPEG-7 using DOLCE design patterns
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 24
29. Image Annotation with Linked Data
Reg1
The "Big Three" at the Yalta
Conference (Wikipedia)
Localize a region (bounding box)
Annotate the content (interpretation)
Tag: Winston Churchill, UK Prime Minister, Allied Forces, WWII
Link to knowledge on the Web
:Reg1 foaf:depicts dbpedia:Winston_Churchill
----------------------------------------------
dbpedia:Winston_Churchill dbpedia:spouse
dbpedia:Clementine_Churchill
dbpedia:Winston_Churchill owl:sameAs
fbase:Winston_Churchill
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 29
31. What is linked data?
URIs, possibly identifying
media fragments wp:2006_FIFA_World_Cup#Final
+ annotations (tags)
events:id
+ links among fragments
& annotations
geonames:2950159
nar:subject
nar:location nc:15054000
foaf:depicts
dbpedia:Zidane
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 31
31
32. Linked Data Principles
Tim Berners Lee [2006] (Design Issues)
1. Use URIs to identify things
(anything, not just documents);
2. Use HTTP URIs – globally unique names, distributed
ownership –
so that people can look up those names;
3. Provide useful information in RDF –
when someone looks up a URI;
4. Include RDF links to other URIs –
to enable discovery of related information
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 32
33. An Example: DBpedia
DBpedia is a community effort to:
extract structured "infobox" information from Wikipedia
interlink DBpedia with other datasets on the Web
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 33
37. Bogotá on Freebase
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 37
38. Bogotá on Geonames
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 38
39. How Much Linked Data is there ?
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 39
40. Linked Data Cloud – August 2007
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 40
41. Linked Data Cloud – March 2008
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 41
42. Linked Data Cloud – September 2008
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 42
43. Linked Data Cloud – March 2009
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 43
44. The Web of Data
Expose open datasets in RDF
Set RDF links among the data items for
different datasets
Over 4.5 billion triples, 5 millions links
(March 2009)
... still counting
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 44
45. Who are the users?
Why would they use the cloud?
What tasks can be supported?
How will the semantics help?
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 45
46. Agenda
1. Semantics in multimedia analysis
• Detecting concepts for video indexing
• Evaluating interactive search tasks
2. Semantics in metadata
• Multimedia metadata interoperability
• Expose your data following 4 basic principles
• Re-use a growing amount of publicly open datasets
3. Semantics in user interfaces
• Provide meaningful presentation of underlying data
• Explore large knowledge bases powered by linked data
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 46
47. Provide meaningful presentation of data
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 47
48. ... and behind the scene
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 48
49. ... link an artist to more data
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 49
53. Going through the Walled Gardens
David Simonds: Everywhere and nowhere. 19 May 2008, The Economist.
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 53
54. How can semantics help?
Query construction
disambiguate input (auto-completion)
selection of available terms (grouping and ranking algorithms)
(Semantic) search algorithm
graph traversal
query expansion
RDFS/OWL reasoning
Presentation of search results
grouping by property
visualization on timeline, map, etc.
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 54
54
55. News Workflow Interoperability
No integration of media (stories, photo, animation, video)
Little (or no) context in the news presentation
Lack of interoperability in the current workflow
NAR Schema Broadcaster Schema
User
NewsCodes Controlled Vocabularies Vocabulary
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 55
55
56. Exploratory Search
(Ultimate) Goal:
Provide an environment for searching and browsing
contextualized multimedia news information
Required integration:
Data: various media, different forms, various sources
Metadata: schema integration, semantic models
Influence and implications of UI:
How to represent semantic multimedia metadata
to facilitate presenting information?
in other words ... What constraints do end-user
interfaces put on the modeling of the metadata?
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 56
56
59. Enriching the News Metadata
Concepts/Entities that
are subject of news
Thematic categories
People
Organizations
Geopolitical Areas
Points of Interest
Events
Products or artefacts
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 59
60. Enriching the News Metadata
Named Entity
Recognition
Domain Ontologies
NAR Ontology
NewsCodes
Thesaurus
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 60
62. Presenting News Information
Dimensions used for searching news items
When time 10/07/2006
Where location Paris
What is depicted J. Chirac, Z. Zidane Metadata
Why event WC 2006
Who photographer Bertrand Guay, AFP
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 62
63. Semantic Search of Multimedia News
Description Number of RDF Triples
General Ontologies: NAR, DC, FOAF 7,336
Domain Specific Ontologies: football 104,358
Thesauri: newscodes 34,903
DBpedia, Geonames 53,468
AFP News Feed (June/July 2006) 804,446
AFP Photos (June/July 2006) 61,311
a
INA Broadcast Video (June/July 2006)
P atri 1,932
Cl io
by
Total r ed lpha 3 1,067,754
P owe 1.0 a
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 63
67. Provide New Dimensions for Exploring
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 67
68. Take Home Message
Concept detection challenges: machine learning and IR
Features can be extracted and used to describe multimedia content
Show generality of approach, dynamic nature of video (event)
Show that an ontology can help
Semantic metadata representation challenges: KR
Media and metadata can be passed around and among systems
Reuse what is there
Expose what you make
Interaction challenges: CHI
Users can be given much richer
and more flexible access to (semantically annotated) content
... but we are still figuring out how to do this!
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 68
69. Credits
Many people
Cees Snoek, Alex Hauptmann, Alan Smeaton,
Ivan Herman, Krishna Chandramouli, David Simonds,
Laurent Le Meur
Colleagues from the Interactive Information Access
Group, CWI Amsterdam
Datasets
http://www.slideshare.net/troncy
04/08/2009 - Multimedia Semantics: Metadata, Analysis and Interaction - LACNEM 2009 - 69
Hinweis der Redaktion
Diagram is messy. Try to show largest part of MPEG-7 in one slide. From Alia and Michiel: MPEG-7 so far???
Who? experts, lay persons Why? information searching, annotation tasks, How? entering query, finding items of interest, displaying results What? fact finding, information gathering, sensemaking, location-based mobile search (Pub Canary Wharf)