SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Tell me why! Ain't nothin' but a mistake?
Describing Media Item Differences with Media
Fragments URI and Speech Synthesis
Thomas Steiner (tomac@google.com, @tomayac)
Raphaël Troncy (raphael.troncy@eurecom.fr, @rtroncy)
http://www.ourprg.com/wp-content/uploads/2013/03/wallpapers ru corvuscorax 2560x1440 chelyabinskiy meteor.jpg
Introduction
Context of this work:
● Event summarization based on multimedia data shared publicly
on social networks.
● Developed an application that auto-generates media galleries.
Media gallery creation steps
1) Extract media items from multiple social networks
[Rizzo2012] G. Rizzo, T. Steiner, R. Troncy, R. Verborgh, J.-L. Redondo García, R. Van de Walle. What
fresh media are you looking for?: retrieving media items from multiple social networks. In Proceedings of the
2012 international workshop on Socially-aware multimedia, pp. 15–20, 2012
Media gallery creation steps (cont.)
2) Deduplicate visually similar media items
[Steiner2013_1] Thomas Steiner, Ruben Verborgh, Joaquim Gabarró Vallés, and Rik Van de Walle. Near-
duplicate Photo Deduplication in Event Media Shared on Social Networks. In Proceedings of the
International Conference on Advanced IT, Engineering and Management, 2013
Media gallery creation steps (cont.)
3) Rank media item clusters
[Steiner2013_2] Thomas Steiner. A Meteoroid on Steroids: Ranking Media Items Stemming from Multiple
Social Networks. In Companion Publication of the IW3C2 WWW 2013 Conference, May 13–17, 2013, Rio de
Janeiro, Brazil.
Media gallery creation steps (cont.)
4) Compile media galleries
[Steiner2012_1] T Steiner, R Verborgh, J Gabarro, R Van de Walle. Defining
aesthetic principles for automatic media gallery layout for visual and audial event
summarization based on social networks. In Quality of Multimedia Experience
(QoMEX), 2012 Fourth International Workshop on, 2012
[Steiner2013_3] Thomas Steiner and Christopher Chedeau. To Crop, Or Not to
Crop: Compiling Online Media Galleries. In Companion Publication of the IW3C2
WWW 2013 Conference, May 13–17, 2013, Rio de Janeiro, Brazil
Research Question
"Given a complex algorithm like a media item clustering algorithm, can
we use Media Fragments URIs together with speech synthesis to
describe the algorithm's results rationales?"
● Human raters that evaluate algorithm results are non-experts.
● Can help algorithm developers improve the algorithms.
● Generalization potential for the proof-of-concept.
Media Fragments URIs
A media item tile is a spatial media fragment
xywh.js—Polyfill for spatial media fragments
<img src="kitten.jpg#xywh=100,100,50,50"/>
<img src="kitten.jpg#xywh=pixel:100,100,50,50"/>
<img src="kitten.jpg#xywh=percent:25,25,50,50"/>
Available as open source on GitHub:
https://github.com/tomayac/xywh.js
Media Fragments URIs (cont.)
Using a tile-wise average-histogram-based media item deduplication
algorithm with face detection.
Makes use of Media Fragments URIs [Troncy2012] to make semantic
statements about fragments of media items:
@base <http://example.org/> .
@prefix ma: <http://www.w3.org/ns/ma-ont> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix db: <http://dbpedia.org/resource/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix col: <http://purl.org/colors/rgb/> .
<video> a ma:MediaResource .
<video#t=,10&xywh=0,0,30,40> a ma:MediaFragment ;
foaf:depicts db:Face .
<video#t=,10&xywh=0,0,10,10> a ma:MediaFragment ;
dbo:colour col:f00 .
[Troncy2012] R. Troncy, E. Mannens, S. Pfeiffer, D. Van Deursen, M. Hausenblas, P. Jagenstedt, J. Jansen, Y.
Lafon, C. Parker, and T. Steiner, “Media Fragments URI 1.0 (basic),” Recommendation, W3C, 2012
Deduplicating media items
Each tile of a media item has its unique URI:
● http://example.org/image.png#xywh=0,0,10,10
We can leverage this fact to make semantic statements about media
item similarity, for example, to debug the deduplication algorithm.
Deduplicating media items (cont.)
Algorithm Matching Conditions
Cond. 1: Out of m tiles of a media item with n tiles (m <= n), the
average color of at most tiles_threshold tiles may differ not more
than similarity_threshold from their counterpart tiles.
Cond. 2: The numbers f1 and f2 of detected faces in both media items
have to be the same. We note that the algorithm does not recognize
faces, but only detects them.
Cond. 3: If the average colors of a tile and its counterpart tile are within
the black-and-white tolerance bw_tolerance, these tiles are not
considered and tiles_threshold is decreased accordingly.
Deduplicating media items (cont.)
Using a speech synthesizer and speech generation to make spoken
statements based on RDF statements about visual similarity of media
item tiles.
Based on Speak.js (https://github.com/kripken/speak.js)
Deduplicating media items (cont.)
Human Rater Decisions
Clustering Consent: Two or more media items are clustered by the
algorithm and the human rater agrees. The human rater wants to
understand why they were clustered.
Clustering Dissent: Two or more media items are clustered by the
algorithm, but the human rater thinks that they should not have been
clustered. The human rater wants to understand why they were
incorrectly clustered.
Non-Clustering Dissent: Two or more media items are not clustered
by the algorithm, but the human rater thinks that they should have
been clustered. The human rater wants to understand why they
were not clustered.
Deduplicating media items (cont.)
Low-level debug output
- Similarity threshold: 15 (Cond. 1)
- Tiles threshold: 67 (Cond. 1)
- Similar tiles: 52 (Cond. 1)
- Faces left: 0. Faces right: 0 (Cond. 2)
- BW tolerance: 1 (Cond. 3)
- Not considered tiles: 22 (Cond. 3)
- Effective tiles threshold: 45 (Cond. 3)
Needs to be lifted to normal human language in order to be
understandable by non-domain experts.
Natural Speech Generation
Reiter and Dale [Reiter2000] differentiate three phases of speech
generation:
Document planning determines the content and structure of a
document.
Microplanning decides which words, syntactic structures, etc. are used
to communicate the chosen content and structure.
Realization maps the abstract representations used by microplanning
into text.
[Reiter2000] E. Reiter and R. Dale, Building Natural Language Generation Systems,
Studies in Natural Language Processing. Cambridge University Press, 2000.
Natural Speech Generation (cont.)
Document Planning: We need to convey the currently selected
tiles_threshold and similarity_threshold, the number of detected faces f1
and f2 in each media item, and the number of tiles not considered given
the bw_tolerance parameter.
Microplanning: We need to decide on a matching condition aspect of
the algorithm that will be first highlighted. Afterwards, we need to
elaborate on secondary matching conditions such as detected faces and
black-and-white tolerance. The grammatical number (plural or singular)
needs to be taken into account. The microplanner needs to decide when
exactness (e.g., “99% of all tiles”) and when approximation of calculated
values (e.g., “roughly 50%”) better suits the human evaluators’ needs.
Realization: We need to map the abstract representations used by the
microplanning step into text.
Natural Speech Generation (cont.)
“However, 22 tiles
were not considered, as
they are either too bright or
too dark, which is a
common source of
clustering issues.”
Live Demo
Slides:
http://bit.ly/icme2013
Demo:
http://social-media-illustrator.herokuapp.com
This Paper:
http://www.lsi.upc.edu/~tsteiner/papers/2013/tell-me-why-aint-nothin-but-
a-mistake-describing-media-item-differences-icme2013.pdf
Other Papers:
http://www2013.org/companion/p31.pdf
http://www2013.org/companion/p201.pdf
Questions here, or tomac@google.com
@tomayac

Weitere ähnliche Inhalte

Ähnlich wie Tell me why! ain't nothin' but a mistake describing media item differences with media fragments uri and speech synthesis

PhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesPhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesFreddy Limpens
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGIJwest
 
MIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNING
MIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNINGMIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNING
MIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNINGCSEIJJournal
 
Mimeme Attribute Classification using LDV Ensemble Multimodel Learning
Mimeme Attribute Classification using LDV Ensemble Multimodel LearningMimeme Attribute Classification using LDV Ensemble Multimodel Learning
Mimeme Attribute Classification using LDV Ensemble Multimodel LearningCSEIJJournal
 
2015-04-29 research seminar
2015-04-29 research seminar2015-04-29 research seminar
2015-04-29 research seminarifi8106tlu
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory acijjournal
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Saeedeh Shekarpour
 
Combining Multimedia and Semantics (LACNEM2010)
Combining Multimedia and Semantics (LACNEM2010)Combining Multimedia and Semantics (LACNEM2010)
Combining Multimedia and Semantics (LACNEM2010)Oscar Corcho
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...
Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...
Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...Roberto García
 
Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...
Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...
Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...Jean Vanderdonckt
 
A new approach to achieve the users’ habitual opportunities on social media
A new approach to achieve the users’ habitual opportunities on social mediaA new approach to achieve the users’ habitual opportunities on social media
A new approach to achieve the users’ habitual opportunities on social mediaIAESIJAI
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003butest
 
DoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital AgeDoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital AgeGregoire Burel
 

Ähnlich wie Tell me why! ain't nothin' but a mistake describing media item differences with media fragments uri and speech synthesis (20)

PhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesPhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomies
 
Semi-Automated Assistance for Conceiving Chatbots
Semi-Automated Assistance for Conceiving ChatbotsSemi-Automated Assistance for Conceiving Chatbots
Semi-Automated Assistance for Conceiving Chatbots
 
The Value and Benefits of Data-to-Text Technologies
The Value and Benefits of Data-to-Text TechnologiesThe Value and Benefits of Data-to-Text Technologies
The Value and Benefits of Data-to-Text Technologies
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
MIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNING
MIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNINGMIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNING
MIMEME ATTRIBUTE CLASSIFICATION USING LDV ENSEMBLE MULTIMODEL LEARNING
 
Mimeme Attribute Classification using LDV Ensemble Multimodel Learning
Mimeme Attribute Classification using LDV Ensemble Multimodel LearningMimeme Attribute Classification using LDV Ensemble Multimodel Learning
Mimeme Attribute Classification using LDV Ensemble Multimodel Learning
 
2015-04-29 research seminar
2015-04-29 research seminar2015-04-29 research seminar
2015-04-29 research seminar
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory
 
Eswc14demo
Eswc14demoEswc14demo
Eswc14demo
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
Combining Multimedia and Semantics (LACNEM2010)
Combining Multimedia and Semantics (LACNEM2010)Combining Multimedia and Semantics (LACNEM2010)
Combining Multimedia and Semantics (LACNEM2010)
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...
Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...
Linked Data: the Entry Point for Worldwide Media Fragments Re-use and Copyrig...
 
Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...
Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...
Re-Engineering Graphical User Interfaces from their Resource Files with UsiRe...
 
A new approach to achieve the users’ habitual opportunities on social media
A new approach to achieve the users’ habitual opportunities on social mediaA new approach to achieve the users’ habitual opportunities on social media
A new approach to achieve the users’ habitual opportunities on social media
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003
 
DoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital AgeDoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
 

Mehr von MediaMixerCommunity

VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...MediaMixerCommunity
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutMediaMixerCommunity
 
Remixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and SemanticsRemixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and SemanticsMediaMixerCommunity
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationMediaMixerCommunity
 
Re-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and ExamplesRe-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and ExamplesMediaMixerCommunity
 
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskSemantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskMediaMixerCommunity
 
Opening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchersOpening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchersMediaMixerCommunity
 
The Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixingThe Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixingMediaMixerCommunity
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesMediaMixerCommunity
 
Media Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industryMedia Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industryMediaMixerCommunity
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesMediaMixerCommunity
 
Semantic technologies for copyright management
Semantic technologies for copyright managementSemantic technologies for copyright management
Semantic technologies for copyright managementMediaMixerCommunity
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013MediaMixerCommunity
 

Mehr von MediaMixerCommunity (14)

VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playout
 
Remixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and SemanticsRemixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and Semantics
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
 
Re-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and ExamplesRe-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and Examples
 
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskSemantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
 
Opening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchersOpening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchers
 
The Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixingThe Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixing
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ Archives
 
Media Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industryMedia Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industry
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ Archives
 
Semantic multimedia remixing
Semantic multimedia remixingSemantic multimedia remixing
Semantic multimedia remixing
 
Semantic technologies for copyright management
Semantic technologies for copyright managementSemantic technologies for copyright management
Semantic technologies for copyright management
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
 

Kürzlich hochgeladen

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Kürzlich hochgeladen (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Tell me why! ain't nothin' but a mistake describing media item differences with media fragments uri and speech synthesis

  • 1. Tell me why! Ain't nothin' but a mistake? Describing Media Item Differences with Media Fragments URI and Speech Synthesis Thomas Steiner (tomac@google.com, @tomayac) Raphaël Troncy (raphael.troncy@eurecom.fr, @rtroncy) http://www.ourprg.com/wp-content/uploads/2013/03/wallpapers ru corvuscorax 2560x1440 chelyabinskiy meteor.jpg
  • 2. Introduction Context of this work: ● Event summarization based on multimedia data shared publicly on social networks. ● Developed an application that auto-generates media galleries.
  • 3. Media gallery creation steps 1) Extract media items from multiple social networks [Rizzo2012] G. Rizzo, T. Steiner, R. Troncy, R. Verborgh, J.-L. Redondo García, R. Van de Walle. What fresh media are you looking for?: retrieving media items from multiple social networks. In Proceedings of the 2012 international workshop on Socially-aware multimedia, pp. 15–20, 2012
  • 4. Media gallery creation steps (cont.) 2) Deduplicate visually similar media items [Steiner2013_1] Thomas Steiner, Ruben Verborgh, Joaquim Gabarró Vallés, and Rik Van de Walle. Near- duplicate Photo Deduplication in Event Media Shared on Social Networks. In Proceedings of the International Conference on Advanced IT, Engineering and Management, 2013
  • 5. Media gallery creation steps (cont.) 3) Rank media item clusters [Steiner2013_2] Thomas Steiner. A Meteoroid on Steroids: Ranking Media Items Stemming from Multiple Social Networks. In Companion Publication of the IW3C2 WWW 2013 Conference, May 13–17, 2013, Rio de Janeiro, Brazil.
  • 6. Media gallery creation steps (cont.) 4) Compile media galleries [Steiner2012_1] T Steiner, R Verborgh, J Gabarro, R Van de Walle. Defining aesthetic principles for automatic media gallery layout for visual and audial event summarization based on social networks. In Quality of Multimedia Experience (QoMEX), 2012 Fourth International Workshop on, 2012 [Steiner2013_3] Thomas Steiner and Christopher Chedeau. To Crop, Or Not to Crop: Compiling Online Media Galleries. In Companion Publication of the IW3C2 WWW 2013 Conference, May 13–17, 2013, Rio de Janeiro, Brazil
  • 7. Research Question "Given a complex algorithm like a media item clustering algorithm, can we use Media Fragments URIs together with speech synthesis to describe the algorithm's results rationales?" ● Human raters that evaluate algorithm results are non-experts. ● Can help algorithm developers improve the algorithms. ● Generalization potential for the proof-of-concept.
  • 8. Media Fragments URIs A media item tile is a spatial media fragment xywh.js—Polyfill for spatial media fragments <img src="kitten.jpg#xywh=100,100,50,50"/> <img src="kitten.jpg#xywh=pixel:100,100,50,50"/> <img src="kitten.jpg#xywh=percent:25,25,50,50"/> Available as open source on GitHub: https://github.com/tomayac/xywh.js
  • 9. Media Fragments URIs (cont.) Using a tile-wise average-histogram-based media item deduplication algorithm with face detection. Makes use of Media Fragments URIs [Troncy2012] to make semantic statements about fragments of media items: @base <http://example.org/> . @prefix ma: <http://www.w3.org/ns/ma-ont> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix db: <http://dbpedia.org/resource/> . @prefix dbo: <http://dbpedia.org/ontology/> . @prefix col: <http://purl.org/colors/rgb/> . <video> a ma:MediaResource . <video#t=,10&xywh=0,0,30,40> a ma:MediaFragment ; foaf:depicts db:Face . <video#t=,10&xywh=0,0,10,10> a ma:MediaFragment ; dbo:colour col:f00 . [Troncy2012] R. Troncy, E. Mannens, S. Pfeiffer, D. Van Deursen, M. Hausenblas, P. Jagenstedt, J. Jansen, Y. Lafon, C. Parker, and T. Steiner, “Media Fragments URI 1.0 (basic),” Recommendation, W3C, 2012
  • 10. Deduplicating media items Each tile of a media item has its unique URI: ● http://example.org/image.png#xywh=0,0,10,10 We can leverage this fact to make semantic statements about media item similarity, for example, to debug the deduplication algorithm.
  • 11. Deduplicating media items (cont.) Algorithm Matching Conditions Cond. 1: Out of m tiles of a media item with n tiles (m <= n), the average color of at most tiles_threshold tiles may differ not more than similarity_threshold from their counterpart tiles. Cond. 2: The numbers f1 and f2 of detected faces in both media items have to be the same. We note that the algorithm does not recognize faces, but only detects them. Cond. 3: If the average colors of a tile and its counterpart tile are within the black-and-white tolerance bw_tolerance, these tiles are not considered and tiles_threshold is decreased accordingly.
  • 12. Deduplicating media items (cont.) Using a speech synthesizer and speech generation to make spoken statements based on RDF statements about visual similarity of media item tiles. Based on Speak.js (https://github.com/kripken/speak.js)
  • 13. Deduplicating media items (cont.) Human Rater Decisions Clustering Consent: Two or more media items are clustered by the algorithm and the human rater agrees. The human rater wants to understand why they were clustered. Clustering Dissent: Two or more media items are clustered by the algorithm, but the human rater thinks that they should not have been clustered. The human rater wants to understand why they were incorrectly clustered. Non-Clustering Dissent: Two or more media items are not clustered by the algorithm, but the human rater thinks that they should have been clustered. The human rater wants to understand why they were not clustered.
  • 14. Deduplicating media items (cont.) Low-level debug output - Similarity threshold: 15 (Cond. 1) - Tiles threshold: 67 (Cond. 1) - Similar tiles: 52 (Cond. 1) - Faces left: 0. Faces right: 0 (Cond. 2) - BW tolerance: 1 (Cond. 3) - Not considered tiles: 22 (Cond. 3) - Effective tiles threshold: 45 (Cond. 3) Needs to be lifted to normal human language in order to be understandable by non-domain experts.
  • 15. Natural Speech Generation Reiter and Dale [Reiter2000] differentiate three phases of speech generation: Document planning determines the content and structure of a document. Microplanning decides which words, syntactic structures, etc. are used to communicate the chosen content and structure. Realization maps the abstract representations used by microplanning into text. [Reiter2000] E. Reiter and R. Dale, Building Natural Language Generation Systems, Studies in Natural Language Processing. Cambridge University Press, 2000.
  • 16. Natural Speech Generation (cont.) Document Planning: We need to convey the currently selected tiles_threshold and similarity_threshold, the number of detected faces f1 and f2 in each media item, and the number of tiles not considered given the bw_tolerance parameter. Microplanning: We need to decide on a matching condition aspect of the algorithm that will be first highlighted. Afterwards, we need to elaborate on secondary matching conditions such as detected faces and black-and-white tolerance. The grammatical number (plural or singular) needs to be taken into account. The microplanner needs to decide when exactness (e.g., “99% of all tiles”) and when approximation of calculated values (e.g., “roughly 50%”) better suits the human evaluators’ needs. Realization: We need to map the abstract representations used by the microplanning step into text.
  • 17. Natural Speech Generation (cont.) “However, 22 tiles were not considered, as they are either too bright or too dark, which is a common source of clustering issues.”