Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment Specification and Semantics

Re-Using Media on the (Semantic) Web
Raphaël Troncy <raphael.troncy@eurecom.fr> with contributions from:
Giuseppe Rizzo, José Luis Redondo Garcia, Mariella Sabatino, Pasquale Lisena
@rtroncy

Agenda

Session 1: Media fragment specification and semantics

Summary: Introduce the W3C Media Fragment URI specification and the Open Annotation model. Highlight how media fragments can be annotated using NER tools.

Session 2: Linked Media principles

Summary: Introduce the Linked Media principles, how to publish linked media in RDF and how to retrieve media enrichments. illustration with Linked Media applications.

Session 3: User experience driven design of Linked Media applications

Summary: Present the Web and TV convergence. Describe LinkedTV experience via two innovative applications.
20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014 - 2

Once upon a time …

… leading to sharing Media Fragments

Publishing status message containing a Media Fragment URI

Use a ‘#’ !

Highlight a video sequence

Highlight a region to pay attention to

W3C Video on the Web Workshop - 2007

Key topics

Addressing: having global identifiers for identifying spatial and temporal clips (for deep linking, bookmarking, caching and indexing)

Metadata: searching and discovering video is difficult with the volume of online video

Video codec: recommending a baseline (open) video codec for the World Wide Web

Content protection: managing digital rights associated with the media is key: W3C should look into metadata for digital rights

Making video a "first class citizen"

Flickr Notes
http://www.flickr.com/photos/mhausenblas/2883727293/

YouTube Temporal Addressing (Sept 2008)

Twitter bot monitoring usage of video sharing

Loose media fragments parser: https://github.com/yunjiali/Media-Fragments-URI-Loose

50 hours monitoring of the Twitter stream (22 December 2013 – 24 December 2013)

5,8 million tweets analyzed containing a video URL

32,754 tweets contain a valid media fragment URI (0.6%)

99% from YouTube, 0.3% from Dailymotion, 0.1% from Vimeo

t
0
20
35
temporal media fragment
spatial media fragment
track media fragment
named media fragment
“Scared Scene”
What are Media Fragments?

Media Fragments (temporal)
Fragment beginning
Fragment end
Playback progress
Original resource length

Media Fragments (spatial)
semi-opaque overlay
highlighted fragment
http://ninsuna.elis.ugent.be/MFPlayer/html5

Media Fragments URIs

Bookmark / Share parts (fragments) of audio/video content

Annotate media fragments

Search for media fragments

Develop Mash-ups/Collage

Conserve bandwidth
http://www.w3.org/TR/media-frags-reqs/
http://www.w3.org/TR/media-frags/

URI Scheme

Using URI query part:

Using URI fragment part:

Mixing both:
http://www.example.org/video.ogv?t=60,100
http://www.example.org/video.ogv#t=60,100
http://www.example.org/video.ogv?t=60,100#t=10,15

Media Fragments Resolution

For the URI query part:

The media file is only processed on server side

The UA receives a new video file

For the URI fragment part:

Smart UA will strip out the fragment definition and encode it into custom http headers (Range header)

(Media) Servers will handle the request, slice the media content and serve just the fragment (corresponding byte ranges) … while old ones will serve the whole resource

Media Fragments Resolution

2 ways handshake

4 ways handshake

Influence of Media Formats

Fragment extraction needs to be expressible in terms of byte ranges

Requirements for the different axes

temporal: presence of intra-coded frames (i.e., random access points)

spatial: presence of independently coded spatial regions

track: need to be identifiable by a name

Conclusion: temporal and track axes are realistic, spatial fragments can hardly be expressed in terms of byte ranges

Clients Video Sharing Platforms
TEMPORAL
NPT (hh:mm:ss)
SMPTE - Clock
SPATIAL 20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014 - 19
State of the art
Only start Not standard syntax

State of the art
MEDIAFRAGMENT.JS
MediaFragments.parse( "http://www.example.com/video.ogv ?t=1:00:00#t=npt:10,20 &xywh=percent:25,25,50,50" );
{
"query":{
"t":[
{
"value":"1:00:00",
"unit":"npt",
"start":"1:00:00",
"end":"",
"startNormalized":3600,
"endNormalized":""
}
]
},
"hash":{
"t":[
{
"value":"npt:10,20",
"unit":"npt",
Alignment to specification Controls for percent spatial frags Node.JS module
https://github.com/tomayac/Media-Fragments-URI/ 20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014 - 20

State of the art
CLIENT IMPLEMENTATIONS
SYNOTE MEDIA FRAGMENT PLAYER
•
Cross-browser (Flash fallback)
•
HTML5, YouTube, Dailymotion, Vimeo support
•
HTML5-like interface
https://github.com/pasqLisena/Media-Fragment-Player 20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014 - 21

State of the art
CLIENT IMPLEMENTATIONS
NINSUNA MEDIA FRAGMENT PLAYER
http://ninsuna.elis.ugent.be/MediaFragmentsPlayer 20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014 - 22

State of the art
SERVER IMPLEMENTATIONS
NINSUNA MEDIA FRAGMENT SERVER
RAFAEL
•
Preliminary process of media resources
•
Structural metadata stored in a RDF triplestore
•
Annotation system
•
Media adaptation and binarization
•
Support for Time range request
•
Fragment extraction on the fly
•
Java lib mp4parser
•
Fragment stored on filesystem
•
Support only for query fragments
http://ninsuna.elis.ugent.be/MediaFragmentsServer
https://github.com/Noterik/Rafael

MAFFIN: node-js Media Fragment Server

Query Fragment
•
Time (npt)
•
Track (video/audio)
•
Xywh (?)
Hash fragment
•
Range request (npt) 20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014 - 24

MAFFIN Architecture

Fragment Extraction
FRAGMENT QUERY
FFMPEG OPTION
NOTE
t=10
-ss 10
t=,20
-to 20
t=10,20
-ss 10 -to 20
track=video
-an
no audio
track=audio
-vn
no video
xywh=10,10,50,60
-filter:v "crop=50:60:10:10"
require transcoding
xywh=percent:10,10,50,60
-filter:v "crop=in_w*50/100:in_h*60/100:in_w*10/ 100:in_h*10/100"
require transcoding
ffmpeg -i C:/video/video.mp4 -ss 10 -to 20 C:/video/out/video_10-20_.mp4

Issuing HTTP (Time) Range Requests

A Chrome extension
20/10/2014 - Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014
- 28
Range: t:npt=10-20;
include-setup
#t=10,20
mediafragment.js

Issuing HTTP (Time) Range Requests
REQUEST
RESPONSE
GET /video.ogv HTTP/1.1 Host: www.example.com Accept: video/* Range: t:npt=10-20;
include-setup
HTTP/1.1 206 Partial Content
Accept-Ranges: bytes, t, id Content-Length: 3795 Content-Type: video/ogg
Content-Range-Mapping:
{ t:npt 9.85-21.16/0.0-653.79;include-setup } =
{ bytes 0-52,19147-22880/35614993 }
Content-type: multipart/byteranges; boundary=BOUNDARY Etag: "b7a60-21f7111-46f3219476580"
--BOUNDARY
Content-type: video/ogg
Content-Range: bytes 0-52/35614993
{binary data}
--BOUNDARY
Content-type: video/ogg
Content-Range: bytes 19147-22880/35614993
{binary data}
---BOUNDARY--
METADATA:
Bytes until first
frame
DATA:
Byte range built
with ffprobe
- 29
20/10/2014 -
Reusing Media on the (Semantic) Web - Tutorial @ ISWC 2014

Media Fragment Semantic Annotation

Media Fragment creation: localize a region (person)

Media Fragment annotation (tagging) = interpretation Winston Churchill, UK Prime Minister, Allied Forces, WWII

Media Fragment semantic annotation :Reg1 foaf:depicts dbpedia:WinstonChurchill. dbpedia:Churchill rdfs:label "Winston Churchill"; rdf:type foaf:Person dbprop:order dbpedia:Prime_Minister_(UK).
The "Big Three" at the Yalta Conference (Wikipedia)
Reg1


Media Fragment creation: localize a temporal sequence

Media Fragment annotation (tagging) = interpretation G8 Summit, EU Summit, Heiligendamm, 2007, Gothenburg, 2001

Media Fragment semantic annotation :Seq1 foaf:depicts dbpedia:33rd_G8_Summit. :Seq4 foaf:depicts dbpedia:EU_Summit. dbpedia:33rd_G8_Summit rdfs:label "33rd G8 summit"@en ; grs:point "54.143055555555556 11.841666666666667".
A history of G8 violence (video) (© Reuters)
Seq1
Seq4

RDF 1.1 Primer (http://www.w3.org/TR/rdf-primer/)


Things, not strings! http://googleblog.blogspot.fr/2012/05/introducing-knowledge- graph-things-not.html

Use knowledge bases (LOD)

Use common vocabularies (LOV)

Follow the 4 Linked Data principles

Refine the 4 Linked Media principles

Open Annotation Data Model

Specification developed in the W3C Open Annotation Community Group now Working Group http://www.openannotation.org/spec/core/

Core model

OWL vocabulary for representing and sharing annotation of digital resources (and their fragment) … in RDF

A body is related to a target

Nature of the annotation changes according to intention (motivation)

How to annotate this image?

Semantic Annotation of an Image
http://www.w3.org/community/openannotation/wiki/ SE_Semantically_Tagging_an_Image

Open Video: Annotation Project
http://openvideoannotation.org/

YouTube Annotations

Annotations are clickable text overlays on YouTube videos

Annotations are used to boost engagement, give more information, and aid in navigation

YouTube Annotations: How To

LinkedTV: automatic annotations ...

... and enrichment for hypervideos
Cubism
Expressionism
Fauvism
FACETS / PROPERTIES OF CONCEPT
CONCEPT IN PLAYER
CONTENT ENRICHMENT

LinkedTV Core Ontology
http://data.linkedtv.eu/ontologies/core

Media Fragments and Annotations
nerd:Location Cafe Rick
nerd:Person H. Bogart
nerd:Person
I. Bergman
nerd:Location Casablanca

Media Fragment URI 1.0

Chapters

Scenes

Shots

etc…
http://data.linkedtv.eu/media/e2899e7f#t=14,15

Enrichment and Hypervideos
nerd:Location Cafe Rick
Nerd:Person H. Bogart
Nerd:Person
I. Bergman
nerd:Location Casablanca
Nerd:Person
E. Tierney
nerd:Location China

What is a Named Entity recognition task?

A task that aims to locate and classify the name of a person or an organization, a location, a brand, a product, a numeric expression including time, date, money and percent in a textual document

Example

“ I want to book a room in an hotel located in the heart of Paris, just a stone’s throw from the Eiffel Tower ”
Eric Charton, “Named Entity Detection and Entity Linking in the Context of Semantic Web: Exploring the ambiguity question”

Part of Speech
I PRP want VBP to TO book VB a DT room NN in IN … … Paris NNP
NER: What is Paris? NEL: Which Paris are we talking about?
Giuseppe Rizzo, “Learning with the Web: Structuring data to ease machine understanding”

What is Paris? Type Ambiguity
03/09/2014 - - 49
dbpedia-owl:Asteroid
schema:City
schema:Movie dbpedia-owl:Film

Named Entity Recognition (NER)
I PRP O want VBP O to TO O book VB O a DT O room NN O in IN O … … … Paris NNP LOC
03/09/2014 - - 50

What is Paris? Name Ambiguity
03/09/2014 - - 51
Paris, Kentucky
Paris, Maine
Paris, Tennessee
Paris, France
Paris, Idaho
Paris, Ontario

Named Entity Linking (NEL)
I PRP O O want VBP O O to TO O O book VB O O a DT O O room NN O O in IN O O … … … … Paris NNP LOC http://dbpedia.org/resource/Paris
03/09/2014 - - 52

NER Tools and Web APIs

Standalone software

GATE

Stanford CoreNLP

Temis

Web APIs
http://nerd.eurecom.fr/

NERD User Interface

Problem: Generating Hypervideos

1) Semantic Graph of MediaResources

2) Main citizen: MediaFragments

MF Annotation

3) Anchors: Named Entities
LOD Cloud

Links to LOD

Hyperlink to other Media Content

Levels of Granularity

Edward Snowden asks for asylum in Russia (04 / 07 / 2013)
Problem: User Perspective

In which Russian airport is he exactly?

LSCOM:Face

LSCOM:Building
?
Problem: Technological Perspective

List of Relevant Named Entities
(1) Named Entity
(2) Filtering and Ranking
b) Expanded Entities
b) Re-ranked Entities
a) Entities from Video
Approach
03/09/2014 -
- 58

Named Entity Expansion

REST API2
ontology1
UI3
1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr
Named Entity Expansion: step 1


Five W´s *  Four W´s

Who: nerd:Person, nerd:Organization

What: nerd:Event, nerd:Function, nerd:Product

Where: nerd:Location

When: news program metadata

Entity Ranking and Selection:

Ranking according extractor’s confidence

Relative confidence falls in the upper quarter interval

Final Query:

Concatenate Labels of the selected entities in Who, What, Where, for a time t
(*) J. Li and L. Fei-Fei. What, where and who? classifying events by scene and object recognition


Entity clustering:

Centroid-based approach

Distance metric:

Strict string similarity over the URL’s

Jaro-Winkler string distance over labels

Entity re-ranking according to:

Relative frequency in the transcripts

Relative frequency over the additional documents

Average confidence score from the extractors

Output:

Frequent entities are promoted

Entities not disambiguated can be identified with a URL by transitivity

Same happens with erroneous labels

Relevant but non-spotted entities arise (example: N)

Named Entity








✚
✚
✚
✚
✚
✚
✚
Named Entity Expansion: Results

List of Relevant Named Entities
(1) Named Entity
(2) Filtering and Ranking
b) Expanded Entities
b) Re-ranked Entities
a) Entities from Video
Approach
03/09/2014 -
- 65

For each pair of results: Iteratively generate DBpedia paths using the EiCE engine [1]
[1] http://github.com/mmlab/eice
: Barack_Obama
:Vladimir_Putin
Refining via EiCE

NE Expansion
DBPedia Connectivity
(2) Ranking















Refining via EiCE: Results

Gathering Related Content for Enrichment

Knowledge Graphs (information cards)

Reverse engineering the GKG https://github.com/ahmadassaf/kbe

Web documents

https://www.google.com/cse/

Social Media

Media Collector https://github.com/vuknje/media-server
03/09/2014 - France - Taiwan Multimedia Workshop @ EURECOM - 68

Take Away Summary

Video is a first class citizen on the Web

Annotations: Ontology and API for Media Resources, Open Annotation Data Model

Access: Media Fragments URI

NERD platform for extracting key information from textual resources including video subtitles and microposts

Embrace the Linked Media vision

Publish, re-use, re-purpose and remix media descriptions

Develop links between (part of) media items via their descriptions

Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment Specification and Semantics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (17)

Andere mochten auch

Andere mochten auch (10)

Ähnlich wie Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment Specification and Semantics

Ähnlich wie Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment Specification and Semantics (20)

Mehr von LinkedTV

Mehr von LinkedTV (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment Specification and Semantics