SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Towards Collaborative
         Annotation for
       Video Accessibility
Pierre-Antoine Champin, Benoît Encelle,
Magali O. Beldame, Yannick Prié
Nick Evans and
Raphaël Troncy <raphael.troncy@eurecom.fr>
The                                         consortium
 Dailymotion (Paris, FR) : video sharing website
    Promotes HTML 5 using the video tag, http://openvideo.dailymotion.com/
 LIRIS (Lyon, FR) : CS research group
    Silex Team: expertise in semantic web, annotation models, video annotation
     and HCI for disabled people
 EURECOM (Sophia Antipolis, FR) : research center in
  communications systems
    Multimedia team: expertise in multimedia analysis (speaker
     diarization/recognition, speech recognition) and semantic web
 INS HEA + school (Lyon, FR)
    Experiences in physical disabilities: blindness, visual impairment, deafness
     and hearing Loss
    Blind and death high-school students



     26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -2
Goals and Motivations
 What is required to make video accessible on the Web?
 How to increase the number of accessible videos?
 Technologies:
    Annotating: automatic (speech transcription) and manual (social
     collaborative annotation tool)
    Addressing: pointing to, retrieving, transmitting only parts of media
    Rendering: video visualization for the impaired, Braille output

 Expected benefits for:
    disabled people, getting better access to video
    video provider, reaching a wider audience
    the Web in general, using semantic annotations



    26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -3
Accessibility Features for Visually
Impaired and Blind People

Man’s actions                                                                            Put on his shoes                    Walk in the street

Son’s actions                                           Look his mother

 Characters             The mother, her son            The son, the man                                    The man and his friend

  Scenery                                 In the shop                                                             In the street



                                                                    Annotations multimodal presentation
Annotations                                                                     depends on video context
                                                                                  and user preferences




                                  Audio                           Auditory              Audio                          Braille
                                  track                            icons              description

         26/04/2010 -             Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA        -4
Accessibility Features for Deaf People


Mother‘s dialogues                                                                              How are you ?


 Son’s dialogues                                                   Hi mom                                            Fine and you ?

     Sound                                                                        Car horn



                                                                             Annotations presentation
Annotations                                                                    depends on video cointext
                                                                                 and user preferences




                                        Video                                  Subtitles            Surtitles
                                        track


          26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA     -5
Producing Video Annotations

  Automatic annotations  Social annotations



           Speaker diarization
                 Who spoke and When?                                                           Annotation corrections,
           Speech recognition                                                                   enhancement
                 Transcription                                                                 Audio description
                                                                                                 (for visually impaired)
Annotations
 Mother                              How are you ?                                      Annotations
  Son                     Ho mom                         Fine                               Mother                                         How are you ?

                                                                                             Son                        Hi mom                             Fine and you ?


                                                                                            Sound                               Car horn




           26/04/2010 -            Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA      -6
Braille     Rendering
 The Advene prototype   emulation   views


Enriched
Media Player




Timeline
with typed
annotations




                                7
Preliminary study (1/2)
 Semi-structured interviews with blind users (n=2)
    Participant’s habits when watching programs with audio description
    Audio description process
    Multimodal presentations of descriptions

 Requirements:
    R1: generate additional descriptions and provide unobtrusive access
     to descriptions (tactile access for blind Braille readers)
    R2: descriptions at various level of granularity and verbosity
    R3: use system’s multimodal output to provide two or more
     descriptions (e.g. speech synthesis and Braille display)




    26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -8
Preliminary study (2/2)
 Goal: see whether we can use auditory icons to convey
  the rhythm of the editing of a movie to blind users
    e.g.: sound of a locomotive arriving from the right to convey the
     concept of a traveling from right to left

 Experiment and questionnaires (n=16+9)
    Viewing with headsets of 5 min of Ratatouille,
     http://www.imdb.com/title/tt0382932/

 Results:
    Rhythm and movie dynamic better perceived
    Usefulness of auditory icons but must be limited (5 max) and be very
     different from the main soundtrack of the movie
          Editing cues: change of scenes, camera movement, flashback (e.g. NCIS)
          Audio zoom (e.g. Survivor)


    26/04/2010 -    Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -9
ACAV Architecture

                                                                     Benchmarking: Sphinx, HTK,
                                                                     Julius




  26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 10
Media Fragments URI

                                                                           Provide URI-based
                                                                           mechanisms for uniquely
                                                                           identifying fragments for
                                                                           media objects on the Web,
                                                                           such as video, audio, and
                                                                           images.


Photo credit: Robert Freund




     26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 11
Media Fragments Processing

http://www.example.com/video.ogv#t=10,20




   26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 12
Conclusion

 ACAV will bring:
   Dedicated annotation schemas for video accessibility
   Social network model for video annotations
   Web integration of state of the art speech technologies
   GUI models for authoring and rendering video
    annotations
   Media Fragments reference implementation
   Open source Braille plugin for most used Web browsers


                                                 http://www.acavideo.fr/

   26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 13

Weitere ähnliche Inhalte

Ähnlich wie Towards Collaborative Annotation for Video Accessibility

Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)David Evan Harris
 
Bryan J. Hogan
Bryan J. HoganBryan J. Hogan
Bryan J. HoganVideoguy
 
User Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media AssetsUser Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media AssetsJon W. Dunn
 
Extract the Audio from Video by using python
Extract the Audio from Video by using pythonExtract the Audio from Video by using python
Extract the Audio from Video by using pythonIRJET Journal
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
 
What can users do for multimedia?
What can users do for multimedia?What can users do for multimedia?
What can users do for multimedia?Lora Aroyo
 
Adding audio and video presentation
Adding audio and video presentationAdding audio and video presentation
Adding audio and video presentationLaura Hollinshead
 
Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...Charleston Conference
 
MediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears togetherMediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears togethermultimediaeval
 
Video Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual EnvironmentVideo Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual Environment3Play Media
 

Ähnlich wie Towards Collaborative Annotation for Video Accessibility (20)

Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)
 
Git yo'self video lit'rit
Git yo'self video lit'ritGit yo'self video lit'rit
Git yo'self video lit'rit
 
Bryan J. Hogan
Bryan J. HoganBryan J. Hogan
Bryan J. Hogan
 
User Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media AssetsUser Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media Assets
 
Extract the Audio from Video by using python
Extract the Audio from Video by using pythonExtract the Audio from Video by using python
Extract the Audio from Video by using python
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
What can users do for multimedia?
What can users do for multimedia?What can users do for multimedia?
What can users do for multimedia?
 
Why Not Video?
Why Not Video?Why Not Video?
Why Not Video?
 
imovie ice 2013
imovie ice 2013imovie ice 2013
imovie ice 2013
 
Video Accessibility
Video Accessibility Video Accessibility
Video Accessibility
 
Adding audio and video presentation
Adding audio and video presentationAdding audio and video presentation
Adding audio and video presentation
 
Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...
 
Feeley i movie islma pp
Feeley i movie islma ppFeeley i movie islma pp
Feeley i movie islma pp
 
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
On Linked Open Data (LOD)-based  Semantic Video Annotation SystemsOn Linked Open Data (LOD)-based  Semantic Video Annotation Systems
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
 
Arneb
ArnebArneb
Arneb
 
MediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears togetherMediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears together
 
Video Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual EnvironmentVideo Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual Environment
 
Athabasca
AthabascaAthabasca
Athabasca
 
Athabasca
AthabascaAthabasca
Athabasca
 
Accessible Video
Accessible VideoAccessible Video
Accessible Video
 

Mehr von Raphael Troncy

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyRaphael Troncy
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentRaphael Troncy
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningRaphael Troncy
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationRaphael Troncy
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...Raphael Troncy
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Raphael Troncy
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Raphael Troncy
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Raphael Troncy
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionRaphael Troncy
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Raphael Troncy
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webRaphael Troncy
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streamsRaphael Troncy
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdRaphael Troncy
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentRaphael Troncy
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksRaphael Troncy
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED OpeningRaphael Troncy
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingRaphael Troncy
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED OpeningRaphael Troncy
 

Mehr von Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social web
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streams
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED Opening
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop opening
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 

Kürzlich hochgeladen

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 

Kürzlich hochgeladen (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 

Towards Collaborative Annotation for Video Accessibility

  • 1. Towards Collaborative Annotation for Video Accessibility Pierre-Antoine Champin, Benoît Encelle, Magali O. Beldame, Yannick Prié Nick Evans and Raphaël Troncy <raphael.troncy@eurecom.fr>
  • 2. The consortium  Dailymotion (Paris, FR) : video sharing website  Promotes HTML 5 using the video tag, http://openvideo.dailymotion.com/  LIRIS (Lyon, FR) : CS research group  Silex Team: expertise in semantic web, annotation models, video annotation and HCI for disabled people  EURECOM (Sophia Antipolis, FR) : research center in communications systems  Multimedia team: expertise in multimedia analysis (speaker diarization/recognition, speech recognition) and semantic web  INS HEA + school (Lyon, FR)  Experiences in physical disabilities: blindness, visual impairment, deafness and hearing Loss  Blind and death high-school students 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -2
  • 3. Goals and Motivations  What is required to make video accessible on the Web?  How to increase the number of accessible videos?  Technologies:  Annotating: automatic (speech transcription) and manual (social collaborative annotation tool)  Addressing: pointing to, retrieving, transmitting only parts of media  Rendering: video visualization for the impaired, Braille output  Expected benefits for:  disabled people, getting better access to video  video provider, reaching a wider audience  the Web in general, using semantic annotations 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -3
  • 4. Accessibility Features for Visually Impaired and Blind People Man’s actions Put on his shoes Walk in the street Son’s actions Look his mother Characters The mother, her son The son, the man The man and his friend Scenery In the shop In the street Annotations multimodal presentation Annotations depends on video context and user preferences Audio Auditory Audio Braille track icons description 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -4
  • 5. Accessibility Features for Deaf People Mother‘s dialogues How are you ? Son’s dialogues Hi mom Fine and you ? Sound Car horn Annotations presentation Annotations depends on video cointext and user preferences Video Subtitles Surtitles track 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -5
  • 6. Producing Video Annotations  Automatic annotations  Social annotations  Speaker diarization Who spoke and When?  Annotation corrections,  Speech recognition enhancement Transcription  Audio description (for visually impaired) Annotations Mother How are you ? Annotations Son Ho mom Fine Mother How are you ? Son Hi mom Fine and you ? Sound Car horn 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -6
  • 7. Braille Rendering The Advene prototype emulation views Enriched Media Player Timeline with typed annotations 7
  • 8. Preliminary study (1/2)  Semi-structured interviews with blind users (n=2)  Participant’s habits when watching programs with audio description  Audio description process  Multimodal presentations of descriptions  Requirements:  R1: generate additional descriptions and provide unobtrusive access to descriptions (tactile access for blind Braille readers)  R2: descriptions at various level of granularity and verbosity  R3: use system’s multimodal output to provide two or more descriptions (e.g. speech synthesis and Braille display) 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -8
  • 9. Preliminary study (2/2)  Goal: see whether we can use auditory icons to convey the rhythm of the editing of a movie to blind users  e.g.: sound of a locomotive arriving from the right to convey the concept of a traveling from right to left  Experiment and questionnaires (n=16+9)  Viewing with headsets of 5 min of Ratatouille, http://www.imdb.com/title/tt0382932/  Results:  Rhythm and movie dynamic better perceived  Usefulness of auditory icons but must be limited (5 max) and be very different from the main soundtrack of the movie  Editing cues: change of scenes, camera movement, flashback (e.g. NCIS)  Audio zoom (e.g. Survivor) 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -9
  • 10. ACAV Architecture Benchmarking: Sphinx, HTK, Julius 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 10
  • 11. Media Fragments URI Provide URI-based mechanisms for uniquely identifying fragments for media objects on the Web, such as video, audio, and images. Photo credit: Robert Freund 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 11
  • 12. Media Fragments Processing http://www.example.com/video.ogv#t=10,20 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 12
  • 13. Conclusion  ACAV will bring:  Dedicated annotation schemas for video accessibility  Social network model for video annotations  Web integration of state of the art speech technologies  GUI models for authoring and rendering video annotations  Media Fragments reference implementation  Open source Braille plugin for most used Web browsers http://www.acavideo.fr/ 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 13