SlideShare a Scribd company logo
1 of 15
Open Text:Speech recognition in Opencast MatterhornStephen MarquardCentre for Educational TechnologyUniversity of Cape TownJune 2011
Project goals Integrate CMU Sphinx speech recognition engine into Opencast Matterhorn Provide easy mechanism for speaker training Generate automatic transcripts of recorded lectures Allow users to correct and improve the transcripts Use feedback to improve recognition accuracy (of the same, similar or subsequent recordings)
Why is it important? Video and audio is more useful if you can: Navigate it easily Locate relevant recordings from a large set Use by students: Catch up on missed lectures (continuous play or read the transcript) Revision: jump to a particular point or find the lectures which cover topic X On the public web: Discoverability (search indexing) Similar advantages to OCR recognition of slides (but harder)
Why is it difficult? Audio quality can dramatically affect speech recognition accuracy Echo and reverberation Background noise Microphone location Speaker-independent large-vocabulary continuous speech recognition is the hardest type of ASR Best case: good acoustics, single speaker (limited dialogue), accent match with the acoustic model, limited vocabulary.
Prior work in ASR for lectures MIT Lecture Browser (SUMMIT recognizer) U. Toronto / ePresence PhD prototype by CosminMunteanu(SONIC recognizer) ETH Zurich Integration of CMU Sphinx with REPLAY by SamirAtitallah
Speech recognition software ecosystem Licensing and patents Closed Proprietary FOSS Open
Accounting for context:Language model adaptation Adapt a language model to more closely resemble the target speech Using related text for Topic modelling (vocabulary, concepts) Style-of-speech modelling 	“ok and um it's quite useful to have a very good diagnostic test of of acute hepatitis um you know to prevent kind of unnecessary um surgery um so hepatitis is really one um example of a cause of acute abdominal pain that doesn't need surgery”
Using Wikipedia for LM adaptation Goal is to adapt a “standard” LM to be specific to the topic of the audio Start somewhere: title, keywords, text from slides Select a set of documents, adapt the LM Using wikipedia, select by similarity: identify the set of documents most closely related to the starting point or keywords
Baseline performance with Sphinx4 (HUB4 acoustic and language models) Lecture audio and transcripts from Open Yale Courses http://oyc.yale.edu/ Used under CC-BY-NC-SA license.
Best-case comparison (30% WER)Transcript, HUB4 LM, Wikipedia Similarity LM Before launching into Pynchon today, I thought I would just take a few moments to look back over the books that we've read and talk about the visions of language that they have offered us, and also just to reflect for a moment on the relationship imagined between those visions of language and what is happening outside of fiction in what we might call the real world. We started this course talking about Black Boy and the way that a whole world of pressure -- political pressure, racial tension -- pushed on the borders of that work and actually changed its very material form. before launching into not pynchontoday route just take a few moments to look back cover the books that we've brad and talk about the visions of language that they have offered up and also just to reflect for mounted on the relationship imagine between those visions of language and what is happening outside of fiction in in what we might call the real world we started this course talking about black boy and a weighing bat a whole world of pressure political pressure racial tension pushed on the borders and that work and actually changed its very nature eel for before launching into not mentioned today really does take a few moments to look back over the books that we've read and talk about the visions of language that they have offered up and also just to reflect for movement on the relationship imagine between those visions of language and what is happening outside of fiction in in what we might call the reel well we started this course talking about black boy and a weighing of that a whole world of pressure political pressure of racial tension pushed on the borders of bad work and actually changed its very nature eel for
Worst-case comparison (61% WER)Transcript, HUB4 LM, Wikipedia Similarity LM i'd talk with the french revolution this party do in all the myself will forty-five minutes after throughout beginning i'm in seoul on on i wanted it to do two things unless the revolution through the eyes of maps that ulmus piano member of a treaty of public safety arguably without fascists i'd solicit were not member ah is jacobo out into an away he incarnated death jacobinchapel back he imparted the french revolution i've talked with the french are loose in this part to do in all the myself low forty five minutes after score of beginning i'm in seoul on bob and i wanted to do two things i want the revolution through the eyes of maps that elvis piano a member of the treaty of public safety are giveaway with that fascists i thought it were not member ah gee i go back into a a way he imparted that chappel been the chapel back he imparted the first revolution I'm going to talk about the French Revolution. It's hard to do. I'll leave myself about forty-five minutes after I screw around at the beginning. I want to do two things. I want to see the Revolution through the eyes of Maximilien de Robespierre, a member of the Committee of Public Safety --arguably, with Saint-Just, its most important member. In a way, Jacobin -- he incarnated the French Revolution.
Work in progress Identify requirements for recording recognition-quality audio (equipment, acoustics) Implement dynamic language model adaptation Integrate into Opencast Matterhorn workflow Show transcript to users in UI, enable search Allow users to edit / improve transcript Use edits to improve recognition
Other integration possibilities External transcription services (automate the workflow, choice between manual or automatic transcript) External speech recognition services (e.g. nexiwave.com)
Find out more Email me:stephen.marquard@uct.ac.za 	Follow me on Twitter: http://twitter.com/stephenmarquard 	Read my blog on open source language modelling and speech recognition: http://trulymadlywordly.blogspot.com 	CMU Sphinxhttp://cmusphinx.sourceforge.net/

More Related Content

Similar to Open Text: Speech recognition in Opencast Matterhorn

High School Goes High Tech May 2007
High School Goes High Tech May 2007High School Goes High Tech May 2007
High School Goes High Tech May 2007tchiasson
 
Autobiographical Essay Sample For University Entrance
Autobiographical Essay Sample For University EntranceAutobiographical Essay Sample For University Entrance
Autobiographical Essay Sample For University EntranceErica Turner
 
Example Of A Narrative Essay About Yourself.pdf
Example Of A Narrative Essay About Yourself.pdfExample Of A Narrative Essay About Yourself.pdf
Example Of A Narrative Essay About Yourself.pdfLory Holets
 
Sample Of A Term Paper How To Write A Research P
Sample Of A Term Paper How To Write A Research PSample Of A Term Paper How To Write A Research P
Sample Of A Term Paper How To Write A Research PSabrina Baloi
 
Error correction 3 mon+wed c1.2
Error correction 3 mon+wed c1.2Error correction 3 mon+wed c1.2
Error correction 3 mon+wed c1.2jeanphilippeguy
 
Masterclass on digital anthropology and our virtual lives
Masterclass on digital anthropology and our virtual livesMasterclass on digital anthropology and our virtual lives
Masterclass on digital anthropology and our virtual livesDoug Thompson
 
Can being part machine make us more human
Can being part machine make us more humanCan being part machine make us more human
Can being part machine make us more humanDoug Thompson
 
Model of the text generator
Model of the text generatorModel of the text generator
Model of the text generatoreyetech
 
033009 Vw Methods Research Panel Metanomics Transcript
033009 Vw Methods Research Panel Metanomics Transcript033009 Vw Methods Research Panel Metanomics Transcript
033009 Vw Methods Research Panel Metanomics TranscriptRemedy Communications
 
Save Our Environment Essay
Save Our Environment EssaySave Our Environment Essay
Save Our Environment EssayDeborah Reyes
 
Narrative Essay Topics For High School.pdf
Narrative Essay Topics For High School.pdfNarrative Essay Topics For High School.pdf
Narrative Essay Topics For High School.pdfHeidi Prado
 
Essay On Environment Day 2014. Online assignment writing service.
Essay On Environment Day 2014. Online assignment writing service.Essay On Environment Day 2014. Online assignment writing service.
Essay On Environment Day 2014. Online assignment writing service.Lisa Davis
 
speech production in psycholinguistics
speech production in psycholinguistics speech production in psycholinguistics
speech production in psycholinguistics Aseel K. Mahmood
 
How Do You Reference A Web Page In An Essay
How Do You Reference A Web Page In An EssayHow Do You Reference A Web Page In An Essay
How Do You Reference A Web Page In An EssayMelanie Mendoza
 
Simon Prentis and the Origin of the Brain
Simon Prentis and the Origin of the BrainSimon Prentis and the Origin of the Brain
Simon Prentis and the Origin of the BrainEditions La Dondaine
 
An Article About Ielts And 9 Sample Papers For The Engli
An Article About Ielts And 9 Sample Papers For The EngliAn Article About Ielts And 9 Sample Papers For The Engli
An Article About Ielts And 9 Sample Papers For The EngliJulie Smith
 
Exploring rhetoric in the Electronic Enlightenment
Exploring rhetoric in the Electronic EnlightenmentExploring rhetoric in the Electronic Enlightenment
Exploring rhetoric in the Electronic EnlightenmentMartin Wynne
 

Similar to Open Text: Speech recognition in Opencast Matterhorn (20)

High School Goes High Tech May 2007
High School Goes High Tech May 2007High School Goes High Tech May 2007
High School Goes High Tech May 2007
 
Autobiographical Essay Sample For University Entrance
Autobiographical Essay Sample For University EntranceAutobiographical Essay Sample For University Entrance
Autobiographical Essay Sample For University Entrance
 
Example Of A Narrative Essay About Yourself.pdf
Example Of A Narrative Essay About Yourself.pdfExample Of A Narrative Essay About Yourself.pdf
Example Of A Narrative Essay About Yourself.pdf
 
Sample Of A Term Paper How To Write A Research P
Sample Of A Term Paper How To Write A Research PSample Of A Term Paper How To Write A Research P
Sample Of A Term Paper How To Write A Research P
 
Lec12
Lec12Lec12
Lec12
 
Error correction 3 mon+wed c1.2
Error correction 3 mon+wed c1.2Error correction 3 mon+wed c1.2
Error correction 3 mon+wed c1.2
 
Masterclass on digital anthropology and our virtual lives
Masterclass on digital anthropology and our virtual livesMasterclass on digital anthropology and our virtual lives
Masterclass on digital anthropology and our virtual lives
 
Can being part machine make us more human
Can being part machine make us more humanCan being part machine make us more human
Can being part machine make us more human
 
Lec13
Lec13Lec13
Lec13
 
Model of the text generator
Model of the text generatorModel of the text generator
Model of the text generator
 
033009 Vw Methods Research Panel Metanomics Transcript
033009 Vw Methods Research Panel Metanomics Transcript033009 Vw Methods Research Panel Metanomics Transcript
033009 Vw Methods Research Panel Metanomics Transcript
 
Save Our Environment Essay
Save Our Environment EssaySave Our Environment Essay
Save Our Environment Essay
 
Narrative Essay Topics For High School.pdf
Narrative Essay Topics For High School.pdfNarrative Essay Topics For High School.pdf
Narrative Essay Topics For High School.pdf
 
Essay On Environment Day 2014. Online assignment writing service.
Essay On Environment Day 2014. Online assignment writing service.Essay On Environment Day 2014. Online assignment writing service.
Essay On Environment Day 2014. Online assignment writing service.
 
Amity NLP Notes
Amity NLP NotesAmity NLP Notes
Amity NLP Notes
 
speech production in psycholinguistics
speech production in psycholinguistics speech production in psycholinguistics
speech production in psycholinguistics
 
How Do You Reference A Web Page In An Essay
How Do You Reference A Web Page In An EssayHow Do You Reference A Web Page In An Essay
How Do You Reference A Web Page In An Essay
 
Simon Prentis and the Origin of the Brain
Simon Prentis and the Origin of the BrainSimon Prentis and the Origin of the Brain
Simon Prentis and the Origin of the Brain
 
An Article About Ielts And 9 Sample Papers For The Engli
An Article About Ielts And 9 Sample Papers For The EngliAn Article About Ielts And 9 Sample Papers For The Engli
An Article About Ielts And 9 Sample Papers For The Engli
 
Exploring rhetoric in the Electronic Enlightenment
Exploring rhetoric in the Electronic EnlightenmentExploring rhetoric in the Electronic Enlightenment
Exploring rhetoric in the Electronic Enlightenment
 

More from Stephen Marquard

The implementation of an Opt-Out Lecture Recording Policy at the University o...
The implementation of an Opt-Out Lecture Recording Policy at the University o...The implementation of an Opt-Out Lecture Recording Policy at the University o...
The implementation of an Opt-Out Lecture Recording Policy at the University o...Stephen Marquard
 
Orchestrating Self-Service Video Workflows with Opencast
Orchestrating Self-Service Video Workflows with OpencastOrchestrating Self-Service Video Workflows with Opencast
Orchestrating Self-Service Video Workflows with OpencastStephen Marquard
 
Smart workflows for Opencast
Smart workflows for OpencastSmart workflows for Opencast
Smart workflows for OpencastStephen Marquard
 
LectureSight is awesome and getting better! 
LectureSight is awesome and getting better! LectureSight is awesome and getting better! 
LectureSight is awesome and getting better! Stephen Marquard
 
Track4K in production at the University of Cape Town
Track4K in production at the University of Cape TownTrack4K in production at the University of Cape Town
Track4K in production at the University of Cape TownStephen Marquard
 
Opencast Valencia 2017: Users, groups, roles, ACLs and providers
Opencast Valencia 2017: Users, groups, roles, ACLs and providersOpencast Valencia 2017: Users, groups, roles, ACLs and providers
Opencast Valencia 2017: Users, groups, roles, ACLs and providersStephen Marquard
 
Opencast and Sakai at UCT, LectureSight and Track4K
Opencast and Sakai at UCT, LectureSight and Track4KOpencast and Sakai at UCT, LectureSight and Track4K
Opencast and Sakai at UCT, LectureSight and Track4KStephen Marquard
 
LectureSight in Action (Opencast Community Summit 2016)
LectureSight in Action (Opencast Community Summit 2016)LectureSight in Action (Opencast Community Summit 2016)
LectureSight in Action (Opencast Community Summit 2016)Stephen Marquard
 
Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015Stephen Marquard
 
Why do students use lecture recordings?
Why do students use lecture recordings?Why do students use lecture recordings?
Why do students use lecture recordings?Stephen Marquard
 
Introduction to Opencast Matterhorn: Apereo 2014
Introduction to Opencast Matterhorn: Apereo 2014Introduction to Opencast Matterhorn: Apereo 2014
Introduction to Opencast Matterhorn: Apereo 2014Stephen Marquard
 
Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014
Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014
Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014Stephen Marquard
 
Matterhorn 2014 Unconference: Ideas for automated post-recording video handling
Matterhorn 2014 Unconference: Ideas for automated post-recording video handlingMatterhorn 2014 Unconference: Ideas for automated post-recording video handling
Matterhorn 2014 Unconference: Ideas for automated post-recording video handlingStephen Marquard
 
Opencast Matterhorn at UCT
Opencast Matterhorn at UCTOpencast Matterhorn at UCT
Opencast Matterhorn at UCTStephen Marquard
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionStephen Marquard
 
Advancing Online Assessment in Medical Education
Advancing Online Assessment in Medical EducationAdvancing Online Assessment in Medical Education
Advancing Online Assessment in Medical EducationStephen Marquard
 
SMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in SakaiSMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in SakaiStephen Marquard
 
SMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in SakaiSMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in SakaiStephen Marquard
 
Sakai E Learning Update Sep09
Sakai E Learning Update Sep09Sakai E Learning Update Sep09
Sakai E Learning Update Sep09Stephen Marquard
 

More from Stephen Marquard (20)

The implementation of an Opt-Out Lecture Recording Policy at the University o...
The implementation of an Opt-Out Lecture Recording Policy at the University o...The implementation of an Opt-Out Lecture Recording Policy at the University o...
The implementation of an Opt-Out Lecture Recording Policy at the University o...
 
Orchestrating Self-Service Video Workflows with Opencast
Orchestrating Self-Service Video Workflows with OpencastOrchestrating Self-Service Video Workflows with Opencast
Orchestrating Self-Service Video Workflows with Opencast
 
Smart workflows for Opencast
Smart workflows for OpencastSmart workflows for Opencast
Smart workflows for Opencast
 
LectureSight is awesome and getting better! 
LectureSight is awesome and getting better! LectureSight is awesome and getting better! 
LectureSight is awesome and getting better! 
 
Track4K in production at the University of Cape Town
Track4K in production at the University of Cape TownTrack4K in production at the University of Cape Town
Track4K in production at the University of Cape Town
 
Opencast Valencia 2017: Users, groups, roles, ACLs and providers
Opencast Valencia 2017: Users, groups, roles, ACLs and providersOpencast Valencia 2017: Users, groups, roles, ACLs and providers
Opencast Valencia 2017: Users, groups, roles, ACLs and providers
 
Opencast and Sakai at UCT, LectureSight and Track4K
Opencast and Sakai at UCT, LectureSight and Track4KOpencast and Sakai at UCT, LectureSight and Track4K
Opencast and Sakai at UCT, LectureSight and Track4K
 
LectureSight in Action (Opencast Community Summit 2016)
LectureSight in Action (Opencast Community Summit 2016)LectureSight in Action (Opencast Community Summit 2016)
LectureSight in Action (Opencast Community Summit 2016)
 
Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015
 
Why do students use lecture recordings?
Why do students use lecture recordings?Why do students use lecture recordings?
Why do students use lecture recordings?
 
Introduction to Opencast Matterhorn: Apereo 2014
Introduction to Opencast Matterhorn: Apereo 2014Introduction to Opencast Matterhorn: Apereo 2014
Introduction to Opencast Matterhorn: Apereo 2014
 
Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014
Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014
Introduction to Opencast Matterhorn, Apereo Mexico Conference, May 2014
 
Matterhorn 2014 Unconference: Ideas for automated post-recording video handling
Matterhorn 2014 Unconference: Ideas for automated post-recording video handlingMatterhorn 2014 Unconference: Ideas for automated post-recording video handling
Matterhorn 2014 Unconference: Ideas for automated post-recording video handling
 
Opencast Matterhorn at UCT
Opencast Matterhorn at UCTOpencast Matterhorn at UCT
Opencast Matterhorn at UCT
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Advancing Online Assessment in Medical Education
Advancing Online Assessment in Medical EducationAdvancing Online Assessment in Medical Education
Advancing Online Assessment in Medical Education
 
SMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in SakaiSMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in Sakai
 
SMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in SakaiSMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in Sakai
 
Sakai E Learning Update Sep09
Sakai E Learning Update Sep09Sakai E Learning Update Sep09
Sakai E Learning Update Sep09
 
Vula is my survival kit
Vula is my survival kitVula is my survival kit
Vula is my survival kit
 

Recently uploaded

Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxdhanalakshmis0310
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 

Recently uploaded (20)

Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 

Open Text: Speech recognition in Opencast Matterhorn

  • 1. Open Text:Speech recognition in Opencast MatterhornStephen MarquardCentre for Educational TechnologyUniversity of Cape TownJune 2011
  • 2. Project goals Integrate CMU Sphinx speech recognition engine into Opencast Matterhorn Provide easy mechanism for speaker training Generate automatic transcripts of recorded lectures Allow users to correct and improve the transcripts Use feedback to improve recognition accuracy (of the same, similar or subsequent recordings)
  • 3. Why is it important? Video and audio is more useful if you can: Navigate it easily Locate relevant recordings from a large set Use by students: Catch up on missed lectures (continuous play or read the transcript) Revision: jump to a particular point or find the lectures which cover topic X On the public web: Discoverability (search indexing) Similar advantages to OCR recognition of slides (but harder)
  • 4. Why is it difficult? Audio quality can dramatically affect speech recognition accuracy Echo and reverberation Background noise Microphone location Speaker-independent large-vocabulary continuous speech recognition is the hardest type of ASR Best case: good acoustics, single speaker (limited dialogue), accent match with the acoustic model, limited vocabulary.
  • 5. Prior work in ASR for lectures MIT Lecture Browser (SUMMIT recognizer) U. Toronto / ePresence PhD prototype by CosminMunteanu(SONIC recognizer) ETH Zurich Integration of CMU Sphinx with REPLAY by SamirAtitallah
  • 6. Speech recognition software ecosystem Licensing and patents Closed Proprietary FOSS Open
  • 7.
  • 8. Accounting for context:Language model adaptation Adapt a language model to more closely resemble the target speech Using related text for Topic modelling (vocabulary, concepts) Style-of-speech modelling “ok and um it's quite useful to have a very good diagnostic test of of acute hepatitis um you know to prevent kind of unnecessary um surgery um so hepatitis is really one um example of a cause of acute abdominal pain that doesn't need surgery”
  • 9. Using Wikipedia for LM adaptation Goal is to adapt a “standard” LM to be specific to the topic of the audio Start somewhere: title, keywords, text from slides Select a set of documents, adapt the LM Using wikipedia, select by similarity: identify the set of documents most closely related to the starting point or keywords
  • 10. Baseline performance with Sphinx4 (HUB4 acoustic and language models) Lecture audio and transcripts from Open Yale Courses http://oyc.yale.edu/ Used under CC-BY-NC-SA license.
  • 11. Best-case comparison (30% WER)Transcript, HUB4 LM, Wikipedia Similarity LM Before launching into Pynchon today, I thought I would just take a few moments to look back over the books that we've read and talk about the visions of language that they have offered us, and also just to reflect for a moment on the relationship imagined between those visions of language and what is happening outside of fiction in what we might call the real world. We started this course talking about Black Boy and the way that a whole world of pressure -- political pressure, racial tension -- pushed on the borders of that work and actually changed its very material form. before launching into not pynchontoday route just take a few moments to look back cover the books that we've brad and talk about the visions of language that they have offered up and also just to reflect for mounted on the relationship imagine between those visions of language and what is happening outside of fiction in in what we might call the real world we started this course talking about black boy and a weighing bat a whole world of pressure political pressure racial tension pushed on the borders and that work and actually changed its very nature eel for before launching into not mentioned today really does take a few moments to look back over the books that we've read and talk about the visions of language that they have offered up and also just to reflect for movement on the relationship imagine between those visions of language and what is happening outside of fiction in in what we might call the reel well we started this course talking about black boy and a weighing of that a whole world of pressure political pressure of racial tension pushed on the borders of bad work and actually changed its very nature eel for
  • 12. Worst-case comparison (61% WER)Transcript, HUB4 LM, Wikipedia Similarity LM i'd talk with the french revolution this party do in all the myself will forty-five minutes after throughout beginning i'm in seoul on on i wanted it to do two things unless the revolution through the eyes of maps that ulmus piano member of a treaty of public safety arguably without fascists i'd solicit were not member ah is jacobo out into an away he incarnated death jacobinchapel back he imparted the french revolution i've talked with the french are loose in this part to do in all the myself low forty five minutes after score of beginning i'm in seoul on bob and i wanted to do two things i want the revolution through the eyes of maps that elvis piano a member of the treaty of public safety are giveaway with that fascists i thought it were not member ah gee i go back into a a way he imparted that chappel been the chapel back he imparted the first revolution I'm going to talk about the French Revolution. It's hard to do. I'll leave myself about forty-five minutes after I screw around at the beginning. I want to do two things. I want to see the Revolution through the eyes of Maximilien de Robespierre, a member of the Committee of Public Safety --arguably, with Saint-Just, its most important member. In a way, Jacobin -- he incarnated the French Revolution.
  • 13. Work in progress Identify requirements for recording recognition-quality audio (equipment, acoustics) Implement dynamic language model adaptation Integrate into Opencast Matterhorn workflow Show transcript to users in UI, enable search Allow users to edit / improve transcript Use edits to improve recognition
  • 14. Other integration possibilities External transcription services (automate the workflow, choice between manual or automatic transcript) External speech recognition services (e.g. nexiwave.com)
  • 15. Find out more Email me:stephen.marquard@uct.ac.za Follow me on Twitter: http://twitter.com/stephenmarquard Read my blog on open source language modelling and speech recognition: http://trulymadlywordly.blogspot.com CMU Sphinxhttp://cmusphinx.sourceforge.net/