SlideShare a Scribd company logo
1 of 16
Download to read offline
Transcript Alignment Service
Webinar
March 12, 2013
Moderator: Josh Miller
Speakers: Roger Zimmerman
David Zylber
Agenda
• Automatic Alignment vs. Transcription
and Captioning
• Alignment Service Overview
• Best Practices
• Submitting Transcripts & Media Files
• Formatting your Transcripts
• Q&A
Transcript Alignment Service vs.
Transcription and Captioning
• Use the Alignment service when you already
have a transcript
• Both services ultimately give you access to the
same 3Play Media account features and tools.
• Alignment is 100% automated where as the
standard service involves human clean up.
• Turnaround Service Levels
Automatic Alignment Process

1)Re-encode text as ASCII
•

MS-Word exports still contain non-ASCII characters

•

Direct upload users can see the results
DEMO
FTP Overview
• Create a folder named for_alignment

• Add the media file first to the for_alignment folder
- e.g. Casablanco.mp4

• Then add the plain .TXT transcript to the for_alignment folder
- e.g. Casablanco.txt

• The .TXT file MUST HAVE THE SAME NAME as the media
file

• Batch uploads: first submit all media files and then the
corresponding transcripts.
Alignment Best Practices
• THE KEY: Text corresponds to audio!
• Common Problems:
-Non-conforming speaker labels (not all caps, hyphens instead of colons
-Wrapped text becomes paragraphs
-Including instructions, screen directions, scene settings/headers
-Interpretation
-Overlapping speakers
-Audio quality
• Duration: No more than 2 hours per file
• Drag and Drop your transcripts when you can
• Transcripts should be unformatted plain text file (.TXT)
• Short duration reduces the likelihood of misalignment
DEMO
Automatic Alignment Process
continued…
2) Infer verbalization from text
•

Speaker labels used for adaptation (and replaced with
optional pause)

•

Punctuation removed (sentences replaced with pause)

•

Numerics expanded:
 10/10/2013 => “ten ten thirteen” OR “October tenth” …
 107 => “one hundred and seven” OR “one oh seven” …
 5’3” => “five foot three” or “five three” …

•

Acronyms/abbreviations expanded: “St.”, “ABC”, “NASDAQ”
Automatic Alignment Process
3) Build a “biased” language model (with options):
CEO: “On 10/10/2013, we will be listed on NASDAQ as ABC”
<SPEAKER> on { NULL / this } { ten ten / october tenth }
<COMMA> { NULL / twenty thirteen / thirteen } { we will / we’ll
} be listed on the nasdaq as a b c <SENTENCE> …
Automatic Alignment Process
4) Run ASR with biased LM:
ON
OCTOBER
TENTH
WE’LL
BE

1.02 1.05
1.05 1.32
1.32 1.51
1.63 1.76
1.76 1.82
Automatic Alignment Process
5) Re-Align with original text:






ON
OCTOBER
TENTH
WE’LL




BE

CEO:
On

0.0
1.02

1.02
1.05

10/10/2013,
we
will
be

1.05
1.63
1.695
1.76

1.51
1.695
1.76
1.82
Automatic Alignment Process

6) Fill in gaps in ASR output with all of the original transcript
text in that region.
Automatic Alignment Process
7) Compute confidence from ASR process plus
number/length of gaps.
 “Audio Quality” bars
Automatic Alignment Process

8) Create all output assets from the aligned transcript, as if it
had been edited.
NEED HELP?
RESOURCES

Knowledge Base

support.3playmedia.com/forums
Contact 3Play Media Support

support@3playmedia.com

More Related Content

Viewers also liked

Derivativemarketinnepal 130710115026-phpapp02
Derivativemarketinnepal 130710115026-phpapp02Derivativemarketinnepal 130710115026-phpapp02
Derivativemarketinnepal 130710115026-phpapp02
adnanabbas
 
Secion educativa sobre el pandillaje
Secion educativa  sobre el pandillajeSecion educativa  sobre el pandillaje
Secion educativa sobre el pandillaje
990672000
 

Viewers also liked (6)

Pennsylvania State of Higher Education (PASSHE) Virtual Conference
 Pennsylvania State of Higher Education (PASSHE) Virtual Conference Pennsylvania State of Higher Education (PASSHE) Virtual Conference
Pennsylvania State of Higher Education (PASSHE) Virtual Conference
 
Didactiko 1
Didactiko 1Didactiko 1
Didactiko 1
 
AAPF Grantmakers in Education Presentation
AAPF Grantmakers in Education PresentationAAPF Grantmakers in Education Presentation
AAPF Grantmakers in Education Presentation
 
Derivativemarketinnepal 130710115026-phpapp02
Derivativemarketinnepal 130710115026-phpapp02Derivativemarketinnepal 130710115026-phpapp02
Derivativemarketinnepal 130710115026-phpapp02
 
Secion educativa sobre el pandillaje
Secion educativa  sobre el pandillajeSecion educativa  sobre el pandillaje
Secion educativa sobre el pandillaje
 
Accessibility at Blackboard
Accessibility at BlackboardAccessibility at Blackboard
Accessibility at Blackboard
 

Similar to Best Practices for Automatic Transcript Alignment

IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server
IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server
IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server
Antonios Chatzipavlis
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
Iván Montes
 

Similar to Best Practices for Automatic Transcript Alignment (20)

Build your own ASR engine
Build your own ASR engineBuild your own ASR engine
Build your own ASR engine
 
IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server
IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server
IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server
 
La big datacamp-2014-aws-dynamodb-overview-michael_limcaco
La big datacamp-2014-aws-dynamodb-overview-michael_limcacoLa big datacamp-2014-aws-dynamodb-overview-michael_limcaco
La big datacamp-2014-aws-dynamodb-overview-michael_limcaco
 
Deep Learning Summit (DLS01-4)
Deep Learning Summit (DLS01-4)Deep Learning Summit (DLS01-4)
Deep Learning Summit (DLS01-4)
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
A Brief Intro to Adobe Flex
A Brief Intro to Adobe FlexA Brief Intro to Adobe Flex
A Brief Intro to Adobe Flex
 
COE 2017: Your first 3DEXPERIENCE customization
COE 2017: Your first 3DEXPERIENCE customizationCOE 2017: Your first 3DEXPERIENCE customization
COE 2017: Your first 3DEXPERIENCE customization
 
Unit 5 application layer
Unit 5 application layerUnit 5 application layer
Unit 5 application layer
 
Ms DOS
Ms DOSMs DOS
Ms DOS
 
15. text files
15. text files15. text files
15. text files
 
Moses
MosesMoses
Moses
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction to .Net
Introduction to .NetIntroduction to .Net
Introduction to .Net
 
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
 
Tool
ToolTool
Tool
 
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech TalksDeep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
 
3-Application Layer.pptx
3-Application Layer.pptx3-Application Layer.pptx
3-Application Layer.pptx
 
Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS
 
Stupid Video Tricks
Stupid Video TricksStupid Video Tricks
Stupid Video Tricks
 

More from 3Play Media

More from 3Play Media (20)

Advancing Equity and Inclusion for Deaf Students in Higher Education
Advancing Equity and Inclusion for Deaf Students in Higher EducationAdvancing Equity and Inclusion for Deaf Students in Higher Education
Advancing Equity and Inclusion for Deaf Students in Higher Education
 
"Am I Doing This Right?" Imposter Syndrome and Accessibility Maturity
"Am I Doing This Right?" Imposter Syndrome and Accessibility Maturity"Am I Doing This Right?" Imposter Syndrome and Accessibility Maturity
"Am I Doing This Right?" Imposter Syndrome and Accessibility Maturity
 
The 3Play Way: Real-Time Captioning in Higher Education
The 3Play Way: Real-Time Captioning in Higher EducationThe 3Play Way: Real-Time Captioning in Higher Education
The 3Play Way: Real-Time Captioning in Higher Education
 
Developing a Centrally Supported Captioning System with Utah State University
Developing a Centrally Supported Captioning System with Utah State UniversityDeveloping a Centrally Supported Captioning System with Utah State University
Developing a Centrally Supported Captioning System with Utah State University
 
Developing a Centrally Supported Captioning System with Utah State University
Developing a Centrally Supported Captioning System with Utah State UniversityDeveloping a Centrally Supported Captioning System with Utah State University
Developing a Centrally Supported Captioning System with Utah State University
 
Lessons Learned: Canada’s Past, Present, and Future Leadership in Digital Acc...
Lessons Learned: Canada’s Past, Present, and Future Leadership in Digital Acc...Lessons Learned: Canada’s Past, Present, and Future Leadership in Digital Acc...
Lessons Learned: Canada’s Past, Present, and Future Leadership in Digital Acc...
 
Product Innovation is on the Edge
Product Innovation is on the EdgeProduct Innovation is on the Edge
Product Innovation is on the Edge
 
Why Every Company Needs to Think and Act Like a Media Company
Why Every Company Needs to Think and Act Like a Media CompanyWhy Every Company Needs to Think and Act Like a Media Company
Why Every Company Needs to Think and Act Like a Media Company
 
2023 State of Automatic Speech Recognition
2023 State of Automatic Speech Recognition2023 State of Automatic Speech Recognition
2023 State of Automatic Speech Recognition
 
Complex Identities: The Intersection of Disability with Race, Culture, Gender...
Complex Identities: The Intersection of Disability with Race, Culture, Gender...Complex Identities: The Intersection of Disability with Race, Culture, Gender...
Complex Identities: The Intersection of Disability with Race, Culture, Gender...
 
Accessibility as a Gateway to Creativity
Accessibility as a Gateway to CreativityAccessibility as a Gateway to Creativity
Accessibility as a Gateway to Creativity
 
Disability Inclusion for Leadership
Disability Inclusion for LeadershipDisability Inclusion for Leadership
Disability Inclusion for Leadership
 
How to Tell Whether UDL is Working
How to Tell Whether UDL is WorkingHow to Tell Whether UDL is Working
How to Tell Whether UDL is Working
 
Neurodivergency at work (P2) – 3Play and B-I.pdf
Neurodivergency at work (P2) – 3Play and B-I.pdfNeurodivergency at work (P2) – 3Play and B-I.pdf
Neurodivergency at work (P2) – 3Play and B-I.pdf
 
Neurodiversity in the Workplace - Part 1
Neurodiversity in the Workplace - Part 1Neurodiversity in the Workplace - Part 1
Neurodiversity in the Workplace - Part 1
 
How To Deliver an Accessible Online Presentation
How To Deliver an Accessible Online PresentationHow To Deliver an Accessible Online Presentation
How To Deliver an Accessible Online Presentation
 
Power of an Accessible Website.pdf
Power of an Accessible Website.pdfPower of an Accessible Website.pdf
Power of an Accessible Website.pdf
 
2022 Digital Accessibility Legal Update.pdf
2022 Digital Accessibility Legal Update.pdf2022 Digital Accessibility Legal Update.pdf
2022 Digital Accessibility Legal Update.pdf
 
Intro to Live Captioning for Broadcast.pdf
Intro to Live Captioning for Broadcast.pdfIntro to Live Captioning for Broadcast.pdf
Intro to Live Captioning for Broadcast.pdf
 
How to Scale a Sustainable Accessibility Program
How to Scale a Sustainable Accessibility Program How to Scale a Sustainable Accessibility Program
How to Scale a Sustainable Accessibility Program
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Best Practices for Automatic Transcript Alignment

  • 1. Transcript Alignment Service Webinar March 12, 2013 Moderator: Josh Miller Speakers: Roger Zimmerman David Zylber
  • 2. Agenda • Automatic Alignment vs. Transcription and Captioning • Alignment Service Overview • Best Practices • Submitting Transcripts & Media Files • Formatting your Transcripts • Q&A
  • 3. Transcript Alignment Service vs. Transcription and Captioning • Use the Alignment service when you already have a transcript • Both services ultimately give you access to the same 3Play Media account features and tools. • Alignment is 100% automated where as the standard service involves human clean up. • Turnaround Service Levels
  • 4. Automatic Alignment Process 1)Re-encode text as ASCII • MS-Word exports still contain non-ASCII characters • Direct upload users can see the results
  • 6. FTP Overview • Create a folder named for_alignment • Add the media file first to the for_alignment folder - e.g. Casablanco.mp4 • Then add the plain .TXT transcript to the for_alignment folder - e.g. Casablanco.txt • The .TXT file MUST HAVE THE SAME NAME as the media file • Batch uploads: first submit all media files and then the corresponding transcripts.
  • 7. Alignment Best Practices • THE KEY: Text corresponds to audio! • Common Problems: -Non-conforming speaker labels (not all caps, hyphens instead of colons -Wrapped text becomes paragraphs -Including instructions, screen directions, scene settings/headers -Interpretation -Overlapping speakers -Audio quality • Duration: No more than 2 hours per file • Drag and Drop your transcripts when you can • Transcripts should be unformatted plain text file (.TXT) • Short duration reduces the likelihood of misalignment
  • 9. Automatic Alignment Process continued… 2) Infer verbalization from text • Speaker labels used for adaptation (and replaced with optional pause) • Punctuation removed (sentences replaced with pause) • Numerics expanded:  10/10/2013 => “ten ten thirteen” OR “October tenth” …  107 => “one hundred and seven” OR “one oh seven” …  5’3” => “five foot three” or “five three” … • Acronyms/abbreviations expanded: “St.”, “ABC”, “NASDAQ”
  • 10. Automatic Alignment Process 3) Build a “biased” language model (with options): CEO: “On 10/10/2013, we will be listed on NASDAQ as ABC” <SPEAKER> on { NULL / this } { ten ten / october tenth } <COMMA> { NULL / twenty thirteen / thirteen } { we will / we’ll } be listed on the nasdaq as a b c <SENTENCE> …
  • 11. Automatic Alignment Process 4) Run ASR with biased LM: ON OCTOBER TENTH WE’LL BE 1.02 1.05 1.05 1.32 1.32 1.51 1.63 1.76 1.76 1.82
  • 12. Automatic Alignment Process 5) Re-Align with original text:      ON OCTOBER TENTH WE’LL   BE CEO: On 0.0 1.02 1.02 1.05 10/10/2013, we will be 1.05 1.63 1.695 1.76 1.51 1.695 1.76 1.82
  • 13. Automatic Alignment Process 6) Fill in gaps in ASR output with all of the original transcript text in that region.
  • 14. Automatic Alignment Process 7) Compute confidence from ASR process plus number/length of gaps.  “Audio Quality” bars
  • 15. Automatic Alignment Process 8) Create all output assets from the aligned transcript, as if it had been edited.