Best Practices for Automatic Transcript Alignment

•

1 like•2,511 views

If you already have transcripts for your video or audio files, automatic transcript alignment is the fastest and least expensive way to create captions and use interactive video plugins. This webinar covers tips and best practices for using our automatic transcript alignment service. Watch this webinar to learn about: - How the technology works - Uploading video/audio files - Preparing a transcript for upload - Transcript synchronization accuracy - Using captions & interactive transcripts - Editing transcripts & captions post processing - Working with lecture capture and video platforms Presenters: Josh Miller (Moderator) Co-Founder | 3Play Media David Zylber Manager of Customer Happiness | 3Play Media Roger Zimmerman VP of Research & Development | 3Play Media

Technology Business

Transcript Alignment Service
Webinar
March 12, 2013
Moderator: Josh Miller
Speakers: Roger Zimmerman
David Zylber

Agenda
• Automatic Alignment vs. Transcription
and Captioning
• Alignment Service Overview
• Best Practices
• Submitting Transcripts & Media Files
• Formatting your Transcripts
• Q&A

Transcript Alignment Service vs.
Transcription and Captioning
• Use the Alignment service when you already
have a transcript
• Both services ultimately give you access to the
same 3Play Media account features and tools.
• Alignment is 100% automated where as the
standard service involves human clean up.
• Turnaround Service Levels

Automatic Alignment Process

1)Re-encode text as ASCII
•

MS-Word exports still contain non-ASCII characters

•

Direct upload users can see the results

FTP Overview
• Create a folder named for_alignment

• Add the media file first to the for_alignment folder
- e.g. Casablanco.mp4

• Then add the plain .TXT transcript to the for_alignment folder
- e.g. Casablanco.txt

• The .TXT file MUST HAVE THE SAME NAME as the media
file

• Batch uploads: first submit all media files and then the
corresponding transcripts.

Alignment Best Practices
• THE KEY: Text corresponds to audio!
• Common Problems:
-Non-conforming speaker labels (not all caps, hyphens instead of colons
-Wrapped text becomes paragraphs
-Including instructions, screen directions, scene settings/headers
-Interpretation
-Overlapping speakers
-Audio quality
• Duration: No more than 2 hours per file
• Drag and Drop your transcripts when you can
• Transcripts should be unformatted plain text file (.TXT)
• Short duration reduces the likelihood of misalignment

Automatic Alignment Process
continued…
2) Infer verbalization from text
•

Speaker labels used for adaptation (and replaced with
optional pause)

•

Punctuation removed (sentences replaced with pause)

•

Numerics expanded:
 10/10/2013 => “ten ten thirteen” OR “October tenth” …
 107 => “one hundred and seven” OR “one oh seven” …
 5’3” => “five foot three” or “five three” …

•

Acronyms/abbreviations expanded: “St.”, “ABC”, “NASDAQ”

Automatic Alignment Process
3) Build a “biased” language model (with options):
CEO: “On 10/10/2013, we will be listed on NASDAQ as ABC”
<SPEAKER> on { NULL / this } { ten ten / october tenth }
<COMMA> { NULL / twenty thirteen / thirteen } { we will / we’ll
} be listed on the nasdaq as a b c <SENTENCE> …

Automatic Alignment Process
4) Run ASR with biased LM:
ON
OCTOBER
TENTH
WE’LL
BE

1.02 1.05
1.05 1.32
1.32 1.51
1.63 1.76
1.76 1.82

Automatic Alignment Process
5) Re-Align with original text:






ON
OCTOBER
TENTH
WE’LL




BE

CEO:
On

0.0
1.02

1.02
1.05

10/10/2013,
we
will
be

1.05
1.63
1.695
1.76

1.51
1.695
1.76
1.82

Automatic Alignment Process

6) Fill in gaps in ASR output with all of the original transcript
text in that region.

Automatic Alignment Process
7) Compute confidence from ASR process plus
number/length of gaps.
 “Audio Quality” bars

Automatic Alignment Process

8) Create all output assets from the aligned transcript, as if it
had been edited.

NEED HELP?
RESOURCES

Knowledge Base

support.3playmedia.com/forums
Contact 3Play Media Support

support@3playmedia.com

Viewers also liked

In this webinar presented at the PASSHE Virtual Conference 2013, Penn State University demonstrates a cost-effective, streamlined captioning workflow that provides push-button simplicity for instructors, administrators, and students campus-wide. Presenters Dr. Joseph Zisk (Moderator) Professor/Teaching and Learning Center Director | California University of Pennsylvania Dr. Keith Bailey Director, e-Learning Institute | Penn State University Josh Miller Co-Founder | 3Play Media

Pennsylvania State of Higher Education (PASSHE) Virtual Conference

3Play Media

Didactiko 1

caterin mendoza

AAPF Grantmakers in Education Presentation

aapfslides

Derivativemarketinnepal 130710115026-phpapp02

adnanabbas

Secion educativa sobre el pandillaje

990672000

Blackboard is committed not only to delivering accessible products, but also to providing accessibility consulting. Understanding that a digital accessibility program encompasses so much more than their products, Blackboard works with schools to develop comprehensive eLearning accessibility plans. In this webinar, you'll learn more about Blackboard's accessibility initiatives and core beliefs. Presented by JoAnna Hunt (Accessibility Manager), Scott Ready (Director of Customer Relations), and Nicolaas Matthijs (Ally Product Manager), this session will cover: Blackboard's accessibility mission statement & core beliefs How Blackboard makes their products accessible How Blackboard works with schools to plan for accessibility How Blackboard developed their rubric for accessibility Common challenges of making eLearning programs accessible Using Blackboard Ally to get insight into how accessible your courses are The future of accessibility at Blackboard

Accessibility at Blackboard

3Play Media

Viewers also liked (6)

Pennsylvania State of Higher Education (PASSHE) Virtual Conference

Didactiko 1

AAPF Grantmakers in Education Presentation

Derivativemarketinnepal 130710115026-phpapp02

Secion educativa sobre el pandillaje

Accessibility at Blackboard

Similar to Best Practices for Automatic Transcript Alignment

Build your own ASR engine

Korakot Chaovavanich

IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server

Antonios Chatzipavlis

La big datacamp-2014-aws-dynamodb-overview-michael_limcaco

Data Con LA

Deep learning is having a profound impact on AI applications. With the future of neural network-inspired computing in mind, re:Invent is hosting the first ever Deep Learning Summit. Designed for developers to learn about the latest in deep learning research and emerging trends, attendees will hear from industry thought leaders—members of the academic and venture capital communities—who will share their perspectives in 30-minute Lightning Talks. The Summit will be held on Thursday, November 30th at the Venetian from 1-5pm. The Deep Learning Revolution - Terrence Sejnowski, The Salk Institute for Biological Studies Eye, Robot: Computer Vision and Autonomous Robotics - Aaron Ames & Pietro Perona, California Institute of Technology Exploiting the Power of Language - Alexander Smola, Amazon Web Services Reducing Supervision: Making More with Less - Martial Herbert, Carnegie Mellon University Learning Where to Look in Video - Kristen Grauman, University of Texas Look, Listen, Learn: The Intersection of Vision and Sound - Antonio Torralba, MIT Investing in the Deep Learning Future - Matt Ocko, Data Collective Venture Capital

Deep Learning Summit (DLS01-4)

Amazon Web Services

What is machine translation

Stephen Peacock

A Brief Intro to Adobe Flex

Chad Udell

COE 2017: Your first 3DEXPERIENCE customization

Razorleaf Corporation

Unit 5 application layer

Kritika Purohit

Ms DOS

Gunjan Singh

15. text files

Konstantin Potemichev

Moses

Nikhil Patteri

Introduction

transformtoit

This PowerPoint Presentation covers the topics of .Net like Introduction to .Net, How Internet Work?, What is Web?, Features of .Net, Architecture of .Net, Language Compilers, Compilation Process of .Net, MSIL Code, Common Language Specification (CLS), Common Type System (CTS), Framework Class Library (FCL) or Base Class Library (BCL), Namespaces, Common Language Runtime (CLR), Compiled Code, Inline Code, Object Oriented Concepts of .Net, Class, Field, Properties, Methods, Events, Objects, Constructors, Destructors, Encapsulation, Inheritance, Abstraction, Interface, Polymorphism, Partial Class, Method Overriding and Event Driven Programming.

Introduction to .Net

Hitesh Santani

Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS

Amazon Web Services

Tool

transformtoit

Organizations face significant challenges moving their applications to the cloud when they require a standard file system interface for accessing their cloud data. In this technical session, we will explore the world’s first cloud-scale file system and its targeted use cases. Attendees will learn about the Amazon Elastic File System (EFS) features and benefits, how to identify applications that are appropriate for use with Amazon EFS, and details about its performance and security models. We will highlight and demonstrate how to deploy Amazon EFS in one of our most common use cases and will share tips for success throughout. Learning Objectives: • Recognize why and when to use Amazon EFS • Understand key technical/security concepts • Learn how to leverage EFS’s performance • See a demo of EFS in action • Review EFS’s economics

Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks

Amazon Web Services

Programming Languages #devcon2013

Iván Montes

3-Application Layer.pptx

SachinDUpadhye

Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS

Amazon Web Services

AV Foundation makes it reasonably straightforward to capture video from the camera and edit together a nice family video. This session is not about that stuff. This session is about the nooks and crannies where AV Foundation exposes what's behind the curtain. Instead of letting AVPlayer read our video files, we can grab the samples ourselves and mess with them. AVCaptureVideoPreviewLayer, meet the CGAffineTransform. And instead of dutifully passing our captured video frames to the preview layer and an output file, how about if we instead run them through a series of Core Image filters? Record your own screen? Oh yeah, we can AVAssetWriter that. With a few pointers, a little experimentation, and a healthy disregard for safe coding practices, Core Media and Core Video let you get away with some neat stuff.

Stupid Video Tricks

Chris Adamson

Similar to Best Practices for Automatic Transcript Alignment (20)

Build your own ASR engine

IT PRO|DEV CONNECTIONS 2013 - The X-Files of SQL Server

La big datacamp-2014-aws-dynamodb-overview-michael_limcaco

Deep Learning Summit (DLS01-4)

What is machine translation

A Brief Intro to Adobe Flex

COE 2017: Your first 3DEXPERIENCE customization

Unit 5 application layer

Ms DOS

15. text files

Moses

Introduction

Introduction to .Net

Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS

Tool

Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks

Programming Languages #devcon2013

3-Application Layer.pptx

Simple, Scalable and Highly Durable NAS in the Cloud - Amazon EFS

Stupid Video Tricks

More from 3Play Media

Advancing Equity and Inclusion for Deaf Students in Higher Education

3Play Media

"Am I Doing This Right?" Imposter Syndrome and Accessibility Maturity

3Play Media

During the session, you will gain insights into 3Play Media’s technologies and integrations to understand how these tools work in harmony with our highly skilled captioners to provide seamless, real-time captions that meet the highest standards of quality. Whether it’s live lectures, virtual classrooms, or campus events, our real-time services ensure that students have equal access to educational content.

The 3Play Way: Real-Time Captioning in Higher Education

3Play Media

Developing a Centrally Supported Captioning System with Utah State University

3Play Media

Developing a Centrally Supported Captioning System with Utah State University

3Play Media

Canada is recognized as a global leader in digital accessibility, most recently with their introduction of the Accessible Canada Act. Like all countries, Canada has an imperfect history when it comes to accessibility and inclusivity, but what can others learn from Canada in how to operationalize an accessible ecosystem? How can other countries build inclusive practices into their culture and legislation? On Global Accessibility Awareness Day (GAAD), join us as David Berman, an internationally recognized expert in inclusive design and strategic communications, discusses Canada’s history of leadership in accessibility, as well as predicting where the puck is heading regarding regulations and emerging standards that can benefit everyone within and beyond its borders. David will share experience and insights that will help you leave no one behind online... while enjoying the “Accessibility Dividend” for all.

Lessons Learned: Canada’s Past, Present, and Future Leadership in Digital Acc...

3Play Media

As Product Professionals we are told our job is to create products that either solve user problems or take advantage of opportunities. The challenge is that, during this time, this seems hard to find as new product innovations are being released every day. Are we running out of problems or opportunities? No, we have simply only been looking at the problem/opportunity from our usual viewpoint of the average user. What if we looked at it from a different lens? What if we look at people who always struggle with problems based on their environments not accommodating their unique needs? I argue, this is the gold mine of opportunity for creating innovative products. Solving the mismatches of people with disabilities will lead into innovations for your users and customers of all abilities! We will discuss the real-world examples of this, how to do it, and future market demand. We will all be disabled one day.

Product Innovation is on the Edge

3Play Media

Why Every Company Needs to Think and Act Like a Media Company

3Play Media

2023 State of Automatic Speech Recognition

3Play Media

Complex Identities: The Intersection of Disability with Race, Culture, Gender...

3Play Media

Accessibility as a Gateway to Creativity

3Play Media

Disability Inclusion for Leadership

3Play Media

Although Universal Design for Learning (UDL) is commonly heard of in higher education, most are implementing it at the level of individual interactions or think it's just another facet of accessibility efforts. During this session, we will build on the foundational knowledge of UDL to create expert-level UDL systems at our institutions. We will work together to develop observation and assessment techniques for UDL to create a foundation from which we can build.

How to Tell Whether UDL is Working

3Play Media

Neurodivergency at work (P2) – 3Play and B-I.pdf

3Play Media

This webinar, presented in partnership with Tara Cunningham from Beyond-Impact, aims to illuminate the experiences of neurodivergent people in the workplace – from the first interview to annual performance reviews. We’ll discuss the impact of a neurodiverse team on overall productivity and communication, as well as introduce easy-to-implement accommodations that could benefit neurodivergent employees… and your organization as a whole.

Neurodiversity in the Workplace - Part 1

3Play Media

How To Deliver an Accessible Online Presentation

3Play Media

Power of an Accessible Website.pdf

3Play Media

3Play Media’s annual end-of-the-year Digital Accessibility Legal Update with Lainey Feingold. Learning Objectives: --Legal requirements impacting digital accessibility (primarily in the US, touching upon international requirements). --Updates on major digital access court cases, laws, regulations, and settlements over the past twelve months. --Best practices for digital accessibility to stay ahead of the legal curve as defined by industry leaders, court orders, and major settlements. -- Ethics in the digital accessibility legal space (centering disabled people and avoiding fear, quick fixes, and shortcuts).

2022 Digital Accessibility Legal Update.pdf

3Play Media

Intro to Live Captioning for Broadcast.pdf

3Play Media

How to Scale a Sustainable Accessibility Program

3Play Media

More from 3Play Media (20)

Advancing Equity and Inclusion for Deaf Students in Higher Education

"Am I Doing This Right?" Imposter Syndrome and Accessibility Maturity

The 3Play Way: Real-Time Captioning in Higher Education

Developing a Centrally Supported Captioning System with Utah State University

Lessons Learned: Canada’s Past, Present, and Future Leadership in Digital Acc...

Product Innovation is on the Edge

Why Every Company Needs to Think and Act Like a Media Company

2023 State of Automatic Speech Recognition

Complex Identities: The Intersection of Disability with Race, Culture, Gender...

Accessibility as a Gateway to Creativity

Disability Inclusion for Leadership

How to Tell Whether UDL is Working

Neurodivergency at work (P2) – 3Play and B-I.pdf

Neurodiversity in the Workplace - Part 1

How To Deliver an Accessible Online Presentation

Power of an Accessible Website.pdf

2022 Digital Accessibility Legal Update.pdf

Intro to Live Captioning for Broadcast.pdf

How to Scale a Sustainable Accessibility Program

Recently uploaded

In this keynote, Asanka Abeysinghe, CTO,WSO2 will explore the shift towards platformless technology ecosystems and their importance in driving digital adaptability and innovation. We will discuss strategies for leveraging decentralized architectures and integrating diverse technologies, with a focus on building resilient, flexible, and future-ready IT infrastructures. We will also highlight WSO2's roadmap, emphasizing our commitment to supporting this transformative journey with our evolving product suite.

Platformless Horizons for Digital Adaptability

WSO2

Understanding the FAA Part 107 License ..

Christopher Logan Kennedy

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

DBX First Quarter 2024 Investor Presentation

Dropbox

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

Retrieval augmented generation (RAG) is the most popular style of large language model application to emerge from 2023. The most basic style of RAG works by vectorizing your data and injecting it into a vector database like Milvus for retrieval to augment the text output generated by an LLM. This is just the beginning. One of the ways that we can extend RAG, and extend AI, is through multilingual use cases. Typical RAG is done in English using embedding models that are trained in English. In this talk, we’ll explore how RAG could work in languages other than English. We’ll explore French, Chinese, and Polish.

Introduction to Multilingual Retrieval Augmented Generation (RAG)

Zilliz

The presentation was made in “Web3 Fusion: Embracing AI and Beyond” is more than a conference; it's a journey into the heart of digital transformation. The conference a provided a platform where the future of technology meets practical application. This three-day hybrid event, set in the heart of innovation, served as a gateway to the latest trends and transformative discussions in AI, Blockchain, IoT, AR/VR, and their collective impact on the information space.

AI in Action: Real World Use Cases by Anitaraj

AnitaRaj43

Dubai, known for its towering skyscrapers, luxurious lifestyle, and relentless pursuit of innovation, often finds itself in the global spotlight. However, amidst the glitz and glamour, the emirate faces its own set of challenges, including the occasional threat of flooding. In recent years, Dubai has experienced sporadic but significant floods, disrupting normalcy and posing unique challenges to its infrastructure. Among the critical nodes in this bustling metropolis is the Dubai International Airport, a vital hub connecting the world. This article delves into the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Orbitshub

ICT role in 21st century education and its challenges

rafiqahmad00786416

Passkeys: Developing APIs to enable passwordless authentication Cody Salas, Sr Developer Advocate | Solutions Architect - Yubico Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

apidays

Architecting Cloud Native Applications

WSO2

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)

Samir Dash

Corporate and higher education. Two industries that, in the past, have had a clear divide with very little crossover. The difference in goals, learning styles and objectives paved the way for differing learning technologies platforms to evolve. Now, those stark lines are blurring as both sides are discovering they have content that’s relevant to the other. Join Tammy Rutherford as she walks through the pros and cons of corporate and higher ed collaborating. And the challenges of these different technology platforms working together for a brighter future.

Corporate and higher education May webinar.pptx

Rustici Software

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

Whatsapp Number Escorts Call girls 8617370543 Available 24x7 Mcleodganj Call Girls Service Offer Genuine VIP Model Escorts Call Girls in Your Budget. Mcleodganj Call Girls Service Provide Real Call Girls Number. Make Your Sexual Pleasure Memorable with Our Mcleodganj Call Girls at Affordable Price. Top VIP Escorts Call Girls, High Profile Independent Escorts Call Girls, Housewife Women Escorts Call Girl, College Girls Escorts Call Girls, Russian Escorts Call girls Service in Your Budget.

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Deepika Singh

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

CNIC Information System with Pakdata Cf In Pakistan

danishmna97

Recently uploaded (20)

Platformless Horizons for Digital Adaptability

Understanding the FAA Part 107 License ..

AWS Community Day CPH - Three problems of Terraform

DBX First Quarter 2024 Investor Presentation

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Introduction to Multilingual Retrieval Augmented Generation (RAG)

AI in Action: Real World Use Cases by Anitaraj

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

ICT role in 21st century education and its challenges

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Architecting Cloud Native Applications

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)

Corporate and higher education May webinar.pptx

FWD Group - Insurer Innovation Award 2024

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Apidays New York 2024 - The value of a flexible API Management solution for O...

CNIC Information System with Pakdata Cf In Pakistan

Best Practices for Automatic Transcript Alignment

1. Transcript Alignment Service Webinar March 12, 2013 Moderator: Josh Miller Speakers: Roger Zimmerman David Zylber

2. Agenda • Automatic Alignment vs. Transcription and Captioning • Alignment Service Overview • Best Practices • Submitting Transcripts & Media Files • Formatting your Transcripts • Q&A

3. Transcript Alignment Service vs. Transcription and Captioning • Use the Alignment service when you already have a transcript • Both services ultimately give you access to the same 3Play Media account features and tools. • Alignment is 100% automated where as the standard service involves human clean up. • Turnaround Service Levels

4. Automatic Alignment Process 1)Re-encode text as ASCII • MS-Word exports still contain non-ASCII characters • Direct upload users can see the results

5. DEMO

6. FTP Overview • Create a folder named for_alignment • Add the media file first to the for_alignment folder - e.g. Casablanco.mp4 • Then add the plain .TXT transcript to the for_alignment folder - e.g. Casablanco.txt • The .TXT file MUST HAVE THE SAME NAME as the media file • Batch uploads: first submit all media files and then the corresponding transcripts.

7. Alignment Best Practices • THE KEY: Text corresponds to audio! • Common Problems: -Non-conforming speaker labels (not all caps, hyphens instead of colons -Wrapped text becomes paragraphs -Including instructions, screen directions, scene settings/headers -Interpretation -Overlapping speakers -Audio quality • Duration: No more than 2 hours per file • Drag and Drop your transcripts when you can • Transcripts should be unformatted plain text file (.TXT) • Short duration reduces the likelihood of misalignment

8. DEMO

9. Automatic Alignment Process continued… 2) Infer verbalization from text • Speaker labels used for adaptation (and replaced with optional pause) • Punctuation removed (sentences replaced with pause) • Numerics expanded:  10/10/2013 => “ten ten thirteen” OR “October tenth” …  107 => “one hundred and seven” OR “one oh seven” …  5’3” => “five foot three” or “five three” … • Acronyms/abbreviations expanded: “St.”, “ABC”, “NASDAQ”

10. Automatic Alignment Process 3) Build a “biased” language model (with options): CEO: “On 10/10/2013, we will be listed on NASDAQ as ABC” <SPEAKER> on { NULL / this } { ten ten / october tenth } <COMMA> { NULL / twenty thirteen / thirteen } { we will / we’ll } be listed on the nasdaq as a b c <SENTENCE> …

11. Automatic Alignment Process 4) Run ASR with biased LM: ON OCTOBER TENTH WE’LL BE 1.02 1.05 1.05 1.32 1.32 1.51 1.63 1.76 1.76 1.82

12. Automatic Alignment Process 5) Re-Align with original text:      ON OCTOBER TENTH WE’LL   BE CEO: On 0.0 1.02 1.02 1.05 10/10/2013, we will be 1.05 1.63 1.695 1.76 1.51 1.695 1.76 1.82

13. Automatic Alignment Process 6) Fill in gaps in ASR output with all of the original transcript text in that region.

14. Automatic Alignment Process 7) Compute confidence from ASR process plus number/length of gaps.  “Audio Quality” bars

15. Automatic Alignment Process 8) Create all output assets from the aligned transcript, as if it had been edited.

16. NEED HELP? RESOURCES Knowledge Base support.3playmedia.com/forums Contact 3Play Media Support support@3playmedia.com

Best Practices for Automatic Transcript Alignment

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (6)

Similar to Best Practices for Automatic Transcript Alignment

Similar to Best Practices for Automatic Transcript Alignment (20)

More from 3Play Media

More from 3Play Media (20)

Recently uploaded

Recently uploaded (20)

Best Practices for Automatic Transcript Alignment