SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Speech Recognition,
Text-To-Speech,
and Voice Interfaces
By:
Taryne Cahalin
Stephanie Sirico
Christiana Vasquez
Adelphi University - Mobile Learning, Fall 2013
What is Speech
Recognition?
Instead of an automated voice recording that enables a
person to press buttons, he or she is able to speak specific
words into a device and command orders with the help of a
speech recognition program.
The Uses
Individuals With Disabilities – Assists those who have visual
impairment, hand immobility, dyslexia, etc.
Medical Transcription – Reduces delays to write out
medical transcriptions
Dictation - Converts words to text in emails or other word
documents (also helpful for English Language Learners).
Access Menu Commands – Opens files using voice commands.
Using Dragon Mobile
How does it work?
Speech recognition functions as a
pipeline:
The pipeline converts PCM (pulse
code modulation) digital audio into
recognized speech from a sound
card.
Transforming PCM Digital Audio

16,000 PCM values
per second, a “wavy
line”, that repeat while
the user speaks

Information is
converted for
better
recognition in
the program

Fast-Fourier
transform
identifies
frequency
components of a
specific sound

The program
can
approximate
how our ears
distinguish the
sound
Transform PCM digital audio
using Fast-Fourier Transform
Fast-Fourier analyzes every 1/100th of a second
and converts the audio data

Each 1/100th produces an amplitude graph
These graphs are in a database called a “codebook”
Sounds matched to the most similar entry in the codebook.
Sound is given a number which describes the sound, called the “feature
number”
Two Categories

Small Vocabulary/many-users:
• Leaves room for speech disparity (i.e. accents)
• Limited, preset number of commands that are able to be used

Large Vocabulary/limited-users:
• Best for business settings
• Train system to work with a small number of users
• Accuracy rate will increase as it learns its users
Discrete vs. Continuous Speech
Discrete
• Easier for program to understand
• Noticeable pause after each word
Continuous
• Allows speaking at conversational speed
• Used in most modern systems
Programs now can recognize accents and pronunciations better. In
earlier programs, accents, pronunciations, speed, and background noise
were all variables that made sounds difficult for programs to understand.
Using Talk – Text to Voice

This app allows you to type and then have the device repeat what was
typed. In this case, instead of the device saying Taryne as “Ta-rin”, it
pronounced it as “Ta-reen”. This is an example of speech recognition
programs still need some work to be done because of emphasis on a
syllable. The codebook did not have Taryne in it, so it was unable to
pronounce her name.
The Future of Assistive Technology
in Schools
Students who need assistance in their writing skills because
they have stronger oral skills.
Students who were absent for a class, have poor memory, or
need assistance hearing the lesson.
Students who need assistance during Guided Reading.

Students who are English Language Learners.

Students with visual/hearing impairments and learning
disabilities regarding reading/spelling/writing.

Weitere ähnliche Inhalte

Andere mochten auch

클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312Justin Shin
 
Speech analytics solution overview
Speech analytics solution overviewSpeech analytics solution overview
Speech analytics solution overviewRajkumar Subramanian
 
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillVoice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillKay Lerch
 
How to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsHow to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsSohan Maheshwar
 
Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Sohan Maheshwar
 
KiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonKiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonAlyona Medelyan
 
Designing a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookDesigning a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookKaushik Das
 
Applying Science to Conversational UX Design
Applying Science to Conversational UX DesignApplying Science to Conversational UX Design
Applying Science to Conversational UX DesignRaphael Arar
 
The Journey to conversational interfaces
The Journey to conversational interfacesThe Journey to conversational interfaces
The Journey to conversational interfacesRomin Irani
 
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Tilmann Böhme
 
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...UXPA International
 
Introduction to Chat Bots
Introduction to Chat BotsIntroduction to Chat Bots
Introduction to Chat BotsAlyona Medelyan
 
Chatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethChatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethWithTheBest
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsDatentreiber
 
Build your first messenger bot
Build your first messenger botBuild your first messenger bot
Build your first messenger botNowa Labs Pte Ltd
 
How to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook MessengerHow to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook MessengerMoritz Strube
 
The lifecycle of a chatbot
The lifecycle of a chatbotThe lifecycle of a chatbot
The lifecycle of a chatbotSohan Maheshwar
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon Web Services
 

Andere mochten auch (20)

클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312
 
Speech analytics solution overview
Speech analytics solution overviewSpeech analytics solution overview
Speech analytics solution overview
 
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillVoice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
 
How to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsHow to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video Ads
 
Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016
 
KiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonKiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with Python
 
Designing a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookDesigning a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cook
 
ICS2208 lecture4
ICS2208 lecture4ICS2208 lecture4
ICS2208 lecture4
 
Applying Science to Conversational UX Design
Applying Science to Conversational UX DesignApplying Science to Conversational UX Design
Applying Science to Conversational UX Design
 
The Journey to conversational interfaces
The Journey to conversational interfacesThe Journey to conversational interfaces
The Journey to conversational interfaces
 
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
 
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
 
Introduction to Chat Bots
Introduction to Chat BotsIntroduction to Chat Bots
Introduction to Chat Bots
 
Chatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethChatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud Sheth
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
 
Build your first messenger bot
Build your first messenger botBuild your first messenger bot
Build your first messenger bot
 
How to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook MessengerHow to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook Messenger
 
The lifecycle of a chatbot
The lifecycle of a chatbotThe lifecycle of a chatbot
The lifecycle of a chatbot
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
 

Ähnlich wie Speech Recognition, Text to Speech, and Voice Interfaces

Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechNgwe Tun
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
An communication app for hearing impaired groups
An communication app for hearing impaired groupsAn communication app for hearing impaired groups
An communication app for hearing impaired groupsVanessa Li
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONijistjournal
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Softwareacollier212
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Softwareacollier212
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognitionأحلام انصارى
 
F 08 dragon naturally speaking
F 08 dragon naturally speakingF 08 dragon naturally speaking
F 08 dragon naturally speakingTracy Gilmer
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
PurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docxPurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docxmakdul
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentationShamia Garrett
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentationShamia Garrett
 

Ähnlich wie Speech Recognition, Text to Speech, and Voice Interfaces (20)

Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-Speech
 
Proposal presentation.pptx
Proposal presentation.pptxProposal presentation.pptx
Proposal presentation.pptx
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
An communication app for hearing impaired groups
An communication app for hearing impaired groupsAn communication app for hearing impaired groups
An communication app for hearing impaired groups
 
Seminar
SeminarSeminar
Seminar
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Web AI.pptx
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Software
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Software
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognition
 
F 08 dragon naturally speaking
F 08 dragon naturally speakingF 08 dragon naturally speaking
F 08 dragon naturally speaking
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Synchronous Communication
Synchronous CommunicationSynchronous Communication
Synchronous Communication
 
PurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docxPurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docx
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentation
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentation
 

Kürzlich hochgeladen

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Speech Recognition, Text to Speech, and Voice Interfaces

  • 1. Speech Recognition, Text-To-Speech, and Voice Interfaces By: Taryne Cahalin Stephanie Sirico Christiana Vasquez Adelphi University - Mobile Learning, Fall 2013
  • 2. What is Speech Recognition? Instead of an automated voice recording that enables a person to press buttons, he or she is able to speak specific words into a device and command orders with the help of a speech recognition program.
  • 3. The Uses Individuals With Disabilities – Assists those who have visual impairment, hand immobility, dyslexia, etc. Medical Transcription – Reduces delays to write out medical transcriptions Dictation - Converts words to text in emails or other word documents (also helpful for English Language Learners). Access Menu Commands – Opens files using voice commands.
  • 5. How does it work? Speech recognition functions as a pipeline: The pipeline converts PCM (pulse code modulation) digital audio into recognized speech from a sound card.
  • 6.
  • 7. Transforming PCM Digital Audio 16,000 PCM values per second, a “wavy line”, that repeat while the user speaks Information is converted for better recognition in the program Fast-Fourier transform identifies frequency components of a specific sound The program can approximate how our ears distinguish the sound
  • 8. Transform PCM digital audio using Fast-Fourier Transform Fast-Fourier analyzes every 1/100th of a second and converts the audio data Each 1/100th produces an amplitude graph These graphs are in a database called a “codebook” Sounds matched to the most similar entry in the codebook. Sound is given a number which describes the sound, called the “feature number”
  • 9. Two Categories Small Vocabulary/many-users: • Leaves room for speech disparity (i.e. accents) • Limited, preset number of commands that are able to be used Large Vocabulary/limited-users: • Best for business settings • Train system to work with a small number of users • Accuracy rate will increase as it learns its users
  • 10. Discrete vs. Continuous Speech Discrete • Easier for program to understand • Noticeable pause after each word Continuous • Allows speaking at conversational speed • Used in most modern systems Programs now can recognize accents and pronunciations better. In earlier programs, accents, pronunciations, speed, and background noise were all variables that made sounds difficult for programs to understand.
  • 11. Using Talk – Text to Voice This app allows you to type and then have the device repeat what was typed. In this case, instead of the device saying Taryne as “Ta-rin”, it pronounced it as “Ta-reen”. This is an example of speech recognition programs still need some work to be done because of emphasis on a syllable. The codebook did not have Taryne in it, so it was unable to pronounce her name.
  • 12. The Future of Assistive Technology in Schools Students who need assistance in their writing skills because they have stronger oral skills. Students who were absent for a class, have poor memory, or need assistance hearing the lesson. Students who need assistance during Guided Reading. Students who are English Language Learners. Students with visual/hearing impairments and learning disabilities regarding reading/spelling/writing.