SlideShare ist ein Scribd-Unternehmen logo
1 von 22
A utomatic  S peech   R ecognition ,[object Object],[object Object],[object Object]
OUTLINE ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Multilayer Structure of speech production: ,[object Object],[I] [would] [like] [to] [book] [a] [flight] [from] [Rome] [to] [London][tomorrow][morning]  [book]  [b/uh/k] Pragmatic Layer Semantic Layer Syntactic Layer Prosodic/Phonetic Layer Acoustic Layer
What is  S peech  R ecognition ? ,[object Object],[object Object],[object Object]
Capabilities of ASR including: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Uses and Applications  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A Timeline & History of Voice Recognition Software Dragon released discrete word dictation-level speech recognition software. It was the first time dictation speech & voice recognition technology was available to consumers .   1995   SpeechWorks, the leading provider of over-the-telephone automated speech recognition (ASR) solutions, was founded.  1984   Dragon Systems was founded. 1982   DARPA established the Speech Understanding Research (SUR) program. A $3 million per year of government funds for 5 years.  It was the largest speech recognition project ever.  1971   HMM approach to speech & voice recognition was invented by Lenny Baum of Princeton University  Early 1970's   AT&T's Bell Labs produced the first electronic speech synthesizer called the Voder.  1936
… timeline…continue Scansoft, Inc. is presently the world leader in the technology of Speech Recognition in the commercial market. ScanSoft Ships Dragon NaturallySpeaking 7 Medical, Lowers Healthcare Costs through Highly Accurate Speech Recognition.  2003   Lernout & Hauspie acquired Dragon Systems for approximately $460 million.  2000   Microsoft invested $45 million to allow Microsoft to use speech & voice recognition technology in their systems.  1998   Dragon introduced "Naturally Speaking", the first "continuous speech" dictation software available  1997
The Structure of ASR System: Functional Scheme of an ASR System Speech samples X Y S W * Database Signal  Interface Feature Extraction Recognition Databases Training HMM
Speech Database: ,[object Object],[object Object],[object Object]
Transcription of speech: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Segmentation and labeling example
Many databases are distributed by the  Linguistic Data Consortium   www.ldc.upenn.edu
Speech Signal Analysis Feature Extraction for ASR: - The aim is to extract the voice features to distinguish different phonemes of a language.
MFCC extraction: ,[object Object],[object Object],Pre-emphasis DFT Mel filter banks Log(|| 2 ) IDFT Speech signal x(n) WINDOW x ’ (n) x t  (n) X t (k) Y t (m) MFCC y t (m) (k)
Spectral Analysis: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Speech waveform of a phoneme “e” ,[object Object],After pre-emphasis and Hamming windowing Power spectrum MFCC
Training  and  Recognition : ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Deterministic  vs.  Stochastic  framework: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Implementing  HMM  to speech Modeling   Training  and  Recognition ,[object Object],[object Object],[object Object],Training HMM Feature  Extraction Recognition W * Y Y S Speech Samples 
Implementation of HMM: ,[object Object],[object Object],P(w t =yes t-1 =il)=0.2 P(w t =il|w t-1 =yes)=1 P(w t =il|w t-1 =no)=1 P(w t =no t-1 =il)=0.2 P(s t  t-1 ) s (0) Silence Start S (1) S (2) S (3) S (4) S (5) S (6) S (7) S (8) S (9) S (10) S (11) S (12) Phoneme ‘ YE ’ Phoneme ‘ S ’ w= YES w= NO Phoneme ‘ N ’ Phoneme ‘ O ’ P(Y t =s (9) ) Y 0.6
The search Algorithm: ,[object Object],s (0) s (7) s (0) s (1) s (8) s (7) s (0) s (1) s (2) Time=1 Time=2 Time=3 0.1 0.4 0.1 0.025 0.021 0.051 0.041 0.045 0.036 0.032
Conclusions: ,[object Object],[object Object],[object Object],[object Object],[object Object]

Weitere ähnliche Inhalte

Was ist angesagt?

Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
Diptimaya Sarangi
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
Alok Tiwari
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
butest
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
ankit_saluja
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
Amrita More
 
VOICE BASED SECURITY SYSTEM
VOICE BASED SECURITY SYSTEMVOICE BASED SECURITY SYSTEM
VOICE BASED SECURITY SYSTEM
Nikhil Ravi
 
Holographic Projection Technology COMPLETE DETAILS NEW PPT
Holographic Projection Technology COMPLETE DETAILS NEW PPTHolographic Projection Technology COMPLETE DETAILS NEW PPT
Holographic Projection Technology COMPLETE DETAILS NEW PPT
 Abin Baby
 
Text to Speech PowerPoint
Text to Speech PowerPointText to Speech PowerPoint
Text to Speech PowerPoint
matthewmahony
 

Was ist angesagt? (20)

Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognitionHidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Holography
HolographyHolography
Holography
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overview
 
VOICE BASED SECURITY SYSTEM
VOICE BASED SECURITY SYSTEMVOICE BASED SECURITY SYSTEM
VOICE BASED SECURITY SYSTEM
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Holographic Projection Technology COMPLETE DETAILS NEW PPT
Holographic Projection Technology COMPLETE DETAILS NEW PPTHolographic Projection Technology COMPLETE DETAILS NEW PPT
Holographic Projection Technology COMPLETE DETAILS NEW PPT
 
Speech Recognition System
Speech Recognition SystemSpeech Recognition System
Speech Recognition System
 
Seminar
SeminarSeminar
Seminar
 
Text to Speech PowerPoint
Text to Speech PowerPointText to Speech PowerPoint
Text to Speech PowerPoint
 

Ähnlich wie Asr

Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
kkkseld
 
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Cemal Ardil
 

Ähnlich wie Asr (20)

Asr
AsrAsr
Asr
 
speech enhancement
speech enhancementspeech enhancement
speech enhancement
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
Comparison and Analysis Of LDM and LMS for an Application of a Speech
Comparison and Analysis Of LDM and LMS for an Application of a SpeechComparison and Analysis Of LDM and LMS for an Application of a Speech
Comparison and Analysis Of LDM and LMS for an Application of a Speech
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
sr.ppt
sr.pptsr.ppt
sr.ppt
 
Voice recognitionr.ppt
Voice recognitionr.pptVoice recognitionr.ppt
Voice recognitionr.ppt
 
sr.ppt
sr.pptsr.ppt
sr.ppt
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
10
1010
10
 
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template Matching
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Emotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio SpeechEmotion Recognition Based On Audio Speech
Emotion Recognition Based On Audio Speech
 
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
 

Mehr von kkkseld (12)

H E A D S C A R F D E A D L O C K I N T U R K E Y A S A C A S E S T U D Y
H E A D S C A R F  D E A D L O C K  I N  T U R K E Y  A S  A  C A S E  S T U D YH E A D S C A R F  D E A D L O C K  I N  T U R K E Y  A S  A  C A S E  S T U D Y
H E A D S C A R F D E A D L O C K I N T U R K E Y A S A C A S E S T U D Y
 
Microsoft Word Mobile Multi Media Applications
Microsoft Word   Mobile Multi Media ApplicationsMicrosoft Word   Mobile Multi Media Applications
Microsoft Word Mobile Multi Media Applications
 
Microsoft Word Project, Firewalls
Microsoft Word   Project, FirewallsMicrosoft Word   Project, Firewalls
Microsoft Word Project, Firewalls
 
Microsoft Word Hw#2
Microsoft Word   Hw#2Microsoft Word   Hw#2
Microsoft Word Hw#2
 
Microsoft Word Hw#3
Microsoft Word   Hw#3Microsoft Word   Hw#3
Microsoft Word Hw#3
 
Microsoft Word Hw#1
Microsoft Word   Hw#1Microsoft Word   Hw#1
Microsoft Word Hw#1
 
Microsoft Word The Project, Islam And Science
Microsoft Word   The Project, Islam And ScienceMicrosoft Word   The Project, Islam And Science
Microsoft Word The Project, Islam And Science
 
Presentation, Firewalls
Presentation, FirewallsPresentation, Firewalls
Presentation, Firewalls
 
Sslis
SslisSslis
Sslis
 
Mobile Multi Media Applications
Mobile Multi Media ApplicationsMobile Multi Media Applications
Mobile Multi Media Applications
 
Presentation, Firewalls
Presentation, FirewallsPresentation, Firewalls
Presentation, Firewalls
 
Kerie2006 Poster Template 01
Kerie2006 Poster Template 01Kerie2006 Poster Template 01
Kerie2006 Poster Template 01
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Asr

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. A Timeline & History of Voice Recognition Software Dragon released discrete word dictation-level speech recognition software. It was the first time dictation speech & voice recognition technology was available to consumers . 1995 SpeechWorks, the leading provider of over-the-telephone automated speech recognition (ASR) solutions, was founded. 1984 Dragon Systems was founded. 1982 DARPA established the Speech Understanding Research (SUR) program. A $3 million per year of government funds for 5 years. It was the largest speech recognition project ever. 1971 HMM approach to speech & voice recognition was invented by Lenny Baum of Princeton University Early 1970's AT&T's Bell Labs produced the first electronic speech synthesizer called the Voder. 1936
  • 8. … timeline…continue Scansoft, Inc. is presently the world leader in the technology of Speech Recognition in the commercial market. ScanSoft Ships Dragon NaturallySpeaking 7 Medical, Lowers Healthcare Costs through Highly Accurate Speech Recognition. 2003 Lernout & Hauspie acquired Dragon Systems for approximately $460 million. 2000 Microsoft invested $45 million to allow Microsoft to use speech & voice recognition technology in their systems. 1998 Dragon introduced "Naturally Speaking", the first "continuous speech" dictation software available 1997
  • 9. The Structure of ASR System: Functional Scheme of an ASR System Speech samples X Y S W * Database Signal Interface Feature Extraction Recognition Databases Training HMM
  • 10.
  • 11.
  • 12. Many databases are distributed by the Linguistic Data Consortium www.ldc.upenn.edu
  • 13. Speech Signal Analysis Feature Extraction for ASR: - The aim is to extract the voice features to distinguish different phonemes of a language.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.