SlideShare a Scribd company logo
1 of 20
Digital Speech Processing

 Dept. of Computer Science & Engineering
 Shahjalal University of Science & Technology
Course Description
– Review of digital signal processing

– Fundamentals of speech production and perception

– Basic techniques for digital speech processing:
   • short - time energy, magnitude, autocorrelation
   • short - time Fourier analysis
   • homomorphic methods
   • linear predictive methods
Course Description…
– Speech estimation methods
    • speech/non-speech detection
    • voiced/unvoiced/non-speech segmentation/classification
    • pitch detection
    • formant estimation
– Applications of speech signal processing
    • Speech coding
    • Speech synthesis
    • Speech recognition/natural language processing
Book Information
Textbook:
L. R. Rabiner and R. W. Schafer,
-Theory and Applications of Digital Speech
Processing, Prentice-Hall Inc., 2011

Recommended Supplementary Textbook:
• T. F. Quatieri, Principles of Discrete - Time Speech
Processing, Prentice Hall Inc, 2002
Laboratory works: Using Matlab
Speech Processing
Speech is one of the most intriguing signals that humans work
  with every day.

• Purpose of speech processing:
– To understand speech as a means of communication;
– To represent speech for transmission and reproduction;
– To analyze speech for automatic recognition and extraction of
   information
– To discover some physiological characteristics of the talker.
The Speech Stack

• Speech Applications —
  coding, synthesis, recognition, understanding, ve
  rification, language translation, speed-up/slow-
  down
• Speech Algorithms —speech-silence
  (background), voiced-unvoiced decision, pitch
  detection, formant estimation
• Speech Representations —
  temporal, spectral, homomorphic, LPC
• Fundamentals —
  acoustics, linguistics, pragmatics, speech
  perception
Speech Coding
Speech Coding

Speech Coding is the process of transforming a speech
  signal into a representation for efficient transmission
  and storage of speech
   – narrowband and broadband wired telephony
   – cellular communications
   – Voice over IP (VoIP) to utilize the Internet as a real-time
      communications medium
   – secure voice for privacy and encryption for national
      security applications
   – extremely narrowband communications
      channels, e.g., battlefield applications using HF radio
   – storage of speech for telephone answering
      machines, prerecorded messages
Speech Synthesis
Synthesis of Speech is the process of generating a speech
signal using computational means for effective human-
machine interactions.
Speech Synthesis…
– machine reading of text or email messages
– talking agents for automatic transactions
– automatic agent in customer care call center
– handheld devices such as foreign language
  phrasebooks, dictionaries, crossword puzzle
  helpers
– announcement machines that provide information
  such as stock quotes, airlines schedules, weather
  reports, etc.
Speech Synthesis Examples




    Natural     Synthetic
Speech Recognition and understanding

Recognition and Understanding of Speech is the process of
  extracting usable linguistic information from a speech signal
  in support of human-machine communication by voice
   – command and control (C&C) applications, e.g., simple commands for
      spreadsheets, presentation graphics, Appliances
   – voice dictation to create letters, memos, and other documents
   – natural language voice dialogues with machines to enable Help
      desks, Call Centers
   – voice dialing for cellphones and from PDA’s and other small devices
   – agent services such as calendar entry and update, address list
      modification and entry, etc.
Speech Recognition Demos
Pattern Matching Problems
Pattern Matching Problems
• Speech recognition
• Speaker recognition
• Speaker verification
• Word spotting
• Automatic indexing of speech recordings
Other Speech Applications
• Speaker Verification for secure access to
   premises, information, virtual spaces
• Speaker Recognition for legal and forensic purposes—national
   security; also for personalized services
• Speech Enhancement for use in noisy environments, to eliminate
   echo, to align voices with video segments, to change voice
   qualities, to speed-up or slow-down prerecorded speech
   (e.g., talking books, rapid review of material, careful scrutinizing of
   spoken material, etc) => potentially to improve intelligibility and
   naturalness of speech
• Language Translation to convert spoken words in one language to
   another to facilitate natural language dialogues between people
   speaking different languages, i.e., tourists, business people
Speech/DSP Enabled Devices
Digital Speech Processing
• DSP:
– obtaining discrete representations of speech signal
– theory, design and implementation of numerical procedures
(algorithms) for processing the discrete representation in order to
achieve a goal (recognizing the signal, modifying the time scale
of the signal, removing background noise from the signal, etc.)
•Why DSP
– reliability
– flexibility
– accuracy
– real-time implementations on inexpensive dsp chips
– ability to integrate with multimedia and data
– encryptability/security of the data and the data representations
via suitable techniques
What We Will Be Learning
• Review some basic DSP concepts
• Speech production model—acoustics, articulatory concepts,
   speech production models
• Speech perception model—ear models, auditory signal
   processing, equivalent acoustic processing models
• Time domain processing concepts—speech properties, pitch,
   voiced-unvoiced, energy, autocorrelation, zero-crossing rates
• Short time Fourier analysis methods—digital filter banks,
   spectrograms, analysis-synthesis systems, vocoders
• Homomorphic speech processing—cepstrum, pitch detection,
   formant estimation, homomorphic vocoder
What We Will Be Learning…
• Linear predictive coding methods—autocorrelation
   method, covariance method, lattice methods, relation to vocal
   tract models
• Speech waveform coding and source models—delta
   modulation, PCM, mu-law, ADPCM, vector
   quantization, multipulse coding, CELP coding
• Methods for speech synthesis and text-to-speech systems—
   physical models, formant models, articulatory
   models, concatenative models
• Methods for speech recognition—the Hidden Markov Model
   (HMM)

More Related Content

What's hot

Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signalVinodhini
 
Introduction to Digital Signal Processing
Introduction to Digital Signal ProcessingIntroduction to Digital Signal Processing
Introduction to Digital Signal Processingop205
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Homomorphic speech processing
Homomorphic speech processingHomomorphic speech processing
Homomorphic speech processingsivakumar m
 
DIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGDIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGSnehal Hedau
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
Adaptive filter
Adaptive filterAdaptive filter
Adaptive filterA. Shamel
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionManthan Gandhi
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
DSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal Processing
DSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal ProcessingDSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal Processing
DSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal ProcessingAmr E. Mohamed
 

What's hot (20)

Digital modeling of speech signal
Digital modeling of speech signalDigital modeling of speech signal
Digital modeling of speech signal
 
Introduction to Digital Signal Processing
Introduction to Digital Signal ProcessingIntroduction to Digital Signal Processing
Introduction to Digital Signal Processing
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Homomorphic speech processing
Homomorphic speech processingHomomorphic speech processing
Homomorphic speech processing
 
Multirate DSP
Multirate DSPMultirate DSP
Multirate DSP
 
DIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGDIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSING
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
FILTER BANKS
FILTER BANKSFILTER BANKS
FILTER BANKS
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Introduction to dsp
Introduction to dspIntroduction to dsp
Introduction to dsp
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
Sampling
SamplingSampling
Sampling
 
Adaptive filter
Adaptive filterAdaptive filter
Adaptive filter
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
DSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal Processing
DSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal ProcessingDSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal Processing
DSP_2018_FOEHU - Lec 1 - Introduction to Digital Signal Processing
 

Viewers also liked

presentation on digital signal processing
presentation on digital signal processingpresentation on digital signal processing
presentation on digital signal processingsandhya jois
 
Essential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic OrganizerEssential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic Organizersheilacook
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeerRanbeer Tyagi
 
Physiology of speech
Physiology of speechPhysiology of speech
Physiology of speechRaghu Veer
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizyLizy Abraham
 
Radio Communication
Radio CommunicationRadio Communication
Radio CommunicationJohn Grace
 
Radio communication presentation
Radio communication presentationRadio communication presentation
Radio communication presentationrandan88
 
Gsm.....ppt
Gsm.....pptGsm.....ppt
Gsm.....pptbalu008
 

Viewers also liked (11)

presentation on digital signal processing
presentation on digital signal processingpresentation on digital signal processing
presentation on digital signal processing
 
Dif fft
Dif fftDif fft
Dif fft
 
Essential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic OrganizerEssential linguistics Chap 3 part 1 Graphic Organizer
Essential linguistics Chap 3 part 1 Graphic Organizer
 
Ppt on speech processing by ranbeer
Ppt on speech processing by ranbeerPpt on speech processing by ranbeer
Ppt on speech processing by ranbeer
 
Physiology of speech
Physiology of speechPhysiology of speech
Physiology of speech
 
Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
Radio Presentation
Radio PresentationRadio Presentation
Radio Presentation
 
Radio Communication
Radio CommunicationRadio Communication
Radio Communication
 
Radio communication presentation
Radio communication presentationRadio communication presentation
Radio communication presentation
 
Dsp ppt
Dsp pptDsp ppt
Dsp ppt
 
Gsm.....ppt
Gsm.....pptGsm.....ppt
Gsm.....ppt
 

Similar to Digital speech processing lecture1

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionZachary S. Brown
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition systemavinash raibole
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
How speech reorganization works
How speech reorganization worksHow speech reorganization works
How speech reorganization worksMuhammad Taqi
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areasLearnbay Datascience
 

Similar to Digital speech processing lecture1 (20)

Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Iitdmj 1
Iitdmj 1Iitdmj 1
Iitdmj 1
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Web AI.pptx
 
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
 
How speech reorganization works
How speech reorganization worksHow speech reorganization works
How speech reorganization works
 
Universal design HCI
Universal design HCIUniversal design HCI
Universal design HCI
 
WomenTech_Event
WomenTech_EventWomenTech_Event
WomenTech_Event
 
Artificial intelligence - research areas
Artificial intelligence - research areasArtificial intelligence - research areas
Artificial intelligence - research areas
 
Assign
AssignAssign
Assign
 
Amadou
AmadouAmadou
Amadou
 

Recently uploaded

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Digital speech processing lecture1

  • 1. Digital Speech Processing Dept. of Computer Science & Engineering Shahjalal University of Science & Technology
  • 2. Course Description – Review of digital signal processing – Fundamentals of speech production and perception – Basic techniques for digital speech processing: • short - time energy, magnitude, autocorrelation • short - time Fourier analysis • homomorphic methods • linear predictive methods
  • 3. Course Description… – Speech estimation methods • speech/non-speech detection • voiced/unvoiced/non-speech segmentation/classification • pitch detection • formant estimation – Applications of speech signal processing • Speech coding • Speech synthesis • Speech recognition/natural language processing
  • 4. Book Information Textbook: L. R. Rabiner and R. W. Schafer, -Theory and Applications of Digital Speech Processing, Prentice-Hall Inc., 2011 Recommended Supplementary Textbook: • T. F. Quatieri, Principles of Discrete - Time Speech Processing, Prentice Hall Inc, 2002 Laboratory works: Using Matlab
  • 5. Speech Processing Speech is one of the most intriguing signals that humans work with every day. • Purpose of speech processing: – To understand speech as a means of communication; – To represent speech for transmission and reproduction; – To analyze speech for automatic recognition and extraction of information – To discover some physiological characteristics of the talker.
  • 6. The Speech Stack • Speech Applications — coding, synthesis, recognition, understanding, ve rification, language translation, speed-up/slow- down • Speech Algorithms —speech-silence (background), voiced-unvoiced decision, pitch detection, formant estimation • Speech Representations — temporal, spectral, homomorphic, LPC • Fundamentals — acoustics, linguistics, pragmatics, speech perception
  • 8. Speech Coding Speech Coding is the process of transforming a speech signal into a representation for efficient transmission and storage of speech – narrowband and broadband wired telephony – cellular communications – Voice over IP (VoIP) to utilize the Internet as a real-time communications medium – secure voice for privacy and encryption for national security applications – extremely narrowband communications channels, e.g., battlefield applications using HF radio – storage of speech for telephone answering machines, prerecorded messages
  • 9. Speech Synthesis Synthesis of Speech is the process of generating a speech signal using computational means for effective human- machine interactions.
  • 10. Speech Synthesis… – machine reading of text or email messages – talking agents for automatic transactions – automatic agent in customer care call center – handheld devices such as foreign language phrasebooks, dictionaries, crossword puzzle helpers – announcement machines that provide information such as stock quotes, airlines schedules, weather reports, etc.
  • 11. Speech Synthesis Examples Natural Synthetic
  • 12. Speech Recognition and understanding Recognition and Understanding of Speech is the process of extracting usable linguistic information from a speech signal in support of human-machine communication by voice – command and control (C&C) applications, e.g., simple commands for spreadsheets, presentation graphics, Appliances – voice dictation to create letters, memos, and other documents – natural language voice dialogues with machines to enable Help desks, Call Centers – voice dialing for cellphones and from PDA’s and other small devices – agent services such as calendar entry and update, address list modification and entry, etc.
  • 15. Pattern Matching Problems • Speech recognition • Speaker recognition • Speaker verification • Word spotting • Automatic indexing of speech recordings
  • 16. Other Speech Applications • Speaker Verification for secure access to premises, information, virtual spaces • Speaker Recognition for legal and forensic purposes—national security; also for personalized services • Speech Enhancement for use in noisy environments, to eliminate echo, to align voices with video segments, to change voice qualities, to speed-up or slow-down prerecorded speech (e.g., talking books, rapid review of material, careful scrutinizing of spoken material, etc) => potentially to improve intelligibility and naturalness of speech • Language Translation to convert spoken words in one language to another to facilitate natural language dialogues between people speaking different languages, i.e., tourists, business people
  • 18. Digital Speech Processing • DSP: – obtaining discrete representations of speech signal – theory, design and implementation of numerical procedures (algorithms) for processing the discrete representation in order to achieve a goal (recognizing the signal, modifying the time scale of the signal, removing background noise from the signal, etc.) •Why DSP – reliability – flexibility – accuracy – real-time implementations on inexpensive dsp chips – ability to integrate with multimedia and data – encryptability/security of the data and the data representations via suitable techniques
  • 19. What We Will Be Learning • Review some basic DSP concepts • Speech production model—acoustics, articulatory concepts, speech production models • Speech perception model—ear models, auditory signal processing, equivalent acoustic processing models • Time domain processing concepts—speech properties, pitch, voiced-unvoiced, energy, autocorrelation, zero-crossing rates • Short time Fourier analysis methods—digital filter banks, spectrograms, analysis-synthesis systems, vocoders • Homomorphic speech processing—cepstrum, pitch detection, formant estimation, homomorphic vocoder
  • 20. What We Will Be Learning… • Linear predictive coding methods—autocorrelation method, covariance method, lattice methods, relation to vocal tract models • Speech waveform coding and source models—delta modulation, PCM, mu-law, ADPCM, vector quantization, multipulse coding, CELP coding • Methods for speech synthesis and text-to-speech systems— physical models, formant models, articulatory models, concatenative models • Methods for speech recognition—the Hidden Markov Model (HMM)