SlideShare a Scribd company logo
1 of 6
Institut de Technologie du Cambodge (ITC)
 Génie Informatique et Communication (GIC)


          TTS (Text-To-Speech)

            Seangmeng LONG
          [seangmeng@itc.edu.kh]


                  BarCamp
What is TTS?
   TTS stands for Text-To-Speech
   It is a system (module) which takes as input text
    in Khmer Unicode and produces Khmer speech

                            Input                Output




     Electronic documents           TTS system            Khmer Speech




                                                                         2
Our Method

   Concatenation-Based Synthesis using Diphone




                                                  3
Our Method (steps)
   Word Segmentation: ស ស ស ស ស ស
                        ស ស ស ស ស ស→ សស ស សស
                                       ស ស  សស ស ,
                                              ស ស
    ស ស ស
     ស ស សស  សស សស ស

   Text Normalization:         →ស ស ស ស ស ស ,ស ស
                                  ស ស ស ស ស ស ស
    ស ស ស
     ស ស ស
   Text To Sound Conversion: ស ស
                               ស ស→ kakthen
   Syllabification: sa:la: → sa: . la:
   Stress Assignment: sa: . la: → sə . la:
   Sound Change: ស ស
                   ស សcak → caʔ
   Intonation
   Diphone Database
   Integration
                                                              4

   Applications Development (mail reader, doc reader, ...)
New Statistical System
   Speech corpus
      ~450 sentences (~30 minutes)
      Automatic labeling
          EHMM labeler
          Sphinx
   Statistical parameter synthesis
      More natural, but buzzy
   Unit selection
      Units of variable size (smallest unit is phone)
      More natural, but bad quality at join points
                                                        5
Thanks for your attention.




                             6

More Related Content

What's hot

Psychoacoustic Approaches to Audio Steganography Report
Psychoacoustic Approaches to Audio Steganography Report Psychoacoustic Approaches to Audio Steganography Report
Psychoacoustic Approaches to Audio Steganography Report Cody Ray
 
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...hblanca
 
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh TomarDeep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh TomarWithTheBest
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniquessonukumar142
 
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesMarathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesIDES Editor
 
OpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for allOpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for allAlexandre Rademaker
 
Progress on Bangla Text-To-Speech System by Dr. M. Shahidur Rahman
Progress on Bangla Text-To-Speech System by Dr. M. Shahidur RahmanProgress on Bangla Text-To-Speech System by Dr. M. Shahidur Rahman
Progress on Bangla Text-To-Speech System by Dr. M. Shahidur RahmanShuvo Habib
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupLINAGORA
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice RecognitionAmrita More
 
Arabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachArabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachIJECEIAES
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesisAnkita Jadhao
 

What's hot (13)

Psychoacoustic Approaches to Audio Steganography Report
Psychoacoustic Approaches to Audio Steganography Report Psychoacoustic Approaches to Audio Steganography Report
Psychoacoustic Approaches to Audio Steganography Report
 
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Ext...
 
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh TomarDeep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh Tomar
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesMarathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW Features
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
OpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for allOpenWN-PT: a Brazilian Wordnet for all
OpenWN-PT: a Brazilian Wordnet for all
 
Progress on Bangla Text-To-Speech System by Dr. M. Shahidur Rahman
Progress on Bangla Text-To-Speech System by Dr. M. Shahidur RahmanProgress on Bangla Text-To-Speech System by Dr. M. Shahidur Rahman
Progress on Bangla Text-To-Speech System by Dr. M. Shahidur Rahman
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - Meetup
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Arabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachArabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approach
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesis
 

Similar to Khmer TTS

NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERTshaurya uppal
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012Jimmy Lai
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingNimrita Koul
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analyticsshanbady
 
saito22research_talk_at_NUS
saito22research_talk_at_NUSsaito22research_talk_at_NUS
saito22research_talk_at_NUSYuki Saito
 
Deep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmDeep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmMeetupDataScienceRoma
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 
Building your own open-source voice assistant
Building your own open-source voice assistantBuilding your own open-source voice assistant
Building your own open-source voice assistantAll Things Open
 
DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019Wes Widner
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingSeonghyun Kim
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)Sumit Raj
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdfRamya Nellutla
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERTAbdurrahimDerric
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speechBilgin Aksoy
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONcscpconf
 

Similar to Khmer TTS (20)

NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
Nltk
NltkNltk
Nltk
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
 
saito22research_talk_at_NUS
saito22research_talk_at_NUSsaito22research_talk_at_NUS
saito22research_talk_at_NUS
 
Deep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmDeep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigm
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Arithmetic Coding
Arithmetic CodingArithmetic Coding
Arithmetic Coding
 
Building your own open-source voice assistant
Building your own open-source voice assistantBuilding your own open-source voice assistant
Building your own open-source voice assistant
 
DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019DIY Jarvis All Things Open 2019
DIY Jarvis All Things Open 2019
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
Data Compression
Data CompressionData Compression
Data Compression
 
Chatbot ppt
Chatbot pptChatbot ppt
Chatbot ppt
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdf
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERT
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
Shannon Fano
Shannon FanoShannon Fano
Shannon Fano
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
 

More from Bill Chea

Xen cloud platform
Xen cloud platformXen cloud platform
Xen cloud platformBill Chea
 
Save time by using sass to develop css
Save time by using sass to develop cssSave time by using sass to develop css
Save time by using sass to develop cssBill Chea
 
Safety social media for positive social change
Safety social media for positive social changeSafety social media for positive social change
Safety social media for positive social changeBill Chea
 
Open street map
Open street mapOpen street map
Open street mapBill Chea
 
Open development cambodia
Open development cambodiaOpen development cambodia
Open development cambodiaBill Chea
 
Job hunting & career development
Job hunting & career developmentJob hunting & career development
Job hunting & career developmentBill Chea
 
Internet security
Internet securityInternet security
Internet securityBill Chea
 
How to build up communication skill
How to build up communication skillHow to build up communication skill
How to build up communication skillBill Chea
 
Google mapmaker
Google mapmakerGoogle mapmaker
Google mapmakerBill Chea
 
Financial job study travel planning
Financial job study travel planningFinancial job study travel planning
Financial job study travel planningBill Chea
 
ERP web based system
ERP web based systemERP web based system
ERP web based systemBill Chea
 
10 golden features of business website
10 golden features of business website10 golden features of business website
10 golden features of business websiteBill Chea
 
UrbanVoicePDF
UrbanVoicePDFUrbanVoicePDF
UrbanVoicePDFBill Chea
 
4 hour-workweek-blogger
4 hour-workweek-blogger4 hour-workweek-blogger
4 hour-workweek-bloggerBill Chea
 

More from Bill Chea (20)

Xen cloud platform
Xen cloud platformXen cloud platform
Xen cloud platform
 
Why ruby
Why rubyWhy ruby
Why ruby
 
Unix tc
Unix tcUnix tc
Unix tc
 
Sithi hub
Sithi hubSithi hub
Sithi hub
 
Save time by using sass to develop css
Save time by using sass to develop cssSave time by using sass to develop css
Save time by using sass to develop css
 
Safety social media for positive social change
Safety social media for positive social changeSafety social media for positive social change
Safety social media for positive social change
 
Open street map
Open street mapOpen street map
Open street map
 
Open development cambodia
Open development cambodiaOpen development cambodia
Open development cambodia
 
Less css
Less cssLess css
Less css
 
Job hunting & career development
Job hunting & career developmentJob hunting & career development
Job hunting & career development
 
Internet security
Internet securityInternet security
Internet security
 
How to build up communication skill
How to build up communication skillHow to build up communication skill
How to build up communication skill
 
Google mapmaker
Google mapmakerGoogle mapmaker
Google mapmaker
 
Financial job study travel planning
Financial job study travel planningFinancial job study travel planning
Financial job study travel planning
 
Khmer OCR
Khmer OCRKhmer OCR
Khmer OCR
 
ERP web based system
ERP web based systemERP web based system
ERP web based system
 
10 golden features of business website
10 golden features of business website10 golden features of business website
10 golden features of business website
 
UrbanVoicePDF
UrbanVoicePDFUrbanVoicePDF
UrbanVoicePDF
 
4 hour-workweek-blogger
4 hour-workweek-blogger4 hour-workweek-blogger
4 hour-workweek-blogger
 
UrbanVoice
UrbanVoiceUrbanVoice
UrbanVoice
 

Khmer TTS

  • 1. Institut de Technologie du Cambodge (ITC) Génie Informatique et Communication (GIC) TTS (Text-To-Speech) Seangmeng LONG [seangmeng@itc.edu.kh] BarCamp
  • 2. What is TTS?  TTS stands for Text-To-Speech  It is a system (module) which takes as input text in Khmer Unicode and produces Khmer speech Input Output Electronic documents TTS system Khmer Speech 2
  • 3. Our Method  Concatenation-Based Synthesis using Diphone 3
  • 4. Our Method (steps)  Word Segmentation: ស ស ស ស ស ស ស ស ស ស ស ស→ សស ស សស ស ស សស ស , ស ស ស ស ស ស ស សស សស សស ស  Text Normalization: →ស ស ស ស ស ស ,ស ស ស ស ស ស ស ស ស ស ស ស ស ស ស  Text To Sound Conversion: ស ស ស ស→ kakthen  Syllabification: sa:la: → sa: . la:  Stress Assignment: sa: . la: → sə . la:  Sound Change: ស ស ស សcak → caʔ  Intonation  Diphone Database  Integration 4  Applications Development (mail reader, doc reader, ...)
  • 5. New Statistical System  Speech corpus ~450 sentences (~30 minutes) Automatic labeling  EHMM labeler  Sphinx  Statistical parameter synthesis More natural, but buzzy  Unit selection Units of variable size (smallest unit is phone) More natural, but bad quality at join points 5
  • 6. Thanks for your attention. 6