SlideShare ist ein Scribd-Unternehmen logo
1 von 14
A TOOL TO CONVERT TEXT TO
SPEECH WITH EMOTIONS
EmoSpeak
Submitted to: Ms. Shikha Jain
Submitted by:
Akriti Saini (10503902)
Stuti Shukla (10503870)
What is NLP?
 Natural language processing (NLP) is a field
of computer science, artificial intelligence,
and linguistics concerned with the interactions
between computers and human (natural) languages. As
such, NLP is related to the area of human–computer
interaction. Many challenges in NLP involve natural
language understanding, that is, enabling computers to
derive meaning from human or natural language input,
and others involve natural language generation.
 The area of NLP we are concerned with is:
Text-to-Speech with emotions.
What is Text-to-Speech?
 A text-to-speech (TTS) system converts normal
language text into speech.
 The quality of a speech synthesizer is judged by its
similarity to the human voice and by its ability to
be understood clearly. An intelligible text-to-
speech program allows people with visual
impairments or reading disabilities to listen to
written works on a home computer.
Our Tool: EmoSpeak
 EmoSpeak converts text to speech in such a way
that it takes into account all the emotions of the
text and incorporates all the extracted emotions
into speech.
 The tool first identifies the various emotions in the
raw text and then modifies certain characteristics
of the voice in order to modulate it, and then
expresses the various emotions.
 The tool is composed of two parts: a front-end and
a back-end. The front-end is responsible for text
normalization, pre-processing, or tokenization.
The back-end—often referred to as
the synthesizer—then converts the symbolic
linguistic representation into sound.
Voice Modulation
 One of the goals of text-to-speech(TTS) systems is to
produce natural-sounding synthesized speech.
Towards this end various natural language
processing (NLP) tasks are performed to model the
prosodic aspects of the TTS.
 One of the fundamental NLP task being used is the
part-of-speech (POS) tagging of the words in the
text.
 The voice modulation aspect of the project. i.e.
changing certain characteristics of the voice based on
a particular emotion has various characteristics of
the voice that could be changed such as f0
frequency, f0 contour, f0 range, jitter, nasal
duration etc.
 These characteristics are changed according to the
emotion, which is set by the user.
Implementation
 For implementation purpose, the first task is to take a
pdf file as an input and convert it to the corresponding
text file.
 The text is then tokenized and decision regarding the
class (emotional or neutral) to which it belongs is taken.
Upon deciding that the text belongs to the emotional
class, it is then required to identify the emotional
subcategory to which the text belongs- suppose ‘happy’
 The above classification can be done by using WordNet
and WordNet-Affect. Now depending on emotions, the
voice can be accordingly modulated by varying the
intensity, time of pause between the words, pitch of the
voice.
Diagram for Emotion Extraction from the
Text
Integrated Literature Survey
 By exploring various research papers we infer that
there are various approaches available which can be
followed to implement our application. Our first task
should be to decide upon whether the text falls in
emotional or non-emotional (neutral) class.
 The important thing that we came to know was that,
using WordNet and WordNet affect was the best way
in order to identify the emotions in a particular text,
because it had the maximum precision among all the
other procedures, like LSA
 From the literature survey we also conclude that
there are various text-to-speech engines available
and our foremost task would be to choose an
appropriate engine according to the requirements.
We came across the researches in which emotional
text-to-speech engine has been implemented for the
Italian and Arabic languages.
Application and Significance of
the project
 It can be used to inculcate the habits of reading books in
the children, as from human psychology it can be
inferred that the particular task when done or performed
beyond a certain limit, develops a liking for that
particular task. So by listening to various type of books
children will develop a habit of reading books.
 It can also be used to supplement children’s reading
classes. A child learns easily especially when things are
pointed to him. They can listen to a voice reading the
contents of the book as they follow with their eyes. It can
be used as a tutor replacing the need of teacher to guide
children
 By implementing expressive child-directed
storytelling in a text-to-speech application, it can be
useful in therapeutic education of children with
communication disorders. This can be done by
helping them to learn how to express their feeling
and try to communicate.
 It can help visually impaired or the people with
certain reading disabilities to get the feel of reading a
book.
Major presentation

Weitere ähnliche Inhalte

Was ist angesagt?

TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
Nsaroj kumar
 
Speech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transformSpeech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transform
vidhateswapnil
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
Diptimaya Sarangi
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
Vinay Jaisriram
 

Was ist angesagt? (20)

Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Bert algorithm 2
Bert algorithm  2Bert algorithm  2
Bert algorithm 2
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Gujarati Text-to-Speech Presentation
Gujarati Text-to-Speech PresentationGujarati Text-to-Speech Presentation
Gujarati Text-to-Speech Presentation
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overview
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
 
Speech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transformSpeech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transform
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 
Voice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediaVoice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social media
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law companding
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-report
 

Andere mochten auch

Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and musicEduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
swissnex San Francisco
 
power point,, kedua
power point,, keduapower point,, kedua
power point,, kedua
rahmanidar
 
Powerpoint nlp..
Powerpoint nlp..Powerpoint nlp..
Powerpoint nlp..
nitadevonna
 
Neuro linguistic programming
Neuro linguistic programmingNeuro linguistic programming
Neuro linguistic programming
Niharika Thakkar
 

Andere mochten auch (20)

EmoSpark
EmoSparkEmoSpark
EmoSpark
 
Emospark
EmosparkEmospark
Emospark
 
Emo spark
Emo sparkEmo spark
Emo spark
 
seminar emosparkk
seminar emosparkk seminar emosparkk
seminar emosparkk
 
Emospark
EmosparkEmospark
Emospark
 
emospark-ppt
 emospark-ppt emospark-ppt
emospark-ppt
 
PPt file
PPt filePPt file
PPt file
 
Seminar_Report on EmoSPARK
Seminar_Report on EmoSPARKSeminar_Report on EmoSPARK
Seminar_Report on EmoSPARK
 
Major presentation on EmoSpeak
Major presentation on EmoSpeakMajor presentation on EmoSpeak
Major presentation on EmoSpeak
 
Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and musicEduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
Eduardo Coutinho - Psychoacoustic cues to emotion in speech prosody and music
 
Best Business Group - Guildford
Best Business Group - GuildfordBest Business Group - Guildford
Best Business Group - Guildford
 
power point,, kedua
power point,, keduapower point,, kedua
power point,, kedua
 
Neurolinguistic Programming Exposition
Neurolinguistic Programming ExpositionNeurolinguistic Programming Exposition
Neurolinguistic Programming Exposition
 
Powerpoint nlp..
Powerpoint nlp..Powerpoint nlp..
Powerpoint nlp..
 
Neuro linguistic programming
Neuro linguistic programmingNeuro linguistic programming
Neuro linguistic programming
 
Emotional Tts
Emotional TtsEmotional Tts
Emotional Tts
 
Neuro-linguistic programming
Neuro-linguistic programmingNeuro-linguistic programming
Neuro-linguistic programming
 
Neuro linguistic programming (nlp)
Neuro linguistic programming (nlp)Neuro linguistic programming (nlp)
Neuro linguistic programming (nlp)
 
Advantages of nlp for business performance ppt
Advantages of nlp for business performance pptAdvantages of nlp for business performance ppt
Advantages of nlp for business performance ppt
 
Nlp in your daily life
Nlp in your daily lifeNlp in your daily life
Nlp in your daily life
 

Ähnlich wie Major presentation

Natural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxNatural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptx
MAKSHAY6
 
Article Summaries
Article SummariesArticle Summaries
Article Summaries
ORhonda
 
Syracuse UniversitySURFACEThe School of Information Studie.docx
Syracuse UniversitySURFACEThe School of Information Studie.docxSyracuse UniversitySURFACEThe School of Information Studie.docx
Syracuse UniversitySURFACEThe School of Information Studie.docx
deanmtaylor1545
 

Ähnlich wie Major presentation (20)

Natural language processing in artificial intelligence
Natural language processing in artificial intelligenceNatural language processing in artificial intelligence
Natural language processing in artificial intelligence
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdf
 
Natural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptxNatural language understandihggjsjng. pptx
Natural language understandihggjsjng. pptx
 
Natural language understanding of chatbots
Natural language understanding of chatbotsNatural language understanding of chatbots
Natural language understanding of chatbots
 
Natural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overviewNatural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overview
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-Speech
 
Article Summaries
Article SummariesArticle Summaries
Article Summaries
 
Natural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and ChallengesNatural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
 
Natural Language Processing from Object Automation
Natural Language Processing from Object Automation Natural Language Processing from Object Automation
Natural Language Processing from Object Automation
 
AI - natural language processing
AI - natural language processingAI - natural language processing
AI - natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Syracuse UniversitySURFACEThe School of Information Studie.docx
Syracuse UniversitySURFACEThe School of Information Studie.docxSyracuse UniversitySURFACEThe School of Information Studie.docx
Syracuse UniversitySURFACEThe School of Information Studie.docx
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Role of language engineering to preserve endangered languages
Role of language engineering to preserve endangered languagesRole of language engineering to preserve endangered languages
Role of language engineering to preserve endangered languages
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLP
NLPNLP
NLP
 
Nlp
NlpNlp
Nlp
 

Major presentation

  • 1. A TOOL TO CONVERT TEXT TO SPEECH WITH EMOTIONS EmoSpeak Submitted to: Ms. Shikha Jain Submitted by: Akriti Saini (10503902) Stuti Shukla (10503870)
  • 2. What is NLP?  Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve natural language understanding, that is, enabling computers to derive meaning from human or natural language input, and others involve natural language generation.  The area of NLP we are concerned with is: Text-to-Speech with emotions.
  • 3. What is Text-to-Speech?  A text-to-speech (TTS) system converts normal language text into speech.  The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood clearly. An intelligible text-to- speech program allows people with visual impairments or reading disabilities to listen to written works on a home computer.
  • 4. Our Tool: EmoSpeak  EmoSpeak converts text to speech in such a way that it takes into account all the emotions of the text and incorporates all the extracted emotions into speech.  The tool first identifies the various emotions in the raw text and then modifies certain characteristics of the voice in order to modulate it, and then expresses the various emotions.
  • 5.  The tool is composed of two parts: a front-end and a back-end. The front-end is responsible for text normalization, pre-processing, or tokenization. The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound.
  • 6. Voice Modulation  One of the goals of text-to-speech(TTS) systems is to produce natural-sounding synthesized speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS.  One of the fundamental NLP task being used is the part-of-speech (POS) tagging of the words in the text.
  • 7.  The voice modulation aspect of the project. i.e. changing certain characteristics of the voice based on a particular emotion has various characteristics of the voice that could be changed such as f0 frequency, f0 contour, f0 range, jitter, nasal duration etc.  These characteristics are changed according to the emotion, which is set by the user.
  • 8. Implementation  For implementation purpose, the first task is to take a pdf file as an input and convert it to the corresponding text file.  The text is then tokenized and decision regarding the class (emotional or neutral) to which it belongs is taken. Upon deciding that the text belongs to the emotional class, it is then required to identify the emotional subcategory to which the text belongs- suppose ‘happy’  The above classification can be done by using WordNet and WordNet-Affect. Now depending on emotions, the voice can be accordingly modulated by varying the intensity, time of pause between the words, pitch of the voice.
  • 9. Diagram for Emotion Extraction from the Text
  • 10. Integrated Literature Survey  By exploring various research papers we infer that there are various approaches available which can be followed to implement our application. Our first task should be to decide upon whether the text falls in emotional or non-emotional (neutral) class.  The important thing that we came to know was that, using WordNet and WordNet affect was the best way in order to identify the emotions in a particular text, because it had the maximum precision among all the other procedures, like LSA
  • 11.  From the literature survey we also conclude that there are various text-to-speech engines available and our foremost task would be to choose an appropriate engine according to the requirements. We came across the researches in which emotional text-to-speech engine has been implemented for the Italian and Arabic languages.
  • 12. Application and Significance of the project  It can be used to inculcate the habits of reading books in the children, as from human psychology it can be inferred that the particular task when done or performed beyond a certain limit, develops a liking for that particular task. So by listening to various type of books children will develop a habit of reading books.  It can also be used to supplement children’s reading classes. A child learns easily especially when things are pointed to him. They can listen to a voice reading the contents of the book as they follow with their eyes. It can be used as a tutor replacing the need of teacher to guide children
  • 13.  By implementing expressive child-directed storytelling in a text-to-speech application, it can be useful in therapeutic education of children with communication disorders. This can be done by helping them to learn how to express their feeling and try to communicate.  It can help visually impaired or the people with certain reading disabilities to get the feel of reading a book.