SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Athens University of Economics
Communicating with PC
 Traditional ways
 Mouse
 Keyboard (printer)
Communicating with PC
 Traditional Ways
 Keyboard
 Mouse
 Printer
 Modern Ways
 touch
 speech
 Movement
Speech
 Speech Synthesis
Speech
 Speech Synthesis
 Speech Recognition
Speech Synthesis
 Input: Text
 Output: Audio stream
Speech Recognition
 Input: Audio stream
 Output: Text
Used In
 Movies 
Used In
 Movies 
 Automatic translations
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
 Mobiles
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
 Movies
 Robotics
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
 Movies
 Robotics
 Games
 Nintendo Wii
 Project Natal (Kinect)
What options do we have today;
 Acapela
What options do we have today;
 Acapela
 Java Speech API
What options do we have today;
 Acapela
 Java Speech API
 Dictaphones
Τι επιλογές έτοσμε σήμερα;
 Acapela
 Java Speech API
 Dictaphones
 etc
 Still a long way to go….
What we see here
 Windows Speech API (SAPI)
with .NET 4.0!
 System.Speech;
Why SAPI;
 free
 Quite accurate
 Easily programmable
History of SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
History of SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
 1998: SAPI 4.0
 C++ wrapper classes
 ActiveX for Visual basic
History of SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
 1998: SAPI 4.0
 C++ wrapper classes
 ActiveX for Visual basic
 2006: SAPI 5.3
 Windows Vista
Ιστορία τοσ SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
 1998: SAPI 4.0
 C++ wrapper classes
 ActiveX for Visual basic
 2006: SAPI 5.3
 Windows Vista
 2009: SAPI 5.4
 Windows 7
Αλλαγές στα Windows Vista & 7
 Αναβαθμισμένη Speech Recognition
engine
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
 Checks the UI operation
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
 Checks the UI operation
 Supports more languages -
 English US & UK, Chinese traditional & simplified,
Japanese, German, French, Spanish
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
 Checks the UI operation
 Supports more languages -
 English US & UK, Chinese traditional & simplified,
Japanese, German, French, Spanish
 Managed code speech API (.ΝΕΤ 3.0)
What we use
Technologies
• .NET Framework 4.0
• C# programming language
• Windows Presentation Foundation
Tools
• Windows 7
• Visual Studio 2010
• FREE @ MSDNAA
Windows Speech Synthesis
 Converts words into voice
 Internet settings like:
 intensity
 Pronunciation (voice)
 Introducing WAV files
 By default, uses Microsoft Anna
DEMO 1
Windows Speech Recognition
 Uses machine learning algorithms
 Continuously Trained
 Trains using the user’s voice
 Can be used for remote control of the
PC 
DEMO 2
Links
 Venus
 StudentGuru
 Exploring Speech Recognition &
Synthesis
 Speech Recognition with C# - Dictation
and custom grammar
Thank you 
Vangos Pterneas
www.vangos.eu
www.vangos.eu/blog

Weitere ähnliche Inhalte

Was ist angesagt?

Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project reportSarang Afle
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognitionsowmith chatlapally
 
Silent sound-technology ppt final
Silent sound-technology ppt finalSilent sound-technology ppt final
Silent sound-technology ppt finalLohit Dalal
 
Eucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaEucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaAmar Myana
 
Computer science seminar topics
Computer science seminar topicsComputer science seminar topics
Computer science seminar topics123seminarsonly
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Digital Scent Technology
Digital Scent TechnologyDigital Scent Technology
Digital Scent TechnologyChaitanya Ram
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognitionananth
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition systemavinash raibole
 
Ppt on use of biomatrix in secure e trasaction
Ppt on use of biomatrix in secure e trasactionPpt on use of biomatrix in secure e trasaction
Ppt on use of biomatrix in secure e trasactionDevyani Vaidya
 
Student information chatbot final report
Student information chatbot  final report Student information chatbot  final report
Student information chatbot final report jaysavani5
 

Was ist angesagt? (20)

Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project report
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 
Voice assistant ppt
Voice assistant pptVoice assistant ppt
Voice assistant ppt
 
VOICE BROWSER
VOICE BROWSERVOICE BROWSER
VOICE BROWSER
 
Silent sound-technology ppt final
Silent sound-technology ppt finalSilent sound-technology ppt final
Silent sound-technology ppt final
 
E ball technology
E ball technologyE ball technology
E ball technology
 
Voice Assistant (1).pdf
Voice Assistant (1).pdfVoice Assistant (1).pdf
Voice Assistant (1).pdf
 
Eucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebulaEucalyptus, Nimbus & OpenNebula
Eucalyptus, Nimbus & OpenNebula
 
Cloud Mashup
Cloud MashupCloud Mashup
Cloud Mashup
 
Computer science seminar topics
Computer science seminar topicsComputer science seminar topics
Computer science seminar topics
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Voice morphing-
Voice morphing-Voice morphing-
Voice morphing-
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Digital Scent Technology
Digital Scent TechnologyDigital Scent Technology
Digital Scent Technology
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 
Ppt on use of biomatrix in secure e trasaction
Ppt on use of biomatrix in secure e trasactionPpt on use of biomatrix in secure e trasaction
Ppt on use of biomatrix in secure e trasaction
 
Student information chatbot final report
Student information chatbot  final report Student information chatbot  final report
Student information chatbot final report
 
Digital smell technology
Digital smell technologyDigital smell technology
Digital smell technology
 

Andere mochten auch

Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NETMandeep Cheema
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySeminar Links
 
Text to speech and word predicition
Text to speech and word predicitionText to speech and word predicition
Text to speech and word predicitionHindie Dershowitz
 
Text to Speech PowerPoint
Text to Speech PowerPointText to Speech PowerPoint
Text to Speech PowerPointmatthewmahony
 
Gujarati Text-to-Speech Presentation
Gujarati Text-to-Speech PresentationGujarati Text-to-Speech Presentation
Gujarati Text-to-Speech Presentationsamyakbhuta
 
Developing with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsDeveloping with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsNick Landry
 
Voice based email for blinds
Voice based email for blindsVoice based email for blinds
Voice based email for blindsArjun AJ
 
Speech Recognition as a User Interface
Speech Recognition as a User InterfaceSpeech Recognition as a User Interface
Speech Recognition as a User InterfaceJared Sheehan
 
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...TAUS - The Language Data Network
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentationshahinmehr
 
Voice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinksVoice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinksSJones87
 
Myanmar Text To Speech Engine
Myanmar Text To Speech EngineMyanmar Text To Speech Engine
Myanmar Text To Speech EngineThin Zar Phyo
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by IqbalIqbal
 
IBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API LabIBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API LabColin McCabe
 
Good presentation!
Good presentation!Good presentation!
Good presentation!Arry Arman
 

Andere mochten auch (19)

Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NET
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Text to speech and word predicition
Text to speech and word predicitionText to speech and word predicition
Text to speech and word predicition
 
Text to Speech PowerPoint
Text to Speech PowerPointText to Speech PowerPoint
Text to Speech PowerPoint
 
Gujarati Text-to-Speech Presentation
Gujarati Text-to-Speech PresentationGujarati Text-to-Speech Presentation
Gujarati Text-to-Speech Presentation
 
Developing with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsDeveloping with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile Apps
 
Voice based email for blinds
Voice based email for blindsVoice based email for blinds
Voice based email for blinds
 
Odf2 Daisy
Odf2 DaisyOdf2 Daisy
Odf2 Daisy
 
E speak aegis-workshop
E speak aegis-workshopE speak aegis-workshop
E speak aegis-workshop
 
Speech Recognition as a User Interface
Speech Recognition as a User InterfaceSpeech Recognition as a User Interface
Speech Recognition as a User Interface
 
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentation
 
Voice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinksVoice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinks
 
Myanmar Text To Speech Engine
Myanmar Text To Speech EngineMyanmar Text To Speech Engine
Myanmar Text To Speech Engine
 
PPT on Android
PPT on AndroidPPT on Android
PPT on Android
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
IBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API LabIBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API Lab
 
Good presentation!
Good presentation!Good presentation!
Good presentation!
 

Ähnlich wie Text to-speech & voice recognition

windows CE
windows CEwindows CE
windows CEbretorio
 
Comparison of Voice Assistant SDKs for Embedded Linux Devices
 Comparison of Voice Assistant SDKs for Embedded Linux Devices Comparison of Voice Assistant SDKs for Embedded Linux Devices
Comparison of Voice Assistant SDKs for Embedded Linux DevicesLeon Anavi
 
Howcasts: Instructional Videos for Library Users
Howcasts:  Instructional Videos for Library UsersHowcasts:  Instructional Videos for Library Users
Howcasts: Instructional Videos for Library UsersJeff Lewandowski
 
Automation Open Source tools
Automation Open Source toolsAutomation Open Source tools
Automation Open Source toolsQA Club Kiev
 
Formate factory22
Formate factory22Formate factory22
Formate factory22Asma Saeed
 
Developing FirefoxOS
Developing FirefoxOSDeveloping FirefoxOS
Developing FirefoxOSFred Lin
 
Development workflow
Development workflowDevelopment workflow
Development workflowSigsiu.NET
 
Windows 7 uudistuksia
Windows 7 uudistuksiaWindows 7 uudistuksia
Windows 7 uudistuksiaVaihde 7
 
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesCloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesVMware Tanzu
 
Adobe premiere pro cs6
Adobe premiere pro cs6Adobe premiere pro cs6
Adobe premiere pro cs6K-M1
 
Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...Verhaert Masters in Innovation
 
Tracking your Technical Debt with Sonarqube
Tracking your Technical Debt with SonarqubeTracking your Technical Debt with Sonarqube
Tracking your Technical Debt with SonarqubePuppet
 
System development using visual studio
System development using visual studioSystem development using visual studio
System development using visual studiojeff23_athisbest
 
Immutable Server generation: The new App Deployment
Immutable Server generation: The new App DeploymentImmutable Server generation: The new App Deployment
Immutable Server generation: The new App DeploymentAxel Fontaine
 
Cloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San FranciscoCloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San FranciscoVMware Tanzu
 
Cloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - DenverCloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - DenverVMware Tanzu
 
Cloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: GoogleCloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: GoogleVMware Tanzu
 
Cloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - DallasCloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - DallasVMware Tanzu
 

Ähnlich wie Text to-speech & voice recognition (20)

windows CE
windows CEwindows CE
windows CE
 
Comparison of Voice Assistant SDKs for Embedded Linux Devices
 Comparison of Voice Assistant SDKs for Embedded Linux Devices Comparison of Voice Assistant SDKs for Embedded Linux Devices
Comparison of Voice Assistant SDKs for Embedded Linux Devices
 
Howcasts: Instructional Videos for Library Users
Howcasts:  Instructional Videos for Library UsersHowcasts:  Instructional Videos for Library Users
Howcasts: Instructional Videos for Library Users
 
Automation Open Source tools
Automation Open Source toolsAutomation Open Source tools
Automation Open Source tools
 
Formate factory22
Formate factory22Formate factory22
Formate factory22
 
Developing FirefoxOS
Developing FirefoxOSDeveloping FirefoxOS
Developing FirefoxOS
 
Development workflow
Development workflowDevelopment workflow
Development workflow
 
Windows 7 uudistuksia
Windows 7 uudistuksiaWindows 7 uudistuksia
Windows 7 uudistuksia
 
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesCloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
 
Adobe premiere pro cs6
Adobe premiere pro cs6Adobe premiere pro cs6
Adobe premiere pro cs6
 
Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...
 
Tracking your Technical Debt with Sonarqube
Tracking your Technical Debt with SonarqubeTracking your Technical Debt with Sonarqube
Tracking your Technical Debt with Sonarqube
 
System development using visual studio
System development using visual studioSystem development using visual studio
System development using visual studio
 
Immutable Server generation: The new App Deployment
Immutable Server generation: The new App DeploymentImmutable Server generation: The new App Deployment
Immutable Server generation: The new App Deployment
 
Jarvisproject
JarvisprojectJarvisproject
Jarvisproject
 
Cloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San FranciscoCloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San Francisco
 
Cloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - DenverCloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - Denver
 
Cloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: GoogleCloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: Google
 
Cloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - DallasCloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - Dallas
 
Canvas real speaker
Canvas real speakerCanvas real speaker
Canvas real speaker
 

Text to-speech & voice recognition