SlideShare ist ein Scribd-Unternehmen logo
1 von 40
JavaScript
Speech
Recognition
Applicationsand maybe some other HTML5 goodness
Who is this guy?
core contributor
nutty about
IBM'er
@macdonst
macdonst on Github
simonmacdonald.com
PhoneGap
speech recognition
Why do I care about speech rec?
Here's a conversation between
two Cape Bretoners
P1: jeet?
P2: naw, jew?
P1: naw, t'rly t'eet bye.
And here's the translation
P1: jeet?
P1: Did you eat?
P2: naw, jew?
P2: No, did you?
P1: naw, t'rly t'eet bye.
P1: No, it's too early to eat buddy.
What is
speech
recognition?
Speech
recognition is
the process of
translating
the spoken
word into text.
The process of speech
rec includes
1. Record and digitize the audio data
2. Split data into phonemes
3. Apply the phonemes to the recognition model
4. Analyze the results against the grammar
5. Return a confidence weighted result
Basically...
So how do we
add speech
rec to our
app?
You may look at the W3C
Speech API Specification
but only Chrome on the
desktop has
implemented that spec
But that's okay!
The spec looks like this:
interface SpeechRecognition : EventTarget {
// recognition parameters
attribute SpeechGrammarList grammars;
attribute DOMString lang;
attribute boolean continuous;
attribute boolean interimResults;
attribute unsigned long maxAlternatives;
attribute DOMString serviceURI;
// methods to drive the speech interaction
void start();
void stop();
void abort();
};
With additional event
methods to control
behaviour:
attribute EventHandler onaudiostart;
attribute EventHandler onsoundstart;
attribute EventHandler onspeechstart;
attribute EventHandler onspeechend;
attribute EventHandler onsoundend;
attribute EventHandler onaudioend;
attribute EventHandler onresult;
attribute EventHandler onnomatch;
attribute EventHandler onerror;
attribute EventHandler onstart;
attribute EventHandler onend;
Let's recognize some
speech
Click to Speak
hello world
var recognition = new SpeechRecognition();
recognition.onresult = function(event) {
if (event.results.length > 0) {
var test1 = document.getElementById("test1");
test1.innerHTML = event.results[0][0].transcript;
}
};
recognition.start();
So that's
pretty cool...
...if taking
dictation gets
you going
But I want to
do something
more exciting
with the
result
Let's do something a little
less trivial
Click to Speak
recognition.onresult = function(event) {
var result = event.results[0][0].transcript;
var music = document.getElementById("music");
switch(result) {
case "jazz":
music.src="jazz.mp3";
music.play();
break;
case "rock":
music.src="rock.mp3";
music.play();
break;
case "stop":
default:
music.pause();
}
};
Which seems
much cooler
to me
Let's ask the web a
question
Click to Speak
Q: what day is it today
A: Friday July 19th, 2013
Works pretty
good...
...but ugly!
Let's style our
button with
some CSS
+
=
<a class="speechinput">
<img src="images/mic.png">
</a>
#speechinput input {
cursor:pointer;
margin:auto;
margin:15px;
color:transparent;
background-color:transparent;
border:5px;
width:15px;
-webkit-transform: scale(3.0, 3.0);
}
And we'll add some color
using
by Nicholas Gallagher
Speech
Bubbles
Pure-CSS-Speech-Bubbles
Then pull it all
together!
what is steve jobs middle name
Steven Paul Jobs
But wait, why
am I using my
eyes like a
sucker
We'll output the answer
using SpeechSynthesis
The SpeechSynthesis
spec looks like this:
interface SpeechSynthesis {
readonly attribute boolean pending;
readonly attribute boolean speaking;
readonly attribute boolean paused;
void speak(SpeechSynthesisUtterance utterance);
void cancel();
void pause();
void resume();
SpeechSynthesisVoiceList getVoices();
};
The
SpeechSynthesisUtterance
spec looks like this:
interface SpeechSynthesisUtterance : EventTarget {
attribute DOMString text;
attribute DOMString lang;
attribute DOMString voiceURI;
attribute float volume;
attribute float rate;
attribute float pitch;
};
With additional event
methods to control
behaviour:
attribute EventHandler onstart;
attribute EventHandler onend;
attribute EventHandler onerror;
attribute EventHandler onpause;
attribute EventHandler onresume;
attribute EventHandler onmark;
attribute EventHandler onboundary;
who won the stanley cup this year
Chicago Blackhawks
Plugin repo's
SpeechRecognitionPlugin -
https://github.com/macdonst/SpeechRecognitionPlugin
SpeechSynthesisPlugin -
https://github.com/macdonst/SpeechSynthesisPlugin
THE END

Weitere ähnliche Inhalte

Ähnlich wie PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition

Voicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.comVoicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.com
Voxeo Corp
 
Device Emulation with OSGi and Flash
Device Emulation with OSGi and FlashDevice Emulation with OSGi and Flash
Device Emulation with OSGi and Flash
georgemesesan
 
Prophet - Beijing Perl Workshop
Prophet - Beijing Perl WorkshopProphet - Beijing Perl Workshop
Prophet - Beijing Perl Workshop
Jesse Vincent
 
Passing The Joel Test In The PHP World
Passing The Joel Test In The PHP WorldPassing The Joel Test In The PHP World
Passing The Joel Test In The PHP World
Lorna Mitchell
 
Programming For Designers V3
Programming For Designers V3Programming For Designers V3
Programming For Designers V3
sqoo
 
Hey man, can I get a clue?
Hey man, can I get a clue?Hey man, can I get a clue?
Hey man, can I get a clue?
Voxeo Corp
 

Ähnlich wie PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition (20)

JavaScript Speech Recognition
JavaScript Speech RecognitionJavaScript Speech Recognition
JavaScript Speech Recognition
 
Voicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.comVoicecon - Mashups with Tropo.com
Voicecon - Mashups with Tropo.com
 
Device Emulation with OSGi and Flash
Device Emulation with OSGi and FlashDevice Emulation with OSGi and Flash
Device Emulation with OSGi and Flash
 
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
Destruction, Decapods and Doughnuts: Continuous Delivery for Audio & Video Fa...
 
NDC 2011 - The FLUID Principles
NDC 2011 - The FLUID PrinciplesNDC 2011 - The FLUID Principles
NDC 2011 - The FLUID Principles
 
通往測試最高殿堂的旅程 - GTAC 2016
通往測試最高殿堂的旅程 - GTAC 2016通往測試最高殿堂的旅程 - GTAC 2016
通往測試最高殿堂的旅程 - GTAC 2016
 
What lies beneath
What lies beneathWhat lies beneath
What lies beneath
 
Prophet - Beijing Perl Workshop
Prophet - Beijing Perl WorkshopProphet - Beijing Perl Workshop
Prophet - Beijing Perl Workshop
 
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
 
Passing The Joel Test In The PHP World
Passing The Joel Test In The PHP WorldPassing The Joel Test In The PHP World
Passing The Joel Test In The PHP World
 
mri-bp2015
mri-bp2015mri-bp2015
mri-bp2015
 
Spring, CDI, Jakarta EE good parts
Spring, CDI, Jakarta EE good partsSpring, CDI, Jakarta EE good parts
Spring, CDI, Jakarta EE good parts
 
Design and Evolution of cyber-dojo
Design and Evolution of cyber-dojoDesign and Evolution of cyber-dojo
Design and Evolution of cyber-dojo
 
London Web Performance Meetup: Performance for mortal companies
London Web Performance Meetup: Performance for mortal companiesLondon Web Performance Meetup: Performance for mortal companies
London Web Performance Meetup: Performance for mortal companies
 
Programming For Designers V3
Programming For Designers V3Programming For Designers V3
Programming For Designers V3
 
Otto AI
Otto AIOtto AI
Otto AI
 
DevDay.lk - Bare Knuckle Web Development
DevDay.lk - Bare Knuckle Web DevelopmentDevDay.lk - Bare Knuckle Web Development
DevDay.lk - Bare Knuckle Web Development
 
Hey man, can I get a clue?
Hey man, can I get a clue?Hey man, can I get a clue?
Hey man, can I get a clue?
 
C 2
C 2C 2
C 2
 
BSides LA/PDX
BSides LA/PDXBSides LA/PDX
BSides LA/PDX
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition