SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Oliver Scheer
Senior Technical Evangelist
Microsoft Deutschland
http://the-oliver.com
Using Speech
Topics
• Speech on Windows Phone 8
• Speech synthesis
• Controlling applications using speech
• Voice command definition files
• Building conversations
• Selecting application entry points
• Simple speech input
• Speech input and grammars
• Using Grammar Lists
Speech on Windows
Phone 8
3
Windows Phone Speech Support
• Windows Phone 7.x had voice support built into the operating system
• Programs and phone features could be started by voice commands e.g “Start MyApp”
• Incoming SMS messages could be read to the user
• The user could compose and send SMS messages
• Windows 8 builds on this to allow applications to make use of speech
• Applications can speak messages using the Speech Synthesis feature
• Applications can be started and given commands
• Applications can accept commands using voice input
• Speech recognition requires an internet connection, but Speech Synthesis does not
4
Speech Synthesis
5
Enabling Speech Synthesis
• If an application wishes to use speech output the
ID_CAP_SPEECH_RECOGNITION capability must
be enabled in WMAppManifest.xml
• The application can also reference the Synthesis
namespace
3/19/20146
using Windows.Phone.Speech.Synthesis;
Simple Speech
• The SpeechSynthesizer class provides a simple way to produce speech
• The SpeakTextAsync method speaks the content of the string using the default voice
• Note that the method is an asynchronous one, so the calling method must use the async
modifier
• Speech output does not require a network connection
3/19/20147
async void CheeseLiker()
{
SpeechSynthesizer synth = new SpeechSynthesizer();
await synth.SpeakTextAsync("I like cheese.");
}
Selecting a language
• The default speaking voice is selected automatically from the locale set for the phone
• The InstalledVoices class provides a list of all the voices available on the phone
• The above code selects a French voice
3/19/20148
// Query for a voice that speaks French.
var frenchVoices = from voice in InstalledVoices.All
where voice.Language == "fr-FR"
select voice;
// Set the voice as identified by the query.
synth.SetVoice(frenchVoices.ElementAt(0));
Demo
Demo 1: Voice Selection
Speech Synthesis Markup Language
• You can use Speech Synthesis Markup Language (SSML) to control the spoken output
• Change the voice, pitch, rate, volume, pronunciation and other characteristics
• Also allows the inclusion of audio files into the spoken output
• You can also use the Speech synthesizer to speak the contents of a file
3/19/201410
<?xml version="1.0" encoding="ISO-8859-1"?>
<speak version="1.0"
xmlns=http://www.w3.org/2001/10/synthesis xml:lang="en-US">
<p> Your <say-as interpret-as="ordinal">1st</say-as> request was for
<say-as interpret-as="cardinal">1</say-as> room on
<say-as interpret-as="date" format="mdy">10/19/2010</say-as> ,
arriving at <say-as interpret-as="time" format="hms12">12:35pm</say-as>.
</p>
</speak>
Controlling
Applications using
Voice Commands
11
Application Launching using Voice command
• The Voice Command feature of Windows Phone 7 allowed users to start applications
• In Windows Phone 8 the feature has been expanded to allow the user to request data from
the application in the start command
• The data will allow a particular application page to be selected when the program starts
and can also pass request information to that page
• To start using Voice Commands you must Create a Voice Command Definition (VCD) file
that defines all the spoken commands
• The application then calls a method to register the words and phrases the first time it is
run
3/19/201412
The Fortune Teller Program
• The Fortune Teller program will tell your
future
• You can ask it questions and it will
display replies
• It could also speak them
• Some of the spoken commands activate
different pages of the application and
others are processed by the application
when it starts running
3/19/201413
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
• This is the “money” question:
“Fortune Teller Will I find money”
3/19/201414
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201415
• This is the phrase the
user says to trigger the
command
• All of the Fortune Teller
commands start with
this phrase
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201416
• This is example text
that will be displayed
by the help for this app
as an example of the
commands the app
supports
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201417
• This is the command
name
• This can be obtained
from the URL by the
application when it
starts
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201418
• This is the example for
this specific command
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201419
• This is the trigger
phrase for this
command
• It can be a sequence of
words
• The user must prefix
this sequence with the
words “Fortune Teller”
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201420
• This is the phraselist for
the command
• The user can say any of
the words in the
phraselist to match this
command
• The application can
determine the phrase
used
• The phraselist can be
changed by the
application dynamically
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201421
• This is the spoken
feedback from the
command
• The feedback will insert
the phrase item used to
activate the command
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201422
• This is the url for the
page to be activated by
the command
• Commands can go to
different pages, or all go
to MainPage.xaml if
required
<CommandPrefix> Fortune Teller </CommandPrefix>
<Example> Will I find money </Example>
<Command Name="showMoney">
<Example> Will I find money </Example>
<ListenFor> [Will I find] {futureMoney} </ListenFor>
<Feedback> Showing {futureMoney} </Feedback>
<Navigate Target="/money.xaml"/>
</Command>
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
The Voice Command Definition (VCD) file
3/19/201423
• These are the phrases
that can be used at the
end of the command
• The application can
modify the phrase list of
a command dynamically
• It could give movie
times for films by
name
Installing a Voice Command Definition (VCD) file
• The VCD file can be loaded from the application or from any URI
• In this case it is just a file that has been added to the project and marked as Content
• The VCD can also be changed by the application when it is running
• The voice commands for an application are loaded into the voice command service when
the application runs
• The application must run at least once to configure the voice commands
3/19/201424
async void setupVoiceCommands()
{
await VoiceCommandService.InstallCommandSetsFromFileAsync(
new Uri("ms-appx:///VCDCommands.xml", UriKind.RelativeOrAbsolute));
}
Launching Your App With a Voice Command
• If the user now presses and holds the Windows button, and says:
Fortune Teller, Will I find gold?
the Phone displays “Showing gold”
• It then launches your app and navigates to the page associated with this command, which is
/Money.xaml
• The query string passed to the page looks like this:
"/?voiceCommandName=showMoney&futureMoney=gold&reco=Fortune%20Teller%Will%20I%20find%20gold"
3/19/201425
Command
Name
Phaselist
Name
Recognized
phrase
Whole phrase as it
was recognized
Handling Voice Commands
• This code runs in the OnNavigatedTo method of a target page
• Can also check for the voice command phrase that was used
3/19/201426
if (e.NavigationMode == System.Windows.Navigation.NavigationMode.New) {
if (NavigationContext.QueryString.ContainsKey("voiceCommandName")) {
string command = NavigationContext.QueryString["voiceCommandName"];
switch command) {
case "tellJoke":
messageTextBlock.Text = "Insert really funny joke here";
break;
// Add cases for other commands.
default:
messageTextBlock.Text = "Sorry, what you said makes no sense.";
break;
}
}
}
Identifying phrases
• The navigation context can be queried to determine the phrase used to trigger the
navigation
• In this case the program is selecting between the phrase used in the “riches” question
3/19/201427
<PhraseList Label="futureMoney">
<Item> money </Item>
<Item> riches </Item>
<Item> gold </Item>
</PhraseList>
string moneyPhrase = NavigationContext.QueryString["futureMoney"];
Demo
Demo 2: Fortune Teller
Modifying the phrase list
• An application can modify a phrase list when it is running
• It cannot add new commands however
• This would allow a program to implement behaviours such as:
“Movie Planner tell me showings for Batman”
3/19/201429
VoiceCommandSet fortuneVcs = VoiceCommandService.InstalledCommandSets["en-US"];
await fortuneVcs.UpdatePhraseListAsync("futureMoney",
new string[] { "money", "cash", "wonga", "spondoolicks" });
Simple Speech Input
30
Recognizing Free Speech
• A Windows Phone application can recognise words and
phrases and pass them to your program
• From my experiments it seems quite reliable
• Note that a network connection is required for this
feature
• Your application can just use the speech string directly
• The standard “Listening” interface is displayed over
your application
3/19/201431
Simple Speech Recognition
• The above method checks for a successful response
• By default the system uses the language settings on the Phone
3/19/201432
SpeechRecognizerUI recoWithUI;
async private void ListenButton_Click(object sender, RoutedEventArgs e)
{
this.recoWithUI = new SpeechRecognizerUI();
SpeechRecognitionUIResult recoResult =
await recoWithUI.RecognizeWithUIAsync();
if ( recoResult.ResultStatus == SpeechRecognitionUIStatus.Succeeded )
MessageBox.Show(string.Format("You said {0}.",
recoResult.RecognitionResult.Text));
}
Customizing Speech Recognition
• InitialSilenceTimeout
• The time that the speech recognizer will wait until it hears speech.
• The default setting is 5 seconds.
BabbleTimeout
• The time that the speech recognizer will listen while it hears background noise
• The default setting is 0 seconds (the feature is not activated).
• EndSilenceTimeout
• The time interval during which the speech recognizer will wait before finalizing the
recognition operation
• The default setting is 150 milliseconds.
3/19/201433
Customizing Speech Recognition
• A program can also select whether or not the speech recognition echoes back the user
input and displays it in a message box
• The code above also sets timeout values
3/19/201434
recoWithUI.Settings.ReadoutEnabled = false; // don't read the saying back
recoWithUI.Settings.ShowConfirmation = false; // don't show the confirmation
recoWithUI.Recognizer.Settings.InitialSilenceTimeout = TimeSpan.FromSeconds(6.0);
recoWithUI.Recognizer.Settings.BabbleTimeout = TimeSpan.FromSeconds(4.0);
recoWithUI.Recognizer.Settings.EndSilenceTimeout = TimeSpan.FromSeconds(1.2);
Handling Errors
• An application can bind to events which indicate problems with the audio input
• There is also an event fired when the state of the capture changes
3/19/201435
recoWithUI.Recognizer.AudioProblemOccurred +=Recognizer_AudioProblemOccurred;
recoWithUI.Recognizer.AudioCaptureStateChanged +=
Recognizer_AudioCaptureStateChanged;
...
void Recognizer_AudioProblemOccurred(SpeechRecognizer sender,
SpeechAudioProblemOccurredEventArgs args)
{
MessageBox.Show("PLease speak more clearly");
}
Using Grammars
36
Grammars and Speech input
• The simple speech recognition we have seen so far uses the “Short Dictation” grammar
which just captures the text and returns it to the application
• You can add your own grammars that will structure the conversation between the user and
the application
• Grammars can be created using the Speech Recognition Grammar Specification (SRGS)
Version 1.0 and stored as XML files loaded when the application runs
• This is a little complex, but worth the effort if you want to create applications with rich
language interaction with the user
• If the application just needs to identify particular commands you can use a grammar list to
achieve this
3/19/201437
Using Grammar Lists
• To create a Grammar List an application defines an array of strings that form the words in
the list
• The Grammar can then be added to the recognizer and given a name
• Multiple grammar lists can be added to a grammar recognizer
• The recognizer will now resolve any of the words in the lists that have been supplied
3/19/201438
string [] strengthNames = { "weak", "mild", "medium", "strong", "english"};
recoWithUI.Recognizer.Grammars.AddGrammarFromList("cheeseStrength",
strengthNames);
Enabling and Disabling Grammar Lists
• An application can enable or disable particular grammars before a recognition action
• It is also possible to set relative weightings of grammar lists
• The text displayed as part of the listen operation can also be set, as shown above
3/19/201439
recoWithUI.Settings.ListenText = "How strong do you like your cheese?";
recoWithUI.Recognizer.Grammars["cheeseStrength"].Enabled = true;
SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();
Determining the confidence in the result
• An application can determine the confidence that the speech system has in the result that
was obtained
• Result values are High, Medium, Low, Rejected
3/19/201440
SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();
if ( recoResult.RecognitionResult.TextConfidence ==
SpeechRecognitionConfidence.High )
{
// select cheese based on strength value
}
Matching Multiple Grammars
• If the spoken input matches multiple grammars a program can obtain a list of the
alternative results using recoResult.RecognitionResult.GetAlternatives
• The list is supplied in order of confidence
• The application can then determine the best fit from the context of the voice request
• This list is also provided if the request used a more complex grammar
3/19/201441
var alternatives = recoResult.RecognitionResult.GetAlternates(3);
Profanity
• Words that are recognised as profanities are not displayed in the response from a
recognizer command
• The speech system will also not repeat them
• They are enclosed in <Profanity> </Profanity> when supplied to the program that
receives the speech data
3/19/201442
Review
• Applications in Windows Phone 8 can use speech generation and recognition to interact
with users
• Applications can produce speech output from text files which can be marked up with Speech
Synthesis Markup Language (SSML) to include sound files
• Applications can be started and provided with initial commands by registering a Voice
Command Definition File with the Windows Phone
• The commands can be picked up when a page is loaded, or the commands specify a
particular page to load
• An application can modify the phrase part of a command to change the activation
commands
• Applications can recognise speech using complex grammars or simple word lists
43
The information herein is for informational
purposes only an represents the current view of
Microsoft Corporation as of the date of this
presentation. Because Microsoft must respond
to changing market conditions, it should not be
interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the
accuracy of any information provided after the
date of this presentation.
© 2012 Microsoft Corporation.
All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION
IN THIS PRESENTATION.

Weitere ähnliche Inhalte

Ähnlich wie Windows Phone 8 - 14 Using Speech

Cortana for Windows Phone
Cortana for Windows PhoneCortana for Windows Phone
Cortana for Windows PhoneKunal Chowdhury
 
Integrando nuestra Aplicación Windows Phone con Cortana
Integrando nuestra Aplicación Windows Phone con CortanaIntegrando nuestra Aplicación Windows Phone con Cortana
Integrando nuestra Aplicación Windows Phone con CortanaJavier Suárez Ruiz
 
Integrating cortana with wp8 app
Integrating cortana with wp8 appIntegrating cortana with wp8 app
Integrating cortana with wp8 appAbhishek Sur
 
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...Nick Landry
 
TDC 2014 - Cortana
TDC 2014 - CortanaTDC 2014 - Cortana
TDC 2014 - Cortanatmonaco
 
Hands free with cortana
Hands free with cortanaHands free with cortana
Hands free with cortanaFiyaz Hasan
 
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !Microsoft
 
Speech for Windows Phone 8
Speech for Windows Phone 8Speech for Windows Phone 8
Speech for Windows Phone 8Marco Massarelli
 
Speech for Windows Phone 8
Speech for Windows Phone 8Speech for Windows Phone 8
Speech for Windows Phone 8Appsterdam Milan
 
Word Talk Tutorial
Word Talk TutorialWord Talk Tutorial
Word Talk Tutorialpaulhami
 
Rapid Prototyping Chatter with a PHP/Hack Canvas App on Heroku
Rapid Prototyping Chatter with a PHP/Hack Canvas App on HerokuRapid Prototyping Chatter with a PHP/Hack Canvas App on Heroku
Rapid Prototyping Chatter with a PHP/Hack Canvas App on HerokuSalesforce Developers
 
How to Imitate Celebrity Voice
How to Imitate Celebrity VoiceHow to Imitate Celebrity Voice
How to Imitate Celebrity Voiceaudio4fun
 
Flashmedia gateway docs_quickstart
Flashmedia gateway docs_quickstartFlashmedia gateway docs_quickstart
Flashmedia gateway docs_quickstartSerge Florov
 
Tropo Presentation at the Telecom API Workshop
Tropo Presentation at the Telecom API WorkshopTropo Presentation at the Telecom API Workshop
Tropo Presentation at the Telecom API WorkshopAlan Quayle
 
Multi Site Manager (25 Jan).pptx
Multi Site Manager (25 Jan).pptxMulti Site Manager (25 Jan).pptx
Multi Site Manager (25 Jan).pptxshivani garg
 
Android design patterns
Android design patternsAndroid design patterns
Android design patternsVitali Pekelis
 
実例で学ぶ、明日から使えるSpring Boot Tips #jsug
実例で学ぶ、明日から使えるSpring Boot Tips #jsug実例で学ぶ、明日から使えるSpring Boot Tips #jsug
実例で学ぶ、明日から使えるSpring Boot Tips #jsugToshiaki Maki
 

Ähnlich wie Windows Phone 8 - 14 Using Speech (20)

Cortana for Windows Phone
Cortana for Windows PhoneCortana for Windows Phone
Cortana for Windows Phone
 
Integrando nuestra Aplicación Windows Phone con Cortana
Integrando nuestra Aplicación Windows Phone con CortanaIntegrando nuestra Aplicación Windows Phone con Cortana
Integrando nuestra Aplicación Windows Phone con Cortana
 
Integrating cortana with wp8 app
Integrating cortana with wp8 appIntegrating cortana with wp8 app
Integrating cortana with wp8 app
 
Hey Cortana!
Hey Cortana!Hey Cortana!
Hey Cortana!
 
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
Beyond Cortana & Siri: Using Speech Recognition & Speech Synthesis for the Ne...
 
TDC 2014 - Cortana
TDC 2014 - CortanaTDC 2014 - Cortana
TDC 2014 - Cortana
 
Hands free with cortana
Hands free with cortanaHands free with cortana
Hands free with cortana
 
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
 
Speech for Windows Phone 8
Speech for Windows Phone 8Speech for Windows Phone 8
Speech for Windows Phone 8
 
Speech for Windows Phone 8
Speech for Windows Phone 8Speech for Windows Phone 8
Speech for Windows Phone 8
 
Word Talk Tutorial
Word Talk TutorialWord Talk Tutorial
Word Talk Tutorial
 
Rapid Prototyping Chatter with a PHP/Hack Canvas App on Heroku
Rapid Prototyping Chatter with a PHP/Hack Canvas App on HerokuRapid Prototyping Chatter with a PHP/Hack Canvas App on Heroku
Rapid Prototyping Chatter with a PHP/Hack Canvas App on Heroku
 
How to Imitate Celebrity Voice
How to Imitate Celebrity VoiceHow to Imitate Celebrity Voice
How to Imitate Celebrity Voice
 
Php with my sql
Php with my sqlPhp with my sql
Php with my sql
 
Flashmedia gateway docs_quickstart
Flashmedia gateway docs_quickstartFlashmedia gateway docs_quickstart
Flashmedia gateway docs_quickstart
 
Tropo Presentation at the Telecom API Workshop
Tropo Presentation at the Telecom API WorkshopTropo Presentation at the Telecom API Workshop
Tropo Presentation at the Telecom API Workshop
 
Multi Site Manager (25 Jan).pptx
Multi Site Manager (25 Jan).pptxMulti Site Manager (25 Jan).pptx
Multi Site Manager (25 Jan).pptx
 
Android design patterns
Android design patternsAndroid design patterns
Android design patterns
 
実例で学ぶ、明日から使えるSpring Boot Tips #jsug
実例で学ぶ、明日から使えるSpring Boot Tips #jsug実例で学ぶ、明日から使えるSpring Boot Tips #jsug
実例で学ぶ、明日から使えるSpring Boot Tips #jsug
 
Session no 1
Session no 1Session no 1
Session no 1
 

Mehr von Oliver Scheer

Windows Phone 8 - 12 Network Communication
Windows Phone 8 - 12 Network CommunicationWindows Phone 8 - 12 Network Communication
Windows Phone 8 - 12 Network CommunicationOliver Scheer
 
Windows Phone 8 - 11 App to App Communication
Windows Phone 8 - 11 App to App CommunicationWindows Phone 8 - 11 App to App Communication
Windows Phone 8 - 11 App to App CommunicationOliver Scheer
 
Windows Phone 8 - 10 Using Phone Resources
Windows Phone 8 - 10 Using Phone ResourcesWindows Phone 8 - 10 Using Phone Resources
Windows Phone 8 - 10 Using Phone ResourcesOliver Scheer
 
Windows Phone 8 - 9 Push Notifications
Windows Phone 8 - 9 Push NotificationsWindows Phone 8 - 9 Push Notifications
Windows Phone 8 - 9 Push NotificationsOliver Scheer
 
Windows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local DatabaseWindows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local DatabaseOliver Scheer
 
Windows Phone 8 - 5 Application Lifecycle
Windows Phone 8 - 5 Application LifecycleWindows Phone 8 - 5 Application Lifecycle
Windows Phone 8 - 5 Application LifecycleOliver Scheer
 
Windows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and StorageWindows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and StorageOliver Scheer
 
Windows Phone 8 - 3.5 Async Programming
Windows Phone 8 - 3.5 Async ProgrammingWindows Phone 8 - 3.5 Async Programming
Windows Phone 8 - 3.5 Async ProgrammingOliver Scheer
 
Windows Phone 8 - 1 Introducing Windows Phone 8 Development
Windows Phone 8 - 1 Introducing Windows Phone 8 DevelopmentWindows Phone 8 - 1 Introducing Windows Phone 8 Development
Windows Phone 8 - 1 Introducing Windows Phone 8 DevelopmentOliver Scheer
 
Windows Phone 8 - 3 Building WP8 Applications
Windows Phone 8 - 3 Building WP8 ApplicationsWindows Phone 8 - 3 Building WP8 Applications
Windows Phone 8 - 3 Building WP8 ApplicationsOliver Scheer
 
Windows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and StorageWindows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and StorageOliver Scheer
 
Windows Phone 8 - 2 Designing WP8 Applications
Windows Phone 8 - 2 Designing WP8 ApplicationsWindows Phone 8 - 2 Designing WP8 Applications
Windows Phone 8 - 2 Designing WP8 ApplicationsOliver Scheer
 

Mehr von Oliver Scheer (12)

Windows Phone 8 - 12 Network Communication
Windows Phone 8 - 12 Network CommunicationWindows Phone 8 - 12 Network Communication
Windows Phone 8 - 12 Network Communication
 
Windows Phone 8 - 11 App to App Communication
Windows Phone 8 - 11 App to App CommunicationWindows Phone 8 - 11 App to App Communication
Windows Phone 8 - 11 App to App Communication
 
Windows Phone 8 - 10 Using Phone Resources
Windows Phone 8 - 10 Using Phone ResourcesWindows Phone 8 - 10 Using Phone Resources
Windows Phone 8 - 10 Using Phone Resources
 
Windows Phone 8 - 9 Push Notifications
Windows Phone 8 - 9 Push NotificationsWindows Phone 8 - 9 Push Notifications
Windows Phone 8 - 9 Push Notifications
 
Windows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local DatabaseWindows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local Database
 
Windows Phone 8 - 5 Application Lifecycle
Windows Phone 8 - 5 Application LifecycleWindows Phone 8 - 5 Application Lifecycle
Windows Phone 8 - 5 Application Lifecycle
 
Windows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and StorageWindows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and Storage
 
Windows Phone 8 - 3.5 Async Programming
Windows Phone 8 - 3.5 Async ProgrammingWindows Phone 8 - 3.5 Async Programming
Windows Phone 8 - 3.5 Async Programming
 
Windows Phone 8 - 1 Introducing Windows Phone 8 Development
Windows Phone 8 - 1 Introducing Windows Phone 8 DevelopmentWindows Phone 8 - 1 Introducing Windows Phone 8 Development
Windows Phone 8 - 1 Introducing Windows Phone 8 Development
 
Windows Phone 8 - 3 Building WP8 Applications
Windows Phone 8 - 3 Building WP8 ApplicationsWindows Phone 8 - 3 Building WP8 Applications
Windows Phone 8 - 3 Building WP8 Applications
 
Windows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and StorageWindows Phone 8 - 4 Files and Storage
Windows Phone 8 - 4 Files and Storage
 
Windows Phone 8 - 2 Designing WP8 Applications
Windows Phone 8 - 2 Designing WP8 ApplicationsWindows Phone 8 - 2 Designing WP8 Applications
Windows Phone 8 - 2 Designing WP8 Applications
 

Kürzlich hochgeladen

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Kürzlich hochgeladen (20)

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

Windows Phone 8 - 14 Using Speech

  • 1. Oliver Scheer Senior Technical Evangelist Microsoft Deutschland http://the-oliver.com Using Speech
  • 2. Topics • Speech on Windows Phone 8 • Speech synthesis • Controlling applications using speech • Voice command definition files • Building conversations • Selecting application entry points • Simple speech input • Speech input and grammars • Using Grammar Lists
  • 4. Windows Phone Speech Support • Windows Phone 7.x had voice support built into the operating system • Programs and phone features could be started by voice commands e.g “Start MyApp” • Incoming SMS messages could be read to the user • The user could compose and send SMS messages • Windows 8 builds on this to allow applications to make use of speech • Applications can speak messages using the Speech Synthesis feature • Applications can be started and given commands • Applications can accept commands using voice input • Speech recognition requires an internet connection, but Speech Synthesis does not 4
  • 6. Enabling Speech Synthesis • If an application wishes to use speech output the ID_CAP_SPEECH_RECOGNITION capability must be enabled in WMAppManifest.xml • The application can also reference the Synthesis namespace 3/19/20146 using Windows.Phone.Speech.Synthesis;
  • 7. Simple Speech • The SpeechSynthesizer class provides a simple way to produce speech • The SpeakTextAsync method speaks the content of the string using the default voice • Note that the method is an asynchronous one, so the calling method must use the async modifier • Speech output does not require a network connection 3/19/20147 async void CheeseLiker() { SpeechSynthesizer synth = new SpeechSynthesizer(); await synth.SpeakTextAsync("I like cheese."); }
  • 8. Selecting a language • The default speaking voice is selected automatically from the locale set for the phone • The InstalledVoices class provides a list of all the voices available on the phone • The above code selects a French voice 3/19/20148 // Query for a voice that speaks French. var frenchVoices = from voice in InstalledVoices.All where voice.Language == "fr-FR" select voice; // Set the voice as identified by the query. synth.SetVoice(frenchVoices.ElementAt(0));
  • 9. Demo Demo 1: Voice Selection
  • 10. Speech Synthesis Markup Language • You can use Speech Synthesis Markup Language (SSML) to control the spoken output • Change the voice, pitch, rate, volume, pronunciation and other characteristics • Also allows the inclusion of audio files into the spoken output • You can also use the Speech synthesizer to speak the contents of a file 3/19/201410 <?xml version="1.0" encoding="ISO-8859-1"?> <speak version="1.0" xmlns=http://www.w3.org/2001/10/synthesis xml:lang="en-US"> <p> Your <say-as interpret-as="ordinal">1st</say-as> request was for <say-as interpret-as="cardinal">1</say-as> room on <say-as interpret-as="date" format="mdy">10/19/2010</say-as> , arriving at <say-as interpret-as="time" format="hms12">12:35pm</say-as>. </p> </speak>
  • 12. Application Launching using Voice command • The Voice Command feature of Windows Phone 7 allowed users to start applications • In Windows Phone 8 the feature has been expanded to allow the user to request data from the application in the start command • The data will allow a particular application page to be selected when the program starts and can also pass request information to that page • To start using Voice Commands you must Create a Voice Command Definition (VCD) file that defines all the spoken commands • The application then calls a method to register the words and phrases the first time it is run 3/19/201412
  • 13. The Fortune Teller Program • The Fortune Teller program will tell your future • You can ask it questions and it will display replies • It could also speak them • Some of the spoken commands activate different pages of the application and others are processed by the application when it starts running 3/19/201413
  • 14. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file • This is the “money” question: “Fortune Teller Will I find money” 3/19/201414
  • 15. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201415 • This is the phrase the user says to trigger the command • All of the Fortune Teller commands start with this phrase
  • 16. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201416 • This is example text that will be displayed by the help for this app as an example of the commands the app supports
  • 17. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201417 • This is the command name • This can be obtained from the URL by the application when it starts
  • 18. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201418 • This is the example for this specific command
  • 19. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201419 • This is the trigger phrase for this command • It can be a sequence of words • The user must prefix this sequence with the words “Fortune Teller”
  • 20. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201420 • This is the phraselist for the command • The user can say any of the words in the phraselist to match this command • The application can determine the phrase used • The phraselist can be changed by the application dynamically
  • 21. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201421 • This is the spoken feedback from the command • The feedback will insert the phrase item used to activate the command
  • 22. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201422 • This is the url for the page to be activated by the command • Commands can go to different pages, or all go to MainPage.xaml if required
  • 23. <CommandPrefix> Fortune Teller </CommandPrefix> <Example> Will I find money </Example> <Command Name="showMoney"> <Example> Will I find money </Example> <ListenFor> [Will I find] {futureMoney} </ListenFor> <Feedback> Showing {futureMoney} </Feedback> <Navigate Target="/money.xaml"/> </Command> <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> The Voice Command Definition (VCD) file 3/19/201423 • These are the phrases that can be used at the end of the command • The application can modify the phrase list of a command dynamically • It could give movie times for films by name
  • 24. Installing a Voice Command Definition (VCD) file • The VCD file can be loaded from the application or from any URI • In this case it is just a file that has been added to the project and marked as Content • The VCD can also be changed by the application when it is running • The voice commands for an application are loaded into the voice command service when the application runs • The application must run at least once to configure the voice commands 3/19/201424 async void setupVoiceCommands() { await VoiceCommandService.InstallCommandSetsFromFileAsync( new Uri("ms-appx:///VCDCommands.xml", UriKind.RelativeOrAbsolute)); }
  • 25. Launching Your App With a Voice Command • If the user now presses and holds the Windows button, and says: Fortune Teller, Will I find gold? the Phone displays “Showing gold” • It then launches your app and navigates to the page associated with this command, which is /Money.xaml • The query string passed to the page looks like this: "/?voiceCommandName=showMoney&futureMoney=gold&reco=Fortune%20Teller%Will%20I%20find%20gold" 3/19/201425 Command Name Phaselist Name Recognized phrase Whole phrase as it was recognized
  • 26. Handling Voice Commands • This code runs in the OnNavigatedTo method of a target page • Can also check for the voice command phrase that was used 3/19/201426 if (e.NavigationMode == System.Windows.Navigation.NavigationMode.New) { if (NavigationContext.QueryString.ContainsKey("voiceCommandName")) { string command = NavigationContext.QueryString["voiceCommandName"]; switch command) { case "tellJoke": messageTextBlock.Text = "Insert really funny joke here"; break; // Add cases for other commands. default: messageTextBlock.Text = "Sorry, what you said makes no sense."; break; } } }
  • 27. Identifying phrases • The navigation context can be queried to determine the phrase used to trigger the navigation • In this case the program is selecting between the phrase used in the “riches” question 3/19/201427 <PhraseList Label="futureMoney"> <Item> money </Item> <Item> riches </Item> <Item> gold </Item> </PhraseList> string moneyPhrase = NavigationContext.QueryString["futureMoney"];
  • 29. Modifying the phrase list • An application can modify a phrase list when it is running • It cannot add new commands however • This would allow a program to implement behaviours such as: “Movie Planner tell me showings for Batman” 3/19/201429 VoiceCommandSet fortuneVcs = VoiceCommandService.InstalledCommandSets["en-US"]; await fortuneVcs.UpdatePhraseListAsync("futureMoney", new string[] { "money", "cash", "wonga", "spondoolicks" });
  • 31. Recognizing Free Speech • A Windows Phone application can recognise words and phrases and pass them to your program • From my experiments it seems quite reliable • Note that a network connection is required for this feature • Your application can just use the speech string directly • The standard “Listening” interface is displayed over your application 3/19/201431
  • 32. Simple Speech Recognition • The above method checks for a successful response • By default the system uses the language settings on the Phone 3/19/201432 SpeechRecognizerUI recoWithUI; async private void ListenButton_Click(object sender, RoutedEventArgs e) { this.recoWithUI = new SpeechRecognizerUI(); SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync(); if ( recoResult.ResultStatus == SpeechRecognitionUIStatus.Succeeded ) MessageBox.Show(string.Format("You said {0}.", recoResult.RecognitionResult.Text)); }
  • 33. Customizing Speech Recognition • InitialSilenceTimeout • The time that the speech recognizer will wait until it hears speech. • The default setting is 5 seconds. BabbleTimeout • The time that the speech recognizer will listen while it hears background noise • The default setting is 0 seconds (the feature is not activated). • EndSilenceTimeout • The time interval during which the speech recognizer will wait before finalizing the recognition operation • The default setting is 150 milliseconds. 3/19/201433
  • 34. Customizing Speech Recognition • A program can also select whether or not the speech recognition echoes back the user input and displays it in a message box • The code above also sets timeout values 3/19/201434 recoWithUI.Settings.ReadoutEnabled = false; // don't read the saying back recoWithUI.Settings.ShowConfirmation = false; // don't show the confirmation recoWithUI.Recognizer.Settings.InitialSilenceTimeout = TimeSpan.FromSeconds(6.0); recoWithUI.Recognizer.Settings.BabbleTimeout = TimeSpan.FromSeconds(4.0); recoWithUI.Recognizer.Settings.EndSilenceTimeout = TimeSpan.FromSeconds(1.2);
  • 35. Handling Errors • An application can bind to events which indicate problems with the audio input • There is also an event fired when the state of the capture changes 3/19/201435 recoWithUI.Recognizer.AudioProblemOccurred +=Recognizer_AudioProblemOccurred; recoWithUI.Recognizer.AudioCaptureStateChanged += Recognizer_AudioCaptureStateChanged; ... void Recognizer_AudioProblemOccurred(SpeechRecognizer sender, SpeechAudioProblemOccurredEventArgs args) { MessageBox.Show("PLease speak more clearly"); }
  • 37. Grammars and Speech input • The simple speech recognition we have seen so far uses the “Short Dictation” grammar which just captures the text and returns it to the application • You can add your own grammars that will structure the conversation between the user and the application • Grammars can be created using the Speech Recognition Grammar Specification (SRGS) Version 1.0 and stored as XML files loaded when the application runs • This is a little complex, but worth the effort if you want to create applications with rich language interaction with the user • If the application just needs to identify particular commands you can use a grammar list to achieve this 3/19/201437
  • 38. Using Grammar Lists • To create a Grammar List an application defines an array of strings that form the words in the list • The Grammar can then be added to the recognizer and given a name • Multiple grammar lists can be added to a grammar recognizer • The recognizer will now resolve any of the words in the lists that have been supplied 3/19/201438 string [] strengthNames = { "weak", "mild", "medium", "strong", "english"}; recoWithUI.Recognizer.Grammars.AddGrammarFromList("cheeseStrength", strengthNames);
  • 39. Enabling and Disabling Grammar Lists • An application can enable or disable particular grammars before a recognition action • It is also possible to set relative weightings of grammar lists • The text displayed as part of the listen operation can also be set, as shown above 3/19/201439 recoWithUI.Settings.ListenText = "How strong do you like your cheese?"; recoWithUI.Recognizer.Grammars["cheeseStrength"].Enabled = true; SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();
  • 40. Determining the confidence in the result • An application can determine the confidence that the speech system has in the result that was obtained • Result values are High, Medium, Low, Rejected 3/19/201440 SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync(); if ( recoResult.RecognitionResult.TextConfidence == SpeechRecognitionConfidence.High ) { // select cheese based on strength value }
  • 41. Matching Multiple Grammars • If the spoken input matches multiple grammars a program can obtain a list of the alternative results using recoResult.RecognitionResult.GetAlternatives • The list is supplied in order of confidence • The application can then determine the best fit from the context of the voice request • This list is also provided if the request used a more complex grammar 3/19/201441 var alternatives = recoResult.RecognitionResult.GetAlternates(3);
  • 42. Profanity • Words that are recognised as profanities are not displayed in the response from a recognizer command • The speech system will also not repeat them • They are enclosed in <Profanity> </Profanity> when supplied to the program that receives the speech data 3/19/201442
  • 43. Review • Applications in Windows Phone 8 can use speech generation and recognition to interact with users • Applications can produce speech output from text files which can be marked up with Speech Synthesis Markup Language (SSML) to include sound files • Applications can be started and provided with initial commands by registering a Voice Command Definition File with the Windows Phone • The commands can be picked up when a page is loaded, or the commands specify a particular page to load • An application can modify the phrase part of a command to change the activation commands • Applications can recognise speech using complex grammars or simple word lists 43
  • 44. The information herein is for informational purposes only an represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Hinweis der Redaktion

  1. In Windows Phone 7.x, users can already launch into your app by pressing and holding down the Windows key and then saying &quot;Start&quot; or &quot;Open&quot; followed by the name of your app. In Windows Phone 8, you can extend and customize additional phrases that a user can say. This allows users to:Launch to a specific page in an app. Launch an app and initiate an action. Windows Phone 8 also provides new speech recognition and speech synthesis APIs that allow users to interact with your apps by speaking and listening.
  2. Using the new speech APIs in Windows Phone 8, you can create new experiences for users by enabling your application to both recognize users&apos; speech and to generate synthesized speech (also known as text-to-speech or TTS). Speech is a natural, efficient, and accurate way to interact with Windows Phone 8 Developer Preview, and one way you can make your application more attractive to users.To use TTS in your application, you must add the ID_CAP_SPEECH_RECOGNITION capability.
  3. The quickest and easiest way to generate speech output to a user of your app is to provide a plain text string to the SpeechSynthesizer.SpeakTextAsync() method. The code example shows how to do this.Note that this is one of the APIs that uses the new Task-basedasync programming pattern.
  4. Windows Phone 8 includes speaking voices for a variety of languages. Each voice generates synthesized speech in a single language, as spoken in a specific country/region. After you create a SpeechSynthesizer object, you can specify the language of a voice to load. A SpeechSynthesizer instance can load any voice that is installed on the phone and use it to generate speech. If no language is specified, the API will load a voice that matches the language that the user selected in Settings/Speech on the phone.The code sample shows how to select a French voice.
  5. Follow the instructions in the Word document.
  6. So far we have seen the simplest usage of the speech synthesizer where it speaks plain text. The speech synthesizer can also speak text that contains markup that conforms to the Speech Synthesis Markup Language (SSML) Version 1.0. You can either insert SSML markup inline in your code, or reference a standalone SSML document from your code.You can select a speaking voice of a particular language using Speech Synthesis Markup Language (SSML), and change the voice, pitch, rate, volume, pronunciation and other characteristics. You can specify pauses, breaks or emphasis, or use the say-as element to specify the type of text contained in an element (such as acronym, cardinal, number, date and time).It gives fine control over how the speech should be generated. A detailed look at SSML is outside the scope of this session. Refer to the SSML reference documentation for more details.
  7. In Windows Phone 7.x, users can already launch into your app by pressing and holding down the Windows key and then saying &quot;Start&quot; or &quot;Open&quot; followed by the name of your app. In Windows Phone 8, you can extend and customize additional phrases that a user can say, passing in parameters to your app, and giving users rapid access to your app’s functionality. This allows users to:Launch to a specific page in an app. For example, a user can press the Start button and speak &quot;Start MyApp, navigate to MyPage&quot; or simply &quot;Start MyApp go to MyPage.&quot;Launch an app and initiate an action. For example, a user can press the Start button and speak &quot;Start MyApp Show MyItem&quot; (where MyItem could be app-specific data such as an item in a favorites list).A user using the Contoso Widgets app could press the Start button and say &quot;Contoso Widgets, show best sellers&quot; to both launch the Contoso Widgets app and navigate to a &apos;best sellers&apos; page, or some other action that the developer specifies.To use voice commands, you must :Create a Voice Command Definition (VCD) file. This is an XML document that defines all the spoken commands that users can say to initiate actions when launching your app. Add code to initialize the VCD file with the phone&apos;s speech feature.Add code to handle navigation and to execute commands.
  8. The example we will look at is a Fortune Teller program. You can ask it questions and it will tell your future by both displaying replies and also speaking them.We will look at different ways you can process voice commands and how you use them to navigate within the pages of your app.The Voice Commands feature on Windows Phone has good support for discoverability built in. When you extend and customize voice commands, end users can find out through system help and “did you know?&quot; screens what phrases your app is listening for.
  9. Each voice command contains: An example phrase of how a user should typically invoke the commandThe words or phrases that your app will recognize to initiate the commandThe text that your app will display and speak to the user when the command is recognizedThe page that your app will navigate to when the command is recognized
  10. In a Voice Command definition, you use the CommandPrefix element to specify the words or phrases that your app will recognize to initiate the command
  11. Discoverability is also a key aspect of voice commands. When you extend and customize voice commands, end users can find out through system help and &quot;What Can I Say&quot; screens what phrases your app is listening for.Here we use the Example element to specify an example phrase of how a user can use voice commands with the application.
  12. The Name attribute on the Command element is passed into the app when it is launched to handle a voice command. Your app logic can retrieve this value from the query string parameters on the launch URI.
  13. You can also give an example for a specific voice command.
  14. Each Command element must contain at least one ListenFor element. Each ListenFor element contains the word or words that will initiate the action specified by the Command element. ListenFor elements cannot be programmatically modified.
  15. The part of the phrase in square brackets is fixed, but ‘{futureMoney} here represents a phrase list, which are different words or phrases the user can use to trigger this particular command.
  16. Use the Feedback element to specify the text that your app will display and speak to the user when the command is recognized.Notice that it can speak back an item from a phrase list.
  17. The Navigate element is used to specify the page that your app will navigate to when the command is recognized.
  18. The PhraseList element defines variable words or phrases that may be used by the user at any particular point in a command.
  19. The VCD file you create defines the voice commands your app supports.When you’ve created the VCD file, you package it with your app and use a single method call to initialize it during your app&apos;s first run. Initializing the VCD registers the commands to listen for with the speech system on the user&apos;s phone. The code example shows how you can initialize a VCD file.
  20. Using the Fortune Teller commands, this means that if a user speaks &quot;Fortune Teller, Will I find gold?&quot;, the app will be launched at the page /Money.xaml with a Query String that contains the string shown.Notice that the variable elements such as the phrase from a phrase list are listed separately, as well as the whole recognized phrase.
  21. In the OnNavigatedTo method of the launched page, you need to figure out if your app was launched by a voice command, and then determine what the name and parameters of the voice command are. You accomplish this by looking at the QueryString property of the NavigationContext class. Once you&apos;ve determined what voice command was used, you can take appropriate action in your app.
  22. Variable parts of a voice command that have been captured using a PhraseList are specified separately in the Query String, as we saw in the example a couple of slides back. This is how you can extract the exact phrase in the PhraseList the user used.
  23. Follow the instructions in the Word document.
  24. PhraseList elements that are associated with ListenFor elements can be programmatically modified. For example, let&apos;s say you have a movie viewer app and want to allow users to launch a movie simply by saying the application name followed by &quot;Play MovieName&quot;. It would be an overwhelming task to have to create a separate ListenFor element for each possible movie. Instead, you could dynamically populate the PhraseList at runtime with all possible movie options. In the ListenFor element itself, you could define it as: &lt;ListenFor&gt;Play {movies}&lt;/ListenFor&gt;, where &apos;movies&apos; is the label name for the PhraseList. You would need to create a blank, PhraseList element labeled ‘movies’ at minimum to be able to programmatically modify it at runtime.
  25. Speech recognition can be a natural, efficient, and accurate way for users to interact with applications in Windows Phone 8. To support speech recognition, Windows Phone 8 includes a speech runtime, recognition APIs for programming the runtime, ready-to-use grammars for dictation and web search, and a graphical user interface (GUI) that helps users discover and use speech recognition features.
  26. The quickest and easiest way to enable your application for speech recognition is to use the built–in dictation grammar included with Windows Phone 8. The dictation grammar will recognize most words and short phrases in a language, and is activated by default when a speech recognizer object is instantiated. Using the built in dictation grammar, you can enable speech recognition with just a few lines of code, as in this example.
  27. You can adjust settings of the speech recognizer that control how it responds to the absence of input or to the absence of recognizable input when speech is expected. These settings affect how recognition behaves while waiting for expected speech input at the onset or the conclusion of a speech recognition operation.InitialSilenceTimeout - adjust the value for the InitialSilenceTimeout setting to have more or less time elapse while your application waits for a user to begin speaking after a recognition operation has started.BabbleTimeout - The time interval during which the speech recognizer will continue to listen while it receives only non-speech input such as background noise. Background noise includes any non-speech input that does not match any active rule in the grammars currently loaded and activated by the speech recognizer. EndSilenceTimeout - The time interval during which the speech recognizer will wait to finalize the recognition operation, after speech has finished and only silence is being received as input.
  28. When you help users to know what to say to your application and to know what was recognized, it can improve their recognition experience. Windows Phone provides built-in graphical user interface (GUI) screens for speech recognition, which you can use to help users provide input that your application expects, and to confirm a user&apos;s input.The ReadoutEnabled property controls whether or not the phone speaks successfully recognized text back to the user from the Confirmation screen, and whether or not it speaks options from the Disambiguation screen.The ShowConfirmation property controls whether the phone displays a confirmation screen that shows and optionally speaks the recognized phrase when speech recognition is successful (default is true).Other properties you can set include ListenText which sets the text on the &quot;Listening&quot; screen to let users know what kind of information your application is expecting, for example &quot;Favoritecolor?“.Also, ExampleText which sets the text that you specify that provides one or more examples of what your application is listening for, for example &quot; &apos;blue&apos;, &apos;orange&apos; .
  29. The speech recognizer raises the AudioProblemOccurred event when it encounters conditions that interfere with accurate recognition of speech input. If you use the SpeechRecognizerUI class, the event is also raised when speech recognition occurs on the Disambiguation screen. You can use a handler for the event to retrieve a description of the audio problem, such as too quiet, too loud, too noisy, or too fast. You may be able to use the description of the audio problem to take steps improve the conditions for recognition, for example to prompt a user to speak louder.
  30. A grammar defines the words and phrases that an application will recognize in speech input. Grammars are at the core of speech recognition and are perhaps the most important factor under your control that influences the accuracy of speech recognition. You can use three different types of grammars to enable your application to perform speech recognition in Windows Phone 8 : Built-in grammars. Use the online dictation and web search grammars provided by Windows Phone 8.List grammars. Create lightweight, custom grammars programmatically in the form of simple lists.XML grammars. Create custom grammars for your application in the XML format defined by the Speech Recognition Grammar Specification (SRGS) Version 1.0. Which grammar type you use may depend on the complexity of the recognition experience you want to create and your level of expertise in creating grammars. Any one approach may be the best choice for a specific recognition task, and you may find uses for all three types of grammars in your application.
  31. A grammar list provide a lightweight approach to creating a simple grammar as a list of phrases. A list grammar consists of an array of strings that represents speech input that your application will accept for a recognition operation. You can create list grammars inside your application by passing an array of strings to the AddGrammarFromList() method. Recognition is successful when the speech recognizer recognizes any one of the strings in the array.List grammars may provide faster performance and greater accuracy than the built-in dictation grammar. However, list grammars are best suited to simple recognition scenarios.
  32. After a grammar set is loaded for recognition, your application can manage which grammars are active for recognition operations by setting the SpeechGrammar.Enabled property to true or false. The default setting is true. It is typically more efficient to load grammars once and to activate and deactivate them, rather than to load and unload grammars for each recognition operation. It takes fewer processor resources and time to set the SpeechGrammar.Enabled property than to load and unload a grammar.If you restrict the number of grammars that are active for recognition based on the phrases that your application expects in the context of the current recognition operation,you can improve both the performance and accuracy of speech recognition.
  33. At the end of a speech recognition operation, the speech recognizer returns a result that contains information about the outcome of recognition. The confidence rating is the speech recognizer&apos;s assessment of how accurately it matched a user&apos;s speech to a phrase in an active grammar. A speech recognizer may assign a low confidence score to spoken input for various reasons, including background interference, inarticulate speech, or unanticipated words or word sequences.
  34. The speech recognizer may return one or more possible recognized phrases (called alternates) in the result for a recognition operation. The alternates are phrases from the grammar. If there are multiple recognition alternates for speech input, and one alternate has a substantially higher confidence score than the other alternates, then the recognizer matches the higher scoring alternate to the speech input.If there are multiple recognition alternates for speech input, and their confidence scores are similar, the standard graphical user interface (GUI) for speech on Windows Phone 8 displays a disambiguation screen. The disambiguation screen displays a ‘Did you mean?’ screen and optionally speaks up to four phrases from the grammar that most likely match what the user spoke.
  35. If a user speaks profanity, the recognized profanity phrases return encapsulated in &lt;profanity&gt; tags in the speech recognition result&apos;s Text property. If the SpeechRecognizerUI class was used, the recognized profanity phrase is censored on the Confirmation screen.
  36. Lots of stuff here.Tell folks that all the demonstrations are available and would make the basis of a number of application ideas.