SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Speech for
Windows Phone 8


       Marco Massarelli
    http://ceoloide.com
Speech for Windows Phone 8


1. Voice commands
2. Speech recognition
3. Text-to-speech (TTS)
4. Q&A
11   Voice commands
1   Voice commands

                                YOUR APP

                           SPEECH RECOGNITION

          VOICE COMMANDS
                           TEXT-TO-SPEECH (TTS)




• Application entry point
• Can act as deep links to your application
1    Voice commands
• Set up your project capabilities:
   – D_CAP_SPEECH_RECOGNITION,
   – ID_CAP_MICROPHONE,
   – ID_CAP_NETWORKING

• Create a new Voice Command Definition
1   Voice commands

        <?xml version="1.0" encoding="utf-8"?>
        <VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.0">

           <CommandSet xml:lang="en-us">
                <CommandPrefix> Contoso Widgets </CommandPrefix>
                <Example> Show today's specials </Example>
                <Command Name="showWidgets">
                      <Example> Show today's specials </Example>
                      <ListenFor> [Show] {widgetViews} </ListenFor>
                      <ListenFor> {*} [Show] {widgetViews} </ListenFor>
                      <Feedback> Showing {widgetViews} </Feedback>
                      <Navigate Target="/favorites.xaml"/>
                </Command>
                <PhraseList Label="widgetViews">
                      <Item> today's specials </Item>
                      <Item> best sellers </Item>
                </PhraseList>
           </CommandSet>

           <!-- Other CommandSets for other languages -->

        </VoiceCommands>
1          Voice commands

• Install the Voice Command Definition (VCD) file
  await VoiceCommandService.InstallCommandSetsFromFileAsync( new Uri("ms-appx:///ContosoWidgets.xml") );




• VCD files need to be installed again when a
  backup is restored on a device.
1           Voice commands

• Voice commands parameters are included in the
  QueryString property of the NavigationContext
  "/favorites.xaml?voiceCommandName=showWidgets&widgetViews=best%20sellers&reco=Contoso%20Widgets%Show%20best%20sellers"




• Asterisks in ListenFor phrases are passed as “…”
  – In other words, it is not possible to receive the actual
    text that matched the asterisk.
2
1   Speech recognition
2   Speech recognition

                                YOUR APP

                           SPEECH RECOGNITION

          VOICE COMMANDS
                           TEXT-TO-SPEECH (TTS)




• Natural interaction with your application
• Grammar-based
• Requires internet connection
2    Speech recognition

• Default dictation grammar for free-text
  and web-search are included in WP8
• Custom grammar can be defined in two
  ways:
  – Programmatic list grammar (array of strings)
  – XML grammar leveraging on Speech
    Recognition Grammar Specification (SRGS) 1.0
2         Speech recognition

• Default dictation grammar

  private async void ButtonWeatherSearch_Click(object sender, RoutedEventArgs e)
  {
         // Add the pre-defined web search grammar to the grammar set.
         SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

          recoWithUI.Recognizer.Grammars.AddGrammarFromPredefinedType ("weatherSearch",
          SpeechPredefinedGrammar.WebSearch);

          // Display text to prompt the user's input.
          recoWithUI.Settings.ListenText = "Say what you want to search for";

          // Display an example of ideal expected input.
          recoWithUI.Settings.ExampleText = @"Ex. 'weather for London'";

          // Load the grammar set and start recognition.
          SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync();
  }
2         Speech recognition

• Programmatic list grammar
  private async void ButtonSR_Click(object sender, RoutedEventArgs e)
  {
         SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

          // You can create this string dynamically, for example from a movie queue.
          string[] movies = { "Play The Cleveland Story", "Play The Office", "Play Psych", "Play Breaking
          Bad", "Play Valley of the Sad", "Play Shaking Mad" };

          // Create a grammar from the string array and add it to the grammar set.
          recoWithUI.Recognizer.Grammars.AddGrammarFromList("myMovieList", movies);

          // Display an example of ideal expected input.
          recoWithUI.Settings.ExampleText = @"ex. 'Play New Mocumentaries'";

          // Load the grammar set and start recognition.
          SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync();

          // Play movie given in result.Text
  }
2         Speech recognition

• XML grammar
 private async void ButtonSR_Click(object sender, EventArgs e)
 {
        // Initialize objects ahead of time to avoid delays when starting recognition.
        SpeeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

         // Initialize a URI with a path to the SRGS-compliant XML file.
         Uri orderPizza = new Uri("ms-appx:///OrderPizza.grxml", UriKind.Absolute);

         // Add an SRGS-compliant XML grammar to the grammar set.
         recoWithUI.Recognizer.Grammars.AddGrammarFromUri("PizzaGrammar", orderPizza);

         // Preload the grammar set.
         await recoWithUI.Recognizer.PreloadGrammarsAsync();

         // Display text to prompt the user's input.
         recoWithUI.Settings.ListenText = "What kind of pizza do you want?";

         // Display an example of ideal expected input.
         recoWithUI.Settings.ExampleText = "Large combination with Italian sausage";

         SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();
 }
2   Speech recognition
3
2   Text-to-speech
         (TTS)
3    Text-to-speech (TTS)

                                YOUR APP

                           SPEECH RECOGNITION

          VOICE COMMANDS
                           TEXT-TO-SPEECH (TTS)




• Output synthetized speech
• Provide the user with spoken instructions
3   Text-to-speech (TTS)

• TTS requires only the following capability:
  – ID_CAP_SPEECH_RECOGNITION
• TTS can output the following text types:
  – Unformatted text strings
  – Speech Synthesis Markup Language (SSML)
    1.0 strings or XML files
3       Text-to-speech (TTS)
• Outputting unformatted strings is very easy and
  it is also possible to select a voice language:
       // Declare the SpeechSynthesizer object at the class level.
       SpeechSynthesizer synth;

       private async void ButtonSimpleTTS_Click(object sender, RoutedEventArgs e)
       {
              SpeechSynthesizer synth = new SpeechSynthesizer();
              await synth.SpeakTextAsync("You have a meeting with Peter in 15 minutes.");
       }

       private async void SpeakFrench_Click_1(object sender, RoutedEventArgs e)
       {

             synth = new SpeechSynthesizer(); // Query for a voice that speaks French.

             IEnumerable<VoiceInformation> frenchVoices = from voice in InstalledVoices.All
             where voice.Language == "fr-FR" select voice;

             // Set the voice as identified by the query.
             synth.SetVoice(frenchVoices.ElementAt(0));

             // Count in French.
             await synth.SpeakTextAsync("un, deux, trois, quatre");
       }
3         Text-to-speech (TTS)

• SSML 1.0 text can be outputted from string
  or XML files
   // Speaks a string of text with SSML markup.
   private async void SpeakSsml_Click(object sender, RoutedEventArgs e) {
          SpeechSynthesizer synth = new SpeechSynthesizer(); // Build an SSML prompt in a string.
          string ssmlPrompt = "<speak version="1.0" ";
          ssmlPrompt += "xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">";
          ssmlPrompt += "This voice speaks English. </speak>"; // Speak the SSML prompt.
          await synth.SpeakSsmlAsync(ssmlPrompt);
   }




   // Speaks the content of a standalone SSML file.
   private async void SpeakSsmlFromFile_Click(object sender, RoutedEventArgs e) {
          // Set the path to the SSML-compliant XML file.
          SpeechSynthesizer synth = new SpeechSynthesizer();

         string path = Package.Current.InstalledLocation.Path + "ChangeVoice.ssml";
         Uri changeVoice = new Uri(path, UriKind.Absolute); // Speak the SSML prompt.
         await synth.SpeakSsmlFromUriAsync(changeVoice);
   }
4
3   Q&A
4    Questions & Answers
• Speech for Windows Phone 8 API
  references:
  – http://msdn.microsoft.com/en-
    us/library/windowsphone/develop/jj206958(v
    =vs.105).aspx
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Shell & Shell Script
Shell & Shell Script Shell & Shell Script
Shell & Shell Script Amit Ghosh
 
Shell programming 1.ppt
Shell programming  1.pptShell programming  1.ppt
Shell programming 1.pptKalkey
 
Penetration testing using python
Penetration testing using pythonPenetration testing using python
Penetration testing using pythonPurna Chander K
 
Quick start bash script
Quick start   bash scriptQuick start   bash script
Quick start bash scriptSimon Su
 
The Php Life Cycle
The Php Life CycleThe Php Life Cycle
The Php Life CycleXinchen Hui
 
Using Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSXUsing Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSXPuppet
 
How PHP Works ?
How PHP Works ?How PHP Works ?
How PHP Works ?Ravi Raj
 
Understanding PHP memory
Understanding PHP memoryUnderstanding PHP memory
Understanding PHP memoryjulien pauli
 

Was ist angesagt? (20)

Shellscripting
ShellscriptingShellscripting
Shellscripting
 
Shell & Shell Script
Shell & Shell Script Shell & Shell Script
Shell & Shell Script
 
Linux shell scripting
Linux shell scriptingLinux shell scripting
Linux shell scripting
 
Shell programming 1.ppt
Shell programming  1.pptShell programming  1.ppt
Shell programming 1.ppt
 
Powershell notes
Powershell notesPowershell notes
Powershell notes
 
Erlang and Elixir
Erlang and ElixirErlang and Elixir
Erlang and Elixir
 
Shell scripting
Shell scriptingShell scripting
Shell scripting
 
Penetration testing using python
Penetration testing using pythonPenetration testing using python
Penetration testing using python
 
Shell scripting
Shell scriptingShell scripting
Shell scripting
 
Unix shell scripts
Unix shell scriptsUnix shell scripts
Unix shell scripts
 
Basics of shell programming
Basics of shell programmingBasics of shell programming
Basics of shell programming
 
Quick start bash script
Quick start   bash scriptQuick start   bash script
Quick start bash script
 
SHELL PROGRAMMING
SHELL PROGRAMMINGSHELL PROGRAMMING
SHELL PROGRAMMING
 
The Php Life Cycle
The Php Life CycleThe Php Life Cycle
The Php Life Cycle
 
cq_cxf_integration
cq_cxf_integrationcq_cxf_integration
cq_cxf_integration
 
Unix - Shell Scripts
Unix - Shell ScriptsUnix - Shell Scripts
Unix - Shell Scripts
 
Using Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSXUsing Puppet on Linux, Windows, and Mac OSX
Using Puppet on Linux, Windows, and Mac OSX
 
PHP 7 new engine
PHP 7 new enginePHP 7 new engine
PHP 7 new engine
 
How PHP Works ?
How PHP Works ?How PHP Works ?
How PHP Works ?
 
Understanding PHP memory
Understanding PHP memoryUnderstanding PHP memory
Understanding PHP memory
 

Ähnlich wie Speech for Windows Phone 8

Introduction to PowerShell
Introduction to PowerShellIntroduction to PowerShell
Introduction to PowerShellBoulos Dib
 
Learn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdf
Learn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdfLearn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdf
Learn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdfClapperboardCinemaPV
 
Monitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web ConsoleMonitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web ConsoleCarsten Ziegeler
 
Monitoring OSGi Applications with the Web Console - Carsten Ziegeler
Monitoring OSGi Applications with the Web Console - Carsten ZiegelerMonitoring OSGi Applications with the Web Console - Carsten Ziegeler
Monitoring OSGi Applications with the Web Console - Carsten Ziegelermfrancis
 
Monitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web ConsoleMonitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web ConsoleAdobe
 
Networked APIs with swift
Networked APIs with swiftNetworked APIs with swift
Networked APIs with swiftTim Burks
 
FMS Administration Seminar
FMS Administration SeminarFMS Administration Seminar
FMS Administration SeminarYoss Cohen
 
Calabash-android
Calabash-androidCalabash-android
Calabash-androidAdnan8990
 
Php interview-questions and answers
Php interview-questions and answersPhp interview-questions and answers
Php interview-questions and answerssheibansari
 
Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...
Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...
Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...Windows Developer
 
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !Microsoft
 
Posh Devcon2009
Posh Devcon2009Posh Devcon2009
Posh Devcon2009db82407
 
Text to speech with Google Cloud
Text to speech with Google CloudText to speech with Google Cloud
Text to speech with Google CloudRajarshi Ghosh
 
Flex3中文教程
Flex3中文教程Flex3中文教程
Flex3中文教程yiditushe
 
Lets have some fun with twilio open tok
Lets have some fun with   twilio open tokLets have some fun with   twilio open tok
Lets have some fun with twilio open tokmirahman
 
TDC 2014 - Cortana
TDC 2014 - CortanaTDC 2014 - Cortana
TDC 2014 - Cortanatmonaco
 
Functional Testing Swing Applications with Frankenstein
Functional Testing Swing Applications with FrankensteinFunctional Testing Swing Applications with Frankenstein
Functional Testing Swing Applications with Frankensteinvivek_prahlad
 

Ähnlich wie Speech for Windows Phone 8 (20)

Introduction to PowerShell
Introduction to PowerShellIntroduction to PowerShell
Introduction to PowerShell
 
Learn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdf
Learn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdfLearn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdf
Learn Powershell Scripting Tutorial Full Course 1dollarcart.com.pdf
 
Monitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web ConsoleMonitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web Console
 
Monitoring OSGi Applications with the Web Console - Carsten Ziegeler
Monitoring OSGi Applications with the Web Console - Carsten ZiegelerMonitoring OSGi Applications with the Web Console - Carsten Ziegeler
Monitoring OSGi Applications with the Web Console - Carsten Ziegeler
 
Monitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web ConsoleMonitoring OSGi Applications with the Web Console
Monitoring OSGi Applications with the Web Console
 
Networked APIs with swift
Networked APIs with swiftNetworked APIs with swift
Networked APIs with swift
 
FMS Administration Seminar
FMS Administration SeminarFMS Administration Seminar
FMS Administration Seminar
 
Calabash-android
Calabash-androidCalabash-android
Calabash-android
 
Php interview-questions and answers
Php interview-questions and answersPhp interview-questions and answers
Php interview-questions and answers
 
Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...
Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...
Build 2017 - B8092 - Using Microsoft Cognitive Services to bring the power of...
 
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
Fonctions vocales sous Windows Phone : intégrez votre application à Cortana !
 
Posh Devcon2009
Posh Devcon2009Posh Devcon2009
Posh Devcon2009
 
Text to speech with Google Cloud
Text to speech with Google CloudText to speech with Google Cloud
Text to speech with Google Cloud
 
Flex3中文教程
Flex3中文教程Flex3中文教程
Flex3中文教程
 
Lets have some fun with twilio open tok
Lets have some fun with   twilio open tokLets have some fun with   twilio open tok
Lets have some fun with twilio open tok
 
Tml for Objective C
Tml for Objective CTml for Objective C
Tml for Objective C
 
Tml for Laravel
Tml for LaravelTml for Laravel
Tml for Laravel
 
Xelerator software
Xelerator softwareXelerator software
Xelerator software
 
TDC 2014 - Cortana
TDC 2014 - CortanaTDC 2014 - Cortana
TDC 2014 - Cortana
 
Functional Testing Swing Applications with Frankenstein
Functional Testing Swing Applications with FrankensteinFunctional Testing Swing Applications with Frankenstein
Functional Testing Swing Applications with Frankenstein
 

Kürzlich hochgeladen

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Kürzlich hochgeladen (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Speech for Windows Phone 8

  • 1. Speech for Windows Phone 8 Marco Massarelli http://ceoloide.com
  • 2. Speech for Windows Phone 8 1. Voice commands 2. Speech recognition 3. Text-to-speech (TTS) 4. Q&A
  • 3. 11 Voice commands
  • 4. 1 Voice commands YOUR APP SPEECH RECOGNITION VOICE COMMANDS TEXT-TO-SPEECH (TTS) • Application entry point • Can act as deep links to your application
  • 5. 1 Voice commands • Set up your project capabilities: – D_CAP_SPEECH_RECOGNITION, – ID_CAP_MICROPHONE, – ID_CAP_NETWORKING • Create a new Voice Command Definition
  • 6. 1 Voice commands <?xml version="1.0" encoding="utf-8"?> <VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.0"> <CommandSet xml:lang="en-us"> <CommandPrefix> Contoso Widgets </CommandPrefix> <Example> Show today's specials </Example> <Command Name="showWidgets"> <Example> Show today's specials </Example> <ListenFor> [Show] {widgetViews} </ListenFor> <ListenFor> {*} [Show] {widgetViews} </ListenFor> <Feedback> Showing {widgetViews} </Feedback> <Navigate Target="/favorites.xaml"/> </Command> <PhraseList Label="widgetViews"> <Item> today's specials </Item> <Item> best sellers </Item> </PhraseList> </CommandSet> <!-- Other CommandSets for other languages --> </VoiceCommands>
  • 7. 1 Voice commands • Install the Voice Command Definition (VCD) file await VoiceCommandService.InstallCommandSetsFromFileAsync( new Uri("ms-appx:///ContosoWidgets.xml") ); • VCD files need to be installed again when a backup is restored on a device.
  • 8. 1 Voice commands • Voice commands parameters are included in the QueryString property of the NavigationContext "/favorites.xaml?voiceCommandName=showWidgets&widgetViews=best%20sellers&reco=Contoso%20Widgets%Show%20best%20sellers" • Asterisks in ListenFor phrases are passed as “…” – In other words, it is not possible to receive the actual text that matched the asterisk.
  • 9. 2 1 Speech recognition
  • 10. 2 Speech recognition YOUR APP SPEECH RECOGNITION VOICE COMMANDS TEXT-TO-SPEECH (TTS) • Natural interaction with your application • Grammar-based • Requires internet connection
  • 11. 2 Speech recognition • Default dictation grammar for free-text and web-search are included in WP8 • Custom grammar can be defined in two ways: – Programmatic list grammar (array of strings) – XML grammar leveraging on Speech Recognition Grammar Specification (SRGS) 1.0
  • 12. 2 Speech recognition • Default dictation grammar private async void ButtonWeatherSearch_Click(object sender, RoutedEventArgs e) { // Add the pre-defined web search grammar to the grammar set. SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI(); recoWithUI.Recognizer.Grammars.AddGrammarFromPredefinedType ("weatherSearch", SpeechPredefinedGrammar.WebSearch); // Display text to prompt the user's input. recoWithUI.Settings.ListenText = "Say what you want to search for"; // Display an example of ideal expected input. recoWithUI.Settings.ExampleText = @"Ex. 'weather for London'"; // Load the grammar set and start recognition. SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync(); }
  • 13. 2 Speech recognition • Programmatic list grammar private async void ButtonSR_Click(object sender, RoutedEventArgs e) { SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI(); // You can create this string dynamically, for example from a movie queue. string[] movies = { "Play The Cleveland Story", "Play The Office", "Play Psych", "Play Breaking Bad", "Play Valley of the Sad", "Play Shaking Mad" }; // Create a grammar from the string array and add it to the grammar set. recoWithUI.Recognizer.Grammars.AddGrammarFromList("myMovieList", movies); // Display an example of ideal expected input. recoWithUI.Settings.ExampleText = @"ex. 'Play New Mocumentaries'"; // Load the grammar set and start recognition. SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync(); // Play movie given in result.Text }
  • 14. 2 Speech recognition • XML grammar private async void ButtonSR_Click(object sender, EventArgs e) { // Initialize objects ahead of time to avoid delays when starting recognition. SpeeechRecognizerUI recoWithUI = new SpeechRecognizerUI(); // Initialize a URI with a path to the SRGS-compliant XML file. Uri orderPizza = new Uri("ms-appx:///OrderPizza.grxml", UriKind.Absolute); // Add an SRGS-compliant XML grammar to the grammar set. recoWithUI.Recognizer.Grammars.AddGrammarFromUri("PizzaGrammar", orderPizza); // Preload the grammar set. await recoWithUI.Recognizer.PreloadGrammarsAsync(); // Display text to prompt the user's input. recoWithUI.Settings.ListenText = "What kind of pizza do you want?"; // Display an example of ideal expected input. recoWithUI.Settings.ExampleText = "Large combination with Italian sausage"; SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync(); }
  • 15. 2 Speech recognition
  • 16. 3 2 Text-to-speech (TTS)
  • 17. 3 Text-to-speech (TTS) YOUR APP SPEECH RECOGNITION VOICE COMMANDS TEXT-TO-SPEECH (TTS) • Output synthetized speech • Provide the user with spoken instructions
  • 18. 3 Text-to-speech (TTS) • TTS requires only the following capability: – ID_CAP_SPEECH_RECOGNITION • TTS can output the following text types: – Unformatted text strings – Speech Synthesis Markup Language (SSML) 1.0 strings or XML files
  • 19. 3 Text-to-speech (TTS) • Outputting unformatted strings is very easy and it is also possible to select a voice language: // Declare the SpeechSynthesizer object at the class level. SpeechSynthesizer synth; private async void ButtonSimpleTTS_Click(object sender, RoutedEventArgs e) { SpeechSynthesizer synth = new SpeechSynthesizer(); await synth.SpeakTextAsync("You have a meeting with Peter in 15 minutes."); } private async void SpeakFrench_Click_1(object sender, RoutedEventArgs e) { synth = new SpeechSynthesizer(); // Query for a voice that speaks French. IEnumerable<VoiceInformation> frenchVoices = from voice in InstalledVoices.All where voice.Language == "fr-FR" select voice; // Set the voice as identified by the query. synth.SetVoice(frenchVoices.ElementAt(0)); // Count in French. await synth.SpeakTextAsync("un, deux, trois, quatre"); }
  • 20. 3 Text-to-speech (TTS) • SSML 1.0 text can be outputted from string or XML files // Speaks a string of text with SSML markup. private async void SpeakSsml_Click(object sender, RoutedEventArgs e) { SpeechSynthesizer synth = new SpeechSynthesizer(); // Build an SSML prompt in a string. string ssmlPrompt = "<speak version="1.0" "; ssmlPrompt += "xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">"; ssmlPrompt += "This voice speaks English. </speak>"; // Speak the SSML prompt. await synth.SpeakSsmlAsync(ssmlPrompt); } // Speaks the content of a standalone SSML file. private async void SpeakSsmlFromFile_Click(object sender, RoutedEventArgs e) { // Set the path to the SSML-compliant XML file. SpeechSynthesizer synth = new SpeechSynthesizer(); string path = Package.Current.InstalledLocation.Path + "ChangeVoice.ssml"; Uri changeVoice = new Uri(path, UriKind.Absolute); // Speak the SSML prompt. await synth.SpeakSsmlFromUriAsync(changeVoice); }
  • 21. 4 3 Q&A
  • 22. 4 Questions & Answers • Speech for Windows Phone 8 API references: – http://msdn.microsoft.com/en- us/library/windowsphone/develop/jj206958(v =vs.105).aspx