No, not Majel Barrett, but being able to simply speak commands into the air and have a computer either retrieve data or execute functions has always been a lofty goal. With tech like Siri, Cortana and Google Now, it seemed like we were always on the cusp, but the systems were locked down with very limited expansion abilities. But now there’s Alexa from Amazon, and that connection to your enterprise systems is at your command. But what does it mean to have a voice UI? What are the pitfalls that come with the benefits? What are the design and testing considerations when developing conversations with your data? We’ll explore these topics as well as discuss what it takes to build Amazon Skills, and how effective it is for business purposes.
3. INTRO
➤ William “Bill” Klos
➤ Senior Architect, Centric Consulting
➤ Columbus, OH
➤ Specialties
➤ Cloud
➤ Mobility
➤ Alternative Technologies
➤ @williamklos
➤ bill.klos@centricconsulting.com
From Google Car,
Brome, QC, CA
August 2015
4. THE PROMISE
➤ Star Trek
➤ 2001: A Space Odyssey
➤ Interstellar
➤ Mother (Alien)
➤ The Beam (Canvas)
5. But what I’m talking about is not Artificial Intelligence or Machine Learning.
6. I’M TALKING ABOUT
➤ Quick Status Updates
➤ What happened with the
overnight jobs?
➤ What’s the story with the dog?
➤ Did Mom take her medicine
today?
➤ Executing Actions
➤ Route me home but don’t bug
me unless there’s traffic.
➤ Put me down for 8 hours today
on the Acme project.
➤ Blow up the ship.
7. WHAT’S SO GOOD ABOUT A VOICE UI?
➤ When It’s Good
➤ Less Friction, More Natural
➤ Pervasive/Ubiquitous
➤ Can be Conversational
➤ Truly Keeps Hands Free
➤ Requires Less Focus/It’s Freeing
➤ Only the Needed Information
➤ When It’s Bad
➤ Requires Focused Verification of
Results
➤ Adds to the Environmental Chaos
➤ Global Thermonuclear War
10. THE PLUSES
➤ Cheap
➤ There’s an API & SDK for it.
➤ Easy to develop to.
➤ Can Host Anywhere where HTTPS is Available
➤ Voice recognition is good.
➤ Can Own a Room
➤ Being Extended all the Time
➤ Good Support & Community
11. THE MINUSES
➤ It’s Voice Recognition, but not Necessarily YOUR Voice
➤ Requires an Internet Connection
➤ Testing Can be Wonky
➤ Will drive your family mad & leave you hoarse
➤ Can’t Take it With You
➤ Not as Feature-rich as Your Phone’s Capabilities Yet
➤ Have to buy a complete device everywhere you want to use it
➤ Cannot Initiate an Interaction
➤ Using your services a little less natural than native services
13. GENERAL USAGE - WHAT IS A VOICE UI (MANAGEMENT)
DueForward
API
Mobile
UI
Web
UI
Voice
UI
14. GENERAL USAGE - WHAT IS A VOICE UI (YOU GUYS)
Spicoli
[devops-slack-hook-push]
Telemetri
[telemetri-api]
DueForward
[dueforward-api]
tbd-email
[aws-ses-manager]
SLACK
tbd-push
[aws-sns-manager]
RSS
FEEDS
[112]
WEB PAGES
[~3500/mo]
Voice UI
[alexa-voice-api]
Web UI
[angular]
Dashboards
[bi-bigdata]
GENERAL TAXONOMIES
COMPANIES
[154]
CONCEPTS
[85]
CITIES
[35]
CLOUDS
[14]
DATABASES
[29]
HARDWARE
[21]
SOFTWARE
[26]
INDUSTRIES
[26]
MATERIALS
[3]
PLATFORMS
[41]
LANGUAGES
[36]
SYNONYMS
[279]
Subscriber
Access
Centric Access
CLIENT/INDUSTRY TAXONOMIES
HEALTHCAREINSURANCE FINANCIAL
MICROSOFT
ALLIANCE
DATA
PROCTOR &
GAMBLE
CLIENT CLIENT
POTENTIAL
CLIENT
INDUSTRY
SERVICE
OFFERING
SERVICE
OFFERING
Public Access
DEMO
DASHBOARD
Blog Posts Opinions Reputation
OPERATIONAL
AWS
OTHER
CENTRIC
APPS
BI SO
ATOM
FEEDS
[20]
PHASE 2
PHASE 3
for CodeMash 2016
presentation
notifications
notifications
notifications
logs
AWS S3
15. “ALEXA”
“ALEXA”
“DO
SOMETHING”
“DO
SOMETHING”
“ASK MY APP
TO…”
“HERE ARE
YOUR
AWESOME
RESULTS”
“HERE ARE
YOUR
AWESOME
RESULTS”
CUSTOM SKILL INTERACTION
NATIVE SKILL INTERACTION
GENERAL USAGE - INTERACTING WITH ALEXA
Alexa, turn on the lights downstairs.
Alexa, tell DevOps to spin up another order processor.
The lights are now on.
There are now 6 order processors running.
16. GENERAL USAGE - ADDRESSING ALEXA & YOUR SKILL
➤ Address the device with: “Alexa” or “Amazon”. Your pick.
➤ Address your Skill with:
➤ Ask, Tell (preferred, most natural)
➤ Talk to, Open, Launch, Start, Use, Resume, Run, Load,
Begin
➤ There is no functional difference between the phrases it
just comes down to what is easiest to convey the
necessary meaning by the user to your application.
AskDueForward what is the current STATUS of the DATABASE
AskDueForward what VERSION it is RUNNING
AskDueForward how many DOCUMENTS need to be DETERMINED
TellDueForward to KICK OFF a DETERMINATION job
AskDueForward how many DOCUMENTS need to be DETERMINED
18. HOW TO DEVELOP FOR IT - WHAT YOU’LL NEED
➤ An AWS Account w/Alexa Development Option
➤ http://developer.amazon.com
➤ An Amazon Echo (though you can do some stuff w/o it)
➤ An Intents File
➤ A kind of template file for filtering your Utterances through
➤ An Utterances File
➤ example phrases
➤ Your Language of Choice (Google go for me)
➤ A Server with HTTPS Capabilities
19. HOW TO DEVELOP FOR IT - APPLICATION SPECIFICS
➤ If you work with Java, Node.js, or Python - you can use
Amazon Lambda to host and execute your source in response
to Alexa events.
➤ More languages coming soon.
➤ Or you are completely free to use whatever you want as long
as you have an HTTPS endpoint to point Alexa to.
26. ALEXA SKILL KIT - UTTERANCES
Alexa, ask DueForward what were the top CONCEPTS for LAST MONTH
Alexa, ask DueForward what were the top DATABASES for LAST MONTH
Alexa, ask DueForward what are the top LANGUAGES for THIS MONTH
Alexa, ask DueForward what was the top SOFTWARE for LAST MONTH
27.
28.
29.
30. CONSIDERATIONS
➤ Designing Your Input Options
➤ How casual? How formal? How many ways to say it?
➤ Can you ask it easily or is it a complex request?
➤ Designing Your Response Options
➤ How casual? How formal? How many ways to say it?
➤ How much data can you retain when hearing vs. seeing?
➤ Do you want read-only (safe) or read-write (powerful)?
➤ Transaction size (short, to the point on both sides)
➤ You have about 10 seconds to put it all together.
➤ Testing
32. SETTING UP YOUR APP
➤ For Amazon Certification & Publishing
➤ Verify that the Request was Sent by Alexa
➤ Check the Signature of the Request
➤ Check the Timestamp of the Request
➤ Don’t Need These for Testing
➤ Verify the Application Id Matches the One Assigned
38. CODING THE APP - GENERAL ARCHITECTURE
Shell API
Passthru
[optional]
DueForward
Application
DB
DB
DB DB
DB
DB
HTTPS
Future
Application
Future
Application
Future
Application
Future
Application
ECHO APP DATA
46. CONVERSATION - DISAMBIGUATION
➤ Alexa, tell CampIO to check in Bill.
➤ Did you mean Bill Klos or Bill Chamberlain?
➤ Bill Chamberlain | The second one.
➤ OK. Bill Chamberlain is now checked in.
48. CONVERSATION - PROMPTING THE USER
➤ Alexa, ask DueForward… | Alexa, ask DueForward for help.
➤ You can check system status, get metrics, or run a job. Which
would you like to do?
➤ Run a job, please.
➤ OK. I can re-determine documents or destroy the ship. Which
would you like me to do?
49. CONVERSATION - TELL ME MORE
➤ Alexa, ask DueForward what is the current status of the
database?
➤ All database servers are currently operating normally.
➤ What about memory usage?
➤ Memory usage is at 11%.
➤ And how many documents need to be re-determined?
➤ Currently, there are 177 documents that need to be re-determined.
Would you like me to go ahead and clear them out?
➤ Please.
➤ Done.
52. “Computer, initiate self-destruct
sequence 1, code 1-1 A, set for five
minutes and I want it to be silent
except for a ticking clock sound to
mysteriously play ship-wide over the
speakers. Thanks.
-The Captain
60. CONVERSATIONS - WHAT ABOUT SECURITY?
Tell Me Your
Authorization
Code
Confirm Your
Request One
Last Time
CAPT?
Geez, Use
“The Force”
Go Find The
Captain
JEDI?
Request Self-
Destruct
Sequence
63. WHAT DID I SAY? - METAPHONES, SOUNDEX, & NYSIIS
➤ Metaphone
➤ William Klos = WLM KLS
➤ Incidentally, “Galluzzo” also = KLS
➤ SoundEx
➤ William Klos = W450 K420
➤ NYSIIS
➤ New York State Identification & Intelligence System
➤ 8-25 Step Process
➤ William Klos = WALAN CL
69. WHAT DID I SAY? - EXACTNESS
➤ Be mindful that Alexa tries to be forgiving.
➤ If what you speak vs. the matched Utterance has some
instances of the wrong tense, transposed words, or
missing/substituted words, it will try to give you the
benefit of the doubt.
➤ If you need exactness, you’ll have to use Slots and bounce the
associated data against a database when matching instead of
simply matching on Intents.
72. WHAT DID YOU SAY? - SSML
➤ audio (recorded voice files)
➤ break (adding pauses)
➤ p (paragraph)
➤ phoneme (pronunciation based on defined alphabets)
➤ s (ending a sentence with a period)
➤ say-as (spell-out, digits, fraction)
➤ speak (root element)
➤ w (verb, noun, past-participle, alternate pronunciations
➤ (e.g. bass vs. bass)
73. WHAT DID YOU SAY? - SSML
<speak>
five<break time="1s"/>four<break time="1s"/>three.
Abort sequence canceled.
</speak>
77. THINGS TO REMEMBER
➤ Enunciation helps, but it is not critical to success.
➤ But ambient noise can wreak havoc.
➤ You should have to have thousands of samples, but more than a few.
➤ Don’t strive for perfection, but don’t blow up your ship on a mis-
understanding.
➤ Treat phonetics/pronunciation codes like you would multi-lingual
set ups.
➤ 2 Words: DevOps
➤ Be polite. Give bonus points for niceties.
➤ Can be a cheap way to get cool in to the enterprise and sneak in
some alternate technologies.