Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Adding Conversation
to GUIs
Dekang Lin
Naturali
1

A Tale of Two Uber Rides
uber ride to
crowne
plaza sfo

Naturali
A Beijing-based startup company
Upgrade apps with a speech interface
Naturali Sesami
✦ Translate speech inputs to action sequences
in apps and execute them on users’ behalf.
✦ Chinese version launched on LeTV phones
as a system app on April 12, 2017
✦ Available as a third party app all Android
phones since Aug. 2017

Advantages of Speech
Speed
✦ voice input is three times as fast as typing
Hand-free:
✦ send messages, play music, order food
✦ turn on hotspot: 5 clicks
Mind-free:
✦ where is my luggage?

Voice Assistants
Chat window
Fulfillment by backend API calls

Chat + API: the down sides
Chat assistants displace apps, but
Chat is not the best mode of
interaction for everything.
editing
browsing
viewing
None the less, there are plenty of
needs for voice interaction.
who has
access to
this?

Re-invention of user
experience inside the
chat window:
✦ usually not as good as
specialized apps,
✦ requires a great deal of
repeated development
effort

Re-invention of user
experience inside the
chat window:
✦ usually not as good as
specialized apps,
✦ requires a great deal of
development effort

Economic interests of the assistant and the backend
services may not be aligned.

Naturali Sesami
A thin, transparent translation
layer over apps.
✦ voice ➜ front end UI actions
Seamless integration of speech
and graphics
✦ Existing GUI interactions are still
available
✦ Making voice interaction available
on any app page

Use Yelp to find greek food near Santa Clara Convention
Center

Voice to Actions in Three Steps
Speech Recognition: sound → text
✦ data
Semantic Interpretation: text → intent
✦ knowledge
Plan Generation: intent → actions
✦ grounding

Speech Recognition: sound → text
Third party services
Open source tools

Naturali Speech
End-to-end DNN: CNN+LSTM+Attention+CTC
✦ built from scratch with TensorFlow
✦ trained with thousands of hours of transcribed speech
Personalized and contextualized language model:
✦ contact names
✦ app specific vocabulary

Semantic Interpretation: text → intent
An intent identifies a task and the necessary
information (parameters) for the task
Example:
✦ task: FlightSearch
✦ parameters: (to, from, date, airline, class)

Entities and Types
Persons: singers/directors/contacts
Locations: cities/POIs/addresses
Apps and Games
Media: songs/shows/movies/books
Time and Date
Food
Sports teams
……

Recognizing Thousands of Types
It is not an option to use manually labeled training
examples.
An alternative is to use naturally annotated data:
✦ Hearst patterns: NPtype such as NPinst
✦ Other examples: navigate to NPloc

Multi-round Conversation
Complex intents may not be articulated in one shot
✦ FlightSearch(to, from, date, airline, class)
A multi-round conversation incrementally collects
information from user and guides the user in the
process.

Composite Intents
Messenger chat with Alex and say let’s meet on saturday
✦ OpenMessenger
✦ ChatWithPerson
✦ SendMessage
get a uber black ride to SFO
✦ UberRide
✦ SetDest
✦ SelectUberBlack

Plan Generation: intent → actions
Grounding: establishes the connection between in the
inside (the assistant) and the outside (apps and devices).
Example:
✦ intent:
{“task”: “FlightStatus”, “number”:”UA888”, “date”:”2017-11-04”}
✦ action:
select * from flight_db where “airline”=“United Airlines”, flight_num = “888”
and year=2017 and month=11 and day=4

Grounding by Crowd Sourcing
context
expression
actions
Skills=

Crowd Sourced Skills
Skills are immediately usable by the creator.
✦ The user may share the skills with others, e.g., tech support
for parents
Vetted skills can be made available to the public

Summary
Voice interaction is inevitable
Naturali Sesami translates user requests into sequences
of actions in APPs.
Sesami grows by crowd sourcing skills.
Join US!
✦ jobs@naturali.ai

Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Dekang Lin at AI Frontiers: Adding Conversation to GUIs

Ähnlich wie Dekang Lin at AI Frontiers: Adding Conversation to GUIs (20)

Mehr von AI Frontiers

Mehr von AI Frontiers (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Dekang Lin at AI Frontiers: Adding Conversation to GUIs