Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

•

4 gefällt mir•1,484 views

Smart Reply: Learning a Model of Conversation from Data: Smart Reply is a text assistance feature that was recently introduced to Inbox by Gmail. Given an incoming email message, the Smartreply system analyzes its contents and suggests complete responses that the recipient can send with just one tap. This talk will cover how we built Smartreply using a combination of deep learning and semantic clustering, as well as what we learned along the way and why we think it shows promise for the future of dialogue models.

Technologie

Confidential + Proprietary
Smart Reply: Learning a Model of
Conversation from Data
Anjuli Kannan
Software Engineer, Google Brain

Confidential + Proprietary
Can you do Tuesday or Wednesday?
Phil Sharp

Confidential + Proprietary
Tuesday Wednesday
Can you do Tuesday or Wednesday?
Phil Sharp

Smart Reply feature
● Provide text assistance for email
reply composition
● Targeted at mobile
● Responses can be sent on their
own or extended

Smart Reply feature predicts email responses
Smart Reply
Incoming
email
Response
email

Why is this task hard?
● extracting meaning from previous message
● generating language
● grammatical transformations between call and response
● matching style/tone

Why is this solution interesting?
● Model is learned fully from data

Confidential + Proprietary
Neural network
Is a 4
Is a 5
...
...
Image: Wikipedia

Confidential + Proprietary
Neural network
Neuron
Is a 4
Is a 5

Confidential + Proprietary
Basic building block is the neuron
Greg Corrado

Confidential + Proprietary
Neural network
Is a 4
Is a 5
...
...

Confidential + Proprietary
Learn a function from one space to another
f(.)x ∈ Rn
y ∈ Rm

Confidential + Proprietary
Smartreply feature predicts email responses
Smartreply
Incoming
email
Response
email

Confidential + Proprietary
Recurrent neural networks handle sequences of input
Diagram by Felix Gers

Confidential + Proprietary
Recurrent neural networks handle sequences of input

Confidential + Proprietary
Reading a word into a feed-forward neural network
cat
output

Confidential + Proprietary
Reading a sequence of words into an RNN
That

Confidential + Proprietary
Reading a sequence of words into an RNN
That is

Confidential + Proprietary
Reading a sequence of words into an RNN
That is good

Confidential + Proprietary
Reading a sequence of words into an RNN
That is good !
output

Sequence-to-sequence model
Sutskever et al, NIPS 2014

Sequence-to-sequence model
encoder decoder

Sequence-to-sequence model
Ingests incoming message Generates reply message

Reading a sequence of words into an RNN
How

Reading a sequence of words into an RNN
How are

Reading a sequence of words into an RNN
How are you

Encoder ingests the incoming message
How are you ?
Internal state is a fixed
length encoding of the
message

Decoder is initialized with final state of encoder
How are you ? __

Decoder predicts next word
How are you ? __

Decoder predicts next word
How are you ? ____ I

Smartreply model
How are you ? __
I
Message
Response

Smartreply model
How are you ? __ I
I am
Message
Response

Smartreply model
How are you ? __ I am
I am great
Message
Response

Smartreply model
How are you ? __ I am great
I am great !
Message
Response
Vinyals & Le, ICML DL 2015

Inference
● Resulting model is fully generative
● Output distribution can be used to determine the most likely responses using a
beam search

Training
● Training data is a corpus of email-reply pairs
● Both encoder and decoder are trained together (end-to-end)

Confidential + Proprietary
Key points about model
● Everything is learned from data, even features
● Neural network smooths across language variation

Deployment & coverage
● Deployed in Inbox by Gmail
● Used to assist with more than 10% of all mobile replies

Quality
● How do we ensure that the response options are always high quality in content
and language?
○ Avoid incorrect grammar and mechanics, misspellings e.g., your the best
○ Avoid inappropriate, offensive responses. e.g., Leave me alone.
○ Deal with wide variability, informal language. e.g., got it thx
● Restricting model vocabulary is not sufficient!
Solution: Restrict to a fixed set of valid responses, derived automatically from data.

Confidential + Proprietary
What the model can do

Confidential + Proprietary
What the model can't do
● Match every user's tone and style

Confidential + Proprietary
What the model can't do
● Match every user's tone and style
● Ensure diverse options

Confidential + Proprietary
What the model can't do
● Match every user's tone and style
● Ensure diverse options
● Access and update any kind of state or knowledge base

Conclusions
● Sequence-to-sequence produces plausible email replies in many common
scenarios, when trained on an email corpus
● Smart Reply is deployed in Inbox by Gmail and generates more than 10% of
mobile replies

Confidential + Proprietary
Conclusions
● A conversation model learned entirely from data is very powerful
● A data-driven approach can be complementary to hand-crafted rules and
scenarios

Confidential + Proprietary
Collaborators
- Greg Corrado, Oriol Vinyals (Google Brain)
- Balint Miklos, Tobias Kaufman, Laszlo Lukacs, and Karol Kurach (GMail)
- Sujith Ravi (Google Research)

Empfohlen

Openbar Leuven // Less is more. Working with less data in NLP by Yves PeirsmanOpenbar

Open vocabulary problemJaeHo Jang

Review: On the Naturalness of Buggy CodeJinhan Kim

Language Modeling and English Speech Prediction System to aid People with Stu...Chandana T L

Lec 0 p plRajkiya Engineering College Banda

Chatbot pptManish Mishra

Big dataIshucs

#5 Predicting Machine Translation QualityBerlin Language Technology

Empfohlen

Openbar Leuven // Less is more. Working with less data in NLP by Yves PeirsmanOpenbar

Open vocabulary problemJaeHo Jang

Review: On the Naturalness of Buggy CodeJinhan Kim

Language Modeling and English Speech Prediction System to aid People with Stu...Chandana T L

Lec 0 p plRajkiya Engineering College Banda

Chatbot pptManish Mishra

Big dataIshucs

#5 Predicting Machine Translation QualityBerlin Language Technology

UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts

Regular expression presentation for the HUBthehoagie

Script writing (2)lenteraide

A word sense disambiguation technique for sinhalaVijayindu Gamage

Machine translation from English to HindiRajat Jain

Amharic WSD using WordNetSeid Hassen

An Improved Approach to Word Sense DisambiguationSurabhi Verma

Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar

Language translation english to hindiRAJENDRA VERMA

Deep Learning for Natural Language ProcessingParrotAI

Voice UpShawn Pereira

Fusing Modeling and Programming into Language-Oriented ProgrammingMarkus Voelter

Word sense dissambiguationAshwin Perti

Recurrent Neural Networks in 10 minutes or lessTal Perry

Gpt modelsDanbi Cho

Clean Codeabdullahizzuddiin

December 12, 2019 GRIPS readability workshopLawrie Hunter

From Dream socialbot to Multiskill AI Assistant PlatformDaniel Kornev

2010 INTERSPEECH WarNik Chow

Natural Language Processing (NLP)Yuriy Guts

Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017MLconf

Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017MLconf

Weitere ähnliche Inhalte

Was ist angesagt?

UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts

Regular expression presentation for the HUBthehoagie

Script writing (2)lenteraide

A word sense disambiguation technique for sinhalaVijayindu Gamage

Machine translation from English to HindiRajat Jain

Amharic WSD using WordNetSeid Hassen

An Improved Approach to Word Sense DisambiguationSurabhi Verma

Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar

Language translation english to hindiRAJENDRA VERMA

Deep Learning for Natural Language ProcessingParrotAI

Voice UpShawn Pereira

Fusing Modeling and Programming into Language-Oriented ProgrammingMarkus Voelter

Word sense dissambiguationAshwin Perti

Recurrent Neural Networks in 10 minutes or lessTal Perry

Gpt modelsDanbi Cho

Clean Codeabdullahizzuddiin

December 12, 2019 GRIPS readability workshopLawrie Hunter

From Dream socialbot to Multiskill AI Assistant PlatformDaniel Kornev

2010 INTERSPEECH WarNik Chow

Natural Language Processing (NLP)Yuriy Guts

Was ist angesagt? (20)

UCU NLP Summer Workshops 2017 - Part 2

Regular expression presentation for the HUB

Script writing (2)

A word sense disambiguation technique for sinhala

Machine translation from English to Hindi

Amharic WSD using WordNet

An Improved Approach to Word Sense Disambiguation

Natural Language Processing in Alternative and Augmentative Communication

Language translation english to hindi

Deep Learning for Natural Language Processing

Voice Up

Fusing Modeling and Programming into Language-Oriented Programming

Word sense dissambiguation

Recurrent Neural Networks in 10 minutes or less

Gpt models

Clean Code

December 12, 2019 GRIPS readability workshop

From Dream socialbot to Multiskill AI Assistant Platform

2010 INTERSPEECH

Natural Language Processing (NLP)

Andere mochten auch

Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017MLconf

Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017MLconf

Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016MLconf

Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016MLconf

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017MLconf

Jeff Bradshaw, Founder, AdaptrisMLconf

Layla El Asri, Research Scientist, Maluuba MLconf

Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016MLconf

Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...MLconf

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...MLconf

Daniel Shank, Data Scientist, Talla at MLconf SF 2016MLconf

Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017MLconf

Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...MLconf

Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf

Scott Clark, CEO, SigOpt, at MLconf Seattle 2017MLconf

Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...MLconf

Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017MLconf

Ross Goodwin, Technologist, Sunspring, MLconf NYC 2017MLconf

Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016MLconf

Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...MLconf

Andere mochten auch (20)

Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017

Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017

Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016

Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016

Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017

Jeff Bradshaw, Founder, Adaptris

Layla El Asri, Research Scientist, Maluuba

Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016

Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...

Daniel Shank, Data Scientist, Talla at MLconf SF 2016

Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017

Caroline Sinders, Online Harassment Researcher, Wikimedia at The AI Conferenc...

Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016

Scott Clark, CEO, SigOpt, at MLconf Seattle 2017

Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...

Yi Wang, Tech Lead of AI Platform, Baidu, at MLconf 2017

Ross Goodwin, Technologist, Sunspring, MLconf NYC 2017

Virginia Smith, Researcher, UC Berkeley at MLconf SF 2016

Funda Gunes, Senior Research Statistician Developer & Patrick Koch, Principal...

Ähnlich wie Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

Pycon India 2018 Natural Language Processing WorkshopLakshya Sivaramakrishnan

Scaling Quality on Quora Using Machine LearningVo Viet Anh

MixedLanguageProcessingTutorialEMNLP2019.pptxMariYam371004

NLP using transformers Arvind Devaraj

Lecture 05 - Prompt Engineering.pptxrealsaadhassan

NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE

Deep network notes.pdfRamya Nellutla

Cloud AI GenAI Overview.pptxSahithiGurlinka

Unit 5f.pptxAdityavishwakarma950681

Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupYves Peirsman

Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain

Lessons learned from building practical deep learning systemsXavier Amatriain

Module 8: Natural language processing Pt 1Sara Hooker

Natural Language Processing (NLP).pptxSHIBDASDUTTA

Tasks and Assessments:Simplified Characters Versus Traditional CharactersLee Kerk

Deep Learning for Natural Language Processing: a pair for the agesManning Publications

Moving to neural machine translation at google - gopro-meetupChester Chen

PL Lecture 01 - preliminariesSchwannden Kuo

[GAN by Hung-yi Lee]Part 3: The recent research of my groupNAVER Engineering

Creating Simple Web Text for People with Intellectual Disabilities and to Tra...John Rochford

Ähnlich wie Anjuli Kannan, Software Engineer, Google at MLconf SF 2016 (20)

Pycon India 2018 Natural Language Processing Workshop

Scaling Quality on Quora Using Machine Learning

MixedLanguageProcessingTutorialEMNLP2019.pptx

NLP using transformers

Lecture 05 - Prompt Engineering.pptx

NOVA Data Science Meetup 1/19/2017 - Presentation 2

Deep network notes.pdf

Cloud AI GenAI Overview.pptx

Unit 5f.pptx

Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup

Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...

Lessons learned from building practical deep learning systems

Module 8: Natural language processing Pt 1

Natural Language Processing (NLP).pptx

Tasks and Assessments:Simplified Characters Versus Traditional Characters

Deep Learning for Natural Language Processing: a pair for the ages

Moving to neural machine translation at google - gopro-meetup

PL Lecture 01 - preliminaries

[GAN by Hung-yi Lee]Part 3: The recent research of my group

Creating Simple Web Text for People with Intellectual Disabilities and to Tra...

Mehr von MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf

Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf

Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf

Josh Wills - Data Labeling as Religious ExperienceMLconf

Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf

Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf

Meghana Ravikumar - Optimized Image Classification on the CheapMLconf

Noam Finkelstein - The Importance of Modeling Data CollectionMLconf

June Andrews - The Uncanny Valley of MLMLconf

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf

Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf

Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf

Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf

Neel Sundaresan - Teaching a machine to codeMLconf

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf

Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf

Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf

Mehr von MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding

Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...

Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush

Josh Wills - Data Labeling as Religious Experience

Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...

Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...

Meghana Ravikumar - Optimized Image Classification on the Cheap

Noam Finkelstein - The Importance of Modeling Data Collection

June Andrews - The Uncanny Valley of ML

Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks

Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...

Vito Ostuni - The Voice: New Challenges in a Zero UI World

Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...

Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...

Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...

Neel Sundaresan - Teaching a machine to code

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...

Soumith Chintala - Increasing the Impact of AI Through Better Software

Roy Lowrance - Predicting Bond Prices: Regime Changes

Kürzlich hochgeladen

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Histor y of HAM Radio presentation slidevu2urc

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors

How to Troubleshoot Apps for the Modern Connected Worker

Driving Behavioral Change for Information Management through Data-Driven Gree...

08448380779 Call Girls In Friends Colony Women Seeking Men

How to Troubleshoot Apps for the Modern Connected Worker

08448380779 Call Girls In Civil Lines Women Seeking Men

presentation ICT roal in 21st century education

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Histor y of HAM Radio presentation slide

GenCyber Cyber Security Day Presentation

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

What Are The Drone Anti-jamming Systems Technology?

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Finology Group – Insurtech Innovation Award 2024

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Anjuli Kannan, Software Engineer, Google at MLconf SF 2016

1. Confidential + Proprietary Smart Reply: Learning a Model of Conversation from Data Anjuli Kannan Software Engineer, Google Brain

2. Problem

3. Confidential + Proprietary Can you do Tuesday or Wednesday? Phil Sharp

4. Confidential + Proprietary Tuesday Wednesday Can you do Tuesday or Wednesday? Phil Sharp

5. Smart Reply feature ● Provide text assistance for email reply composition ● Targeted at mobile ● Responses can be sent on their own or extended

6. Smart Reply feature predicts email responses Smart Reply Incoming email Response email

7. Why is this task hard? ● extracting meaning from previous message ● generating language ● grammatical transformations between call and response ● matching style/tone

8. Why is this solution interesting? ● Model is learned fully from data

9. Model

10. Confidential + Proprietary Neural network Is a 4 Is a 5 ... ... Image: Wikipedia

11. Confidential + Proprietary Neural network Neuron Is a 4 Is a 5

12. Confidential + Proprietary Basic building block is the neuron Greg Corrado

13. Confidential + Proprietary Neural network Is a 4 Is a 5 ... ...

14. Confidential + Proprietary Learn a function from one space to another f(.)x ∈ Rn y ∈ Rm

15. Confidential + Proprietary Smartreply feature predicts email responses Smartreply Incoming email Response email

16. Confidential + Proprietary Recurrent neural networks handle sequences of input Diagram by Felix Gers

17. Confidential + Proprietary Recurrent neural networks handle sequences of input Diagram by Felix Gers

18. Confidential + Proprietary Recurrent neural networks handle sequences of input

19. Confidential + Proprietary Reading a word into a feed-forward neural network cat output

20. Confidential + Proprietary Reading a sequence of words into an RNN That

21. Confidential + Proprietary Reading a sequence of words into an RNN That is

22. Confidential + Proprietary Reading a sequence of words into an RNN That is good

23. Confidential + Proprietary Reading a sequence of words into an RNN That is good !

24. Confidential + Proprietary Reading a sequence of words into an RNN That is good ! output

25. Sequence-to-sequence model Sutskever et al, NIPS 2014

26. Sequence-to-sequence model encoder decoder

27. Sequence-to-sequence model Ingests incoming message Generates reply message

28. Inference

29. Reading a sequence of words into an RNN How

30. Reading a sequence of words into an RNN How are

31. Reading a sequence of words into an RNN How are you

32. Reading a sequence of words into an RNN How are you ?

33. Encoder ingests the incoming message How are you ? Internal state is a fixed length encoding of the message

34. Decoder is initialized with final state of encoder How are you ? __

35. Decoder is initialized with final state of encoder How are you ? __

36. Decoder predicts next word How are you ? __

37. Decoder predicts next word How are you ? ____ I

38. Smartreply model How Message

39. Smartreply model How are Message

40. Smartreply model How are you Message

41. Smartreply model How are you ? Message

42. Smartreply model How are you ? __ I Message Response

43. Smartreply model How are you ? __ I I am Message Response

44. Smartreply model How are you ? __ I am I am great Message Response

45. Smartreply model How are you ? __ I am great I am great ! Message Response Vinyals & Le, ICML DL 2015

46. Inference ● Resulting model is fully generative ● Output distribution can be used to determine the most likely responses using a beam search

47. Training

48. Training ● Training data is a corpus of email-reply pairs ● Both encoder and decoder are trained together (end-to-end)

49. Training ● Training data is a corpus of email-reply pairs ● Both encoder and decoder are trained together (end-to-end)

50. Confidential + Proprietary Key points about model ● Everything is learned from data, even features ● Neural network smooths across language variation

51. Smart Reply in Production

52. Deployment & coverage ● Deployed in Inbox by Gmail ● Used to assist with more than 10% of all mobile replies

53. Examples

54. Quality ● How do we ensure that the response options are always high quality in content and language? ○ Avoid incorrect grammar and mechanics, misspellings e.g., your the best ○ Avoid inappropriate, offensive responses. e.g., Leave me alone. ○ Deal with wide variability, informal language. e.g., got it thx ● Restricting model vocabulary is not sufficient! Solution: Restrict to a fixed set of valid responses, derived automatically from data.

55. Most frequently used clusters

56. Confidential + Proprietary What the model can do

57. Confidential + Proprietary What the model can't do ● Match every user's tone and style

58. Confidential + Proprietary What the model can't do ● Match every user's tone and style ● Ensure diverse options

59. Confidential + Proprietary What the model can't do ● Match every user's tone and style ● Ensure diverse options ● Access and update any kind of state or knowledge base

60. Conclusions

61. Conclusions ● Sequence-to-sequence produces plausible email replies in many common scenarios, when trained on an email corpus ● Smart Reply is deployed in Inbox by Gmail and generates more than 10% of mobile replies

62. Confidential + Proprietary Conclusions ● A conversation model learned entirely from data is very powerful ● A data-driven approach can be complementary to hand-crafted rules and scenarios

63. Confidential + Proprietary Collaborators - Greg Corrado, Oriol Vinyals (Google Brain) - Balint Miklos, Tobias Kaufman, Laszlo Lukacs, and Karol Kurach (GMail) - Sujith Ravi (Google Research)

64. Confidential + Proprietary Thank you!

65. Extra slides

66. Example

67. Unique cluster and suggestion usage

68. Ranking experiments