A Crowd-Powered Conversational Assistant That Automates Itself Over Time

Live Note/QA: http://tinyurl.com/KenDefense
1 / 85
[ Question / Feedback: http://tinyurl.com/KenDefense ]
Ting-Hao (Kenneth) Huang, Carnegie Mellon University
A Crowd-Powered Conversational Assistant That
Automates Itself Over Time

2 / 85
A Crowd-Powered
Conversation
Assistant
 CHI’18 , CHI LBW’16
 UIST’17, UIST Poster’17
 HCOMP’17, ‘16, ‘15, HCOMP DC’16,
HCOMP WIP’14
 CI’17
 CSCW Workshop'17
Chorus

3 / 85

4 / 85

5 / 85

6 / 85

7 / 85

8 / 85

9 / 85
What just
happened?
• Open Conversation
• Multi-turn interaction
• Multiple domains
• Personalized
• Coherent dialog
• Mix of task-oriented
and social conversation

10 / 85
Today’s Conversational Assistants…
“What’s new
with Alexa?”
“Talking to Siri”

11 / 85
Open Conversation
Personal
Assistants
Automated

12 / 85
Existing Approaches to
Open Conversation
• Combining multiple automated dialog systems
• DialPort (Zhao, et al., 2016)
• End-to-end framework for dialogue systems
• Serban, et al. 2016; Li, et al. 2017
• Adapting a model to many other domains
• Walker, et al., 2007; Sun, et al., 2016
• Chit-chat systems (social bot)
• Hold social conversations (Banchs, et al., 2012)
• Still a very hard problem…

13 / 85
Existing Approaches to
Open Conversation
• Combining multiple task-oriented dialog systems
• DialPort (Zhao, et al., 2016)
• End-to-end framework for dialogue systems
• Serban, et al. 2016; Li, et al. 2017
• Adapting a model to many other domains
• Walker, et al., 2007; Sun, et al., 2016
• Chit-chat systems (social bot)
• Hold social conversations (Banchs, et al., 2012)
• Still a very hard problem…
MIT Technology Review
Feb 27, 2018

14 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated

15 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated

16 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems

17 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems

18 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems

19 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.

20 / 85
Thesis Statement
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]

21 / 85
Thesis Statement
Chorus Deployment Evorus
[ HCOMP’16, HCOMP’17 ] [ CHI’18 , UIST Poster’17 ]

22 / 85
Thesis Statement
Guardian
[ HCOMP’15, CI’17 ]

23 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems

24 / 85
Chorus: A Crowd-Powered
Conversation Assistant

25 / 85
Chorus: A Crowd-Powered Conversational Assistant
Lasecki, et al. UIST’13.
•Crowd workers collectively hold a
conversation by:
1. Propose Responses
2. Vote Responses
3. Take Notes
• Reward points for
each action
• Agreement bonus
Chorus: A Crowd-Powered
Conversation Assistant

26 / 85
User Interface
26 / 31

27 / 85
User & Worker Interface
27 / 31

28 / 85

29 / 85
We Deployed Chorus
• Launched on May 20th, 2016
• On Google Hangouts
• 2200+ conversations, 420+ users
• TalkingToTheCrowd.org

30 / 85
female, computer science
PhD student in Texas
we're going to visit her this
weekend from Pittsburgh
She's in Austin
Does she have any
favorite TV shows,
movies, or video games?
U
Sure! What types of
things does your friend
like?
U
Can you suggest some
birthday present for one
of my friend?
30
Gift
Suggestion

31 / 85
She's in Austin
Does she have any
favorite TV shows,
U
Sure! What types of
like?
U
of my friend?
31
Gift
Suggestion

32 / 85
She's in Austin
Does she have any
favorite TV shows,
U
Sure! What types of
like?
U
of my friend?
32
Gift
Suggestion

33 / 85
Pittsburgh
with which company
are you flying?
U
Let me check
UHow many suitcases can I
take on a flight from the US
to Israel?
Can I ask you from where
are you planning to board
the flight?
and which air services
are you using?
Travel
Planning
Full transcript:
Huang, et al. HCOMP 2016.

34 / 85
What Did We Learn?
• Challenges Identified
• Malicious workers & users
• Identifying the end of a conversation
• When workers’ consensus is not enough…
• Basic Statistics
• Avg session duration = 10.63 min (SD=8.38)
• Avg #message per session = 25.87 (SD= 27.27)
Foundation for future automation!

35 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment

36 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment

37 / 85
Evorus: A Crowd-Powered Conversational Assistant
Built to Automate Itself Over Time
[ UIST Poster’17, CHI’18 ]

38 / 85
Automating Chorus Over Time

39 / 85

40 / 85

41 / 85

42 / 85
Empower Chorus with Multiple Chatbots

43 / 85
Chatbots
How to select
chatbots
automatically?

44 / 85
Ranking Chatbots: Performance & Topic
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
≈

45 / 85
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
~= Overall Message
Acceptance Rate
≈

46 / 85
Topic Similarity
User Message
Domain of
the Chatbot
Hey what should
I eat in Montreal?
≈

47 / 85
Topic Similarity
User Message
Domain of
the Chatbot
Hey what should
I eat in Montreal?
Find me some
good restaurants !
Where can I get
Chinese food?
Example
Triggering
Message
≈

48 / 85
Topic Similarity
User Message
Domain of
the Chatbot
Hey what should
I eat in Montreal?
Example
Triggering
Message
Find me some
good restaurants !
Where can I get
Chinese food?
Topic
Similarity
≈

49 / 85
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
≈

50 / 85
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
 Add more chatbots over time !
≈

51 / 85

52 / 85
Automatic Upvote
How to estimate
the impact of an
automation?

53 / 85
Find the Best Confidence Threshold
• High Threshold
• Only vote when pretty sure
• High precision, but little benefit
• Low Threshold
• Nearly always vote
• Grant agreement bonus by mistake
• Damage conversation quality

54 / 85

55 / 85
Automating Open Conversation
• Setup
• A 5-month-long deployment, 80 Users
• 4 chatbots + 1 voting bot
• Result
• Automated responses were chosen 12.44% of the time.
• Human upvotes were reduced by 13.81%.
• The cost of each message is reduced by 32.76%.
• Conversation quality and user
satisfaction level remains.
• Conversation Quality: Satisfaction,
Clarity, Responsiveness, Comfort
(Liu, et al., 2010)

56 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
Evorus
[ CHI’18 , UIST Poster’17 ]

57 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
Evorus

58 / 85
Empower Chorus with Multiple Chatbots

59 / 85
How to build a set of
chatbots quickly?

60 / 85
Use Web APIs to Empower Chorus
19,758+ APIS

61 / 85
Use Web APIs to Empower Chorus
19,758+ APIS
How to convert an
Web API into a
chatbot?

62 / 85
Guardian: A Crowd-Powered Spoken Dialog
System for Web APIs
[ HCOMP WIP’14, HCOMP’15, CI’17 ]

63 / 85
Guardian: A Crowd-Powered Dialog System
for Web APIs
3
2 Dialog ManagementHi, I’m in San Diego.
Any Chinese restaurants here?
1 Language Understanding
Response Generation
Mandarin Wok Restaurant is
good ! It’s on 4227 Balboa Ave.
term = Chinese
location = San Diego
Yelp
Search
API 2.0
{ ... "name":
"Mandarin Wok
Restaurant”,...
"address":["4227
Balboa Ave”,...], …}
JSON

64 / 85
Parameter Extraction
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Hi, I’m in San Diego.
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User

65 / 85
Parameter Extraction
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
1. How to extract
parameters?
2. Which parameters
to use?

66 / 85
How to Extract Parameters?
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
1. How to extract
parameters?
2. Which parameters
to use?

67 / 85
Real-time On-Demand Crowd-powered Entity Extraction.
Huang, et al. Collective Intelligence 2017.
Crowd-Powered Parameter Extraction
Answer
Aggregate
Location =
San Diego
RecruitedPlayers
Time Constraint
(10 – 20 sec)

68 / 85
Which Parameters to Use?
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
1. How to extract
parameters?
2. Which parameters
to use?

69 / 85
Parameter Rating Problem
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Pick good parameters for the dialog system.

70 / 85
How about just do a survey?
Task
Parameter Name / Desc

71 / 85
Match Questions with Parameters
I like Chinese food.
What do you want to eat?
? !
I’m in Pittsburgh.
Which city are you in?
? !
Dinner.
Is it dinner or lunch?
? !
...
Yelp API
Question Collection

72 / 85
offset
? !
? !
Dinner.
? !
...
term
location
sw_latitude
sw_longitude
category_filter
Yelp API
Question Collection
Parameter Filtering

73 / 85
offset
? !
? !
Dinner.
? !
...
location
?
!
term
? !
!
?
!
? !
?
!
?
!
category_filter
? !
?
!
?
!
?
!
? !
?
!
? !
?
! ? !
? ! ? !
?
!
?
!
?
!
?
!
?
!
?
!?
!
? !
? !
? !
? !
? !? !
?
!
?
!
? !
? !? !
? !
? !
? !
?
!
? !
?
!
term
location
sw_latitude
sw_longitude
category_filter
BetterParameter
Yelp API
Question Collection
Parameter Filtering
Question-Parameter Matching

74 / 85
Evaluation on Parameter Ranking
0
0.2
0.4
0.6
0.8
1
MAP MRR
Question Matching
Ask Siri
Ask a Friend
• Average results of 8 Web APIs’ parameters

75 / 85
Guardian: A Crowd-Powered Dialog System
for Web APIs
3
2 Dialog ManagementHi, I’m in San Diego.
Any Chinese restaurants here?
1 Language Understanding
Response Generation
Mandarin Wok Restaurant is
good ! It’s on 4227 Balboa Ave.
term = Chinese
location = San Diego
Yelp
Search
API 2.0
{ ... "name":
"Mandarin Wok
Restaurant”,...
"address":["4227
Balboa Ave”,...], …}
JSON

76 / 85
Task
Find Chinese
restaurants in
Pittsburgh.
Check current weather
by using a zip code.
Find information
of “Titanic”.
API
Result
9 out of 10 9 out of 10 6 out of 10
Final
Response
10 out of 10 9 out of 10 10 out of 10
Evaluation: Task Completion Rate
Crowd Recover Errors Crowd Recover Errors
2
3

77 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
Evorus
Guardian
[ HCOMP’15, CI’17 ]

78 / 85
Thesis Statement

79 / 85
Thesis Statement
Guardian
[ HCOMP’15, CI’17 ]

80 / 85
Some More Projects…
Ignition HCOMP’17
WearMail
Swaminathan et al. UIST’17
InstructableCrowd
CHI LBW’16, TOCHI (Under Review)
Visual Storytelling (VIST)
NAACL’16, Ferraro et al. EMNLP’15,
EmotionLines
Chen et al.,
LREC’18

81 / 85
Crowd Research is Critical
For Building Future Computer Systems.
• Collect data to guide AI models
• Accomplish tasks that are not yet fully automated
• Pave the way for future AI systems

82 / 85
Future Work
• Deployed Chorus as An Open Research Platform
 Chorus API
 1000+ chatbots
• Chorus on Smart Devices
 Echo, Google Home…
• Future Crowd-AI Systems!
 Object Recognition
 Speech Recognition
 Programming Tools
 … And More!

83 / 85
Future Work
• Deployed Chorus as An Open Research Platform
 Chorus API
 1000+ chatbots
• Chorus on Smart Devices
 Echo, Google Home…
• Future Crowd-AI Systems!
 Object Recognition
 Speech Recognition
 Programming Tools
 … And More!

84 / 85
Acknowledgment
• Family, Yan-Zhu (Lavender) Chen
• Jeffrey P. Bigham
• Walter S. Lasecki, Chris Callison-Burch, Alex Rudnicky, Margaret
Mitchell, Lun-Wei Ku, Hsin-Hsi Chen, Saiph Savage, Jane Hsu…
• Shoou-I Yu, Joseph Chee Chang, Chih-Yi (Jessica) Lin, Shihyun Lo,
Chu-Cheng Lin, Yun-Nung (Vivian) Chen, Lingpeng Kong, Luan Yi,
William Wang, Zi Yang, Yen-Chia Hsu, Kuen-Bang Hou (Favonia),
Kerry Shih-Ping Chang, Janet Huang, Yi-Chia Wang, Kai-min Kevin
Chang…
• Anhong Guo, Sai Ganesh, Kotaro Hara, Yashesh Gaur, Gierad Laput,
Robert Xiao, Yang Zhang, Patrick Carrington, Luz Rello, Cole Gleason,
Kristin Williams, Alex Chen, Susumu Saito…
• Amos Azaria, Oscar Romero Lopez…
• Stacey Young

85 / 85
@windx0303
KennethHuang.cc
Ting-Hao (Kenneth) Huang
Carnegie Mellon University
tinghaoh@cs.cmu.edu
Thank you!

86 / 85
Backup Slides

87 / 85

88 / 85
Automatic Voting

89 / 85
Find the Best Confidence Threshold
Expected Reward Points Saved

A Crowd-Powered Conversational Assistant That Automates Itself Over Time

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie A Crowd-Powered Conversational Assistant That Automates Itself Over Time

Ähnlich wie A Crowd-Powered Conversational Assistant That Automates Itself Over Time (20)

Mehr von Ting-Hao Huang

Mehr von Ting-Hao Huang (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

A Crowd-Powered Conversational Assistant That Automates Itself Over Time

Hinweis der Redaktion