SlideShare ist ein Scribd-Unternehmen logo
1 von 89
Live Note/QA: http://tinyurl.com/KenDefense
1 / 85
[ Question / Feedback: http://tinyurl.com/KenDefense ]
Ting-Hao (Kenneth) Huang, Carnegie Mellon University
A Crowd-Powered Conversational Assistant That
Automates Itself Over Time
Live Note/QA: http://tinyurl.com/KenDefense
2 / 85
A Crowd-Powered
Conversation
Assistant
 CHI’18 , CHI LBW’16
 UIST’17, UIST Poster’17
 HCOMP’17, ‘16, ‘15, HCOMP DC’16,
HCOMP WIP’14
 CI’17
 CSCW Workshop'17
Chorus
Live Note/QA: http://tinyurl.com/KenDefense
3 / 85
Live Note/QA: http://tinyurl.com/KenDefense
4 / 85
Live Note/QA: http://tinyurl.com/KenDefense
5 / 85
Live Note/QA: http://tinyurl.com/KenDefense
6 / 85
Live Note/QA: http://tinyurl.com/KenDefense
7 / 85
Live Note/QA: http://tinyurl.com/KenDefense
8 / 85
Live Note/QA: http://tinyurl.com/KenDefense
9 / 85
What just
happened?
• Open Conversation
• Multi-turn interaction
• Multiple domains
• Personalized
• Coherent dialog
• Mix of task-oriented
and social conversation
Live Note/QA: http://tinyurl.com/KenDefense
10 / 85
Today’s Conversational Assistants…
“What’s new
with Alexa?”
“Talking to Siri”
Live Note/QA: http://tinyurl.com/KenDefense
11 / 85
Open Conversation
Personal
Assistants
Automated
Live Note/QA: http://tinyurl.com/KenDefense
12 / 85
Existing Approaches to
Open Conversation
• Combining multiple automated dialog systems
• DialPort (Zhao, et al., 2016)
• End-to-end framework for dialogue systems
• Serban, et al. 2016; Li, et al. 2017
• Adapting a model to many other domains
• Walker, et al., 2007; Sun, et al., 2016
• Chit-chat systems (social bot)
• Hold social conversations (Banchs, et al., 2012)
• Still a very hard problem…
Live Note/QA: http://tinyurl.com/KenDefense
13 / 85
Existing Approaches to
Open Conversation
• Combining multiple task-oriented dialog systems
• DialPort (Zhao, et al., 2016)
• End-to-end framework for dialogue systems
• Serban, et al. 2016; Li, et al. 2017
• Adapting a model to many other domains
• Walker, et al., 2007; Sun, et al., 2016
• Chit-chat systems (social bot)
• Hold social conversations (Banchs, et al., 2012)
• Still a very hard problem…
MIT Technology Review
Feb 27, 2018
Live Note/QA: http://tinyurl.com/KenDefense
14 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Live Note/QA: http://tinyurl.com/KenDefense
15 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Live Note/QA: http://tinyurl.com/KenDefense
16 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Live Note/QA: http://tinyurl.com/KenDefense
17 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Live Note/QA: http://tinyurl.com/KenDefense
18 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Live Note/QA: http://tinyurl.com/KenDefense
19 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.
Live Note/QA: http://tinyurl.com/KenDefense
20 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
21 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.
Chorus Deployment Evorus
[ HCOMP’16, HCOMP’17 ] [ CHI’18 , UIST Poster’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
22 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.
Chorus Deployment Evorus
Guardian
[ HCOMP’15, CI’17 ]
[ HCOMP’16, HCOMP’17 ] [ CHI’18 , UIST Poster’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
23 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Live Note/QA: http://tinyurl.com/KenDefense
24 / 85
Chorus: A Crowd-Powered
Conversation Assistant
[ HCOMP’16, HCOMP’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
25 / 85
Chorus: A Crowd-Powered Conversational Assistant
Lasecki, et al. UIST’13.
•Crowd workers collectively hold a
conversation by:
1. Propose Responses
2. Vote Responses
3. Take Notes
• Reward points for
each action
• Agreement bonus
Chorus: A Crowd-Powered
Conversation Assistant
Live Note/QA: http://tinyurl.com/KenDefense
26 / 85
User Interface
26 / 31
Live Note/QA: http://tinyurl.com/KenDefense
27 / 85
User & Worker Interface
27 / 31
Live Note/QA: http://tinyurl.com/KenDefense
28 / 85
Live Note/QA: http://tinyurl.com/KenDefense
29 / 85
We Deployed Chorus
• Launched on May 20th, 2016
• On Google Hangouts
• 2200+ conversations, 420+ users
• TalkingToTheCrowd.org
Live Note/QA: http://tinyurl.com/KenDefense
30 / 85
female, computer science
PhD student in Texas
we're going to visit her this
weekend from Pittsburgh
She's in Austin
Does she have any
favorite TV shows,
movies, or video games?
U
Sure! What types of
things does your friend
like?
U
Can you suggest some
birthday present for one
of my friend?
30
Gift
Suggestion
Live Note/QA: http://tinyurl.com/KenDefense
31 / 85
female, computer science
PhD student in Texas
we're going to visit her this
weekend from Pittsburgh
She's in Austin
Does she have any
favorite TV shows,
movies, or video games?
U
Sure! What types of
things does your friend
like?
U
Can you suggest some
birthday present for one
of my friend?
31
Gift
Suggestion
Live Note/QA: http://tinyurl.com/KenDefense
32 / 85
female, computer science
PhD student in Texas
we're going to visit her this
weekend from Pittsburgh
She's in Austin
Does she have any
favorite TV shows,
movies, or video games?
U
Sure! What types of
things does your friend
like?
U
Can you suggest some
birthday present for one
of my friend?
32
Gift
Suggestion
Live Note/QA: http://tinyurl.com/KenDefense
33 / 85
Pittsburgh
with which company
are you flying?
U
Let me check
UHow many suitcases can I
take on a flight from the US
to Israel?
Can I ask you from where
are you planning to board
the flight?
and which air services
are you using?
Travel
Planning
Full transcript:
Huang, et al. HCOMP 2016.
Live Note/QA: http://tinyurl.com/KenDefense
34 / 85
What Did We Learn?
• Challenges Identified
• Malicious workers & users
• Identifying the end of a conversation
• When workers’ consensus is not enough…
• Basic Statistics
• Avg session duration = 10.63 min (SD=8.38)
• Avg #message per session = 25.87 (SD= 27.27)
Foundation for future automation!
Live Note/QA: http://tinyurl.com/KenDefense
35 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
36 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
37 / 85
Evorus: A Crowd-Powered Conversational Assistant
Built to Automate Itself Over Time
[ UIST Poster’17, CHI’18 ]
Live Note/QA: http://tinyurl.com/KenDefense
38 / 85
Automating Chorus Over Time
Live Note/QA: http://tinyurl.com/KenDefense
39 / 85
Automating Chorus Over Time
Live Note/QA: http://tinyurl.com/KenDefense
40 / 85
Automating Chorus Over Time
Live Note/QA: http://tinyurl.com/KenDefense
41 / 85
Automating Chorus Over Time
Live Note/QA: http://tinyurl.com/KenDefense
42 / 85
Empower Chorus with Multiple Chatbots
Live Note/QA: http://tinyurl.com/KenDefense
43 / 85
Chatbots
How to select
chatbots
automatically?
Live Note/QA: http://tinyurl.com/KenDefense
44 / 85
Ranking Chatbots: Performance & Topic
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
≈
Live Note/QA: http://tinyurl.com/KenDefense
45 / 85
Ranking Chatbots: Performance & Topic
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
~= Overall Message
Acceptance Rate
≈
Live Note/QA: http://tinyurl.com/KenDefense
46 / 85
Ranking Chatbots: Performance & Topic
Topic Similarity
User Message
Domain of
the Chatbot
Hey what should
I eat in Montreal?
≈
Live Note/QA: http://tinyurl.com/KenDefense
47 / 85
Ranking Chatbots: Performance & Topic
Topic Similarity
User Message
Domain of
the Chatbot
Hey what should
I eat in Montreal?
Find me some
good restaurants !
Where can I get
Chinese food?
Example
Triggering
Message
≈
Live Note/QA: http://tinyurl.com/KenDefense
48 / 85
Ranking Chatbots: Performance & Topic
Topic Similarity
User Message
Domain of
the Chatbot
Hey what should
I eat in Montreal?
Example
Triggering
Message
Find me some
good restaurants !
Where can I get
Chinese food?
Topic
Similarity
≈
Live Note/QA: http://tinyurl.com/KenDefense
49 / 85
Ranking Chatbots: Performance & Topic
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
≈
Live Note/QA: http://tinyurl.com/KenDefense
50 / 85
Ranking Chatbots: Performance & Topic
Chatbot’s
Performance
Topic Similarity
Posterior
of a
Chatbot
 Add more chatbots over time !
≈
Live Note/QA: http://tinyurl.com/KenDefense
51 / 85
Live Note/QA: http://tinyurl.com/KenDefense
52 / 85
Automatic Upvote
How to estimate
the impact of an
automation?
Live Note/QA: http://tinyurl.com/KenDefense
53 / 85
Find the Best Confidence Threshold
• High Threshold
• Only vote when pretty sure
• High precision, but little benefit
• Low Threshold
• Nearly always vote
• Grant agreement bonus by mistake
• Damage conversation quality
Live Note/QA: http://tinyurl.com/KenDefense
54 / 85
Automating Chorus Over Time
Live Note/QA: http://tinyurl.com/KenDefense
55 / 85
Automating Open Conversation
• Setup
• A 5-month-long deployment, 80 Users
• 4 chatbots + 1 voting bot
• Result
• Automated responses were chosen 12.44% of the time.
• Human upvotes were reduced by 13.81%.
• The cost of each message is reduced by 32.76%.
• Conversation quality and user
satisfaction level remains.
• Conversation Quality: Satisfaction,
Clarity, Responsiveness, Comfort
(Liu, et al., 2010)
Live Note/QA: http://tinyurl.com/KenDefense
56 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]
Evorus
[ CHI’18 , UIST Poster’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
57 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]
Evorus
[ CHI’18 , UIST Poster’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
58 / 85
Empower Chorus with Multiple Chatbots
Live Note/QA: http://tinyurl.com/KenDefense
59 / 85
How to build a set of
chatbots quickly?
Live Note/QA: http://tinyurl.com/KenDefense
60 / 85
Use Web APIs to Empower Chorus
19,758+ APIS
Live Note/QA: http://tinyurl.com/KenDefense
61 / 85
Use Web APIs to Empower Chorus
19,758+ APIS
How to convert an
Web API into a
chatbot?
Live Note/QA: http://tinyurl.com/KenDefense
62 / 85
Guardian: A Crowd-Powered Spoken Dialog
System for Web APIs
[ HCOMP WIP’14, HCOMP’15, CI’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
63 / 85
Guardian: A Crowd-Powered Dialog System
for Web APIs
3
2 Dialog ManagementHi, I’m in San Diego.
Any Chinese restaurants here?
1 Language Understanding
Response Generation
Mandarin Wok Restaurant is
good ! It’s on 4227 Balboa Ave.
term = Chinese
location = San Diego
Yelp
Search
API 2.0
{ ... "name":
"Mandarin Wok
Restaurant”,...
"address":["4227
Balboa Ave”,...], …}
JSON
Live Note/QA: http://tinyurl.com/KenDefense
64 / 85
Parameter Extraction
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Hi, I’m in San Diego.
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
Live Note/QA: http://tinyurl.com/KenDefense
65 / 85
Parameter Extraction
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Hi, I’m in San Diego.
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
1. How to extract
parameters?
2. Which parameters
to use?
Live Note/QA: http://tinyurl.com/KenDefense
66 / 85
How to Extract Parameters?
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Hi, I’m in San Diego.
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
1. How to extract
parameters?
2. Which parameters
to use?
Live Note/QA: http://tinyurl.com/KenDefense
67 / 85
Real-time On-Demand Crowd-powered Entity Extraction.
Huang, et al. Collective Intelligence 2017.
Crowd-Powered Parameter Extraction
Hi, I’m in San Diego.
Answer
Aggregate
Location =
San Diego
RecruitedPlayers
Time Constraint
(10 – 20 sec)
Live Note/QA: http://tinyurl.com/KenDefense
68 / 85
Which Parameters to Use?
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Hi, I’m in San Diego.
Any Chinese
restaurants here?
Parameters
Yelp
Search
API
User
1. How to extract
parameters?
2. Which parameters
to use?
Live Note/QA: http://tinyurl.com/KenDefense
69 / 85
Parameter Rating Problem
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
offset
term
location
sw_latitude
sw_longitude
category_filter
accuracy
deals_filter
radius_filter
...
Pick good parameters for the dialog system.
Live Note/QA: http://tinyurl.com/KenDefense
70 / 85
How about just do a survey?
Task
Parameter Name / Desc
Live Note/QA: http://tinyurl.com/KenDefense
71 / 85
Match Questions with Parameters
I like Chinese food.
What do you want to eat?
? !
I’m in Pittsburgh.
Which city are you in?
? !
Dinner.
Is it dinner or lunch?
? !
...
Yelp API
Question Collection
Live Note/QA: http://tinyurl.com/KenDefense
72 / 85
Match Questions with Parameters
offset
I like Chinese food.
What do you want to eat?
? !
I’m in Pittsburgh.
Which city are you in?
? !
Dinner.
Is it dinner or lunch?
? !
...
term
location
sw_latitude
sw_longitude
category_filter
Yelp API
Question Collection
Parameter Filtering
Live Note/QA: http://tinyurl.com/KenDefense
73 / 85
Match Questions with Parameters
offset
I like Chinese food.
What do you want to eat?
? !
I’m in Pittsburgh.
Which city are you in?
? !
Dinner.
Is it dinner or lunch?
? !
...
location
?
!
term
? !
!
?
!
? !
?
!
?
!
category_filter
? !
?
!
?
!
?
!
? !
?
!
? !
?
! ? !
? ! ? !
?
!
?
!
?
!
?
!
?
!
?
!?
!
? !
? !
? !
? !
? !? !
?
!
?
!
? !
? !? !
? !
? !
? !
?
!
? !
?
!
term
location
sw_latitude
sw_longitude
category_filter
BetterParameter
Yelp API
Question Collection
Parameter Filtering
Question-Parameter Matching
Live Note/QA: http://tinyurl.com/KenDefense
74 / 85
Evaluation on Parameter Ranking
0
0.2
0.4
0.6
0.8
1
MAP MRR
Question Matching
Ask Siri
Ask a Friend
• Average results of 8 Web APIs’ parameters
Live Note/QA: http://tinyurl.com/KenDefense
75 / 85
Guardian: A Crowd-Powered Dialog System
for Web APIs
3
2 Dialog ManagementHi, I’m in San Diego.
Any Chinese restaurants here?
1 Language Understanding
Response Generation
Mandarin Wok Restaurant is
good ! It’s on 4227 Balboa Ave.
term = Chinese
location = San Diego
Yelp
Search
API 2.0
{ ... "name":
"Mandarin Wok
Restaurant”,...
"address":["4227
Balboa Ave”,...], …}
JSON
Live Note/QA: http://tinyurl.com/KenDefense
76 / 85
Task
Find Chinese
restaurants in
Pittsburgh.
Check current weather
by using a zip code.
Find information
of “Titanic”.
API
Result
9 out of 10 9 out of 10 6 out of 10
Final
Response
10 out of 10 9 out of 10 10 out of 10
Evaluation: Task Completion Rate
Crowd Recover Errors Crowd Recover Errors
2
3
Live Note/QA: http://tinyurl.com/KenDefense
77 / 85
Open Conversation
Personal
Assistants
AI-Powered
Dialog Systems
Automated
Crowd-Powered
Dialog Systems
Chorus Deployment
[ HCOMP’16, HCOMP’17 ]
Evorus
[ CHI’18 , UIST Poster’17 ]
Guardian
[ HCOMP’15, CI’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
78 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.
Live Note/QA: http://tinyurl.com/KenDefense
79 / 85
Thesis Statement
By allowing new chatbots to be easily integrated, reusing prior
crowd answers, and gradually reducing the crowd's role in
choosing high-quality responses,
a deployed crowd-powered dialog system can be automated
over time to support real-world open conversations.
Chorus Deployment Evorus
Guardian
[ HCOMP’15, CI’17 ]
[ HCOMP’16, HCOMP’17 ] [ CHI’18 , UIST Poster’17 ]
Live Note/QA: http://tinyurl.com/KenDefense
80 / 85
Some More Projects…
Ignition HCOMP’17
WearMail
Swaminathan et al. UIST’17
InstructableCrowd
CHI LBW’16, TOCHI (Under Review)
Visual Storytelling (VIST)
NAACL’16, Ferraro et al. EMNLP’15,
EmotionLines
Chen et al.,
LREC’18
Live Note/QA: http://tinyurl.com/KenDefense
81 / 85
Crowd Research is Critical
For Building Future Computer Systems.
• Collect data to guide AI models
• Accomplish tasks that are not yet fully automated
• Pave the way for future AI systems
Live Note/QA: http://tinyurl.com/KenDefense
82 / 85
Future Work
• Deployed Chorus as An Open Research Platform
 Chorus API
 1000+ chatbots
• Chorus on Smart Devices
 Echo, Google Home…
• Future Crowd-AI Systems!
 Object Recognition
 Speech Recognition
 Programming Tools
 … And More!
Live Note/QA: http://tinyurl.com/KenDefense
83 / 85
Future Work
• Deployed Chorus as An Open Research Platform
 Chorus API
 1000+ chatbots
• Chorus on Smart Devices
 Echo, Google Home…
• Future Crowd-AI Systems!
 Object Recognition
 Speech Recognition
 Programming Tools
 … And More!
Live Note/QA: http://tinyurl.com/KenDefense
84 / 85
Acknowledgment
• Family, Yan-Zhu (Lavender) Chen
• Jeffrey P. Bigham
• Walter S. Lasecki, Chris Callison-Burch, Alex Rudnicky, Margaret
Mitchell, Lun-Wei Ku, Hsin-Hsi Chen, Saiph Savage, Jane Hsu…
• Shoou-I Yu, Joseph Chee Chang, Chih-Yi (Jessica) Lin, Shihyun Lo,
Chu-Cheng Lin, Yun-Nung (Vivian) Chen, Lingpeng Kong, Luan Yi,
William Wang, Zi Yang, Yen-Chia Hsu, Kuen-Bang Hou (Favonia),
Kerry Shih-Ping Chang, Janet Huang, Yi-Chia Wang, Kai-min Kevin
Chang…
• Anhong Guo, Sai Ganesh, Kotaro Hara, Yashesh Gaur, Gierad Laput,
Robert Xiao, Yang Zhang, Patrick Carrington, Luz Rello, Cole Gleason,
Kristin Williams, Alex Chen, Susumu Saito…
• Amos Azaria, Oscar Romero Lopez…
• Stacey Young
Live Note/QA: http://tinyurl.com/KenDefense
85 / 85
@windx0303
KennethHuang.cc
Ting-Hao (Kenneth) Huang
Carnegie Mellon University
tinghaoh@cs.cmu.edu
Thank you!
Live Note/QA: http://tinyurl.com/KenDefense
86 / 85
Backup Slides
Live Note/QA: http://tinyurl.com/KenDefense
87 / 85
Live Note/QA: http://tinyurl.com/KenDefense
88 / 85
Automatic Voting
Live Note/QA: http://tinyurl.com/KenDefense
89 / 85
Find the Best Confidence Threshold
Expected Reward Points Saved

Weitere ähnliche Inhalte

Ähnlich wie A Crowd-Powered Conversational Assistant That Automates Itself Over Time

Running Effective Virtual Meetings: Tools & Techniques for Engagement
Running Effective Virtual Meetings:  Tools & Techniques for EngagementRunning Effective Virtual Meetings:  Tools & Techniques for Engagement
Running Effective Virtual Meetings: Tools & Techniques for EngagementBeth Kanter
 
Podcasting Workshop Oakland
Podcasting Workshop OaklandPodcasting Workshop Oakland
Podcasting Workshop Oaklandelizkeren
 
Podcasting Workshop Oakland
Podcasting Workshop OaklandPodcasting Workshop Oakland
Podcasting Workshop Oaklandelizkeren
 
February OpenNTF Webinar: Introduction to Ansible for Newbies
February OpenNTF Webinar: Introduction to Ansible for NewbiesFebruary OpenNTF Webinar: Introduction to Ansible for Newbies
February OpenNTF Webinar: Introduction to Ansible for NewbiesHoward Greenberg
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsTing-Hao Huang
 
SXSW 2015 Shredding Wireframes: Intro to Rapid Prototyping
SXSW 2015 Shredding Wireframes: Intro to Rapid PrototypingSXSW 2015 Shredding Wireframes: Intro to Rapid Prototyping
SXSW 2015 Shredding Wireframes: Intro to Rapid PrototypingKyle Outlaw
 
Fundamentals of Open Source Development
Fundamentals of Open Source DevelopmentFundamentals of Open Source Development
Fundamentals of Open Source DevelopmentOSU Open Source Lab
 
State of the Puppet Community: PuppetConf 2014
State of the Puppet Community: PuppetConf 2014State of the Puppet Community: PuppetConf 2014
State of the Puppet Community: PuppetConf 2014Dawn Foster
 
The Web 2.0 Honing Social Skills
The Web 2.0 Honing Social SkillsThe Web 2.0 Honing Social Skills
The Web 2.0 Honing Social SkillsRita Zeinstejer
 
30 days gcp info session final
30 days gcp info session final30 days gcp info session final
30 days gcp info session finalDomendra Sahu
 
ConveRSE framework at work (hands on RecSys Summer School)
ConveRSE framework at work (hands on RecSys Summer School)ConveRSE framework at work (hands on RecSys Summer School)
ConveRSE framework at work (hands on RecSys Summer School)Fedelucio Narducci
 
Clug 2014-09 - chef community resources
Clug 2014-09 - chef community resourcesClug 2014-09 - chef community resources
Clug 2014-09 - chef community resourcesZachary Stevens
 
ITS Forum 2008: Connecting Communities
ITS Forum 2008: Connecting CommunitiesITS Forum 2008: Connecting Communities
ITS Forum 2008: Connecting CommunitiesCole Camplese
 

Ähnlich wie A Crowd-Powered Conversational Assistant That Automates Itself Over Time (20)

Mis Takes With Video Presenting
Mis Takes With Video PresentingMis Takes With Video Presenting
Mis Takes With Video Presenting
 
The Great Eight
The Great EightThe Great Eight
The Great Eight
 
Running Effective Virtual Meetings: Tools & Techniques for Engagement
Running Effective Virtual Meetings:  Tools & Techniques for EngagementRunning Effective Virtual Meetings:  Tools & Techniques for Engagement
Running Effective Virtual Meetings: Tools & Techniques for Engagement
 
Podcasting Workshop Oakland
Podcasting Workshop OaklandPodcasting Workshop Oakland
Podcasting Workshop Oakland
 
Podcasting Workshop Oakland
Podcasting Workshop OaklandPodcasting Workshop Oakland
Podcasting Workshop Oakland
 
February OpenNTF Webinar: Introduction to Ansible for Newbies
February OpenNTF Webinar: Introduction to Ansible for NewbiesFebruary OpenNTF Webinar: Introduction to Ansible for Newbies
February OpenNTF Webinar: Introduction to Ansible for Newbies
 
Let's Contribute
Let's ContributeLet's Contribute
Let's Contribute
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
 
SXSW 2015 Shredding Wireframes: Intro to Rapid Prototyping
SXSW 2015 Shredding Wireframes: Intro to Rapid PrototypingSXSW 2015 Shredding Wireframes: Intro to Rapid Prototyping
SXSW 2015 Shredding Wireframes: Intro to Rapid Prototyping
 
Fundamentals of Open Source Development
Fundamentals of Open Source DevelopmentFundamentals of Open Source Development
Fundamentals of Open Source Development
 
State of the Puppet Community: PuppetConf 2014
State of the Puppet Community: PuppetConf 2014State of the Puppet Community: PuppetConf 2014
State of the Puppet Community: PuppetConf 2014
 
Charleston2010 atb-forupload
Charleston2010 atb-foruploadCharleston2010 atb-forupload
Charleston2010 atb-forupload
 
Charleston2010 atb-forupload
Charleston2010 atb-foruploadCharleston2010 atb-forupload
Charleston2010 atb-forupload
 
Python Meetup: The Origins
Python Meetup: The OriginsPython Meetup: The Origins
Python Meetup: The Origins
 
The Web 2.0 Honing Social Skills
The Web 2.0 Honing Social SkillsThe Web 2.0 Honing Social Skills
The Web 2.0 Honing Social Skills
 
30 days gcp info session final
30 days gcp info session final30 days gcp info session final
30 days gcp info session final
 
ConveRSE framework at work (hands on RecSys Summer School)
ConveRSE framework at work (hands on RecSys Summer School)ConveRSE framework at work (hands on RecSys Summer School)
ConveRSE framework at work (hands on RecSys Summer School)
 
OLITA Digital Odyssey Presentation on Open Source (with Randy Metcalfe)
OLITA Digital Odyssey Presentation on Open Source (with Randy Metcalfe)OLITA Digital Odyssey Presentation on Open Source (with Randy Metcalfe)
OLITA Digital Odyssey Presentation on Open Source (with Randy Metcalfe)
 
Clug 2014-09 - chef community resources
Clug 2014-09 - chef community resourcesClug 2014-09 - chef community resources
Clug 2014-09 - chef community resources
 
ITS Forum 2008: Connecting Communities
ITS Forum 2008: Connecting CommunitiesITS Forum 2008: Connecting Communities
ITS Forum 2008: Connecting Communities
 

Mehr von Ting-Hao Huang

A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...Ting-Hao Huang
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionTing-Hao Huang
 
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeTing-Hao Huang
 
"Is there anything else I can help you with?": Challenges in Deploying an On-...
"Is there anything else I can help you with?": Challenges in Deploying an On-..."Is there anything else I can help you with?": Challenges in Deploying an On-...
"Is there anything else I can help you with?": Challenges in Deploying an On-...Ting-Hao Huang
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Ting-Hao Huang
 
Social Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisSocial Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisTing-Hao Huang
 

Mehr von Ting-Hao Huang (6)

A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
 
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
"Is there anything else I can help you with?": Challenges in Deploying an On-...
"Is there anything else I can help you with?": Challenges in Deploying an On-..."Is there anything else I can help you with?": Challenges in Deploying an On-...
"Is there anything else I can help you with?": Challenges in Deploying an On-...
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
 
Social Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisSocial Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical Analysis
 

Kürzlich hochgeladen

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

A Crowd-Powered Conversational Assistant That Automates Itself Over Time

Hinweis der Redaktion

  1. Use this for setup
  2. Use this for setup
  3. Move to front
  4. We introduce the new approach to open conversation
  5. We introduce the new approach to open conversation
  6. We introduce the new approach to open conversation
  7. We introduce the new approach to open conversation
  8. We introduce the new approach to open conversation
  9. Say some challenges of crowdsourcing system Keep context Malicious / Lazy workers
  10. Dino-shape clear container living tiny organisms glow blue in dark
  11. Dino-shape clear container living tiny organisms glow blue in dark
  12. Dino-shape clear container living tiny organisms glow blue in dark
  13. “Feasible” is weird. Maybe something else?
  14. Telling a story
  15. The key point of this part is that each chatbot doesn’t need to be perfect
  16. If your think this it too abstract, we have a more concrete visulizaiton:
  17. Let’s first take a look at the overview of the automation. The way we are going to automate Chorus is to have Chorus incorperate with a big set of external dialog systesm, and gradually learn when to call them to obtain responses. For instacne, (Yelp example)
  18. Let’s first take a look at the overview of the automation. The way we are going to automate Chorus is to have Chorus incorperate with a big set of external dialog systesm, and gradually learn when to call them to obtain responses. For instacne, (Yelp example)
  19. Working system from day 1 The comparison is shown in Figure 4(B). Moreover, an accepted non-user message sent by Evorus costed $0.142 in Phase-1 deployment on average, while it costed $0.211 during the Control Phase. Namely, with automated chatbots and the vote bot, the cost of each message is reduced by 32.76%.
  20. Let’s first take a look at the overview of the automation. The way we are going to automate Chorus is to have Chorus incorperate with a big set of external dialog systesm, and gradually learn when to call them to obtain responses. For instacne, (Yelp example)
  21. So the first question is: How to build a big set of external dialog systems quickly?
  22. We think of Web APIs. This page shows the ProgrammableWeb, a web site that collects Web APIs. Nowadays, it contains 16 thousands of Web APIs. We have a lot of them. they are well-defined. And a lot of them are even free.
  23. We think of Web APIs. This page shows the ProgrammableWeb, a web site that collects Web APIs. Nowadays, it contains 16 thousands of Web APIs. We have a lot of them. they are well-defined. And a lot of them are even free.
  24. Guaridan’s framework contains three main steps: First, the workers have a conversation with the user, and extract the parameter values with a dialog ESP Game. Second, behind the scenes, the system will us these values to call the Yelp API and run the query. Finally, when Yelp API returns the result, it’s in a JSON file. We also use the crowd to interpret the response. We visualize the JSON file as a user friendly interface. The workers can click through the data and explore the information inside the JSON. By using Guardian, we can have a running dialog system without using any training data or even pre-knowledge of task.
  25. How to choose parameters? We think of this problem as a Parameter Rating Problem. Imagine you have a list of all parameters of Yelp API. The task is to rate how good is each parameter for dialog systems. The output is the rating score attached to each parameter, and thus you can have a ranking list of all parameters.
  26. How to choose parameters? We think of this problem as a Parameter Rating Problem. Imagine you have a list of all parameters of Yelp API. The task is to rate how good is each parameter for dialog systems. The output is the rating score attached to each parameter, and thus you can have a ranking list of all parameters.
  27. How to choose parameters? We think of this problem as a Parameter Rating Problem. Imagine you have a list of all parameters of Yelp API. The task is to rate how good is each parameter for dialog systems. The output is the rating score attached to each parameter, and thus you can have a ranking list of all parameters.
  28. We propose a multi-player Dialog ESP Game to extract parameter values from a running conversation. ESP Game is originally proposed for image labeling, now we adopt the idea to dialog. In the interface, we show the dialog, we show the description of the parameter, and ask the workers to type what the other workers might type If there are two answers matching with each other, we take it as the extracted parameter value. This method works well. Now we can extract parameters without having any training data. Therefore, based on all the works we’ve done, we propose a system called “Guardian”: There are 2 ways to aagregate the answers.
  29. How to choose parameters? We think of this problem as a Parameter Rating Problem. Imagine you have a list of all parameters of Yelp API. The task is to rate how good is each parameter for dialog systems. The output is the rating score attached to each parameter, and thus you can have a ranking list of all parameters.
  30. How to choose parameters? We think of this problem as a Parameter Rating Problem. Imagine you have a list of all parameters of Yelp API. The task is to rate how good is each parameter for dialog systems. The output is the rating score attached to each parameter, and thus you can have a ranking list of all parameters.
  31. As a crowdsourcing person, people would ask: Why don’t you just tell the crowd what you want and do a survey on each parameters? So we did. This is our interface. This survey is conducted on CrowdFlower. For each parameter, we show the parameter name, parameter’s description, and the task of the API. Then we ask the worker to imagine a scenario, and rate how likely you are going to provide the information of this parameter as a user. To be more careful, we run experiment on three different scenarios. First, ask Siri. Imagine you’re talking to Siri, how likely you’re going to provide this information? Second, as a friend. Imagine you can not use Internet right now and call a friend for help, how likely you’re going to provide this information? Third, we also ask the workers to rate how wired is the parameter, and use “Not Weird” as rating. How does this work?
  32. Like this! The ideas we propose here is to collect questions related to this task, and then ask the workers use questions to vote for parameters. Take the Yelp API for example, we first collect all possible questions from the crowd. Like “what do you want to eat?”, “where are you?”, “What’s your budget?”and so on. And then we ask workers to associate questions with parameters. So essentially, the workers are using questions to vote for parameters. We assume the parameters that are associated with more questions are better for dialog systems. How does this work?
  33. Like this! The ideas we propose here is to collect questions related to this task, and then ask the workers use questions to vote for parameters. Take the Yelp API for example, we first collect all possible questions from the crowd. Like “what do you want to eat?”, “where are you?”, “What’s your budget?”and so on. And then we ask workers to associate questions with parameters. So essentially, the workers are using questions to vote for parameters. We assume the parameters that are associated with more questions are better for dialog systems. How does this work?
  34. ?/! -> Q/A Like this! The ideas we propose here is to collect questions related to this task, and then ask the workers use questions to vote for parameters. Take the Yelp API for example, we first collect all possible questions from the crowd. Like “what do you want to eat?”, “where are you?”, “What’s your budget?”and so on. And then we ask workers to associate questions with parameters. So essentially, the workers are using questions to vote for parameters. We assume the parameters that are associated with more questions are better for dialog systems. How does this work?
  35. What does it mean to be better?! Retrieve parameters better than a friend Other than question-matching approaching It turned out our workflow outperforms all three baselines. When you take a look at the result, you will know the quality is much better and close to practical use.
  36. Guaridan’s framework contains three main steps: First, the workers have a conversation with the user, and extract the parameter values with a dialog ESP Game. Second, behind the scenes, the system will us these values to call the Yelp API and run the query. Finally, when Yelp API returns the result, it’s in a JSON file. We also use the crowd to interpret the response. We visualize the JSON file as a user friendly interface. The workers can click through the data and explore the information inside the JSON. By using Guardian, we can have a running dialog system without using any training data or even pre-knowledge of task.
  37. We implement the system on 3 different Web APIs. Yelp API for restaurant search, Weather Underground API for weather query, and RottenTomatoes API for movie query. We design three small tasks for each API, and run 10 trials on each systems. Here we only talking about the task completion rate. By task completion we mean the system provides the valid responses that contains the information the user requires. You can see the task completion rate is almost perfect. It’s because, first, the task here is relatively simple, second, even when the results returned from the API is incorrect, most of the time, crowd workers is able to figure it out the recover the correct answers. We also compare our result with the task completion rate reported by literature. The numbers are not directly comparable, but you can still see that our system reaches the same level of task completion rate with automated systems.
  38. We introduce the new approach to open conversation
  39. We introduce the new approach to open conversation
  40. 1. Leverage crowd wisdom to empower users to solve tasks which can not be solved by existing tech 2. Evorus demonstrates the potential of utilizing crowdsourced data as a scaffolding for training future AI systems 3. Pave the way for future AI systems to solve these problems
  41. 1. Leverage crowd wisdom to empower users to solve tasks which can not be solved by existing tech 2. Evorus demonstrates the potential of utilizing crowdsourced data as a scaffolding for training future AI systems 3. Pave the way for future AI systems to solve these problems
  42. How to automate….? Learning + voting