SlideShare ist ein Scribd-Unternehmen logo
1 von 7
When Crowd Meets Persona: Creating a Large-
Scale Open-Domain Persona Dialogue Corpus
Nov. 2022. @HCOMP (WiP)
Won Ik Cho¹*, Yoon Kyung Lee¹*, Seoyeon Bae¹, Jihwan Kim¹,
Sangah Park², Moosung Kim³, Sowon Hahn¹ and Nam Soo Kim¹
Seoul National University¹, DeepNatural AI², Smilegate AI³
Motivation
• Creating dialogue dataset
 Multiple participants
 High degree of freedom
• Difficulties of crowdsourcing
 Researchers, moderators, and crowdworkers
 Considerate scheduling and conflict resolution required
• Persona dialogue
 Challenging and time-
consuming project
 What should the task
managers keep in mind?
1
Our study
• Setting
 Persona participants (actors) talk with user participants (workers)
 Actors are hired, while workers are crowdsourced
 User initiates the conversation, but persona leads the role
• Collection
 Recruiting workers from crowdsourcing platform
 Chat interface developed by the platform
2
Our study
• Project flow
3
Discussion
• Overview
 RQ1: What should be considered in accommodating the construction
of a successful dialogue dataset?
• The organizer should acknowledge that it differs a lot from usual conversation
and it is crucial to handle unexpected and unwanted situations
 RQ2: What is the role of the moderator in large-scale dialogue dataset
construction?
• Resolve conflicts after constructing a rapport with participants
• Be aware on the points participants feel uncomfortable, empathizing and
understanding the struggles
• Recruitment and financial support that affects the atmosphere
 RQ3: Will such considerations help reach an intended goal of
construction?
• Shown indirectly using survey results, textual analysis, and generative model-
based experiments (to be further investigated)
4
Conclusion
• Dataset
 https://github.com/smilegate-ai/OPELA
• Acknowledgement
 Smilegate AI (funding and discussions)
 DeepNatural AI (crowdsourcing and moderation)
 Kudos to all our crowdworkers 
• Full paper and analyses
 To be disclosed
5
Thank you
6

Weitere ähnliche Inhalte

Ähnlich wie 2211 HCOMP

Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Mark_Childs
 
European Communication School: Social Media Session 5
European Communication School: Social Media Session 5European Communication School: Social Media Session 5
European Communication School: Social Media Session 5Richard Stacy
 
Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...berhanu taye
 
#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference OverviewLaura Pasquini
 
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningMental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningDaniel Eizans
 
CorporateCommunityOWF2010
CorporateCommunityOWF2010CorporateCommunityOWF2010
CorporateCommunityOWF2010Connect'up
 
Zen and the Art of UX Planning
Zen and the Art of UX PlanningZen and the Art of UX Planning
Zen and the Art of UX PlanningCorey Allenbach
 
Redistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationRedistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationKurt Luther
 
Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Karen S Calhoun
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementctedds
 
The Birth of the HUGE UX School
The Birth of the HUGE UX SchoolThe Birth of the HUGE UX School
The Birth of the HUGE UX SchoolMichal Pasternak
 
Project Management Base Camp
Project Management Base CampProject Management Base Camp
Project Management Base Campeph-hr
 
Some perspectives from the Astropy Project
Some perspectives from the Astropy ProjectSome perspectives from the Astropy Project
Some perspectives from the Astropy ProjectKelle Cruz
 
Project management.docx communiction
Project management.docx communictionProject management.docx communiction
Project management.docx communictionberhanu taye
 
Open Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupOpen Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupChris Aniszczyk
 

Ähnlich wie 2211 HCOMP (20)

Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2
 
COMP 4026 - Lecture 1
COMP 4026 - Lecture 1COMP 4026 - Lecture 1
COMP 4026 - Lecture 1
 
Mg6088 spm unit-4
Mg6088 spm unit-4Mg6088 spm unit-4
Mg6088 spm unit-4
 
Report
ReportReport
Report
 
European Communication School: Social Media Session 5
European Communication School: Social Media Session 5European Communication School: Social Media Session 5
European Communication School: Social Media Session 5
 
Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...
 
Sakai Development Process
Sakai Development ProcessSakai Development Process
Sakai Development Process
 
#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview
 
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningMental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
 
CorporateCommunityOWF2010
CorporateCommunityOWF2010CorporateCommunityOWF2010
CorporateCommunityOWF2010
 
Proyectos Investigación y Desarrollo
Proyectos Investigación y DesarrolloProyectos Investigación y Desarrollo
Proyectos Investigación y Desarrollo
 
Zen and the Art of UX Planning
Zen and the Art of UX PlanningZen and the Art of UX Planning
Zen and the Art of UX Planning
 
Redistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationRedistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative Collaboration
 
Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangement
 
The Birth of the HUGE UX School
The Birth of the HUGE UX SchoolThe Birth of the HUGE UX School
The Birth of the HUGE UX School
 
Project Management Base Camp
Project Management Base CampProject Management Base Camp
Project Management Base Camp
 
Some perspectives from the Astropy Project
Some perspectives from the Astropy ProjectSome perspectives from the Astropy Project
Some perspectives from the Astropy Project
 
Project management.docx communiction
Project management.docx communictionProject management.docx communiction
Project management.docx communiction
 
Open Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupOpen Source Lessons from the TODO Group
Open Source Lessons from the TODO Group
 

Mehr von WarNik Chow

2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inpersonWarNik Chow
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech datasetWarNik Chow
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2eWarNik Chow
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminarWarNik Chow
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH WarNik Chow
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate SpeechWarNik Chow
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLPWarNik Chow
 

Mehr von WarNik Chow (20)

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
 

Kürzlich hochgeladen

Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 

Kürzlich hochgeladen (20)

Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 

2211 HCOMP

  • 1. When Crowd Meets Persona: Creating a Large- Scale Open-Domain Persona Dialogue Corpus Nov. 2022. @HCOMP (WiP) Won Ik Cho¹*, Yoon Kyung Lee¹*, Seoyeon Bae¹, Jihwan Kim¹, Sangah Park², Moosung Kim³, Sowon Hahn¹ and Nam Soo Kim¹ Seoul National University¹, DeepNatural AI², Smilegate AI³
  • 2. Motivation • Creating dialogue dataset  Multiple participants  High degree of freedom • Difficulties of crowdsourcing  Researchers, moderators, and crowdworkers  Considerate scheduling and conflict resolution required • Persona dialogue  Challenging and time- consuming project  What should the task managers keep in mind? 1
  • 3. Our study • Setting  Persona participants (actors) talk with user participants (workers)  Actors are hired, while workers are crowdsourced  User initiates the conversation, but persona leads the role • Collection  Recruiting workers from crowdsourcing platform  Chat interface developed by the platform 2
  • 5. Discussion • Overview  RQ1: What should be considered in accommodating the construction of a successful dialogue dataset? • The organizer should acknowledge that it differs a lot from usual conversation and it is crucial to handle unexpected and unwanted situations  RQ2: What is the role of the moderator in large-scale dialogue dataset construction? • Resolve conflicts after constructing a rapport with participants • Be aware on the points participants feel uncomfortable, empathizing and understanding the struggles • Recruitment and financial support that affects the atmosphere  RQ3: Will such considerations help reach an intended goal of construction? • Shown indirectly using survey results, textual analysis, and generative model- based experiments (to be further investigated) 4
  • 6. Conclusion • Dataset  https://github.com/smilegate-ai/OPELA • Acknowledgement  Smilegate AI (funding and discussions)  DeepNatural AI (crowdsourcing and moderation)  Kudos to all our crowdworkers  • Full paper and analyses  To be disclosed 5

Hinweis der Redaktion

  1. Hi, we are joint team of Seoul national university, Deep natural AI, and smilegate AI, from South korea. Today we are going to present our work-in-progress project on persona dialogue creation with hired persona actors and crowdsourced users.
  2. Our work first considers an innate difficulty of making up dialogue corpus, that two or more participants are necessarily involved with the construction process, and such process has so high degree of freedom that the quality control of the output may not be feasible. Also, in many corpus creation work these days corporate with crowdsourcing companies and the moderators there, who recruit the workers and manage their overall load and compensation. That is, the role of researchers, moderators and crowdworkers are all slightly different concerning the goal and scale, which requires a considerate scheduling and conflict resolution. In this light, we’ve come to a question that how should the persona dialogue corpus generation should be managed in practice.
  3. In our study, we let persona participants, namely the actors, talk with user participants, the workers. Actors are hired here, while workers are crowdsourced. For every dialogue, the user initiates the conversation, but persona actors lead the role while they talk. The collection is processed by recruiting workers from the community of crowdsourcing platform, using the chat interface developed by the platform so as to check and manage the progress of the conversation. Freedom of conversation was guaranteed as much as possible, but users who make actors uneasy or feel eerieness were reported and set aside from the project. After the collection was finished, we analyzed the survey and interview done with participants and the moderator, and furthermore analyzed the constructed data.
  4. We demonstrate the overall project flow. First, guidelines for the conversation are created by researchers, and the platform and moderator recruit actors and workers based on the guidelines. Here, actor plays the perfona they first decided, and the user initiate the conversation with the persona based on the profile they face, only if the pass the test prepared for user participants. When the conversation starts, The conversation lasts over 15 turns, and it is terminated by actors or workers if they feel fatigued or feel bored. They finish a survey after each conversation, and the reward is given afterward according to the amount of dialogue.
  5. After the whole collection phase, we answered our research questions briefly. First, In accommodating the construction of a successful persona dialogue dataset, the organizer should acknowledge that it differs a lot from usual conversation and it is crucial to handle unexpected and unwanted situations, which could be moderated by a expertise moderator. To look more into this, the moderator should resolve conflicts after constructing a rapport with participants so that they can report whatever they feel uncomfortable, at the same time empathizing and understanding their struggles. Recruiting them and managing finance is also a crucial role in that such environments can deter or boost the atmosphere of the project. We've also found that the whole process led to high quality generation of the persona dialogue dataset and recently disclosed it online, but our work is to be further investigated with more thorough experimental criteria, and to be presented as a more mature work afterwards.
  6. Our work is currently disclosed in the github of our funding agency, smilegate AI. also, we thank deep natural AI for building up the chat interface, recruiting participants from the worker pool, and moderating the whole process. Finally, we thank all our crowdworkers, including actors and users, who made up the whole dialogues and went through the survey and interviews. Since our work is in progress, we will soon disclose the whole analysis results with our full paper.
  7. Thank you for listening 