SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
CrowdED: Guideline for
Optimal Crowdsourcing
Amrapali Zaveri, Pedro Hernandez Serrano, Manisha
Desai, Michel Dumontier
HumL@WWW2018 @AmrapaliZ 24 April, 20181
Crowdsourcing Tasks
❖ Tasks based on human skills
not yet replicable by machines
❖ Highly parallelizable tasks
❖ Every human (worker) must
be provided with a monetary
reward for an answer
❖ Consolidated answers solve
scientific problems
!2
Crowdsourcing Design
❖ Gold standard
questions

❖ Master Workers

❖ Majority voting 

❖ Overall accuracy
!3
Crowdsourcing Use Case
Biomedical Metadata Quality Assessment*
!4
*MetaCrowd: Crowdsourcing Biomedical Metadata Quality Assessment. 
Amrapali Zaveri and Michel Dumontier. Bio-Ontologies 2017.
How CrowdED is too
crowded?
BUT
!5
Research Question
Can we a-priori estimate optimal
workers and tasks' assignment to obtain
maximum accuracy on all tasks?
!6
CrowdED
a two-staged statistical
Crowdsourcing
Experimental Design
!7
Related Studies
!8
Adaptive Model
Active Learning
KB Test Questions
Self Assessment
Cost-Time

&

Cost-Quality

Optimization
CrowdED
CrowdED offers a two-staged statistical model to estimate a-priori worker
and task assignment to achieve maximum accuracy.
Stage 1: 

• Train all
workers

• On a proportion
of tasks

• Identify best
workers &

• Hard tasks
2 Stages
!9
!
Stage 2:
• Assign best
workers to

• Hard tasks

• Remaining tasks

• Calculate
Overall
Accuracy
!
Stage 1
!
Stage 1
Easy Hard
Good Poor
Workers
Tasks
!10
Assign Tasks to Workers
!
Stage 1
Easy Hard
Good Poor
Workers
Tasks
Task
Label
Truth
1 1 hard_task age
1 2 hard_task age
1 3 hard_task age
1 4 easy_task age
1 5 easy_task age
Simulate
Odd no.
Proportion
of tasks to train
!11
Worker
Label
Truth
1 1 good_worker age
2 1 poor_worker age
3 1 good_worker age
4 1 good_worker age
5 1 poor_worker age
Workerview
Taskview
Calculate Worker Accuracy
& Task Difficulty
!12
Task
Label
Truth
Task
Difficulty
1 1 hard_task age 0.54
1 2 hard_task age 0.42
1 3 hard_task age 0.45
1 4 easy_task age 0.80
1 5 easy_task age 0.70
Worker
Label
Truth
Worker
Accuracy
1 1 good_worker age 0.75
2 1 poor_worker age 0.58
3 1 good_worker age 0.78
4 1 good_worker age 0.95
5 1 poor_worker age 0.54
Workerview
Taskview
Simulate Worker Answer
!13
Task
Label
Truth
Task
Difficulty
Worker
Answer
1 1 hard_task age 0.54 age
1 2 hard_task age 0.42 tissue
1 3 hard_task age 0.45 disease
1 4 easy_task age 0.80 age
1 5 easy_task age 0.70 age
Worker
Label
Truth
Worker
Accuracy
Worker
Answer
1 1 good_worker age 0.75 age
2 1 poor_worker age 0.58 age
3 1 good_worker age 0.78 age
4 1 good_worker age 0.95 tissue
5 1 poor_worker age 0.54 age
!13
Workerview
Taskview
Calculate Worker
Performance
Avg. proportion of times a

worker is in agreement with other 

workers for a given task 

vs. 

all tasks performed by the worker
Range

[0…1]
Threshold
identify
!
Easy Hard
Good Poor
!14
Easy Tasks
!15
Hard Tasks!
Worker
Label
Truth
Worker
Accuracy
Worker
Answer
1 1 good_worker age 0.75 age
2 1 poor_worker age 0.58 age
3 1 good_worker age 0.78 age
4 1 good_worker age 0.95 tissue
5 1 poor_worker age 0.54 age
Worker
Label
Truth
Worker
Accuracy
Worker
Answer
2 2 good_worker age 0.75 treatment
3 2 poor_worker age 0.58 disease
15 2 good_worker age 0.78 age
17 2 poor_worker age 0.95 tissue
20 2 poor_worker age 0.54
Taskview
Taskview
Stage 1: 

• Train all
workers

• On a proportion
of tasks

• Identify best
workers &

• Hard tasks
2 Stages
!16
!
Stage 2:
• Assign best
workers to

• Hard tasks & 

• Remaining tasks

• Calculate
Overall
Accuracy
!
Stage 2
!
Easy Hard
Good Poor
Stage 2
!17
Simulate Worker Answer
Stage 2
!
Hard
Good
simulate
Remaining 

Tasks
!18
Task
Label
Truth
Task
Difficulty
Worker
Answer
1 1 hard_task age 0.54 age
1 2 hard_task age 0.42 tissue
1 3 hard_task age 0.45 disease
1 4 easy_task age 0.80 age
1 5 easy_task age 0.70 age
Workerview
Merge Stage 1 and 2
& Assign Answers
!19
Worker
Label
Truth
Worker
Accuracy
Worker
Answer
1 1 good_worker age 0.75 age
2 1 poor_worker age 0.58 age
3 1 good_worker age 0.78 age
4 1 good_worker age 0.95 tissue
5 1 poor_worker age 0.54 age
Taskview
Answer = age
Assessing Design
Merged Dataset
calculate
!20
Overall Accuracy
avg. of all the tasks
which had consensus
Worker
Label
Truth
Worker
Accuracy
Worker
Answer
1 1 good_worker age 0.75 age
2 1 poor_worker age 0.58 age
3 1 good_worker age 0.78 age
4 1 good_worker age 0.95 tissue
5 1 poor_worker age 0.54 age
Taskview
Experimental Evaluation
• tasks = [60, 80, 100, 120, 140, 160, 180]

• workers = [20, 30, 40]

• answers key = ["liver", "blood", "lung", "brain",
“heart"]

• good workers = [0.1, 0.3, 0.5, 0.7, 0.9]

• hard tasks = [0.1, 0.3, 0.5, 0.7, 0.9]

• proportion of training tasks = [0.2, 0.3, 0.4, 0.5, 0.6]

• workers per task = [3, 5, 7, 9, 11]
13,125 combinations
!21
• Results support the
intuition that reduced
difficulty (10%) in tasks
result in higher
accuracy
!22
• calculating the
performance of the
workers in combination
with whether she was a
good worker (from the
beginning) ensures that
she is the best worker

• adopting the two-
staged algorithm
ensures that only the
best workers are chosen
to perform all the tasks
!23
Results
!24
CrowdED recommendation
• no. of workers should be 40-60% of the total number
of tasks

• train workers on 40-60% of the tasks in Stage 1

• set the number of workers per task to be either 3, 5 or
7 (fewer than 9)

• reduce the number of hard tasks 

• adopt the two-staged algorithm to identify the best
workers
!25
https://pedrohserrano.shinyapps.io/crowdapp/
!26
Conclusion & Future Work
• Two-staged statistical design for designing optimal crowdsourcing experiments 

• a-priori estimate optimal workers and tasks' assignment to obtain maximum
accuracy on all tasks

• Implemented in Python, open source, Jupyter notebook

• Future work

• Training the workers vs. not training

• Real-world experiments and comparison with baseline approaches

• Include budgetary constraints 

• Extend the interface to allow user to vary parameters and observe sensitivity the
design is to various assumptions
!27
@AmrapaliZamrapali.zaveri@maastrichtuniversity.nl
Thank You!
Questions?
Try it yourself

https://github.com/MaastrichtU-IDS/crowdED
Feedback welcome !
!28

Weitere ähnliche Inhalte

Ähnlich wie CrowdED: Guideline for optimal Crowdsourcing Experimental Design

Software estimation is crap
Software estimation is crapSoftware estimation is crap
Software estimation is crapIan Garrison
 
Testing for everyone agile yorkshire
Testing for everyone agile yorkshireTesting for everyone agile yorkshire
Testing for everyone agile yorkshireAdy Stokes
 
Dymystify Statistics Day 1.pdf
Dymystify Statistics Day 1.pdfDymystify Statistics Day 1.pdf
Dymystify Statistics Day 1.pdfKristineIbaez2
 
Assignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docx
Assignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docxAssignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docx
Assignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docxsherni1
 
“Job Quality, Labour Market Performance and Well-Being”_Parent thirion
“Job Quality, Labour Market Performance and Well-Being”_Parent thirion“Job Quality, Labour Market Performance and Well-Being”_Parent thirion
“Job Quality, Labour Market Performance and Well-Being”_Parent thirionStatsCommunications
 
Optimising selection success through best practice
Optimising selection success through best practiceOptimising selection success through best practice
Optimising selection success through best practiceOPRA Psychology Group
 
Hendrix 2015 composite endpoints redacted
Hendrix 2015 composite endpoints redacted Hendrix 2015 composite endpoints redacted
Hendrix 2015 composite endpoints redacted Alzforum
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsBigML, Inc
 
Employee productivity
Employee productivityEmployee productivity
Employee productivitySelf-employed
 
3 brooke-ifa 2012 29.5 libby ppt
3 brooke-ifa 2012 29.5  libby  ppt3 brooke-ifa 2012 29.5  libby  ppt
3 brooke-ifa 2012 29.5 libby pptifa2012
 
Employee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptxEmployee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptxBoston Institute of Analytics
 
Principles of management
Principles of managementPrinciples of management
Principles of managementSahil Jindal
 
Performance Appraisal HRM
Performance Appraisal HRMPerformance Appraisal HRM
Performance Appraisal HRMAditya Gupta
 
Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...
Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...
Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...NHS Improving Quality
 
Employee productivity and Role of HR
Employee productivity and Role of HREmployee productivity and Role of HR
Employee productivity and Role of HRSelf-employed
 
How Experienced Workers are Re-energizing the Workforce
How Experienced Workers  are Re-energizing the WorkforceHow Experienced Workers  are Re-energizing the Workforce
How Experienced Workers are Re-energizing the WorkforceAARP
 

Ähnlich wie CrowdED: Guideline for optimal Crowdsourcing Experimental Design (20)

Software estimation is crap
Software estimation is crapSoftware estimation is crap
Software estimation is crap
 
Testing for everyone agile yorkshire
Testing for everyone agile yorkshireTesting for everyone agile yorkshire
Testing for everyone agile yorkshire
 
Dymystify Statistics Day 1.pdf
Dymystify Statistics Day 1.pdfDymystify Statistics Day 1.pdf
Dymystify Statistics Day 1.pdf
 
Statistics-1 : The Basics of Statistics
Statistics-1 : The Basics of StatisticsStatistics-1 : The Basics of Statistics
Statistics-1 : The Basics of Statistics
 
Assignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docx
Assignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docxAssignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docx
Assignment 2 1 of 32Exercise 2-1 Testing Herzbergs Job .docx
 
“Job Quality, Labour Market Performance and Well-Being”_Parent thirion
“Job Quality, Labour Market Performance and Well-Being”_Parent thirion“Job Quality, Labour Market Performance and Well-Being”_Parent thirion
“Job Quality, Labour Market Performance and Well-Being”_Parent thirion
 
Optimising selection success through best practice
Optimising selection success through best practiceOptimising selection success through best practice
Optimising selection success through best practice
 
Hendrix 2015 composite endpoints redacted
Hendrix 2015 composite endpoints redacted Hendrix 2015 composite endpoints redacted
Hendrix 2015 composite endpoints redacted
 
Job Evaluation
Job EvaluationJob Evaluation
Job Evaluation
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. Evaluations
 
Employee productivity
Employee productivityEmployee productivity
Employee productivity
 
3 brooke-ifa 2012 29.5 libby ppt
3 brooke-ifa 2012 29.5  libby  ppt3 brooke-ifa 2012 29.5  libby  ppt
3 brooke-ifa 2012 29.5 libby ppt
 
Employee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptxEmployee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptx
 
Principles of management
Principles of managementPrinciples of management
Principles of management
 
171 Red beads The company as a system - Essential Lean 2014 01
171 Red beads   The company as a system - Essential Lean 2014 01171 Red beads   The company as a system - Essential Lean 2014 01
171 Red beads The company as a system - Essential Lean 2014 01
 
Performance Appraisal HRM
Performance Appraisal HRMPerformance Appraisal HRM
Performance Appraisal HRM
 
Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...
Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...
Transforming End of Life Care in Acute Hospitals PM Workshop 3: Vital Signs ‘...
 
Employee productivity and Role of HR
Employee productivity and Role of HREmployee productivity and Role of HR
Employee productivity and Role of HR
 
Unit 2 - Statistics
Unit 2 - StatisticsUnit 2 - Statistics
Unit 2 - Statistics
 
How Experienced Workers are Re-energizing the Workforce
How Experienced Workers  are Re-energizing the WorkforceHow Experienced Workers  are Re-energizing the Workforce
How Experienced Workers are Re-energizing the Workforce
 

Mehr von Amrapali Zaveri, PhD

Data Quality and the FAIR principles
Data Quality and the FAIR principlesData Quality and the FAIR principles
Data Quality and the FAIR principlesAmrapali Zaveri, PhD
 
Workshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataWorkshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataAmrapali Zaveri, PhD
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentAmrapali Zaveri, PhD
 
smartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIssmartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIsAmrapali Zaveri, PhD
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyAmrapali Zaveri, PhD
 
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionTowards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionAmrapali Zaveri, PhD
 
User-driven Quality Evaluation of DBpedia
User-driven Quality Evaluation of DBpediaUser-driven Quality Evaluation of DBpedia
User-driven Quality Evaluation of DBpediaAmrapali Zaveri, PhD
 

Mehr von Amrapali Zaveri, PhD (16)

Data Quality and the FAIR principles
Data Quality and the FAIR principlesData Quality and the FAIR principles
Data Quality and the FAIR principles
 
Workshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataWorkshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in Wikidata
 
ESOF Panel 2018
ESOF Panel 2018ESOF Panel 2018
ESOF Panel 2018
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
 
smartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIssmartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIs
 
Introduction to Bio SPARQL
Introduction to Bio SPARQL Introduction to Bio SPARQL
Introduction to Bio SPARQL
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 
Amrapali Zaveri Defense
Amrapali Zaveri DefenseAmrapali Zaveri Defense
Amrapali Zaveri Defense
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
 
LOD-SEM
LOD-SEMLOD-SEM
LOD-SEM
 
TripleCheckMate
TripleCheckMateTripleCheckMate
TripleCheckMate
 
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionTowards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
 
User-driven Quality Evaluation of DBpedia
User-driven Quality Evaluation of DBpediaUser-driven Quality Evaluation of DBpedia
User-driven Quality Evaluation of DBpedia
 
Converting GHO to RDF
Converting GHO to RDFConverting GHO to RDF
Converting GHO to RDF
 
ReDD-Observatory
ReDD-ObservatoryReDD-Observatory
ReDD-Observatory
 

Kürzlich hochgeladen

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 

Kürzlich hochgeladen (20)

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 

CrowdED: Guideline for optimal Crowdsourcing Experimental Design

  • 1. CrowdED: Guideline for Optimal Crowdsourcing Amrapali Zaveri, Pedro Hernandez Serrano, Manisha Desai, Michel Dumontier HumL@WWW2018 @AmrapaliZ 24 April, 20181
  • 2. Crowdsourcing Tasks ❖ Tasks based on human skills not yet replicable by machines ❖ Highly parallelizable tasks ❖ Every human (worker) must be provided with a monetary reward for an answer ❖ Consolidated answers solve scientific problems !2
  • 3. Crowdsourcing Design ❖ Gold standard questions ❖ Master Workers ❖ Majority voting ❖ Overall accuracy !3
  • 4. Crowdsourcing Use Case Biomedical Metadata Quality Assessment* !4 *MetaCrowd: Crowdsourcing Biomedical Metadata Quality Assessment.  Amrapali Zaveri and Michel Dumontier. Bio-Ontologies 2017.
  • 5. How CrowdED is too crowded? BUT !5
  • 6. Research Question Can we a-priori estimate optimal workers and tasks' assignment to obtain maximum accuracy on all tasks? !6
  • 8. Related Studies !8 Adaptive Model Active Learning KB Test Questions Self Assessment Cost-Time & Cost-Quality Optimization CrowdED CrowdED offers a two-staged statistical model to estimate a-priori worker and task assignment to achieve maximum accuracy.
  • 9. Stage 1: • Train all workers • On a proportion of tasks • Identify best workers & • Hard tasks 2 Stages !9 ! Stage 2: • Assign best workers to • Hard tasks • Remaining tasks • Calculate Overall Accuracy !
  • 10. Stage 1 ! Stage 1 Easy Hard Good Poor Workers Tasks !10
  • 11. Assign Tasks to Workers ! Stage 1 Easy Hard Good Poor Workers Tasks Task Label Truth 1 1 hard_task age 1 2 hard_task age 1 3 hard_task age 1 4 easy_task age 1 5 easy_task age Simulate Odd no. Proportion of tasks to train !11 Worker Label Truth 1 1 good_worker age 2 1 poor_worker age 3 1 good_worker age 4 1 good_worker age 5 1 poor_worker age Workerview Taskview
  • 12. Calculate Worker Accuracy & Task Difficulty !12 Task Label Truth Task Difficulty 1 1 hard_task age 0.54 1 2 hard_task age 0.42 1 3 hard_task age 0.45 1 4 easy_task age 0.80 1 5 easy_task age 0.70 Worker Label Truth Worker Accuracy 1 1 good_worker age 0.75 2 1 poor_worker age 0.58 3 1 good_worker age 0.78 4 1 good_worker age 0.95 5 1 poor_worker age 0.54 Workerview Taskview
  • 13. Simulate Worker Answer !13 Task Label Truth Task Difficulty Worker Answer 1 1 hard_task age 0.54 age 1 2 hard_task age 0.42 tissue 1 3 hard_task age 0.45 disease 1 4 easy_task age 0.80 age 1 5 easy_task age 0.70 age Worker Label Truth Worker Accuracy Worker Answer 1 1 good_worker age 0.75 age 2 1 poor_worker age 0.58 age 3 1 good_worker age 0.78 age 4 1 good_worker age 0.95 tissue 5 1 poor_worker age 0.54 age !13 Workerview Taskview
  • 14. Calculate Worker Performance Avg. proportion of times a worker is in agreement with other workers for a given task vs. all tasks performed by the worker Range [0…1] Threshold identify ! Easy Hard Good Poor !14
  • 15. Easy Tasks !15 Hard Tasks! Worker Label Truth Worker Accuracy Worker Answer 1 1 good_worker age 0.75 age 2 1 poor_worker age 0.58 age 3 1 good_worker age 0.78 age 4 1 good_worker age 0.95 tissue 5 1 poor_worker age 0.54 age Worker Label Truth Worker Accuracy Worker Answer 2 2 good_worker age 0.75 treatment 3 2 poor_worker age 0.58 disease 15 2 good_worker age 0.78 age 17 2 poor_worker age 0.95 tissue 20 2 poor_worker age 0.54 Taskview Taskview
  • 16. Stage 1: • Train all workers • On a proportion of tasks • Identify best workers & • Hard tasks 2 Stages !16 ! Stage 2: • Assign best workers to • Hard tasks & • Remaining tasks • Calculate Overall Accuracy !
  • 17. Stage 2 ! Easy Hard Good Poor Stage 2 !17
  • 18. Simulate Worker Answer Stage 2 ! Hard Good simulate Remaining Tasks !18 Task Label Truth Task Difficulty Worker Answer 1 1 hard_task age 0.54 age 1 2 hard_task age 0.42 tissue 1 3 hard_task age 0.45 disease 1 4 easy_task age 0.80 age 1 5 easy_task age 0.70 age Workerview
  • 19. Merge Stage 1 and 2 & Assign Answers !19 Worker Label Truth Worker Accuracy Worker Answer 1 1 good_worker age 0.75 age 2 1 poor_worker age 0.58 age 3 1 good_worker age 0.78 age 4 1 good_worker age 0.95 tissue 5 1 poor_worker age 0.54 age Taskview Answer = age
  • 20. Assessing Design Merged Dataset calculate !20 Overall Accuracy avg. of all the tasks which had consensus Worker Label Truth Worker Accuracy Worker Answer 1 1 good_worker age 0.75 age 2 1 poor_worker age 0.58 age 3 1 good_worker age 0.78 age 4 1 good_worker age 0.95 tissue 5 1 poor_worker age 0.54 age Taskview
  • 21. Experimental Evaluation • tasks = [60, 80, 100, 120, 140, 160, 180] • workers = [20, 30, 40] • answers key = ["liver", "blood", "lung", "brain", “heart"] • good workers = [0.1, 0.3, 0.5, 0.7, 0.9] • hard tasks = [0.1, 0.3, 0.5, 0.7, 0.9] • proportion of training tasks = [0.2, 0.3, 0.4, 0.5, 0.6] • workers per task = [3, 5, 7, 9, 11] 13,125 combinations !21
  • 22. • Results support the intuition that reduced difficulty (10%) in tasks result in higher accuracy !22
  • 23. • calculating the performance of the workers in combination with whether she was a good worker (from the beginning) ensures that she is the best worker • adopting the two- staged algorithm ensures that only the best workers are chosen to perform all the tasks !23
  • 25. CrowdED recommendation • no. of workers should be 40-60% of the total number of tasks • train workers on 40-60% of the tasks in Stage 1 • set the number of workers per task to be either 3, 5 or 7 (fewer than 9) • reduce the number of hard tasks • adopt the two-staged algorithm to identify the best workers !25
  • 27. Conclusion & Future Work • Two-staged statistical design for designing optimal crowdsourcing experiments • a-priori estimate optimal workers and tasks' assignment to obtain maximum accuracy on all tasks • Implemented in Python, open source, Jupyter notebook • Future work • Training the workers vs. not training • Real-world experiments and comparison with baseline approaches • Include budgetary constraints • Extend the interface to allow user to vary parameters and observe sensitivity the design is to various assumptions !27
  • 28. @AmrapaliZamrapali.zaveri@maastrichtuniversity.nl Thank You! Questions? Try it yourself https://github.com/MaastrichtU-IDS/crowdED Feedback welcome ! !28