SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Scheduling Human Intelligence
Tasks in Multi-Tenant
Crowd-Powered Systems
Djellel Eddine Difallah, University of Fribourg, CH
Gianluca Demartini, University of Sheffield, UK
Philippe Cudré-Mauroux, University of Fribourg, CH
Introduction
• Crowdsourcing relies on a large pool of humans to perform
complex tasks (paid workers, volunteers, players etc)
• A Crowdsourcing platform (e.g., CrowdFlower, Amazon
MTurk) allows requesters to tap into a pool of paid workers
in a shared resources fashion
• Requesters would publish batches of similar tasks to be
completed in exchange of a monetary reward
• Workers can arrive and leave at any point in time and can
selectively focus on an arbitrary subset of the tasks only
2
Introduction
Observations
• Few workers perform many tasks, followed by a
long tail of workers performing fewer tasks [Ipeirotis
2010; Franklin et al. 2011]
• Large jobs are fast at the beginning, then they lose
their momentum toward the end [Difallah et al. 2014]
• We suspect that this leads to batches being treated
unequally. (Batch Size, Freshness, Requester,
Price) [Difallah et al. 2015]
3
0.00
0.25
0.50
0.75
1.00
Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01
Time (Day)
Count(Normalized)
(a) Batch distribution per Size.
0.00
0.25
0.50
0.75
1.00
Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01
Time (Day)
Throughput(Normalized)
(b) Cumulative Throughput per Batch Size.
Introduction
Data Analysis
• Most of the Batches
present on AMT have
10 HITs or less
• The overall platform
throughput is
dominated by larger
batches
Tiny[0,10]
Small[10,100]
Medium[100,1000]
Large[1000,Inf]
4
Motivation
The case of Multi-Tenant Crowd-powered Systems (CPS)
• Definition: A CPS serves multiple customers/users (e.g., a
Crowd DBMS)
• The system posts a batch of tasks on the crowdsourcing
platform per user query
• The CPS is in constant competition to attract workers
• With itself — multiple tenants
• With other requesters
• Job starvation is problematic in business applications
5
Contributions
• We design a novel crowdsourcing system
architecture that allows job scheduling for a CPS
on top of a traditional crowdsourcing platform
• We devise a scheduling algorithm that embodies
a set of general design requirements
• We empirically evaluate our setup on Amazon
MTurk, with real crowd and a set of scheduling
algorithms
6
HIT-Bundle
Definition
• Scheduling requires that
we have control over the
serving process of tasks
• A HIT-Bundle is a batch
that contains
heterogeneous tasks
• All tasks that are generated
by the CPS are published
through the HIT-Bundle HIT-Bundle
Batch 1
Batch 2
Batch 3
Batch 4
7
HIT-Bundle
Micro Experiment
• Comparison of
batch execution
time using different
grouping strategies
• Distinct batches
• Combined in a
HIT-Bundle
0
25
50
75
100
0 1000 2000 3000 4000
Time (seconds)
#HITsRemaining
B6 − Bundle
B7 − Bundle
B6
B7
8
Proposed CPS
Architecture
Crowdsourcing
Decision Engine
HIT-Bundle Manager
Multi-Tenant
Crowd-Powered System
Crowdsourcing
Platform
Progress
Monitor API
HIT Scheduler
Human
Workers
c1 a1b3..
Queue
Crowdsourcing
App
HIT Collection and Reward
HIT
Results
Aggregator
HIT
Manager
Scheduler
External
HIT
Page
Batch A $$
Batch B $$$
Batch C $
..
Batch Catalog
HIT-Bundle
Creation/Update
Batch Merging
StatusMETA
System
Crowdsourced
queries
Batch Input
Merger
Resource
Tracker
config_file
9
Scheduling for the Crowd
Design Guidelines
• (R1) Runtime Scalability: Adopt a runtime scheduler that a)
dynamically adapts to the current availability of the crowd, and b)
scales to make real-time scheduling decisions as the work
demand grows higher
• (R2) Fairness: The scheduler must provide a steady progress to
large requests without blocking or starving, the smaller requests
• (R3) Priority: The scheduler must be sensitive to clients who
have higher priority (e.g., those who pay more)
• (R4) Human Aware: Unlike machines, people performances are
impacted by many factors including context switching, training
effects, boringness, task difficulty and interestingness
10
(Weighted) Fair Scheduler
• Fair Scheduling FS (R1) (R2):
• Keep track of how many tasks per
batch are currently assigned
running_tasks
• Assign task with min running_tasks
• The Weighted Fair Sharing WFS variant
(R3):
• Compute a weight, based on priority
(e.g., price)
• weight(Bj) = p(Bj)/sum(p(B))
• Assign task with

min running_tasks/weight
• Pros. ensures that all the batches receive
proportional number of workers available
• Cons. We don’t satisfy (R4) Human
Awareness
HIT-Bundle
7 tasks running
1. get_task()
FS: return( )
WFS: return( )
2.
p=0.1$ w= 0.5
p=0.05$ w= 0.25
p=0.05$ w= 0.25
11
Worker Context Switch
Micro Experiment
• We run a HIT Bundle with
heterogenous tasks
• Compute average execution
time for each HIT
• RR: Round Robin, task type
changes every time
• SEQ10 / SEQ25: Task types
are alternated every 10,
respectively 25 tasks
• The mean task execution time
is significantly lower for SEQ25
●
●
●
●
●
●
●
●
●
** (p−value=0.023)** (p−value=0.023)
20
40
60
RR SEQ10 SEQ25
Experiment Type
ExecutiontimeperHIT(Seconds)
RR SEQ10 SEQ25
12
Worker Conscious Fair
Scheduling WCFS
• Goal: Reduce the context switch introduced by having
the worker continuously switch tasks types
• We modify Fair Sharing with Delayed Scheduling [Zaharia
et al. 2010]
• A task will give up its priority K times until a worker
who just completed a similar task is available again
• Pros. we satisfy all our design requirements. A worker
receives longer sequences of similar tasks
• Cons. Need to set K
13
Experiments
Controlled Setup
• On Amazon Mechanical Turk (no simulations)
• HIT-Bundle with 5 different task types
• We artificially ensure that we have num_workers
>10 before starting an experiment
• We compare against basic schedulers First In First
Out (FIFO), Round Robin (RR), Shortest Job First
(SJF)
14
Controlled Experiments
Latency
All experiment are run in parallel
FIFO order [B1, B2, B3, B4, B5]
SJF order [B4, B3, B5, B2, B1] based on
previous evidence
• FIFO finishes jobs one after the other
• Wile SJF finishes the shortest jobs
first
• FS and RR offer a balanced
workforce
0
500
1000
1500
2000
B1 B2 B3 B4 B5
Batch
Time(Seconds)
FIFO FS RR SJF
(a) Batch Latency
0
500
1000
1500
2000
FIFO FS RR SJF
Scheduling Scheme
Time(Seconds)
(b) Overall Experiment Latency
15
0
300
600
900
B1 B2 B3 B4 B5
Batch
Time(seconds)
B2:$0.02
B2:$0.05
(a)Vary The Price
0
250
500
750
1000
B1 B2 B3 B4 B5
Batch
Time(seconds)
10 workers
20 workers
(b) Vary The Workforce
Experiments
Varying the Control Factors
Weighted Fair Scheduler is used
• (a) Effect of increasing B2’s
priority (Price) on batch
execution time
• B2 executes faster
• (b) Effect of varying the number
of crowd workers involved in the
completion of the HIT batches
• The load is rebalanced (albeit,
with different proportions) but
all batches had a speed
increase
16
Experiments in the Wild
Execution Trace
0
10
20
30
0
10
20
30
0
10
20
30
FSIndividualBatchesWCFS
12:20 12:30 12:40 12:50
Time
#ActiveWorkers
Conclusions
• Batch starvation in crowdsourcing is problematic for requesters
• We introduce a new scheduling layer that shares a pool of crowd
workers among multiple tenants of a crowd-powered system
• We perform evaluations in a real setup with real workers
• We show that an HIT-Bundle increases the overall throughput
• Our technique (Worker Conscious Fair Sharing), inspired from
large scale data processing frameworks, minimises context switch
• Toward Service Level Agreement aware scheduling for
crowdsourcing platforms.
Code: https://github.com/XI-lab/HIT-Scheduler

Weitere ähnliche Inhalte

Ähnlich wie Crowd scheduling www2016

Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithmssathish sak
 
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...IRJET Journal
 
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning WorkloadsHeterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning WorkloadsDatabricks
 
Scheduling and sequencing
Scheduling and sequencingScheduling and sequencing
Scheduling and sequencingAkanksha Gupta
 
Lecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxLecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxssuser0d0f881
 
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERSVTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERSvtunotesbysree
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6Shah Zaib
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java ApplicationsC4Media
 
Job Queues Overview
Job Queues OverviewJob Queues Overview
Job Queues Overviewjoeyrobert
 
Operating System Lab Manual
Operating System Lab ManualOperating System Lab Manual
Operating System Lab ManualBilal Mirza
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.pptssuserf6eb9b
 
Product layout in Food Industry and Line Balancing
Product layout in Food Industry and Line BalancingProduct layout in Food Industry and Line Balancing
Product layout in Food Industry and Line BalancingAbhishek Thakur
 
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...Editor IJMTER
 
Operations Research_18ME735_module 5 sequencing notes.pdf
Operations Research_18ME735_module 5 sequencing notes.pdfOperations Research_18ME735_module 5 sequencing notes.pdf
Operations Research_18ME735_module 5 sequencing notes.pdfRoopaDNDandally
 
Operations Management : Line Balancing
Operations Management : Line BalancingOperations Management : Line Balancing
Operations Management : Line BalancingRohan Bharaj
 
First Come First Serve
First Come First ServeFirst Come First Serve
First Come First ServeKavya Kapoor
 

Ähnlich wie Crowd scheduling www2016 (20)

02 performance
02 performance02 performance
02 performance
 
Operating Systems Process Scheduling Algorithms
Operating Systems   Process Scheduling AlgorithmsOperating Systems   Process Scheduling Algorithms
Operating Systems Process Scheduling Algorithms
 
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
 
Scheduling
SchedulingScheduling
Scheduling
 
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning WorkloadsHeterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
 
Scheduling and sequencing
Scheduling and sequencingScheduling and sequencing
Scheduling and sequencing
 
Lecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptxLecture7-QuantitativeAnalysis2.pptx
Lecture7-QuantitativeAnalysis2.pptx
 
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERSVTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
VTU 5TH SEM CSE OPERATING SYSTEMS SOLVED PAPERS
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 
Section05 scheduling
Section05 schedulingSection05 scheduling
Section05 scheduling
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java Applications
 
Job Queues Overview
Job Queues OverviewJob Queues Overview
Job Queues Overview
 
Operating System Lab Manual
Operating System Lab ManualOperating System Lab Manual
Operating System Lab Manual
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.ppt
 
Product layout in Food Industry and Line Balancing
Product layout in Food Industry and Line BalancingProduct layout in Food Industry and Line Balancing
Product layout in Food Industry and Line Balancing
 
K017446974
K017446974K017446974
K017446974
 
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
Comparision of different Round Robin Scheduling Algorithm using Dynamic Time ...
 
Operations Research_18ME735_module 5 sequencing notes.pdf
Operations Research_18ME735_module 5 sequencing notes.pdfOperations Research_18ME735_module 5 sequencing notes.pdf
Operations Research_18ME735_module 5 sequencing notes.pdf
 
Operations Management : Line Balancing
Operations Management : Line BalancingOperations Management : Line Balancing
Operations Management : Line Balancing
 
First Come First Serve
First Come First ServeFirst Come First Serve
First Come First Serve
 

Mehr von eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex GraphseXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
 

Mehr von eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 
Hasler2014
Hasler2014Hasler2014
Hasler2014
 

Kürzlich hochgeladen

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 

Kürzlich hochgeladen (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 

Crowd scheduling www2016

  • 1. Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems Djellel Eddine Difallah, University of Fribourg, CH Gianluca Demartini, University of Sheffield, UK Philippe Cudré-Mauroux, University of Fribourg, CH
  • 2. Introduction • Crowdsourcing relies on a large pool of humans to perform complex tasks (paid workers, volunteers, players etc) • A Crowdsourcing platform (e.g., CrowdFlower, Amazon MTurk) allows requesters to tap into a pool of paid workers in a shared resources fashion • Requesters would publish batches of similar tasks to be completed in exchange of a monetary reward • Workers can arrive and leave at any point in time and can selectively focus on an arbitrary subset of the tasks only 2
  • 3. Introduction Observations • Few workers perform many tasks, followed by a long tail of workers performing fewer tasks [Ipeirotis 2010; Franklin et al. 2011] • Large jobs are fast at the beginning, then they lose their momentum toward the end [Difallah et al. 2014] • We suspect that this leads to batches being treated unequally. (Batch Size, Freshness, Requester, Price) [Difallah et al. 2015] 3
  • 4. 0.00 0.25 0.50 0.75 1.00 Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01 Time (Day) Count(Normalized) (a) Batch distribution per Size. 0.00 0.25 0.50 0.75 1.00 Jan 01 Jan 15 Feb 01 Feb 15 Mar 01 Mar 15 Apr 01 Time (Day) Throughput(Normalized) (b) Cumulative Throughput per Batch Size. Introduction Data Analysis • Most of the Batches present on AMT have 10 HITs or less • The overall platform throughput is dominated by larger batches Tiny[0,10] Small[10,100] Medium[100,1000] Large[1000,Inf] 4
  • 5. Motivation The case of Multi-Tenant Crowd-powered Systems (CPS) • Definition: A CPS serves multiple customers/users (e.g., a Crowd DBMS) • The system posts a batch of tasks on the crowdsourcing platform per user query • The CPS is in constant competition to attract workers • With itself — multiple tenants • With other requesters • Job starvation is problematic in business applications 5
  • 6. Contributions • We design a novel crowdsourcing system architecture that allows job scheduling for a CPS on top of a traditional crowdsourcing platform • We devise a scheduling algorithm that embodies a set of general design requirements • We empirically evaluate our setup on Amazon MTurk, with real crowd and a set of scheduling algorithms 6
  • 7. HIT-Bundle Definition • Scheduling requires that we have control over the serving process of tasks • A HIT-Bundle is a batch that contains heterogeneous tasks • All tasks that are generated by the CPS are published through the HIT-Bundle HIT-Bundle Batch 1 Batch 2 Batch 3 Batch 4 7
  • 8. HIT-Bundle Micro Experiment • Comparison of batch execution time using different grouping strategies • Distinct batches • Combined in a HIT-Bundle 0 25 50 75 100 0 1000 2000 3000 4000 Time (seconds) #HITsRemaining B6 − Bundle B7 − Bundle B6 B7 8
  • 9. Proposed CPS Architecture Crowdsourcing Decision Engine HIT-Bundle Manager Multi-Tenant Crowd-Powered System Crowdsourcing Platform Progress Monitor API HIT Scheduler Human Workers c1 a1b3.. Queue Crowdsourcing App HIT Collection and Reward HIT Results Aggregator HIT Manager Scheduler External HIT Page Batch A $$ Batch B $$$ Batch C $ .. Batch Catalog HIT-Bundle Creation/Update Batch Merging StatusMETA System Crowdsourced queries Batch Input Merger Resource Tracker config_file 9
  • 10. Scheduling for the Crowd Design Guidelines • (R1) Runtime Scalability: Adopt a runtime scheduler that a) dynamically adapts to the current availability of the crowd, and b) scales to make real-time scheduling decisions as the work demand grows higher • (R2) Fairness: The scheduler must provide a steady progress to large requests without blocking or starving, the smaller requests • (R3) Priority: The scheduler must be sensitive to clients who have higher priority (e.g., those who pay more) • (R4) Human Aware: Unlike machines, people performances are impacted by many factors including context switching, training effects, boringness, task difficulty and interestingness 10
  • 11. (Weighted) Fair Scheduler • Fair Scheduling FS (R1) (R2): • Keep track of how many tasks per batch are currently assigned running_tasks • Assign task with min running_tasks • The Weighted Fair Sharing WFS variant (R3): • Compute a weight, based on priority (e.g., price) • weight(Bj) = p(Bj)/sum(p(B)) • Assign task with
 min running_tasks/weight • Pros. ensures that all the batches receive proportional number of workers available • Cons. We don’t satisfy (R4) Human Awareness HIT-Bundle 7 tasks running 1. get_task() FS: return( ) WFS: return( ) 2. p=0.1$ w= 0.5 p=0.05$ w= 0.25 p=0.05$ w= 0.25 11
  • 12. Worker Context Switch Micro Experiment • We run a HIT Bundle with heterogenous tasks • Compute average execution time for each HIT • RR: Round Robin, task type changes every time • SEQ10 / SEQ25: Task types are alternated every 10, respectively 25 tasks • The mean task execution time is significantly lower for SEQ25 ● ● ● ● ● ● ● ● ● ** (p−value=0.023)** (p−value=0.023) 20 40 60 RR SEQ10 SEQ25 Experiment Type ExecutiontimeperHIT(Seconds) RR SEQ10 SEQ25 12
  • 13. Worker Conscious Fair Scheduling WCFS • Goal: Reduce the context switch introduced by having the worker continuously switch tasks types • We modify Fair Sharing with Delayed Scheduling [Zaharia et al. 2010] • A task will give up its priority K times until a worker who just completed a similar task is available again • Pros. we satisfy all our design requirements. A worker receives longer sequences of similar tasks • Cons. Need to set K 13
  • 14. Experiments Controlled Setup • On Amazon Mechanical Turk (no simulations) • HIT-Bundle with 5 different task types • We artificially ensure that we have num_workers >10 before starting an experiment • We compare against basic schedulers First In First Out (FIFO), Round Robin (RR), Shortest Job First (SJF) 14
  • 15. Controlled Experiments Latency All experiment are run in parallel FIFO order [B1, B2, B3, B4, B5] SJF order [B4, B3, B5, B2, B1] based on previous evidence • FIFO finishes jobs one after the other • Wile SJF finishes the shortest jobs first • FS and RR offer a balanced workforce 0 500 1000 1500 2000 B1 B2 B3 B4 B5 Batch Time(Seconds) FIFO FS RR SJF (a) Batch Latency 0 500 1000 1500 2000 FIFO FS RR SJF Scheduling Scheme Time(Seconds) (b) Overall Experiment Latency 15
  • 16. 0 300 600 900 B1 B2 B3 B4 B5 Batch Time(seconds) B2:$0.02 B2:$0.05 (a)Vary The Price 0 250 500 750 1000 B1 B2 B3 B4 B5 Batch Time(seconds) 10 workers 20 workers (b) Vary The Workforce Experiments Varying the Control Factors Weighted Fair Scheduler is used • (a) Effect of increasing B2’s priority (Price) on batch execution time • B2 executes faster • (b) Effect of varying the number of crowd workers involved in the completion of the HIT batches • The load is rebalanced (albeit, with different proportions) but all batches had a speed increase 16
  • 17. Experiments in the Wild Execution Trace 0 10 20 30 0 10 20 30 0 10 20 30 FSIndividualBatchesWCFS 12:20 12:30 12:40 12:50 Time #ActiveWorkers
  • 18. Conclusions • Batch starvation in crowdsourcing is problematic for requesters • We introduce a new scheduling layer that shares a pool of crowd workers among multiple tenants of a crowd-powered system • We perform evaluations in a real setup with real workers • We show that an HIT-Bundle increases the overall throughput • Our technique (Worker Conscious Fair Sharing), inspired from large scale data processing frameworks, minimises context switch • Toward Service Level Agreement aware scheduling for crowdsourcing platforms. Code: https://github.com/XI-lab/HIT-Scheduler