2. Roadmap
• A Brief History of Amazon’s Mechanical Turk
– Its rise to fame, enduring popularity, & strengths & limitations
• A capability analysis of 7 alternative crowd work platforms
– Ascertain distinguishing capabilities not found in MTurk
Matt Lease <ml@utexas.edu>
Take-away: Predominant use of Mechanical Turk in research
on paid crowd work has let its unique vagaries & limitations
unduly shape the broader crowdsourcing research agenda.
Greater awareness of alternative crowd work platforms can
accelerate practice, conceptual understanding, & progress.
2/20
3. Amazon Mechanical Turk (MTurk)
• Online marketplace for paid crowd work (micro-tasks)
• Human labor on-demand as a programmatic service
• Launched in November 2005
Matt Lease <ml@utexas.edu> 3/20
4. The Rise of Mechanical Turk
J. Pontin. Artificial Intelligence, With Help From the
Humans. New York Times, March 2007
Su et al., WWW 2007: “a web-based human data
collection system that we [call] ‘System M’ ”
Matt Lease <ml@utexas.edu> 4/20
5. 2008: Gold Rush begins in Comp. Sci.
Snow et al, EMNLP
• Annotating human language
• 22,000 labels for only US $26 !!!
• Crowd’s consensus labels can
replace traditional expert labels
“Discovery” sparks rush for “gold” data across areas
• Alonso et al., SIGIR Forum (Information Retrieval)
• Kittur et al., CHI (Human-Computer Interaction)
• Sorokin and Forsythe, CVPR (Computer Vision)
Matt Lease <ml@utexas.edu> 5/20
6. MTurk for Social & Behavioral Sciences
• A Guide to Behavioral Experiments
on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
– see also: Amazon Mechanical Turk Guide for Social Scientists
Matt Lease <ml@utexas.edu> 6/20
7. MTurk Strengths
• On-demand, 24-7, vast U.S. & Indian workforce
• Traditional economic friction and geographic
work barriers dramatically streamlined
– Cheap & fast recruitment & job execution
– Hiring/firing, communication, transfer, payment
• Programmatic control via intermediary API
• Worker pseudonyms and diverse participant
pool amenable to human subjects research
Matt Lease <ml@utexas.edu> 7/20
8. But also challenges (& much research)
• Reliability & data quality
– Risk & uncertainty may preclude adoption;
management overhead may offset savings
– Limited reputation system (no ratings/ reviews)
– Real & perceived fraud in marketplace
• Basic usage difficulties (eg. posting & search)
• How to accomplish non-trivial tasks?
– Lack of worker profiles (skills & interests)
– interaction, collaboration, & workflow design
• How to process sensitive data?
– Private records, trade secrets, military use
• How to assess and navigate ethics?
Matt Lease <ml@utexas.edu> 8/20
9. Amazon Mechanical Turk (MTurk)
• Online marketplace for paid crowd work (micro-tasks)
• Human labor on-demand as a programmatic service
• Launched in Nov. 2005, remains in “beta” in 2015
Matt Lease <ml@utexas.edu> 9/20
11. Why Eytan Adar hates MTurk Research
(CHI 2011 CHC Workshop)
• Overly-narrow research focus on MTurk
– Distinguish general vs. platform-specific problems
– Distinguish research vs. industry concerns
• Should researchers really focus on…
– “...writing the user’s manual for MTurk ...”?
– “…struggl[ing] against the limits of the platform...”?
“…by rewarding quick demonstrations of the tool’s
use, we fail to attain a deeper understanding of the
problems to which it is applied…”
Matt Lease <ml@utexas.edu> 11/20
14. Many Crowd Work Platforms Today
And More!
JobBoy, microWorkers, MiniFreelance,
MiniJobz, MinuteWorkers, MyEasyTask,
OpTask, ShortTask, SimpleWorkers
Paid Crowd Work ≠ Mechanical Turk!
Matt Lease <ml@utexas.edu> 14/20
15. The Future of Crowd Work
Paper @ ACM CSCW 2013 by
Kittur, Nickerson, Bernstein,
Gerber, Shaw, Zimmerman
Lease, & Horton
Matt Lease <ml@utexas.edu>
With diminished visibility and communication…
workers may be treated as exchangeable and
untrustworthy, having …. strong motivations to
shirk. Workers may become equally cynical…”
“…we surveyed a number of … popular crowd work
platforms... Mechanical Turk, oDesk, Freelancer,
Crowdflower, MobileWorks, ManPower…”
But our platform coverage was pretty superficial…
15/20
16. Here: Qualitative Study of 7 Platforms
Matt Lease <ml@utexas.edu>
We shared preliminary findings with personnel from
all platforms, requesting & incorporating feedback
16/20
18. Capabilities Beyond MTurk (1 of 2)
• Basic platform operations mature beyond “beta”
– e.g. effective task search, preventing task starvation
• Payment models beyond per-task completed
– e.g., hourly pay or stronger models of contract work
• Rich worker profiles & reputation systems
– Some rigorously curate and assess workforce
– Work quality, desired demographics, task routing
• Rich interaction & collaboration support
– Support for a variety of specific task verticals
Matt Lease <ml@utexas.edu> 18/20
19. Capabilities Beyond MTurk (2 of 2)
• Truly global workforce & deeper geographic foci
– Still accepting new international workers…
• Supporting confidential & bring-your-own crowds
– Ability to crowdsource sensitive data
• “Fair trade” models for work with career ladders
– Explicit missions & designs for ethical work practices
• Legal terms-of-use beyond marketplace broker
– e.g., avoid risk of independent contractors vs.
employees debate (Wolfson & Lease, ASIS&T’11)
Matt Lease <ml@utexas.edu> 19/20
20. Summary
• Qualitative study of 7 platforms for crowd work
– Detailed analysis & comparison of current state-of-the-art
– Highlighted key capabilities available beyond MTurk
• Further inform current practice & future research
Matt Lease <ml@utexas.edu>
Take-away: Predominant use of Mechanical Turk in research
on paid crowd work has let its unique vagaries & limitations
unduly shape the broader crowdsourcing research agenda.
Greater awareness of alternative crowd work platforms can
accelerate practice, conceptual understanding, & progress.
20/20
23. AAAI HCOMP 2013 Industry Panel
Anand Kulkarni: “How do we
dramatically reduce the complexity of
getting work done with the crowd?”
Greg Little: How can we post a task and
with 98% confidence know we’ll get a
quality result?
Matt Lease <ml@utexas.edu> 23/20
24. Why Paid Crowd Work?
• Why can’t we just…
– …ask people to volunteer?
– …create fun games which produce work as a by-product?
– …design services that produce work invisibly (e.g. re-Captcha)?
• Established models of paying work exist for a reason
– Even volunteers need to work to survive
– Demand will likely always outstrip volunteer supply
– There will always be forms of uninteresting but needed work
• Volunteer approaches bring their own set of challenges…
• But what about ethics?
– Are volunteering approaches truly more ethical?
– Since paying work isn’t going away, promote awareness & study
Matt Lease <ml@utexas.edu> 24/20
25. CACM August, 2013
Paul Hyman. Communications of the ACM, Vol. 56 No. 8, Pages 19-21, August 2013.
Matt Lease <ml@utexas.edu> 25/20
26. What about ethics?
• Silberman, Irani, and Ross (2010)
– “How should we… conceptualize the role of these
people who we ask to power our computing?”
– Power dynamics between parties
• What are the consequences for a worker
when your actions harm their reputation?
– “Abstraction hides detail”
• Fort, Adda, and Cohen (2011)
– “…opportunities for our community to deliberately
value ethics above cost savings.”
Matt Lease <ml@utexas.edu> 26/20
27. What about Human Subjects Research?
•
“What are the characteristics of MTurk workers?... the MTurk
system is set up to strictly protect workers’ anonymity….”
Matt Lease <ml@utexas.edu> 27/20
28. `
A MTurk worker’s ID is
also their customer
ID on Amazon. Public
profile pages can link
worker ID to name.
Lease et al., SSRN’13
Matt Lease <ml@utexas.edu> 28/20
29. Crowdsourcing
• Take a task traditionally
performed by a known agent
(often an employee)
• Advertise it to an undefined,
generally large group of
people via an open call
Matt Lease <ml@utexas.edu> 29/20
30. The Unreasonable Effectiveness of Data
Banko and Brill (2001)
An AI system’s effectiveness in practice is often
limited by the amount of available training data
Matt Lease <ml@utexas.edu> 30/20