SlideShare ist ein Scribd-Unternehmen logo
1 von 149
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@ischool.utexas.edu
Crowdsourcing & Human Computation
Labeling Data & Building Hybrid Systems
Slides: www.slideshare.net/mattlease
Roadmap
• A Quick Example
• Crowd-powered data collection & applications
• Crowdsourcing, Incentives, & Demographics
• Mechanical Turk & Other Platforms
• Designing for Crowds & Statistical QA
• Open Problems
• Broader Considerations & a Darker Side
2
What is Crowdsourcing?
• Let’s start with a simple example!
• Goal
– See a concrete example of real crowdsourcing
– Ground later discussion of abstract concepts
– Provide a specific example with which we will
contrast other forms of crowdsourcing
3
Human Intelligence Tasks (HITs)
4
5
6
Jane saw the man with the binoculars
Traditional Data Collection
• Setup data collection software / harness
• Recruit participants / annotators / assessors
• Pay a flat fee for experiment or hourly wage
• Characteristics
– Slow
– Expensive
– Difficult and/or Tedious
– Sample Bias…
7
“Hello World” Demo
• Let’s create and run a simple MTurk HIT
• This is a teaser highlighting concepts
– Don’t worry about details; we’ll revisit them
• Goal
– See a concrete example of real crowdsourcing
– Ground our later discussion of abstract concepts
– Provide a specific example with which we will
contrast other forms of crowdsourcing
8
DEMO
9
10
PHASE 1: DATA COLLECTION
NLP: Snow et al. (EMNLP 2008)
• MTurk annotation for 5 Tasks
– Affect recognition
– Word similarity
– Recognizing textual entailment
– Event temporal ordering
– Word sense disambiguation
• 22K labels for US $26
• High agreement between
consensus labels and
gold-standard labels
11
Computer Vision:
Sorokin & Forsythe (CVPR 2008)
• 4K labels for US $60
12
IR: Alonso et al. (SIGIR Forum 2008)
• MTurk for Information Retrieval (IR)
– Judge relevance of search engine results
• Many follow-on studies (design, quality, cost)
13
User Studies: Kittur, Chi, & Suh (CHI 2008)
• “…make creating believable invalid responses as
effortful as completing the task in good faith.”
14
Remote Usability Testing
• Liu, Bias, Lease, and Kuipers, ASIS&T, 2012
• Remote usability testing via MTurk & CrowdFlower
vs. traditional on-site testing
• Advantages
– More (Diverse) Participants
– High Speed
– Low Cost
• Disadvantages
– Lower Quality Feedback
– Less Interaction
– Greater need for quality control
– Less Focused User Groups
15
16
Human Subjects Research:
Surveys, Demographics, etc.
• A Guide to Behavioral Experiments
on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research:
Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of
Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
– see also: Amazon Mechanical Turk Guide for Social Scientists 17
• PhD Thesis, December 2005
• Law & von Ahn, Book, June 2011
18
LUIS VON AHN, CMU
ESP Game (Games With a Purpose)
L. Von Ahn and L. Dabbish (2004)
19
reCaptcha
L. von Ahn et al. (2008). In Science.
20
DuoLingo (Launched Nov. 2011)
21
MORE DATA COLLECTION EXAMPLES
22
Crowd Sensing
• Steve Kelling, et al. A Human/Computer Learning
Network to Improve Biodiversity Conservation
and Research. AI Magazine 34.1 (2012): 10.
23
Tracking Sentiment in Online Media
Brew et al., PAIS 2010
• Volunteer-crowd
• Judge in exchange for
access to rich content
• Balance system needs
with user interest
• Daily updates to non-
stationary distribution
24
PHASE 2: FROM DATA COLLECTION
TO HUMAN COMPUTATION
25
What is a Computer?
26
Princeton University Press, 2005
• What was old is new
• Crowdsourcing: A New Branch
of Computer Science
– D.A. Grier, March 29, 2011
• Tabulating the heavens:
computing the Nautical
Almanac in 18th-century
England - M. Croarken’03
27
Human Computation
J. Pontin. Artificial Intelligence, With Help From
the Humans. New York Times (March 25, 2007)
The Mechanical Turk
28
Constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804)
The Human Processing Unit (HPU)
• Davis et al. (2010)
HPU
29
Human Computation
• Having people do stuff instead of computers
• Investigates use of people to execute certain
computations for which capabilities of current
automated methods are more limited
• Explores the metaphor of computation for
characterizing attributes, capabilities, and
limitations of human task performance
30
APPLYING HUMAN COMPUTATION:
CROWD-POWERED APPLICATIONS
31
32
Crowd-Assisted Search: “Amazon Remembers”
Translation by monolingual speakers
• C. Hu, CHI 2009
33
Soylent: A Word Processor with a Crowd Inside
• Bernstein et al., UIST 2010
34
fold.it
S. Cooper et al. (2010)
Alice G. Walton. Online Gamers Help Solve Mystery of
Critical AIDS Virus Enzyme. The Atlantic, October 8, 2011.
35
PlateMate (Noronha et al., UIST’10)
36
Image Analysis and more: Eatery
37
VizWiz aaaaaaaa
Bingham et al. (UIST 2010)
38
39
Crowd Sensing: Waze
40
SO WHAT IS CROWDSOURCING?
41
42
From Outsourcing to Crowdsourcing
• Take a job traditionally
performed by a known agent
(often an employee)
• Outsource it to an undefined,
generally large group of
people via an open call
• New application of principles
from open source movement
• Evolving & broadly defined ...
43
Crowdsourcing models
• Micro-tasks & citizen science
• Co-Creation
• Open Innovation, Contests
• Prediction Markets
• Crowd Funding and Charity
• “Gamification” (not serious gaming)
• Transparent
• cQ&A, Social Search, and Polling
• Physical Interface/Task
44
What is Crowdsourcing?
• Mechanisms and methodology for directing
crowd action to achieve some goal(s)
– E.g., novel ways of collecting data from crowds
• Powered by internet-connectivity
• Related topics:
– Human computation
– Collective intelligence
– Crowd/Social computing
– Wisdom of Crowds
– People services, Human Clouds, Peer-production, …
45
What is not crowdsourcing?
• Analyzing existing datasets (no matter source)
– Data mining
– Visual analytics
• Use of few people
– Mixed-initiative design
– Active learning
• Conducting a survey or poll… (*)
– Novelty?
46
Crowdsourcing Key Questions
• What are the goals?
– Purposeful directing of human activity
• How can you incentivize participation?
– Incentive engineering
– Who are the target participants?
• Which model(s) are most appropriate?
– How to adapt them to your context and goals?
47
Wisdom of Crowds (WoC)
Requires
• Diversity
• Independence
• Decentralization
• Aggregation
Input: large, diverse sample
(to increase likelihood of overall pool quality)
Output: consensus or selection (aggregation)
48
What do you want to accomplish?
• Create
• Execute task/computation
• Fund
• Innovate and/or discover
• Learn
• Monitor
• Predict
49
INCENTIVE ENGINEERING
50
Why should your crowd participate?
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige (leaderboards, badges)
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
Multiple incentives can often operate in parallel (*caveat)
51
Example: Wikipedia
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
52
Example: DuoLingo
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
53
Example:
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
54
Example: ESP
55
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
Example: fold.it
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
56
Example: FreeRice
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
57
Example: cQ&A
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
58
Example: reCaptcha
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
59
Is there an existing human
activity you can harness
for another purpose?
Example: Mechanical Turk
• Earn Money (real or virtual)
• Have fun (or pass the time)
• Socialize with others
• Obtain recognition or prestige
• Do Good (altruism)
• Learn something new
• Obtain something else
• Create self-serving resource
60
Dan Pink – YouTube video
“The Surprising Truth about what Motivates us”
61
Who are
the workers?
• A. Baio, November 2008. The Faces of Mechanical Turk.
• P. Ipeirotis. March 2010.
The New Demographics of Mechanical Turk
• J. Ross, et al. Who are the Crowdworkers?... CHI 2010.
62
MTurk Demographics
• 2008-2009 studies found
less global and diverse
than previously thought
– US
– Female
– Educated
– Bored
– Money is secondary
63
2010 shows increasing diversity
47% US, 34% India, 19% other (P. Ipeitorotis. March 2010)
64
How Much to Pay?
• Price commensurate with task effort
– Ex: $0.02 for yes/no answer + $0.02 bonus for optional feedback
• Ethics & market-factors: W. Mason and S. Suri, 2010.
– e.g. non-profit SamaSource involves workers in refugee camps
– Predict right price given market & task: Wang et al. CSDM’11
• Uptake & time-to-completion vs. Cost & Quality
– Too little $$, no interest or slow – too much $$, attract spammers
– Real problem is lack of reliable QA substrate
• Accuracy & quantity
– More pay = more work, not better (W. Mason and D. Watts, 2009)
• Heuristics: start small, watch uptake and bargaining feedback
• Worker retention (“anchoring”)
65
See also: L.B. Chilton et al. KDD-HCOMP 2010.
MTURK & OTHER PLATFORMS
66
Does anyone really use it? Yes!
http://www.mturk-tracker.com (P. Ipeirotis’10)
From 1/09 – 4/10, 7M HITs from 10K requestors
worth $500,000 USD (significant under-estimate)
67
MTurk: The Requester
• Sign up with your Amazon account
• Amazon payments
• Purchase prepaid HITs
• There is no minimum or up-front fee
• MTurk collects a 10% commission
• The minimum commission charge is $0.005 per HIT
68
MTurk Dashboard
• Three tabs
– Design
– Publish
– Manage
• Design
– HIT Template
• Publish
– Make work available
• Manage
– Monitor progress
69
70
MTurk: Dashboard - II
71
MTurk API
• Amazon Web Services API
• Rich set of services
• Command line tools
• More flexibility than dashboard
72
MTurk Dashboard vs. API
• Dashboard
– Easy to prototype
– Setup and launch an experiment in a few minutes
• API
– Ability to integrate AMT as part of a system
– Ideal if you want to run experiments regularly
– Schedule tasks
73
74
• Multiple Channels
• Gold-based tests
• Only pay for
“trusted” judgments
More Crowd Labor Platforms
• Clickworker
• CloudCrowd
• CloudFactory
• CrowdSource
• DoMyStuff
• Microtask
• MobileWorks (by Anand Kulkarni )
• myGengo
• SmartSheet
• vWorker
• Industry heavy-weights
– Elance
– Liveops
– oDesk
– uTest
• and more…
75
Many Factors Matter!
• Process
– Task design, instructions, setup, iteration
• Choose crowdsourcing platform (or roll your own)
• Human factors
– Payment / incentives, interface and interaction design,
communication, reputation, recruitment, retention
• Quality Control / Data Quality
– Trust, reliability, spam detection, consensus labeling
• Don’t write a paper saying “we collected data from
MTurk & then…” – details of method matter!
76
WORKFLOW DESIGN
77
PlateMate - Architecture
78
Kulkarni et al.,
CSCW 2012
Turkomatic
79
CrowdForge: Workers perform a task
or further decompose them
80
Kittur et al., CHI 2011
Kittur et al., CrowdWeaver, CSCW 2012
81
DESIGNING FOR CROWDS
82
Typical Workflow
• Define and design what to test
• Sample data
• Design the experiment
• Run experiment
• Collect data and analyze results
• Quality control
83
Development Framework
• Incremental approach (from Omar Alonso)
• Measure, evaluate, and adjust as you go
• Suitable for repeatable tasks
84
Survey Design
• One of the most important parts
• Part art, part science
• Instructions are key
• Prepare to iterate
85
Questionnaire Design
• Ask the right questions
• Workers may not be IR experts so don’t
assume the same understanding in terms of
terminology
• Show examples
• Hire a technical writer
– Engineer writes the specification
– Writer communicates
86
UX Design
• Time to apply all those usability concepts
• Generic tips
– Experiment should be self-contained.
– Keep it short and simple. Brief and concise.
– Be very clear with the relevance task.
– Engage with the worker. Avoid boring stuff.
– Always ask for feedback (open-ended question) in
an input box.
87
UX Design - II
• Presentation
• Document design
• Highlight important concepts
• Colors and fonts
• Need to grab attention
• Localization
88
Implementation
• Similar to a UX
• Build a mock up and test it with your team
– Yes, you need to judge some tasks
• Incorporate feedback and run a test on MTurk
with a very small data set
– Time the experiment
– Do people understand the task?
• Analyze results
– Look for spammers
– Check completion times
• Iterate and modify accordingly
89
Implementation – II
• Introduce quality control
– Qualification test
– Gold answers (honey pots)
• Adjust passing grade and worker approval rate
• Run experiment with new settings & same data
• Scale on data
• Scale on workers
90
Other design principles
• Text alignment
• Legibility
• Reading level: complexity of words and sentences
• Attractiveness (worker’s attention & enjoyment)
• Multi-cultural / multi-lingual
• Who is the audience (e.g. target worker community)
– Special needs communities (e.g. simple color blindness)
• Parsimony
• Cognitive load: mental rigor needed to perform task
• Exposure effect
91
The human side
• As a worker
– I hate when instructions are not clear
– I’m not a spammer – I just don’t get what you want
– Boring task
– A good pay is ideal but not the only condition for engagement
• As a requester
– Attrition
– Balancing act: a task that would produce the right results and
is appealing to workers
– I want your honest answer for the task
– I want qualified workers; system should do some of that for me
• Managing crowds and tasks is a daily activity
– more difficult than managing computers
92
QUALITY ASSURANCE
93
When to assess quality of work
• Beforehand (prior to main task activity)
– How: “qualification tests” or similar mechanism
– Purpose: screening, selection, recruiting, training
• During
– How: assess labels as worker produces them
• Like random checks on a manufacturing line
– Purpose: calibrate, reward/penalize, weight
• After
– How: compute accuracy metrics post-hoc
– Purpose: filter, calibrate, weight, retain (HR)
– E.g. Jung & Lease (2011), Tang & Lease (2011), ...
94
How do we measure work quality?
• Compare worker’s label vs.
– Known (correct, trusted) label
– Other workers’ labels
• P. Ipeirotis. Worker Evaluation in Crowdsourcing: Gold Data or
Multiple Workers? Sept. 2010.
– Model predictions of the above
• Model the labels (Ryu & Lease, ASIS&T11)
• Model the workers (Chen et al., AAAI’10)
• Verify worker’s label
– Yourself
– Tiered approach (e.g. Find-Fix-Verify)
• Quinn and B. Bederson’09, Bernstein et al.’10
95
Typical Assumptions
• Objective truth exists
– no minority voice / rare insights
– Can relax this to model “truth distribution”
• Automatic answer comparison/evaluation
– What about free text responses? Hope from NLP…
• Automatic essay scoring
• Translation (BLEU: Papineni, ACL’2002)
• Summarization (Rouge: C.Y. Lin, WAS’2004)
– Have people do it (yourself or find-verify crowd, etc.)
96
Distinguishing Bias vs. Noise
• Ipeirotis (HComp 2010)
• People often have consistent, idiosyncratic
skews in their labels (bias)
– E.g. I like action movies, so they get higher ratings
• Once detected, systematic bias can be
calibrated for and corrected (yeah!)
• Noise, however, seems random & inconsistent
– this is the real issue we want to focus on
97
Comparing to known answers
• AKA: gold, honey pot, verifiable answer, trap
• Assumes you have known answers
• Cost vs. Benefit
– Producing known answers (experts?)
– % of work spent re-producing them
• Finer points
– Controls against collusion
– What if workers recognize the honey pots?
98
Comparing to other workers
• AKA: consensus, plurality, redundant labeling
• Well-known metrics for measuring agreement
• Cost vs. Benefit: % of work that is redundant
• Finer points
– Is consensus “truth” or systematic bias of group?
– What if no one really knows what they’re doing?
• Low-agreement across workers indicates problem is with the
task (or a specific example), not the workers
– Risk of collusion
• Sheng et al. (KDD 2008)
99
Comparing to predicted label
• Ryu & Lease, ASIS&T11
• Catch-22 extremes
– If model is really bad, why bother comparing?
– If model is really good, why collect human labels?
• Exploit model confidence
– Trust predictions proportional to confidence
– What if model very confident and wrong?
• Active learning
– Time sensitive: Accuracy / confidence changes
100
Compare to predicted worker labels
• Chen et al., AAAI’10
• Avoid inefficiency of redundant labeling
– See also: Dekel & Shamir (COLT’2009)
• Train a classifier for each worker
• For each example labeled by a worker
– Compare to predicted labels for all other workers
• Issues
• Sparsity: workers have to stick around to train model…
• Time-sensitivity: New workers & incremental updates?
101
Methods for measuring agreement
• What to look for
– Agreement, reliability, validity
• Inter-agreement level
– Agreement between judges
– Agreement between judges and the gold set
• Some statistics
– Percentage agreement
– Cohen’s kappa (2 raters)
– Fleiss’ kappa (any number of raters)
– Krippendorff’s alpha
• With majority vote, what if 2 say relevant, 3 say not?
– Use expert to break ties (Kochhar et al, HCOMP’10; GQR)
– Collect more judgments as needed to reduce uncertainty
102
Other practical tips
• Sign up as worker and do some HITs
• “Eat your own dog food”
• Monitor discussion forums
• Address feedback (e.g., poor guidelines,
payments, passing grade, etc.)
• Everything counts!
– Overall design only as strong as weakest link
103
OPEN PROBLEMS
104
Why Eytan Adar hates MTurk Research
(CHI 2011 CHC Workshop)
• Overly-narrow focus on MTurk
– Identify general vs. platform-specific problems
– Academic vs. Industrial problems
• Inattention to prior work in other disciplines
• Turks aren’t Martians
– Just human behavior…
105
What about sensitive data?
• Not all data can be publicly disclosed
– User data (e.g. AOL query log, Netflix ratings)
– Intellectual property
– Legal confidentiality
• Need to restrict who is in your crowd
– Separate channel (workforce) from technology
– Hot question for adoption at enterprise level
106
A Few Open Questions
• How should we balance automation vs.
human computation? Which does what?
• Who’s the right person for the job?
• How do we handle complex tasks? Can we
decompose them into smaller tasks? How?
107
What about ethics?
• Silberman, Irani, and Ross (2010)
– “How should we… conceptualize the role of these
people who we ask to power our computing?”
– Power dynamics between parties
• What are the consequences for a worker
when your actions harm their reputation?
– “Abstraction hides detail”
• Fort, Adda, and Cohen (2011)
– “…opportunities for our community to deliberately
value ethics above cost savings.”
108
Example: SamaSource
109
Davis et al. (2010) The HPU.
HPU
110
HPU: “Abstraction hides detail”
• Not just turning a mechanical crank
111
Micro-tasks & Task Decomposition
• Small, simple tasks can be completed faster by
reducing extraneous context and detail
– e.g. “Can you name who is in this photo?”
• Current workflow research investigates how to
decompose complex tasks into simpler ones
112
Context & Informed Consent
• What is the larger task I’m contributing to?
• Who will benefit from it and how?
113
Worker Privacy
Each worker is assigned an alphanumeric ID
114
Requesters see only Worker IDs
115
Issues of Identity Fraud
• Compromised & exploited worker accounts
• Sybil attacks: use of multiple worker identities
• Script bots masquerading as human workers
116
Robert Sim, MSR Faculty Summit’12
Safeguarding Personal Data
•
“What are the characteristics of MTurk workers?... the MTurk
system is set up to strictly protect workers’ anonymity….”
117
`
Amazon profile page
URLs use the same
IDs used on MTurk !
Paper: MTurk is
Not Anonymous 118
What about the regulation?
• Wolfson & Lease (ASIS&T 2011)
• As usual, technology is ahead of the law
– employment law
– patent inventorship
– data security and the Federal Trade Commission
– copyright ownership
– securities regulation of crowdfunding
• Take-away: don’t panic, but be mindful
– Understand risks of “just in-time compliance”
119
Digital Dirty Jobs
• NY Times: Policing the Web’s Lurid Precincts
• Gawker: Facebook content moderation
• CultureDigitally: The dirty job of keeping
Facebook clean
• Even LDC annotators reading typical
news articles report stress & nightmares!
120
Jeff Howe Vision vs. Reality?
• Vision of empowering worker freedom:
– work whenever you want for whomever you want
• When $$$ is at stake, populations at risk may
be compelled to perform work by others
– Digital sweat shops? Digital slaves?
– We really don’t know (and need to learn more…)
– Traction? Human Trafficking at MSR Summit’12
121
A DARK SIDE OF CROWDSOURCING
122
Putting the shoe on the other foot:
Spam
123
What about trust?
• Some reports of robot “workers” on MTurk
– E.g. McCreadie et al. (2011)
– Violates terms of service
• Why not just use a captcha?
124
Captcha Fraud
125
Requester Fraud on MTurk
“Do not do any HITs that involve: filling in
CAPTCHAs; secret shopping; test our web page;
test zip code; free trial; click my link; surveys or
quizzes (unless the requester is listed with a
smiley in the Hall of Fame/Shame); anything
that involves sending a text message; or
basically anything that asks for any personal
information at all—even your zip code. If you
feel in your gut it’s not on the level, IT’S NOT.
Why? Because they are scams...”
126
Defeating CAPTCHAs with crowds
127
Gaming the System: SEO, etc.
WWW’12
129
Robert Sim, MSR Summit’12
130
Conclusion
• Crowdsourcing is quickly transforming practice
in industry and academia via greater efficiency
• Crowd computing enables a new design space
for applications, augmenting state-of-the-art AI
with human computation to offer
new capabilities and user experiences
• With people at the center of this new computing
paradigm, important research questions
bridge technological & social considerations
131
The Future of Crowd Work
Paper @ ACM CSCW 2013
Kittur, Nickerson, Bernstein, Gerber,
Shaw, Zimmerman, Lease, and Horton 132
Brief Digression: Information Schools
• At 30 universities in N. America, Europe, Asia
• Study human-centered aspects of information
technologies: design, implementation, policy, …
133
www.ischools.org
Wobbrock et
al., 2009
REFERENCES & RESOURCES
134
• Jeff Nickerson Aniket Kittur, Michael S. Bernstein, Elizabeth
Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and
John J. Horton. The Future of Crowd Work. In ACM Computer
Supported Cooperative Work (CSCW), February 2013.
• Alex Quinn and Ben Bederson. Human Computation: A Survey
and Taxonomy of a Growing Field. In Proceedings of CHI 2011.
• Law and von Ahn (2011). Human Computation
135
Surveys
2013 Crowdsourcing
• 1st year of HComp as AAAI conference
• TREC 2013 Crowdsourcing Track
• Springer’s Information Retrieval (articles online):
Crowdsourcing for Information Retrieval
• 4th CrowdConf (San Francisco, Fall)
• 1st Crowdsourcing Week (Singapore, April)
136
TREC Crowdsourcing Track
• Year 1 (2011) – horizontals
– Task 1 (hci): collect crowd relevance judgments
– Task 2 (stats): aggregate judgments
– Organizers: Kazai & Lease
– Sponsors: Amazon, CrowdFlower
• Year 2 (2012) – content types
– Task 1 (text): judge relevance
– Task 2 (images): judge relevance
– Organizers: Ipeirotis, Kazai, Lease, & Smucker
– Sponsors: Amazon, CrowdFlower, MobileWorks
137
2012 Workshops & Conferences
• AAAI: Human Computation (HComp) (July 22-23)
• AAAI Spring Symposium: Wisdom of the Crowd (March 26-28)
• ACL: 3rd Workshop of the People's Web meets NLP (July 12-13)
• AMCIS: Crowdsourcing Innovation, Knowledge, and Creativity in Virtual Communities(August 9-12)
• CHI: CrowdCamp (May 5-6)
• CIKM: Multimodal Crowd Sensing (CrowdSens) (Oct. or Nov.)
• Collective Intelligence (April 18-20)
• CrowdConf 2012 -- 3rd Annual Conference on the Future of Distributed Work (October 23)
• CrowdNet - 2nd Workshop on Cloud Labor and Human Computation (Jan 26-27)
• EC: Social Computing and User Generated Content Workshop (June 7)
• ICDIM: Emerging Problem- specific Crowdsourcing Technologies (August 23)
• ICEC: Harnessing Collective Intelligence with Games (September)
• ICML: Machine Learning in Human Computation & Crowdsourcing (June 30)
• ICWE: 1st International Workshop on Crowdsourced Web Engineering (CroWE) (July 27)
• KDD: Workshop on Crowdsourcing and Data Mining (August 12)
• Multimedia: Crowdsourcing for Multimedia (Nov 2)
• SocialCom: Social Media for Human Computation (September 6)
• TREC-Crowd: 2nd TREC Crowdsourcing Track (Nov. 14-16)
• WWW: CrowdSearch: Crowdsourcing Web search (April 17)
138
2011 Workshops & Conferences
• AAAI-HCOMP: 3rd Human Computation Workshop (Aug. 8)
• ACIS: Crowdsourcing, Value Co-Creation, & Digital Economy Innovation (Nov. 30 – Dec. 2)
• Crowdsourcing Technologies for Language and Cognition Studies (July 27)
• CHI-CHC: Crowdsourcing and Human Computation (May 8)
• CIKM: BooksOnline (Oct. 24, “crowdsourcing … online books”)
• CrowdConf 2011 -- 2nd Conf. on the Future of Distributed Work (Nov. 1-2)
• Crowdsourcing: Improving … Scientific Data Through Social Networking (June 13)
• EC: Workshop on Social Computing and User Generated Content (June 5)
• ICWE: 2nd International Workshop on Enterprise Crowdsourcing (June 20)
• Interspeech: Crowdsourcing for speech processing (August)
• NIPS: Second Workshop on Computational Social Science and the Wisdom of Crowds (Dec. TBD)
• SIGIR-CIR: Workshop on Crowdsourcing for Information Retrieval (July 28)
• TREC-Crowd: 1st TREC Crowdsourcing Track (Nov. 16-18)
• UbiComp: 2nd Workshop on Ubiquitous Crowdsourcing (Sep. 18)
• WSDM-CSDM: Crowdsourcing for Search and Data Mining (Feb. 9)
139
2011 Tutorials and Keynotes
• By Omar Alonso and/or Matthew Lease
– CLEF: Crowdsourcing for Information Retrieval Experimentation and Evaluation (Sep. 20, Omar only)
– CrowdConf: Crowdsourcing for Research and Engineering
– IJCNLP: Crowd Computing: Opportunities and Challenges (Nov. 10, Matt only)
– WSDM: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You (Feb. 9)
– SIGIR: Crowdsourcing for Information Retrieval: Principles, Methods, and Applications (July 24)
• AAAI: Human Computation: Core Research Questions and State of the Art
– Edith Law and Luis von Ahn, August 7
• ASIS&T: How to Identify Ducks In Flight: A Crowdsourcing Approach to Biodiversity Research and
Conservation
– Steve Kelling, October 10, ebird
• EC: Conducting Behavioral Research Using Amazon's Mechanical Turk
– Winter Mason and Siddharth Suri, June 5
• HCIC: Quality Crowdsourcing for Human Computer Interaction Research
– Ed Chi, June 14-18, about HCIC)
– Also see his: Crowdsourcing for HCI Research with Amazon Mechanical Turk
• Multimedia: Frontiers in Multimedia Search
– Alan Hanjalic and Martha Larson, Nov 28
• VLDB: Crowdsourcing Applications and Platforms
– Anhai Doan, Michael Franklin, Donald Kossmann, and Tim Kraska)
• WWW: Managing Crowdsourced Human Computation
– Panos Ipeirotis and Praveen Paritosh
140
Students
– Catherine Grady (iSchool)
– Hyunjoon Jung (iSchool)
– Jorn Klinger (Linguistics)
– Adriana Kovashka (CS)
– Abhimanu Kumar (CS)
– Hohyon Ryu (iSchool)
– Wei Tang (CS)
– Stephen Wolfson (iSchool)
Matt Lease - ml@ischool.utexas.edu - @mattlease
Thank You!
141
ir.ischool.utexas.edu/crowd
More Books
July 2010, kindle-only: “This book introduces you to the
top crowdsourcing sites and outlines step by step with
photos the exact process to get started as a requester on
Amazon Mechanical Turk.“
142
Resources
A Few Blogs
 Behind Enemy Lines (P.G. Ipeirotis, NYU)
 Deneme: a Mechanical Turk experiments blog (Gret Little, MIT)
 CrowdFlower Blog
 http://experimentalturk.wordpress.com
 Jeff Howe
A Few Sites
 The Crowdsortium
 Crowdsourcing.org
 CrowdsourceBase (for workers)
 Daily Crowdsource
MTurk Forums and Resources
 Turker Nation: http://turkers.proboards.com
 http://www.turkalert.com (and its blog)
 Turkopticon: report/avoid shady requestors
 Amazon Forum for MTurk
143
Bibliography
 J. Barr and L. Cabrera. “AI gets a Brain”, ACM Queue, May 2006.
 Bernstein, M. et al. Soylent: A Word Processor with a Crowd Inside. UIST 2010. Best Student Paper award.
 Bederson, B.B., Hu, C., & Resnik, P. Translation by Iteractive Collaboration between Monolingual Users, Proceedings of Graphics
Interface (GI 2010), 39-46.
 N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.
 C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.
 P. Dai, Mausam, and D. Weld. “Decision-Theoretic of Crowd-Sourced Workflows”, AAAI, 2010.
 J. Davis et al. “The HPU”, IEEE Computer Vision and Pattern Recognition Workshop on Advancing Computer Vision with Human
in the Loop (ACVHL), June 2010.
 M. Gashler, C. Giraud-Carrier, T. Martinez. Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, ICMLA 2008.
 D. A. Grier. When Computers Were Human. Princeton University Press, 2005. ISBN 0691091579
 JS. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game”, CHI 2009.
 J. Heer, M. Bobstock. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”, CHI 2010.
 P. Heymann and H. Garcia-Molina. “Human Processing”, Technical Report, Stanford Info Lab, 2010.
 J. Howe. “Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business”. Crown Business, New York, 2008.
 P. Hsueh, P. Melville, V. Sindhwami. “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”. NAACL HLT
Workshop on Active Learning and NLP, 2009.
 B. Huberman, D. Romero, and F. Wu. “Crowdsourcing, attention and productivity”. Journal of Information Science, 2009.
 P.G. Ipeirotis. The New Demographics of Mechanical Turk. March 9, 2010. PDF and Spreadsheet.
 P.G. Ipeirotis, R. Chandrasekar and P. Bennett. Report on the human computation workshop. SIGKDD Explorations v11 no 2 pp. 80-83, 2010.
 P.G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. CeDER-10-04 (Sept. 11, 2010)
144
Bibliography (2)
 A. Kittur, E. Chi, and B. Suh. “Crowdsourcing user studies with Mechanical Turk”, SIGCHI 2008.
 Aniket Kittur, Boris Smus, Robert E. Kraut. CrowdForge: Crowdsourcing Complex Work. CHI 2011
 Adriana Kovashka and Matthew Lease. “Human and Machine Detection of … Similarity in Art”. CrowdConf 2010.
 K. Krippendorff. "Content Analysis", Sage Publications, 2003
 G. Little, L. Chilton, M. Goldman, and R. Miller. “TurKit: Tools for Iterative Tasks on Mechanical Turk”, HCOMP 2009.
 T. Malone, R. Laubacher, and C. Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence.
2009.
 W. Mason and D. Watts. “Financial Incentives and the ’Performance of Crowds’”, HCOMP Workshop at KDD 2009.
 J. Nielsen. “Usability Engineering”, Morgan-Kaufman, 1994.
 A. Quinn and B. Bederson. “A Taxonomy of Distributed Human Computation”, Technical Report HCIL-2009-23, 2009
 J. Ross, L. Irani, M. Six Silberman, A. Zaldivar, and B. Tomlinson. “Who are the Crowdworkers?: Shifting
Demographics in Amazon Mechanical Turk”. CHI 2010.
 F. Scheuren. “What is a Survey” (http://www.whatisasurvey.info) 2004.
 R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng. “Cheap and Fast But is it Good? Evaluating Non-Expert Annotations
for Natural Language Tasks”. EMNLP-2008.
 V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality … Using Multiple, Noisy Labelers”
KDD 2008.
 S. Weber. “The Success of Open Source”, Harvard University Press, 2004.
 L. von Ahn. Games with a purpose. Computer, 39 (6), 92–94, 2006.
 L. von Ahn and L. Dabbish. “Designing Games with a purpose”. CACM, Vol. 51, No. 8, 2008.
145
Bibliography (3)
 Shuo Chen et al. What if the Irresponsible Teachers Are Dominating? A Method of Training on Samples and
Clustering on Teachers. AAAI 2010.
 Paul Heymann, Hector Garcia-Molina: Turkalytics: analytics for human computation. WWW 2011.
 Florian Laws, Christian Scheible and Hinrich Schütze. Active Learning with Amazon Mechanical Turk.
EMNLP 2011.
 C.Y. Lin. Rouge: A package for automatic evaluation of summaries. Proceedings of the workshop on text
summarization branches out (WAS), 2004.
 C. Marshall and F. Shipman “The Ownership and Reuse of Visual Media”, JCDL, 2011.
 Hohyon Ryu and Matthew Lease. Crowdworker Filtering with Support Vector Machine. ASIS&T 2011.
 Wei Tang and Matthew Lease. Semi-Supervised Consensus Labeling for Crowdsourcing. ACM SIGIR
Workshop on Crowdsourcing for Information Retrieval (CIR), 2011.
 S. Vijayanarasimhan and K. Grauman. Large-Scale Live Active Learning: Training Object Detectors with
Crawled Data and Crowds. CVPR 2011.
 Stephen Wolfson and Matthew Lease. Look Before You Leap: Legal Pitfalls of Crowdsourcing. ASIS&T 2011.
146
Recent Work
• Della Penna, N, and M D Reid. (2012). “Crowd & Prejudice: An Impossibility Theorem for Crowd Labelling without a Gold
Standard.” in Proceedings of Collective Intelligence. Arxiv preprint arXiv:1204.3511.
• Demartini, Gianluca, D.E. Difallah, and P. Cudre-Mauroux. (2012). “ZenCrowd: leveraging probabilistic reasoning and
crowdsourcing techniques for large-scale entity linking.” 21st Annual Conference on the World Wide Web (WWW).
• Donmez, Pinar, Jaime Carbonnel, and Jeff Schneider. (2010). “A probabilistic framework to learn from multiple
annotators with time-varying accuracy.” in SIAM International Conference on Data Mining (SDM), 826-837.
• Donmez, Pinar, Jaime Carbonnel, and Jeff Schneider. (2009). “Efficiently learning the accuracy of labeling sources for
selective sampling.” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and
data mining (KDD), 259-268.
• Fort, K., Adda, G., and Cohen, K. (2011). Amazon Mechanical Turk: Gold mine or coal mine? Computational
Linguistics, 37(2):413–420.
• Ghosh, A, Satyen Kale, and Preson McAfee. (2012). “Who Moderates the Moderators? Crowdsourcing Abuse Detection
in User-Generated Content.” in Proceedings of the 12th ACM conference on Electronic commerce.
• Ho, C J, and J W Vaughan. (2012). “Online Task Assignment in Crowdsourcing Markets.” in Twenty-Sixth AAAI Conference
on Artificial Intelligence.
• Jung, Hyun Joon, and Matthew Lease. (2012). “Inferring Missing Relevance Judgments from Crowd Workers via
Probabilistic Matrix Factorization.” in Proceeding of the 36th international ACM SIGIR conference on Research and
development in information retrieval.
• Kamar, E, S Hacker, and E Horvitz. (2012). “Combining Human and Machine Intelligence in Large-scale Crowdsourcing.” in
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
• Karger, D R, S Oh, and D Shah. (2011). “Budget-optimal task allocation for reliable crowdsourcing systems.” Arxiv preprint
arXiv:1110.3564.
• Kazai, Gabriella, Jaap Kamps, and Natasa Milic-Frayling. (2012). “An Analysis of Human Factors and Label Accuracy in
Crowdsourcing Relevance Judgments.” Springer's Information Retrieval Journal: Special Issue on Crowdsourcing.
147
Recent Work (2)
• Lin, C.H. and Mausam and Weld, D.S. (2012). “Crowdsourcing Control: Moving Beyond Multiple Choice.” in in
Proceedings of the 4th Human Computation Workshop (HCOMP) at AAAI.
• Liu, C, and Y M Wang. (2012). “TrueLabel + Confusions: A Spectrum of Probabilistic Models in Analyzing Multiple
Ratings.” in Proceedings of the 29th International Conference on Machine Learning (ICML).
• Liu, Di, Ranolph Bias, Matthew Lease, and Rebecca Kuipers. (2012). “Crowdsourcing for Usability Testing.” in
Proceedings of the 75th Annual Meeting of the American Society for Information Science and Technology (ASIS&T).
• Ramesh, A, A Parameswaran, Hector Garcia-Molina, and Neoklis Polyzotis. (2012). Identifying Reliable Workers Swiftly.
• Raykar, Vikas, Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., and Moy, (2010). “Learning From Crowds.” Journal
of Machine Learning Research 11:1297-1322.
• Raykar, Vikas, Yu, S., Zhao, L.H., Jerebko, A., Florin, C., Valadez, G.H., Bogoni, L., and Moy, L. (2009). “Supervised
learning from multiple experts: whom to trust when everyone lies a bit.” in Proceedings of the 26th Annual
International Conference on Machine Learning (ICML), 889-896.
• Raykar, Vikas C, and Shipeng Yu. (2012). “Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling
Tasks.” Journal of Machine Learning Research 13:491-518.
• Wauthier, Fabian L., and Michael I. Jordan. (2012). “Bayesian Bias Mitigation for Crowdsourcing.” in Advances in neural
information processing systems (NIPS).
• Weld, D.S., Mausam, and Dai, P. (2011). “Execution control for crowdsourcing.” in Proceedings of the 24th ACM
symposium adjunct on User interface software and technology (UIST).
• Weld, D.S., Mausam, and Dai, P. (2011). “Human Intelligence Needs Artificial Intelligence.” in in Proceedings of the 3rd
Human Computation Workshop (HCOMP) at AAAI.
• Welinder, Peter, Steve Branson, Serge Belongie, and Pietro Perona. (2010). “The Multidimensional Wisdom of
Crowds.” in Advances in Neural Information Processing Systems (NIPS), 2424-2432.
• Welinder, Peter, and Pietro Perona. (2010). “Online crowdsourcing: rating annotators and obtaining cost-effective
labels.” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 25-32.
• Whitehill, J, P Ruvolo, T Wu, J Bergsma, and J Movellan. (2009). “Whose Vote Should Count More: Optimal Integration
of Labels from Labelers of Unknown Expertise.” in Advances in Neural Information Processing Systems (NIPS).
• Yan, Y, and R Rosales. (2011). “Active learning from crowds.” in Proceedings of the 28th Annual International
Conference on Machine Learning (ICML).
148
149

Weitere ähnliche Inhalte

Was ist angesagt?

"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked
"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked
"So UX Designers Just Make Things Pretty, Right?" and Other Myths DebunkedKaitlan Chu
 
Computational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 StepsComputational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 StepsPaul Herring
 
Luigina Ciolfi inaugural lecture 2019
Luigina Ciolfi inaugural lecture 2019Luigina Ciolfi inaugural lecture 2019
Luigina Ciolfi inaugural lecture 2019Luigina Ciolfi
 
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...Aalto University
 
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)City University London
 
Computational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogyComputational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogyPaul Herring
 
Proposal defense
Proposal defenseProposal defense
Proposal defenseleducmills
 
Usability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered DesignUsability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered Design@cristobalcobo
 
Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Simon Buckingham Shum
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Matthew Lease
 
Dr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UX
Dr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UXDr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UX
Dr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UXJake Truemper
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
 
Qsite Presentation computational thinking 2013
Qsite Presentation   computational thinking 2013Qsite Presentation   computational thinking 2013
Qsite Presentation computational thinking 2013Paul Herring
 
What is computational thinking? Who needs it? Why? How can it be learnt? ...
What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...
What is computational thinking? Who needs it? Why? How can it be learnt? ...Aaron Sloman
 
Computational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All StudentsComputational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All StudentsNAFCareerAcads
 
Junkbots and computational thinking
Junkbots and computational thinkingJunkbots and computational thinking
Junkbots and computational thinkingScott Turner
 

Was ist angesagt? (20)

"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked
"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked
"So UX Designers Just Make Things Pretty, Right?" and Other Myths Debunked
 
Computational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 StepsComputational Thinking - A Revolution in 4 Steps
Computational Thinking - A Revolution in 4 Steps
 
BCII 2016 - Visualizing Complexity
BCII 2016 - Visualizing ComplexityBCII 2016 - Visualizing Complexity
BCII 2016 - Visualizing Complexity
 
Provocations for CLA Dashboard
Provocations for CLA DashboardProvocations for CLA Dashboard
Provocations for CLA Dashboard
 
Luigina Ciolfi inaugural lecture 2019
Luigina Ciolfi inaugural lecture 2019Luigina Ciolfi inaugural lecture 2019
Luigina Ciolfi inaugural lecture 2019
 
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
Can Computers Design? Presented at interaction16, March 2, 2016, Helsinki by ...
 
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
hcid2011 - Creativity for open spaces - Dr Sara Jones (HCID)
 
Computational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogyComputational Thinking - a 4 step approach and a new pedagogy
Computational Thinking - a 4 step approach and a new pedagogy
 
Proposal defense
Proposal defenseProposal defense
Proposal defense
 
Usability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered DesignUsability First - Introduction to User-Centered Design
Usability First - Introduction to User-Centered Design
 
Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
 
Ppdd copy
Ppdd copyPpdd copy
Ppdd copy
 
Dr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UX
Dr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UXDr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UX
Dr.* Truemper, Or: How I learned to Stop Being Wasteful and Love Lean UX
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
 
Qsite Presentation computational thinking 2013
Qsite Presentation   computational thinking 2013Qsite Presentation   computational thinking 2013
Qsite Presentation computational thinking 2013
 
What is computational thinking? Who needs it? Why? How can it be learnt? ...
What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...What is computational thinking?  Who needs it?  Why?  How can it be learnt?  ...
What is computational thinking? Who needs it? Why? How can it be learnt? ...
 
Computational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All StudentsComputational Thinking: Why It is Important for All Students
Computational Thinking: Why It is Important for All Students
 
Junkbots and computational thinking
Junkbots and computational thinkingJunkbots and computational thinking
Junkbots and computational thinking
 

Ähnlich wie Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems

Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Matthew Lease
 
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Matthew Lease
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)Matthew Lease
 
UT Dallas CS - Rise of Crowd Computing
UT Dallas CS - Rise of Crowd ComputingUT Dallas CS - Rise of Crowd Computing
UT Dallas CS - Rise of Crowd ComputingMatthew Lease
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentivesElena Simperl
 
Hcic muller guha davis geyer shami 2015 06-29
Hcic muller guha davis geyer shami 2015 06-29Hcic muller guha davis geyer shami 2015 06-29
Hcic muller guha davis geyer shami 2015 06-29Michael Muller
 
Csls 20160821 v1
Csls 20160821 v1Csls 20160821 v1
Csls 20160821 v1ISSIP
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Matthew Lease
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingMatthew Lease
 
Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014sbclapp
 
Putting Personas to Work at UX Pittsburgh
Putting Personas to Work at UX PittsburghPutting Personas to Work at UX Pittsburgh
Putting Personas to Work at UX PittsburghCarol Smith
 
COBWEB - The Academy of Finland research project
COBWEB - The Academy of Finland research projectCOBWEB - The Academy of Finland research project
COBWEB - The Academy of Finland research projectJari Jussila
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveoralonso
 
Towards a classification framework for social machines
Towards a classification framework for social machinesTowards a classification framework for social machines
Towards a classification framework for social machinesElena Simperl
 
Towards a classification framework for social machines
Towards a classification  framework for social machinesTowards a classification  framework for social machines
Towards a classification framework for social machinesSOCIAM Project
 
Information Retrieval Fundamentals - An introduction
Information Retrieval Fundamentals - An introduction Information Retrieval Fundamentals - An introduction
Information Retrieval Fundamentals - An introduction Grace Hui Yang
 
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIsDashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIsLuciano Pesci, PhD
 
Digital Humanities in Practice, DHC 2012
Digital Humanities in Practice, DHC 2012Digital Humanities in Practice, DHC 2012
Digital Humanities in Practice, DHC 2012Monica Bulger
 
Crowdsourcing for Information Retrieval: Principles, Methods, and Applications
Crowdsourcing for Information Retrieval: Principles, Methods, and ApplicationsCrowdsourcing for Information Retrieval: Principles, Methods, and Applications
Crowdsourcing for Information Retrieval: Principles, Methods, and ApplicationsMatthew Lease
 
6 oct15 writing kmb plan edited
6 oct15 writing kmb plan edited6 oct15 writing kmb plan edited
6 oct15 writing kmb plan editedShawna Reibling
 

Ähnlich wie Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems (20)

Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
Crowdsourcing For Research and Engineering (Tutorial given at CrowdConf 2011)
 
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
Crowd Computing: Opportunities & Challenges (IJCNLP 2011 Keynote)
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)
 
UT Dallas CS - Rise of Crowd Computing
UT Dallas CS - Rise of Crowd ComputingUT Dallas CS - Rise of Crowd Computing
UT Dallas CS - Rise of Crowd Computing
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentives
 
Hcic muller guha davis geyer shami 2015 06-29
Hcic muller guha davis geyer shami 2015 06-29Hcic muller guha davis geyer shami 2015 06-29
Hcic muller guha davis geyer shami 2015 06-29
 
Csls 20160821 v1
Csls 20160821 v1Csls 20160821 v1
Csls 20160821 v1
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject Crowdsourcing
 
Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014
 
Putting Personas to Work at UX Pittsburgh
Putting Personas to Work at UX PittsburghPutting Personas to Work at UX Pittsburgh
Putting Personas to Work at UX Pittsburgh
 
COBWEB - The Academy of Finland research project
COBWEB - The Academy of Finland research projectCOBWEB - The Academy of Finland research project
COBWEB - The Academy of Finland research project
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
Towards a classification framework for social machines
Towards a classification framework for social machinesTowards a classification framework for social machines
Towards a classification framework for social machines
 
Towards a classification framework for social machines
Towards a classification  framework for social machinesTowards a classification  framework for social machines
Towards a classification framework for social machines
 
Information Retrieval Fundamentals - An introduction
Information Retrieval Fundamentals - An introduction Information Retrieval Fundamentals - An introduction
Information Retrieval Fundamentals - An introduction
 
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIsDashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
 
Digital Humanities in Practice, DHC 2012
Digital Humanities in Practice, DHC 2012Digital Humanities in Practice, DHC 2012
Digital Humanities in Practice, DHC 2012
 
Crowdsourcing for Information Retrieval: Principles, Methods, and Applications
Crowdsourcing for Information Retrieval: Principles, Methods, and ApplicationsCrowdsourcing for Information Retrieval: Principles, Methods, and Applications
Crowdsourcing for Information Retrieval: Principles, Methods, and Applications
 
6 oct15 writing kmb plan edited
6 oct15 writing kmb plan edited6 oct15 writing kmb plan edited
6 oct15 writing kmb plan edited
 

Mehr von Matthew Lease

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Matthew Lease
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopMatthew Lease
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd Matthew Lease
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?Matthew Lease
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)Matthew Lease
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016Matthew Lease
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing ScienceMatthew Lease
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsMatthew Lease
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkMatthew Lease
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Matthew Lease
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationMatthew Lease
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkMatthew Lease
 

Mehr von Matthew Lease (20)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loop
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing Science
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd Work
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine Evaluation
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical Turk
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Kürzlich hochgeladen (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems

  • 1. Matt Lease School of Information @mattlease University of Texas at Austin ml@ischool.utexas.edu Crowdsourcing & Human Computation Labeling Data & Building Hybrid Systems Slides: www.slideshare.net/mattlease
  • 2. Roadmap • A Quick Example • Crowd-powered data collection & applications • Crowdsourcing, Incentives, & Demographics • Mechanical Turk & Other Platforms • Designing for Crowds & Statistical QA • Open Problems • Broader Considerations & a Darker Side 2
  • 3. What is Crowdsourcing? • Let’s start with a simple example! • Goal – See a concrete example of real crowdsourcing – Ground later discussion of abstract concepts – Provide a specific example with which we will contrast other forms of crowdsourcing 3
  • 5. 5
  • 6. 6 Jane saw the man with the binoculars
  • 7. Traditional Data Collection • Setup data collection software / harness • Recruit participants / annotators / assessors • Pay a flat fee for experiment or hourly wage • Characteristics – Slow – Expensive – Difficult and/or Tedious – Sample Bias… 7
  • 8. “Hello World” Demo • Let’s create and run a simple MTurk HIT • This is a teaser highlighting concepts – Don’t worry about details; we’ll revisit them • Goal – See a concrete example of real crowdsourcing – Ground our later discussion of abstract concepts – Provide a specific example with which we will contrast other forms of crowdsourcing 8
  • 10. 10 PHASE 1: DATA COLLECTION
  • 11. NLP: Snow et al. (EMNLP 2008) • MTurk annotation for 5 Tasks – Affect recognition – Word similarity – Recognizing textual entailment – Event temporal ordering – Word sense disambiguation • 22K labels for US $26 • High agreement between consensus labels and gold-standard labels 11
  • 12. Computer Vision: Sorokin & Forsythe (CVPR 2008) • 4K labels for US $60 12
  • 13. IR: Alonso et al. (SIGIR Forum 2008) • MTurk for Information Retrieval (IR) – Judge relevance of search engine results • Many follow-on studies (design, quality, cost) 13
  • 14. User Studies: Kittur, Chi, & Suh (CHI 2008) • “…make creating believable invalid responses as effortful as completing the task in good faith.” 14
  • 15. Remote Usability Testing • Liu, Bias, Lease, and Kuipers, ASIS&T, 2012 • Remote usability testing via MTurk & CrowdFlower vs. traditional on-site testing • Advantages – More (Diverse) Participants – High Speed – Low Cost • Disadvantages – Lower Quality Feedback – Less Interaction – Greater need for quality control – Less Focused User Groups 15
  • 16. 16
  • 17. Human Subjects Research: Surveys, Demographics, etc. • A Guide to Behavioral Experiments on Mechanical Turk – W. Mason and S. Suri (2010). SSRN online. • Crowdsourcing for Human Subjects Research – L. Schmidt (CrowdConf 2010) • Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk – Conley & Tosti-Kharas (2010). Academy of Management • Amazon's Mechanical Turk : A New Source of Inexpensive, Yet High-Quality, Data? – M. Buhrmester et al. (2011). Perspectives… 6(1):3-5. – see also: Amazon Mechanical Turk Guide for Social Scientists 17
  • 18. • PhD Thesis, December 2005 • Law & von Ahn, Book, June 2011 18 LUIS VON AHN, CMU
  • 19. ESP Game (Games With a Purpose) L. Von Ahn and L. Dabbish (2004) 19
  • 20. reCaptcha L. von Ahn et al. (2008). In Science. 20
  • 22. MORE DATA COLLECTION EXAMPLES 22
  • 23. Crowd Sensing • Steve Kelling, et al. A Human/Computer Learning Network to Improve Biodiversity Conservation and Research. AI Magazine 34.1 (2012): 10. 23
  • 24. Tracking Sentiment in Online Media Brew et al., PAIS 2010 • Volunteer-crowd • Judge in exchange for access to rich content • Balance system needs with user interest • Daily updates to non- stationary distribution 24
  • 25. PHASE 2: FROM DATA COLLECTION TO HUMAN COMPUTATION 25
  • 26. What is a Computer? 26
  • 27. Princeton University Press, 2005 • What was old is new • Crowdsourcing: A New Branch of Computer Science – D.A. Grier, March 29, 2011 • Tabulating the heavens: computing the Nautical Almanac in 18th-century England - M. Croarken’03 27 Human Computation
  • 28. J. Pontin. Artificial Intelligence, With Help From the Humans. New York Times (March 25, 2007) The Mechanical Turk 28 Constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804)
  • 29. The Human Processing Unit (HPU) • Davis et al. (2010) HPU 29
  • 30. Human Computation • Having people do stuff instead of computers • Investigates use of people to execute certain computations for which capabilities of current automated methods are more limited • Explores the metaphor of computation for characterizing attributes, capabilities, and limitations of human task performance 30
  • 33. Translation by monolingual speakers • C. Hu, CHI 2009 33
  • 34. Soylent: A Word Processor with a Crowd Inside • Bernstein et al., UIST 2010 34
  • 35. fold.it S. Cooper et al. (2010) Alice G. Walton. Online Gamers Help Solve Mystery of Critical AIDS Virus Enzyme. The Atlantic, October 8, 2011. 35
  • 36. PlateMate (Noronha et al., UIST’10) 36
  • 37. Image Analysis and more: Eatery 37
  • 38. VizWiz aaaaaaaa Bingham et al. (UIST 2010) 38
  • 39. 39
  • 41. SO WHAT IS CROWDSOURCING? 41
  • 42. 42
  • 43. From Outsourcing to Crowdsourcing • Take a job traditionally performed by a known agent (often an employee) • Outsource it to an undefined, generally large group of people via an open call • New application of principles from open source movement • Evolving & broadly defined ... 43
  • 44. Crowdsourcing models • Micro-tasks & citizen science • Co-Creation • Open Innovation, Contests • Prediction Markets • Crowd Funding and Charity • “Gamification” (not serious gaming) • Transparent • cQ&A, Social Search, and Polling • Physical Interface/Task 44
  • 45. What is Crowdsourcing? • Mechanisms and methodology for directing crowd action to achieve some goal(s) – E.g., novel ways of collecting data from crowds • Powered by internet-connectivity • Related topics: – Human computation – Collective intelligence – Crowd/Social computing – Wisdom of Crowds – People services, Human Clouds, Peer-production, … 45
  • 46. What is not crowdsourcing? • Analyzing existing datasets (no matter source) – Data mining – Visual analytics • Use of few people – Mixed-initiative design – Active learning • Conducting a survey or poll… (*) – Novelty? 46
  • 47. Crowdsourcing Key Questions • What are the goals? – Purposeful directing of human activity • How can you incentivize participation? – Incentive engineering – Who are the target participants? • Which model(s) are most appropriate? – How to adapt them to your context and goals? 47
  • 48. Wisdom of Crowds (WoC) Requires • Diversity • Independence • Decentralization • Aggregation Input: large, diverse sample (to increase likelihood of overall pool quality) Output: consensus or selection (aggregation) 48
  • 49. What do you want to accomplish? • Create • Execute task/computation • Fund • Innovate and/or discover • Learn • Monitor • Predict 49
  • 51. Why should your crowd participate? • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige (leaderboards, badges) • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource Multiple incentives can often operate in parallel (*caveat) 51
  • 52. Example: Wikipedia • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 52
  • 53. Example: DuoLingo • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 53
  • 54. Example: • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 54
  • 55. Example: ESP 55 • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource
  • 56. Example: fold.it • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 56
  • 57. Example: FreeRice • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 57
  • 58. Example: cQ&A • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 58
  • 59. Example: reCaptcha • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 59 Is there an existing human activity you can harness for another purpose?
  • 60. Example: Mechanical Turk • Earn Money (real or virtual) • Have fun (or pass the time) • Socialize with others • Obtain recognition or prestige • Do Good (altruism) • Learn something new • Obtain something else • Create self-serving resource 60
  • 61. Dan Pink – YouTube video “The Surprising Truth about what Motivates us” 61
  • 62. Who are the workers? • A. Baio, November 2008. The Faces of Mechanical Turk. • P. Ipeirotis. March 2010. The New Demographics of Mechanical Turk • J. Ross, et al. Who are the Crowdworkers?... CHI 2010. 62
  • 63. MTurk Demographics • 2008-2009 studies found less global and diverse than previously thought – US – Female – Educated – Bored – Money is secondary 63
  • 64. 2010 shows increasing diversity 47% US, 34% India, 19% other (P. Ipeitorotis. March 2010) 64
  • 65. How Much to Pay? • Price commensurate with task effort – Ex: $0.02 for yes/no answer + $0.02 bonus for optional feedback • Ethics & market-factors: W. Mason and S. Suri, 2010. – e.g. non-profit SamaSource involves workers in refugee camps – Predict right price given market & task: Wang et al. CSDM’11 • Uptake & time-to-completion vs. Cost & Quality – Too little $$, no interest or slow – too much $$, attract spammers – Real problem is lack of reliable QA substrate • Accuracy & quantity – More pay = more work, not better (W. Mason and D. Watts, 2009) • Heuristics: start small, watch uptake and bargaining feedback • Worker retention (“anchoring”) 65 See also: L.B. Chilton et al. KDD-HCOMP 2010.
  • 66. MTURK & OTHER PLATFORMS 66
  • 67. Does anyone really use it? Yes! http://www.mturk-tracker.com (P. Ipeirotis’10) From 1/09 – 4/10, 7M HITs from 10K requestors worth $500,000 USD (significant under-estimate) 67
  • 68. MTurk: The Requester • Sign up with your Amazon account • Amazon payments • Purchase prepaid HITs • There is no minimum or up-front fee • MTurk collects a 10% commission • The minimum commission charge is $0.005 per HIT 68
  • 69. MTurk Dashboard • Three tabs – Design – Publish – Manage • Design – HIT Template • Publish – Make work available • Manage – Monitor progress 69
  • 70. 70
  • 72. MTurk API • Amazon Web Services API • Rich set of services • Command line tools • More flexibility than dashboard 72
  • 73. MTurk Dashboard vs. API • Dashboard – Easy to prototype – Setup and launch an experiment in a few minutes • API – Ability to integrate AMT as part of a system – Ideal if you want to run experiments regularly – Schedule tasks 73
  • 74. 74 • Multiple Channels • Gold-based tests • Only pay for “trusted” judgments
  • 75. More Crowd Labor Platforms • Clickworker • CloudCrowd • CloudFactory • CrowdSource • DoMyStuff • Microtask • MobileWorks (by Anand Kulkarni ) • myGengo • SmartSheet • vWorker • Industry heavy-weights – Elance – Liveops – oDesk – uTest • and more… 75
  • 76. Many Factors Matter! • Process – Task design, instructions, setup, iteration • Choose crowdsourcing platform (or roll your own) • Human factors – Payment / incentives, interface and interaction design, communication, reputation, recruitment, retention • Quality Control / Data Quality – Trust, reliability, spam detection, consensus labeling • Don’t write a paper saying “we collected data from MTurk & then…” – details of method matter! 76
  • 79. Kulkarni et al., CSCW 2012 Turkomatic 79
  • 80. CrowdForge: Workers perform a task or further decompose them 80 Kittur et al., CHI 2011
  • 81. Kittur et al., CrowdWeaver, CSCW 2012 81
  • 83. Typical Workflow • Define and design what to test • Sample data • Design the experiment • Run experiment • Collect data and analyze results • Quality control 83
  • 84. Development Framework • Incremental approach (from Omar Alonso) • Measure, evaluate, and adjust as you go • Suitable for repeatable tasks 84
  • 85. Survey Design • One of the most important parts • Part art, part science • Instructions are key • Prepare to iterate 85
  • 86. Questionnaire Design • Ask the right questions • Workers may not be IR experts so don’t assume the same understanding in terms of terminology • Show examples • Hire a technical writer – Engineer writes the specification – Writer communicates 86
  • 87. UX Design • Time to apply all those usability concepts • Generic tips – Experiment should be self-contained. – Keep it short and simple. Brief and concise. – Be very clear with the relevance task. – Engage with the worker. Avoid boring stuff. – Always ask for feedback (open-ended question) in an input box. 87
  • 88. UX Design - II • Presentation • Document design • Highlight important concepts • Colors and fonts • Need to grab attention • Localization 88
  • 89. Implementation • Similar to a UX • Build a mock up and test it with your team – Yes, you need to judge some tasks • Incorporate feedback and run a test on MTurk with a very small data set – Time the experiment – Do people understand the task? • Analyze results – Look for spammers – Check completion times • Iterate and modify accordingly 89
  • 90. Implementation – II • Introduce quality control – Qualification test – Gold answers (honey pots) • Adjust passing grade and worker approval rate • Run experiment with new settings & same data • Scale on data • Scale on workers 90
  • 91. Other design principles • Text alignment • Legibility • Reading level: complexity of words and sentences • Attractiveness (worker’s attention & enjoyment) • Multi-cultural / multi-lingual • Who is the audience (e.g. target worker community) – Special needs communities (e.g. simple color blindness) • Parsimony • Cognitive load: mental rigor needed to perform task • Exposure effect 91
  • 92. The human side • As a worker – I hate when instructions are not clear – I’m not a spammer – I just don’t get what you want – Boring task – A good pay is ideal but not the only condition for engagement • As a requester – Attrition – Balancing act: a task that would produce the right results and is appealing to workers – I want your honest answer for the task – I want qualified workers; system should do some of that for me • Managing crowds and tasks is a daily activity – more difficult than managing computers 92
  • 94. When to assess quality of work • Beforehand (prior to main task activity) – How: “qualification tests” or similar mechanism – Purpose: screening, selection, recruiting, training • During – How: assess labels as worker produces them • Like random checks on a manufacturing line – Purpose: calibrate, reward/penalize, weight • After – How: compute accuracy metrics post-hoc – Purpose: filter, calibrate, weight, retain (HR) – E.g. Jung & Lease (2011), Tang & Lease (2011), ... 94
  • 95. How do we measure work quality? • Compare worker’s label vs. – Known (correct, trusted) label – Other workers’ labels • P. Ipeirotis. Worker Evaluation in Crowdsourcing: Gold Data or Multiple Workers? Sept. 2010. – Model predictions of the above • Model the labels (Ryu & Lease, ASIS&T11) • Model the workers (Chen et al., AAAI’10) • Verify worker’s label – Yourself – Tiered approach (e.g. Find-Fix-Verify) • Quinn and B. Bederson’09, Bernstein et al.’10 95
  • 96. Typical Assumptions • Objective truth exists – no minority voice / rare insights – Can relax this to model “truth distribution” • Automatic answer comparison/evaluation – What about free text responses? Hope from NLP… • Automatic essay scoring • Translation (BLEU: Papineni, ACL’2002) • Summarization (Rouge: C.Y. Lin, WAS’2004) – Have people do it (yourself or find-verify crowd, etc.) 96
  • 97. Distinguishing Bias vs. Noise • Ipeirotis (HComp 2010) • People often have consistent, idiosyncratic skews in their labels (bias) – E.g. I like action movies, so they get higher ratings • Once detected, systematic bias can be calibrated for and corrected (yeah!) • Noise, however, seems random & inconsistent – this is the real issue we want to focus on 97
  • 98. Comparing to known answers • AKA: gold, honey pot, verifiable answer, trap • Assumes you have known answers • Cost vs. Benefit – Producing known answers (experts?) – % of work spent re-producing them • Finer points – Controls against collusion – What if workers recognize the honey pots? 98
  • 99. Comparing to other workers • AKA: consensus, plurality, redundant labeling • Well-known metrics for measuring agreement • Cost vs. Benefit: % of work that is redundant • Finer points – Is consensus “truth” or systematic bias of group? – What if no one really knows what they’re doing? • Low-agreement across workers indicates problem is with the task (or a specific example), not the workers – Risk of collusion • Sheng et al. (KDD 2008) 99
  • 100. Comparing to predicted label • Ryu & Lease, ASIS&T11 • Catch-22 extremes – If model is really bad, why bother comparing? – If model is really good, why collect human labels? • Exploit model confidence – Trust predictions proportional to confidence – What if model very confident and wrong? • Active learning – Time sensitive: Accuracy / confidence changes 100
  • 101. Compare to predicted worker labels • Chen et al., AAAI’10 • Avoid inefficiency of redundant labeling – See also: Dekel & Shamir (COLT’2009) • Train a classifier for each worker • For each example labeled by a worker – Compare to predicted labels for all other workers • Issues • Sparsity: workers have to stick around to train model… • Time-sensitivity: New workers & incremental updates? 101
  • 102. Methods for measuring agreement • What to look for – Agreement, reliability, validity • Inter-agreement level – Agreement between judges – Agreement between judges and the gold set • Some statistics – Percentage agreement – Cohen’s kappa (2 raters) – Fleiss’ kappa (any number of raters) – Krippendorff’s alpha • With majority vote, what if 2 say relevant, 3 say not? – Use expert to break ties (Kochhar et al, HCOMP’10; GQR) – Collect more judgments as needed to reduce uncertainty 102
  • 103. Other practical tips • Sign up as worker and do some HITs • “Eat your own dog food” • Monitor discussion forums • Address feedback (e.g., poor guidelines, payments, passing grade, etc.) • Everything counts! – Overall design only as strong as weakest link 103
  • 105. Why Eytan Adar hates MTurk Research (CHI 2011 CHC Workshop) • Overly-narrow focus on MTurk – Identify general vs. platform-specific problems – Academic vs. Industrial problems • Inattention to prior work in other disciplines • Turks aren’t Martians – Just human behavior… 105
  • 106. What about sensitive data? • Not all data can be publicly disclosed – User data (e.g. AOL query log, Netflix ratings) – Intellectual property – Legal confidentiality • Need to restrict who is in your crowd – Separate channel (workforce) from technology – Hot question for adoption at enterprise level 106
  • 107. A Few Open Questions • How should we balance automation vs. human computation? Which does what? • Who’s the right person for the job? • How do we handle complex tasks? Can we decompose them into smaller tasks? How? 107
  • 108. What about ethics? • Silberman, Irani, and Ross (2010) – “How should we… conceptualize the role of these people who we ask to power our computing?” – Power dynamics between parties • What are the consequences for a worker when your actions harm their reputation? – “Abstraction hides detail” • Fort, Adda, and Cohen (2011) – “…opportunities for our community to deliberately value ethics above cost savings.” 108
  • 110. Davis et al. (2010) The HPU. HPU 110
  • 111. HPU: “Abstraction hides detail” • Not just turning a mechanical crank 111
  • 112. Micro-tasks & Task Decomposition • Small, simple tasks can be completed faster by reducing extraneous context and detail – e.g. “Can you name who is in this photo?” • Current workflow research investigates how to decompose complex tasks into simpler ones 112
  • 113. Context & Informed Consent • What is the larger task I’m contributing to? • Who will benefit from it and how? 113
  • 114. Worker Privacy Each worker is assigned an alphanumeric ID 114
  • 115. Requesters see only Worker IDs 115
  • 116. Issues of Identity Fraud • Compromised & exploited worker accounts • Sybil attacks: use of multiple worker identities • Script bots masquerading as human workers 116 Robert Sim, MSR Faculty Summit’12
  • 117. Safeguarding Personal Data • “What are the characteristics of MTurk workers?... the MTurk system is set up to strictly protect workers’ anonymity….” 117
  • 118. ` Amazon profile page URLs use the same IDs used on MTurk ! Paper: MTurk is Not Anonymous 118
  • 119. What about the regulation? • Wolfson & Lease (ASIS&T 2011) • As usual, technology is ahead of the law – employment law – patent inventorship – data security and the Federal Trade Commission – copyright ownership – securities regulation of crowdfunding • Take-away: don’t panic, but be mindful – Understand risks of “just in-time compliance” 119
  • 120. Digital Dirty Jobs • NY Times: Policing the Web’s Lurid Precincts • Gawker: Facebook content moderation • CultureDigitally: The dirty job of keeping Facebook clean • Even LDC annotators reading typical news articles report stress & nightmares! 120
  • 121. Jeff Howe Vision vs. Reality? • Vision of empowering worker freedom: – work whenever you want for whomever you want • When $$$ is at stake, populations at risk may be compelled to perform work by others – Digital sweat shops? Digital slaves? – We really don’t know (and need to learn more…) – Traction? Human Trafficking at MSR Summit’12 121
  • 122. A DARK SIDE OF CROWDSOURCING 122
  • 123. Putting the shoe on the other foot: Spam 123
  • 124. What about trust? • Some reports of robot “workers” on MTurk – E.g. McCreadie et al. (2011) – Violates terms of service • Why not just use a captcha? 124
  • 126. Requester Fraud on MTurk “Do not do any HITs that involve: filling in CAPTCHAs; secret shopping; test our web page; test zip code; free trial; click my link; surveys or quizzes (unless the requester is listed with a smiley in the Hall of Fame/Shame); anything that involves sending a text message; or basically anything that asks for any personal information at all—even your zip code. If you feel in your gut it’s not on the level, IT’S NOT. Why? Because they are scams...” 126
  • 127. Defeating CAPTCHAs with crowds 127
  • 128. Gaming the System: SEO, etc.
  • 130. Robert Sim, MSR Summit’12 130
  • 131. Conclusion • Crowdsourcing is quickly transforming practice in industry and academia via greater efficiency • Crowd computing enables a new design space for applications, augmenting state-of-the-art AI with human computation to offer new capabilities and user experiences • With people at the center of this new computing paradigm, important research questions bridge technological & social considerations 131
  • 132. The Future of Crowd Work Paper @ ACM CSCW 2013 Kittur, Nickerson, Bernstein, Gerber, Shaw, Zimmerman, Lease, and Horton 132
  • 133. Brief Digression: Information Schools • At 30 universities in N. America, Europe, Asia • Study human-centered aspects of information technologies: design, implementation, policy, … 133 www.ischools.org Wobbrock et al., 2009
  • 135. • Jeff Nickerson Aniket Kittur, Michael S. Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and John J. Horton. The Future of Crowd Work. In ACM Computer Supported Cooperative Work (CSCW), February 2013. • Alex Quinn and Ben Bederson. Human Computation: A Survey and Taxonomy of a Growing Field. In Proceedings of CHI 2011. • Law and von Ahn (2011). Human Computation 135 Surveys
  • 136. 2013 Crowdsourcing • 1st year of HComp as AAAI conference • TREC 2013 Crowdsourcing Track • Springer’s Information Retrieval (articles online): Crowdsourcing for Information Retrieval • 4th CrowdConf (San Francisco, Fall) • 1st Crowdsourcing Week (Singapore, April) 136
  • 137. TREC Crowdsourcing Track • Year 1 (2011) – horizontals – Task 1 (hci): collect crowd relevance judgments – Task 2 (stats): aggregate judgments – Organizers: Kazai & Lease – Sponsors: Amazon, CrowdFlower • Year 2 (2012) – content types – Task 1 (text): judge relevance – Task 2 (images): judge relevance – Organizers: Ipeirotis, Kazai, Lease, & Smucker – Sponsors: Amazon, CrowdFlower, MobileWorks 137
  • 138. 2012 Workshops & Conferences • AAAI: Human Computation (HComp) (July 22-23) • AAAI Spring Symposium: Wisdom of the Crowd (March 26-28) • ACL: 3rd Workshop of the People's Web meets NLP (July 12-13) • AMCIS: Crowdsourcing Innovation, Knowledge, and Creativity in Virtual Communities(August 9-12) • CHI: CrowdCamp (May 5-6) • CIKM: Multimodal Crowd Sensing (CrowdSens) (Oct. or Nov.) • Collective Intelligence (April 18-20) • CrowdConf 2012 -- 3rd Annual Conference on the Future of Distributed Work (October 23) • CrowdNet - 2nd Workshop on Cloud Labor and Human Computation (Jan 26-27) • EC: Social Computing and User Generated Content Workshop (June 7) • ICDIM: Emerging Problem- specific Crowdsourcing Technologies (August 23) • ICEC: Harnessing Collective Intelligence with Games (September) • ICML: Machine Learning in Human Computation & Crowdsourcing (June 30) • ICWE: 1st International Workshop on Crowdsourced Web Engineering (CroWE) (July 27) • KDD: Workshop on Crowdsourcing and Data Mining (August 12) • Multimedia: Crowdsourcing for Multimedia (Nov 2) • SocialCom: Social Media for Human Computation (September 6) • TREC-Crowd: 2nd TREC Crowdsourcing Track (Nov. 14-16) • WWW: CrowdSearch: Crowdsourcing Web search (April 17) 138
  • 139. 2011 Workshops & Conferences • AAAI-HCOMP: 3rd Human Computation Workshop (Aug. 8) • ACIS: Crowdsourcing, Value Co-Creation, & Digital Economy Innovation (Nov. 30 – Dec. 2) • Crowdsourcing Technologies for Language and Cognition Studies (July 27) • CHI-CHC: Crowdsourcing and Human Computation (May 8) • CIKM: BooksOnline (Oct. 24, “crowdsourcing … online books”) • CrowdConf 2011 -- 2nd Conf. on the Future of Distributed Work (Nov. 1-2) • Crowdsourcing: Improving … Scientific Data Through Social Networking (June 13) • EC: Workshop on Social Computing and User Generated Content (June 5) • ICWE: 2nd International Workshop on Enterprise Crowdsourcing (June 20) • Interspeech: Crowdsourcing for speech processing (August) • NIPS: Second Workshop on Computational Social Science and the Wisdom of Crowds (Dec. TBD) • SIGIR-CIR: Workshop on Crowdsourcing for Information Retrieval (July 28) • TREC-Crowd: 1st TREC Crowdsourcing Track (Nov. 16-18) • UbiComp: 2nd Workshop on Ubiquitous Crowdsourcing (Sep. 18) • WSDM-CSDM: Crowdsourcing for Search and Data Mining (Feb. 9) 139
  • 140. 2011 Tutorials and Keynotes • By Omar Alonso and/or Matthew Lease – CLEF: Crowdsourcing for Information Retrieval Experimentation and Evaluation (Sep. 20, Omar only) – CrowdConf: Crowdsourcing for Research and Engineering – IJCNLP: Crowd Computing: Opportunities and Challenges (Nov. 10, Matt only) – WSDM: Crowdsourcing 101: Putting the WSDM of Crowds to Work for You (Feb. 9) – SIGIR: Crowdsourcing for Information Retrieval: Principles, Methods, and Applications (July 24) • AAAI: Human Computation: Core Research Questions and State of the Art – Edith Law and Luis von Ahn, August 7 • ASIS&T: How to Identify Ducks In Flight: A Crowdsourcing Approach to Biodiversity Research and Conservation – Steve Kelling, October 10, ebird • EC: Conducting Behavioral Research Using Amazon's Mechanical Turk – Winter Mason and Siddharth Suri, June 5 • HCIC: Quality Crowdsourcing for Human Computer Interaction Research – Ed Chi, June 14-18, about HCIC) – Also see his: Crowdsourcing for HCI Research with Amazon Mechanical Turk • Multimedia: Frontiers in Multimedia Search – Alan Hanjalic and Martha Larson, Nov 28 • VLDB: Crowdsourcing Applications and Platforms – Anhai Doan, Michael Franklin, Donald Kossmann, and Tim Kraska) • WWW: Managing Crowdsourced Human Computation – Panos Ipeirotis and Praveen Paritosh 140
  • 141. Students – Catherine Grady (iSchool) – Hyunjoon Jung (iSchool) – Jorn Klinger (Linguistics) – Adriana Kovashka (CS) – Abhimanu Kumar (CS) – Hohyon Ryu (iSchool) – Wei Tang (CS) – Stephen Wolfson (iSchool) Matt Lease - ml@ischool.utexas.edu - @mattlease Thank You! 141 ir.ischool.utexas.edu/crowd
  • 142. More Books July 2010, kindle-only: “This book introduces you to the top crowdsourcing sites and outlines step by step with photos the exact process to get started as a requester on Amazon Mechanical Turk.“ 142
  • 143. Resources A Few Blogs  Behind Enemy Lines (P.G. Ipeirotis, NYU)  Deneme: a Mechanical Turk experiments blog (Gret Little, MIT)  CrowdFlower Blog  http://experimentalturk.wordpress.com  Jeff Howe A Few Sites  The Crowdsortium  Crowdsourcing.org  CrowdsourceBase (for workers)  Daily Crowdsource MTurk Forums and Resources  Turker Nation: http://turkers.proboards.com  http://www.turkalert.com (and its blog)  Turkopticon: report/avoid shady requestors  Amazon Forum for MTurk 143
  • 144. Bibliography  J. Barr and L. Cabrera. “AI gets a Brain”, ACM Queue, May 2006.  Bernstein, M. et al. Soylent: A Word Processor with a Crowd Inside. UIST 2010. Best Student Paper award.  Bederson, B.B., Hu, C., & Resnik, P. Translation by Iteractive Collaboration between Monolingual Users, Proceedings of Graphics Interface (GI 2010), 39-46.  N. Bradburn, S. Sudman, and B. Wansink. Asking Questions: The Definitive Guide to Questionnaire Design, Jossey-Bass, 2004.  C. Callison-Burch. “Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk”, EMNLP 2009.  P. Dai, Mausam, and D. Weld. “Decision-Theoretic of Crowd-Sourced Workflows”, AAAI, 2010.  J. Davis et al. “The HPU”, IEEE Computer Vision and Pattern Recognition Workshop on Advancing Computer Vision with Human in the Loop (ACVHL), June 2010.  M. Gashler, C. Giraud-Carrier, T. Martinez. Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, ICMLA 2008.  D. A. Grier. When Computers Were Human. Princeton University Press, 2005. ISBN 0691091579  JS. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game”, CHI 2009.  J. Heer, M. Bobstock. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”, CHI 2010.  P. Heymann and H. Garcia-Molina. “Human Processing”, Technical Report, Stanford Info Lab, 2010.  J. Howe. “Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business”. Crown Business, New York, 2008.  P. Hsueh, P. Melville, V. Sindhwami. “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”. NAACL HLT Workshop on Active Learning and NLP, 2009.  B. Huberman, D. Romero, and F. Wu. “Crowdsourcing, attention and productivity”. Journal of Information Science, 2009.  P.G. Ipeirotis. The New Demographics of Mechanical Turk. March 9, 2010. PDF and Spreadsheet.  P.G. Ipeirotis, R. Chandrasekar and P. Bennett. Report on the human computation workshop. SIGKDD Explorations v11 no 2 pp. 80-83, 2010.  P.G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. CeDER-10-04 (Sept. 11, 2010) 144
  • 145. Bibliography (2)  A. Kittur, E. Chi, and B. Suh. “Crowdsourcing user studies with Mechanical Turk”, SIGCHI 2008.  Aniket Kittur, Boris Smus, Robert E. Kraut. CrowdForge: Crowdsourcing Complex Work. CHI 2011  Adriana Kovashka and Matthew Lease. “Human and Machine Detection of … Similarity in Art”. CrowdConf 2010.  K. Krippendorff. "Content Analysis", Sage Publications, 2003  G. Little, L. Chilton, M. Goldman, and R. Miller. “TurKit: Tools for Iterative Tasks on Mechanical Turk”, HCOMP 2009.  T. Malone, R. Laubacher, and C. Dellarocas. Harnessing Crowds: Mapping the Genome of Collective Intelligence. 2009.  W. Mason and D. Watts. “Financial Incentives and the ’Performance of Crowds’”, HCOMP Workshop at KDD 2009.  J. Nielsen. “Usability Engineering”, Morgan-Kaufman, 1994.  A. Quinn and B. Bederson. “A Taxonomy of Distributed Human Computation”, Technical Report HCIL-2009-23, 2009  J. Ross, L. Irani, M. Six Silberman, A. Zaldivar, and B. Tomlinson. “Who are the Crowdworkers?: Shifting Demographics in Amazon Mechanical Turk”. CHI 2010.  F. Scheuren. “What is a Survey” (http://www.whatisasurvey.info) 2004.  R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng. “Cheap and Fast But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks”. EMNLP-2008.  V. Sheng, F. Provost, P. Ipeirotis. “Get Another Label? Improving Data Quality … Using Multiple, Noisy Labelers” KDD 2008.  S. Weber. “The Success of Open Source”, Harvard University Press, 2004.  L. von Ahn. Games with a purpose. Computer, 39 (6), 92–94, 2006.  L. von Ahn and L. Dabbish. “Designing Games with a purpose”. CACM, Vol. 51, No. 8, 2008. 145
  • 146. Bibliography (3)  Shuo Chen et al. What if the Irresponsible Teachers Are Dominating? A Method of Training on Samples and Clustering on Teachers. AAAI 2010.  Paul Heymann, Hector Garcia-Molina: Turkalytics: analytics for human computation. WWW 2011.  Florian Laws, Christian Scheible and Hinrich Schütze. Active Learning with Amazon Mechanical Turk. EMNLP 2011.  C.Y. Lin. Rouge: A package for automatic evaluation of summaries. Proceedings of the workshop on text summarization branches out (WAS), 2004.  C. Marshall and F. Shipman “The Ownership and Reuse of Visual Media”, JCDL, 2011.  Hohyon Ryu and Matthew Lease. Crowdworker Filtering with Support Vector Machine. ASIS&T 2011.  Wei Tang and Matthew Lease. Semi-Supervised Consensus Labeling for Crowdsourcing. ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR), 2011.  S. Vijayanarasimhan and K. Grauman. Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. CVPR 2011.  Stephen Wolfson and Matthew Lease. Look Before You Leap: Legal Pitfalls of Crowdsourcing. ASIS&T 2011. 146
  • 147. Recent Work • Della Penna, N, and M D Reid. (2012). “Crowd & Prejudice: An Impossibility Theorem for Crowd Labelling without a Gold Standard.” in Proceedings of Collective Intelligence. Arxiv preprint arXiv:1204.3511. • Demartini, Gianluca, D.E. Difallah, and P. Cudre-Mauroux. (2012). “ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking.” 21st Annual Conference on the World Wide Web (WWW). • Donmez, Pinar, Jaime Carbonnel, and Jeff Schneider. (2010). “A probabilistic framework to learn from multiple annotators with time-varying accuracy.” in SIAM International Conference on Data Mining (SDM), 826-837. • Donmez, Pinar, Jaime Carbonnel, and Jeff Schneider. (2009). “Efficiently learning the accuracy of labeling sources for selective sampling.” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), 259-268. • Fort, K., Adda, G., and Cohen, K. (2011). Amazon Mechanical Turk: Gold mine or coal mine? Computational Linguistics, 37(2):413–420. • Ghosh, A, Satyen Kale, and Preson McAfee. (2012). “Who Moderates the Moderators? Crowdsourcing Abuse Detection in User-Generated Content.” in Proceedings of the 12th ACM conference on Electronic commerce. • Ho, C J, and J W Vaughan. (2012). “Online Task Assignment in Crowdsourcing Markets.” in Twenty-Sixth AAAI Conference on Artificial Intelligence. • Jung, Hyun Joon, and Matthew Lease. (2012). “Inferring Missing Relevance Judgments from Crowd Workers via Probabilistic Matrix Factorization.” in Proceeding of the 36th international ACM SIGIR conference on Research and development in information retrieval. • Kamar, E, S Hacker, and E Horvitz. (2012). “Combining Human and Machine Intelligence in Large-scale Crowdsourcing.” in Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS). • Karger, D R, S Oh, and D Shah. (2011). “Budget-optimal task allocation for reliable crowdsourcing systems.” Arxiv preprint arXiv:1110.3564. • Kazai, Gabriella, Jaap Kamps, and Natasa Milic-Frayling. (2012). “An Analysis of Human Factors and Label Accuracy in Crowdsourcing Relevance Judgments.” Springer's Information Retrieval Journal: Special Issue on Crowdsourcing. 147
  • 148. Recent Work (2) • Lin, C.H. and Mausam and Weld, D.S. (2012). “Crowdsourcing Control: Moving Beyond Multiple Choice.” in in Proceedings of the 4th Human Computation Workshop (HCOMP) at AAAI. • Liu, C, and Y M Wang. (2012). “TrueLabel + Confusions: A Spectrum of Probabilistic Models in Analyzing Multiple Ratings.” in Proceedings of the 29th International Conference on Machine Learning (ICML). • Liu, Di, Ranolph Bias, Matthew Lease, and Rebecca Kuipers. (2012). “Crowdsourcing for Usability Testing.” in Proceedings of the 75th Annual Meeting of the American Society for Information Science and Technology (ASIS&T). • Ramesh, A, A Parameswaran, Hector Garcia-Molina, and Neoklis Polyzotis. (2012). Identifying Reliable Workers Swiftly. • Raykar, Vikas, Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., and Moy, (2010). “Learning From Crowds.” Journal of Machine Learning Research 11:1297-1322. • Raykar, Vikas, Yu, S., Zhao, L.H., Jerebko, A., Florin, C., Valadez, G.H., Bogoni, L., and Moy, L. (2009). “Supervised learning from multiple experts: whom to trust when everyone lies a bit.” in Proceedings of the 26th Annual International Conference on Machine Learning (ICML), 889-896. • Raykar, Vikas C, and Shipeng Yu. (2012). “Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks.” Journal of Machine Learning Research 13:491-518. • Wauthier, Fabian L., and Michael I. Jordan. (2012). “Bayesian Bias Mitigation for Crowdsourcing.” in Advances in neural information processing systems (NIPS). • Weld, D.S., Mausam, and Dai, P. (2011). “Execution control for crowdsourcing.” in Proceedings of the 24th ACM symposium adjunct on User interface software and technology (UIST). • Weld, D.S., Mausam, and Dai, P. (2011). “Human Intelligence Needs Artificial Intelligence.” in in Proceedings of the 3rd Human Computation Workshop (HCOMP) at AAAI. • Welinder, Peter, Steve Branson, Serge Belongie, and Pietro Perona. (2010). “The Multidimensional Wisdom of Crowds.” in Advances in Neural Information Processing Systems (NIPS), 2424-2432. • Welinder, Peter, and Pietro Perona. (2010). “Online crowdsourcing: rating annotators and obtaining cost-effective labels.” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 25-32. • Whitehill, J, P Ruvolo, T Wu, J Bergsma, and J Movellan. (2009). “Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise.” in Advances in Neural Information Processing Systems (NIPS). • Yan, Y, and R Rosales. (2011). “Active learning from crowds.” in Proceedings of the 28th Annual International Conference on Machine Learning (ICML). 148
  • 149. 149