SlideShare ist ein Scribd-Unternehmen logo
1 von 73
Matt Lease
School of Information
The University of Texas at Austin
Adventures in Crowdsourcing :
Toward Safer Content Moderation & Better
Supporting Complex Annotation Tasks
1
Lab: ir.ischool.utexas.edu
@mattlease
Slides: slideshare.net/mattlease
Roadmap
• Context: UT Good Systems & iSchool
• Two parts to talk today
– Content Moderation
– Aggregating Complex Annotations
2
3
Goal: Design a future of Artificial Intelligence (AI)
technologies to meet society’s needs and values.
.
http://goodsystems.utexas.edu
Good Systems: an 8-year, $10M
UT Austin Grand Challenge
“The place where people & technology meet”
~ Wobbrock et al., 2009
“iSchools” now exist at over 100 universities around the world
4
What’s an Information School?
Anubrata Das, Brandon Dang and Matthew Lease
School of Information
The University of Texas at Austin
Fast, Accurate, and Healthier:
Interactive Blurring Helps Moderators
Reduce Exposure to Harmful Content
5
Lab: ir.ischool.utexas.edu
@mattlease
Slides: slideshare.net/mattlease
Today’s Talk: Content Moderation
- Social media platforms are hubs of user generated content
- Some types of content are unacceptable or may cause harm
- pornography & nudity, depictions of violence, hate speech, mis/disinformation
- What is considered acceptable varies by platform and region
- Further issues of free speech & due process in content removal & remediation
- e.g., Moderate Globally, Impact Locally: The Global Impacts of Content Moderation (Yale, Nov. 2020)
6
Alon Halevy et al. "Preserving integrity in online social networks." arXiv preprint, September 25, 2020.
Scale of Content Moderation
7Paul M. Barrett. (2020). Who Moderates the Social Media Giants? A Call to End Outsourcing.
Facebook, Youtube
Can’t we just use AI?
• High cost of errors -> very high accuracy required
• Continually evolving content and moderation policies
– also regional variants, cultural issues, and adversarial attacks
• While AI systems are often advertised/perceived as fully-automated, in
practice, human labor is typically required and often hidden
– Gray and Suri (2019) “ghost work”, Ekbia and Nardi (2014) ”heteromation”,
Irani and Silberman (2013) “invisible work”
• Human moderators today: Facebook ~15K, Youtube ~10K
• No free lunch: human annotators still needed to create training data 8
Barr & Cabrera, ACM Queue 2006
9
“Software developers with innovative ideas for businesses
and technologies are constrained by the limits of artificial
intelligence… If software developers could programmatically
access and incorporate human intelligence into their
applications, a whole new class of innovative businesses
and applications would be possible. This is the goal of
Amazon Mechanical Turk… people are freer to innovate because
they can now imbue software with real human intelligence.”
10
Implication on Moderators
“The psychological effects of viewing harmful content is well
documented, with reports of moderators experiencing
posttraumatic stress disorder (PTSD) symptoms and other
mental health issues as a result of the disturbing content they
are exposed to.” (Cambridge Consultants, 2019)
11
“From my own interviews with more than 100 moderators… a
significant number [get PTSD]. And many other employees
develop long- lasting mental health symptoms that stop short
of full-blown PTSD, including depression, anxiety, and
insomnia.” (Casey Newton, 2020)
Volume quotas (akin to a call center) - “constant measurement
for accuracy is as pressurizing as a quota” (Dwoskin 2019)
Image Source: The Verge
The Great Irony
12
The sort of task we most want an algorithm to do
(emotionally disturbing) is what people are doing
because the algorithm isn’t good enough
BUT WHO PROTECTS THE
MODERATORS? (HCOMP 2018)
BRANDON DANG1, MARTIN J. RIEDL2, AND MATTHEW LEASE1
1School of Information & 2School of Journalism (both students contributed equally)
The University of Texas at Austin
AAAI HCOMP -&- ACM Collective Intelligence
July 2018, Zurich, Switzerland
Research Question
14
By revealing less of an image, can we reduce the emotional
labor of image moderation without compromising
moderator accuracy and efficiency?
Design and Demo
http://ir.ischool.utexas.edu/CM/demo/
15Dang, Brandon, Martin J. Riedl, and Matthew Lease. "But who protects the moderators? the case of crowdsourced image
moderation." arXiv preprint arXiv:1804.10999 (2018).
Code: https://github.com/budang/content-moderation
Exposure and Control
“shielding moderators from harm begins with giving them
more control of what they’re seeing and how they’re seeing it,
so just the existence of ...preferences helps” (Sullivan 2019)
16
“Scientifically, do we know how much [exposure] is too much?
The answer is no, we don’t... If there’s something that were to
keep me up at night... it’s that question”
(Facebook psychologist Chris Harrison)
“Finding the right balance between content reviewer well-
being and resiliency, quality, and productivity is very
challenging at the scale we operate in. We are continually
working to get this balance right.” (Facebook’s Carolyn
Glanville)
Source: https://images.fastcompany.net/image/upload/w_596,c_limit,q_auto:best,f_auto/wp-cms/uploads/2019/06/Quick-Settings.png
Exposure and Control
- Industry moving towards establishing best practices for providing control & tools
17
18Source:https://docs.microsoft.com/en-us/azure/cognitive-services/content-moderator/images/video-review-default-view.png;
https://docs.microsoft.com/en-us/azure/cognitive-services/content-moderator/
Exposure and Control
- Industry moving towards establishing best practices for providing control & tools
- Such interventions include greyscaling, muting videos, and blurring
- Not well understood how effective such practices are
- Google: Ramakrishnan and Karunakaran (HCOMP 2019) report grayscaling of
images and videos reduces harm. Also study static blurring.
19
HCOMP’20: MTurk Moderation Task
20
Survey: Well-being and Usability
21
Usefulness04
Perceived usefulness and
perceived ease of use
(Davis 1989; Venkatesh and Davis 2000)
Emotional Exhaustion03
Slightly modified version of emotional
exhaustion scale
(Wharton 1993) (Cates and Howe 2015)
Positive and Negative
Affect02
7-point Likert scale what emotions they are
currently feeling (I-PANAS-SF)
(Thompson 2007)
Positive and Negative
Experience01
5-point Likert scale how often they experience
the following emotions: positive, negative,
good, bad, pleasant, unpleasant, etc. (SPANE)
(Diener et al, 2010)
Experiment
22
- Random sample of 60 synthetic & real images
across categories: 180 total images
- Divided into groups of 9, balanced over classes
- 20 HITs, Five workers/ HIT
- Workers restricted to a single HIT
- Adult content qualification, >98% approval rate
with 300+ submitted HITs
- $7.25/hour
Results
Performance
- Accuracy
- Time taken
- Effort*
- # Clicks
- # Mouse Movement
Well-being
- Worker comfort
- Experience
- Affect
- Emotional Exhaustion
- Usefulness
*Brandon Dang, Miles Hutson, & Matthew Lease. MmmTurkey: A Crowdsourcing Framework for Deploying Tasks
and Recording Worker Behavior on Amazon Mechanical Turk. HCOMP 2016. https://github.com/budang/turkey-lite
Speed and Accuracy is not Impacted in Interactive Blurring
24
Worker Accuracy Time
Similar Effort Across Designs (except for “Click”)
25
# Clicks # Mouse Movement
Slider is Perceived to be the Most Usable Interface
26
Perceived Usefulness Perceived Ease of Use
Hover is perceived as most comfortable
27
SPANE-B score for all interventions except for click is
higher than the unblurred baseline
28
Positive and Negative Experience Overall Experience
Overall emotional exhaustion is the least for hover
30
Increased mean positive affect with increasing level of blur
31
Positive and Negative Affect
Summary: Hover is the Champion for Adoption
32
B: Baseline, **p< 0.05, ***p< 0.005
- Slider and hover are both top performers
- Hover shows significantly low emotional exhaustion with comparatively high accuracy
- If key goal is to keep accuracy intact & reduce emotional impact, we recommend hover design
33
Future Work03
• Qualitative Analysis
• Intelligent Unblurring
• Early warning for severity
Conclusion02
As opposed to static blurring that
decreases accuracy, Interactive
blurring, improves well-being without
sacrificing accuracy and speed
Contribution01
Proposed and extensively evaluated
intervention that improves moderator
well-being
Alex Braylan1
and Matthew Lease2
1
Dept. of Computer Science & 2
School of Information
The University of Texas at Austin
Modeling and Aggregation of Complex
Annotations via Annotation Distance
34
ml@utexas.edu
@mattlease
Slides: slideshare.net/mattlease
Encore: Dec 11 talk @NeurIPS Crowd Science Workshop (https://research.yandex.com/workshops/crowd/neurips-2020)
Code & Data: https://github.com/Praznat/annotationmodeling
Simple annotation & aggregation
• Classification
– sentiment analysis
– image categorization
• Ordinal rating
– product & movie reviews
– search relevance
• Multiple choice selection
– quizzes
Aggregation
• Crowdsourcing: quality control
• Experts: wisdom of crowds
• Goal: select best label available
for each item (no label fusion)
35
What’s the capital of Texas?
Austin
Austin
Houston
36
What’s the capital of Texas?
Austin
Austin
Houston
Majority Vote
37
Caption this image:
38
A cat is
eating
The cat
eats
A beautiful
picture
Caption this image:
When majority voting falls short
Problem: large label space, exact match doesn’t work!
39
A cat is
eating
The cat
eats
A beautiful
picture
What about complex annotations?
Ranked lists
Parse trees
A1: A cat is eating
A2: The cat eats
A3: A beautiful picture
Image captions
Range sequences
40
Outline
• Prior work
• Approach
• Experiments
• Conclusion
41
Aggregating Simple Labels
• Hundreds of papers
• Multiple benchmarking studies
• Rich body of Bayesian modeling
• General-purpose aggregation
models for simple labels don’t
support complex labels!
Dawid-Skene MACE
Hierarchical Dawid-Skene
Item Difficulty
Logistic Random Effects
Source:
Paun et al 2018
“Comparing bayesian
models of annotation”
42
Task-specific models
• Pros:
– Task specialization
maximizes accuracy
• Cons:
– Need new model for
every task
– Complicated, difficult
to formulate
Nguyen et al 2017 (Sequences)
Lin, Mausam, and Weld 2012 (Math)
43
Task-specific workflows
• Pros:
– Empower workers
for complex tasks
• Cons:
– Need new workflow
for every task
– Complicated, difficult
to formulate
Noronha et al 2011
(image analysis)
Lasecki et al 2012
(transcription)
44
Our goals
• We want aggregation for complex data types
– Build on ideas from simple label aggregation models
• We want to generalize across many labeling tasks
– Can we reduce problem to common simpler state space?
45
Outline
• Prior work
• Approach
• Experiments
• Conclusion
46
Key Insight
• Partial credit matching via task-specific distance function
– Encapsulate task-specific label features into requester distance function
– Model annotation distances rather than annotations
– Distance functions already exist for most tasks because people need
evaluation functions to compare predicted labels vs gold
47
Distance functions
48
Properties of distance functions
Non-negativity
Symmetry
Triangle inequality
Data Free Text Rankings
Example
evaluation fn
BLEU(x, y)
Example
distance fn
Non-negativity ✓ ✓
Symmetry ✓ ✓
Triangle
inequality
✓ ✓
Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
49
• Example task: text annotation
• Example distance function:
string edit distance
Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
0.05
0.1
0.1
50
• Example task: text annotation
• Example distance function:
string edit distance
Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
0.8
0.82
0.05
0.1
0.1
51
0.82
• Example task: text annotation
• Example distance function:
string edit distance
A1: A cat is eating
A2: The cat eats
A3: A beautiful
picture
0.1 0.6
0.3
52
All tasks reduce to matrices of
annotation distances
How to aggregate given distances
• Local selection model
• Global selection model
• Combined
53
Current item
Other items
Local approach: Smallest Avg Distance
• For each item:
1. Compute average distance between
annotations for the item
2. Choose annotation with smallest
average distance
• Generalization of majority vote
• Independence between items
• Local approach does not model
annotator reliability
54
Current item
Other items
Global approach: Best Available User
• For each annotator:
– Score by average distance over full dataset
• For each item:
– Choose label by best-scoring annotator
• Fixed annotator reliability
• Global approach does not model how
well annotators did on specific items
55
Current item
Other items
Can we get best of both worlds?
• Want a method that combines:
– Best available user (global)
– Smallest avg distance (local)
• Should build on rich history of work on Bayesian annotation modeling
• Need a principled framework for modeling annotation distance matrices
weights
votes weighted voting
56
Multidimensional Annotation Scaling (MAS)
• Based on Multidimensional
Scaling (Kruskal & Wish 1978)
• Probabilistic model of multi-
item distance matrices
• “Hierarchical Bayesian”
– Additional learned parameters
represent crowd effects such as
worker reliability
A cat is
eating
The cat
eats
A beautiful
picture
58
MAS Objective 1: Likelihood
Multidimensional Scaling
objective:
Diuv ∼ N(∥εiu−εiv∥, σ)
• Diuv : observed distance
• εiu : annotation embedding
• σ : error scale
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
0.8
0.82
0.05
0.1
0.1
0.82
59
MAS Objective 1: Likelihood
Multidimensional Scaling
objective:
Diuv ∼ N(∥εiu−εiv∥, σ)
• Diuv : observed distance
• εiu : annotation embedding
• σ : error scale
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
0.8
0.82
0.05
0.1
0.1
0.82
60
MAS Objective 2: Prior
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
Pseudo-gold
61
MAS Objective 2: Prior
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
62
MAS Objective 2: Prior
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
63
MAS Objective 2: Prior
64
MAS Objective 2: Prior
65
Outline
• Prior work
• Approach
• Experiments
• Conclusion
66
Tasks & datasets
SYNTHETIC DATASETS
• Syntactic parse trees
– Distance function: evalb
• Ranked lists
– Distance function: Kendall’s tau
REAL DATASETS
• Biomedical text sequences
– Distance function: Span F1
• Urdu-English translations
– Distance function: GLEU
67
Nguyen et al 2017
Zaidan and Callison-Burch 2011
Methods
Baselines:
• Random User (RU): pick one label randomly
• ZenCrowd (ZC) (Demartini et al. 2012)
– Weighted voting based on exact match (rare!)
• Crowd Hidden Markov Model (CHMM) (Nguyen et al. 2017)
– Sequence annotation task only
Upper bound: Oracle (OR) (always picks best label)
• Even if 5 workers answer, limited by best answer any of them gave
68
Results
Task Metric RU ZC CHMM MAS Oracle
Translations GLEU 0.185 0.246
Sequences F1 0.561 0.827
Parses EVALB 0.812 0.939
Rankings 0.491 0.724
69
• Diverse complex label datasets
Results
Task Metric RU ZC CHMM MAS Oracle
Translations GLEU 0.185 0.188 0.246
Sequences F1 0.561 0.569 0.827
Parses EVALB 0.812 0.819 0.939
Rankings 0.491 0.495 0.724
70
• Diverse complex label datasets
Results
Task Metric RU ZC CHMM MAS Oracle
Translations GLEU 0.185 0.188 - 0.246
Sequences F1 0.561 0.569 0.702 0.827
Parses EVALB 0.812 0.819 - 0.939
Rankings 0.491 0.495 - 0.724
71
• Diverse complex label datasets
Results
Task Metric RU ZC CHMM MAS Oracle
Translations GLEU 0.185 0.188 - 0.217 0.246
Sequences F1 0.561 0.569 0.702 0.709 0.827
Parses EVALB 0.812 0.819 - 0.932 0.939
Rankings 0.491 0.495 - 0.710 0.724
72
• Diverse complex label datasets
• MAS aggregation is best way to get closer to ground truth with no
model alteration between datasets
Conclusion
• Goal: general-purpose probabilistic model to aggregate complex annotations
– Categorical-based methods insufficient
– Custom models difficult to design for new annotation types
• Solution: Model annotation distances via task-specific distance functions
– Transforms problem into general-purpose variable space
• Multi-dimensional Annotation Scaling (MAS)
– Allows unsupervised weighted voting with inferred annotator reliability
• Not covered in talk (see paper)
– Semi-supervised learning
– Partial credit 73
Ongoing work
• Generalization to more tasks (e.g., image bounding boxes & keypoints)
• Generalization to simple annotation tasks (”one ring to rule them all”)
• Support for multiple latent objects per item
• Merging annotations rather than selecting best one
– e.g. guessing weight of an ox
– MAS vs. non-embedding EM model, varying noise, fewer annotations, …
74
Code & Data: https://github.com/Praznat/annotationmodeling
A1: A cat is eating
A2: The cat eats
A3: A beautiful picture
Thank you!
75
Matt Lease (University of Texas at Austin)
Lab: ir.ischool.utexas.edu
@mattlease
Slides: slideshare.net/mattlease
We thank our many talented crowd workers for their contributions to our research!

Weitere ähnliche Inhalte

Was ist angesagt?

Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataHaluan Irsad
 
Data storytelling with personas, Utrecht
Data storytelling with personas, UtrechtData storytelling with personas, Utrecht
Data storytelling with personas, UtrechtCREATIVE COMPANION
 
This Isn't 'Big Data.' It's Just Bad Data.
This Isn't 'Big Data.' It's Just Bad Data.This Isn't 'Big Data.' It's Just Bad Data.
This Isn't 'Big Data.' It's Just Bad Data.Peter Orszag
 
Data Transformation Powerpoint Presentation Slides
Data Transformation Powerpoint Presentation SlidesData Transformation Powerpoint Presentation Slides
Data Transformation Powerpoint Presentation SlidesSlideTeam
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeKent Graziano
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Tableau Visual Guidebook
Tableau Visual GuidebookTableau Visual Guidebook
Tableau Visual GuidebookAndy Kriebel
 

Was ist angesagt? (10)

Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Data storytelling with personas, Utrecht
Data storytelling with personas, UtrechtData storytelling with personas, Utrecht
Data storytelling with personas, Utrecht
 
This Isn't 'Big Data.' It's Just Bad Data.
This Isn't 'Big Data.' It's Just Bad Data.This Isn't 'Big Data.' It's Just Bad Data.
This Isn't 'Big Data.' It's Just Bad Data.
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Data Transformation Powerpoint Presentation Slides
Data Transformation Powerpoint Presentation SlidesData Transformation Powerpoint Presentation Slides
Data Transformation Powerpoint Presentation Slides
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Tableau Visual Guidebook
Tableau Visual GuidebookTableau Visual Guidebook
Tableau Visual Guidebook
 
Big data, Big decision
Big data, Big decisionBig data, Big decision
Big data, Big decision
 

Ähnlich wie Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Supporting Complex Annotation Tasks

Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Matthew Lease
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3
TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3
TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3TOO4TO
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?Matthew Lease
 
Ethics and sustainability for techies
Ethics and sustainability for techiesEthics and sustainability for techies
Ethics and sustainability for techiesClaudia Melo
 
Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...
Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...
Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...EISLab
 
Mobimooc Week 2 - Planning mLearning
Mobimooc Week 2 - Planning mLearningMobimooc Week 2 - Planning mLearning
Mobimooc Week 2 - Planning mLearningJudy Brown
 
Violence Detection: Introducing a Machine Learning Based Novel Method
Violence Detection: Introducing a Machine Learning Based Novel Method Violence Detection: Introducing a Machine Learning Based Novel Method
Violence Detection: Introducing a Machine Learning Based Novel Method Arindam Paul
 
Explainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AIExplainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AIepsilon_tud
 
TL_Thompson.pptx.ppt
TL_Thompson.pptx.pptTL_Thompson.pptx.ppt
TL_Thompson.pptx.pptRGowthamRao
 
Global Learn Conference Summary
Global Learn Conference SummaryGlobal Learn Conference Summary
Global Learn Conference SummaryMichael Coghlan
 
Violence_Detection.pptx
Violence_Detection.pptxViolence_Detection.pptx
Violence_Detection.pptxArindam Paul
 
Detection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social NetworkDetection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social NetworkIRJET Journal
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
 
University Public Driven Applications - Big Data and Organizational Design
University Public Driven Applications - Big Data and Organizational Design University Public Driven Applications - Big Data and Organizational Design
University Public Driven Applications - Big Data and Organizational Design maria chiara pettenati
 
Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Joe McCarthy
 
Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Simon Buckingham Shum
 

Ähnlich wie Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Supporting Complex Annotation Tasks (20)

Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3
TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3
TOO4TO Module 7 / Artificial Intelligence and Sustainability: Part 3
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?
 
Ethics and sustainability for techies
Ethics and sustainability for techiesEthics and sustainability for techies
Ethics and sustainability for techies
 
Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...
Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...
Don't look at me that way! - Understanding User Attitudes Towards Data Glasse...
 
Mobimooc Week 2 - Planning mLearning
Mobimooc Week 2 - Planning mLearningMobimooc Week 2 - Planning mLearning
Mobimooc Week 2 - Planning mLearning
 
Violence Detection: Introducing a Machine Learning Based Novel Method
Violence Detection: Introducing a Machine Learning Based Novel Method Violence Detection: Introducing a Machine Learning Based Novel Method
Violence Detection: Introducing a Machine Learning Based Novel Method
 
Explainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AIExplainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AI
 
TL_Thompson.pptx.ppt
TL_Thompson.pptx.pptTL_Thompson.pptx.ppt
TL_Thompson.pptx.ppt
 
Global Learn Conference Summary
Global Learn Conference SummaryGlobal Learn Conference Summary
Global Learn Conference Summary
 
Cpdp slides
Cpdp slidesCpdp slides
Cpdp slides
 
Violence_Detection.pptx
Violence_Detection.pptxViolence_Detection.pptx
Violence_Detection.pptx
 
Detection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social NetworkDetection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social Network
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
 
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...
 
University Public Driven Applications - Big Data and Organizational Design
University Public Driven Applications - Big Data and Organizational Design University Public Driven Applications - Big Data and Organizational Design
University Public Driven Applications - Big Data and Organizational Design
 
Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627
 
Performance Profile
Performance ProfilePerformance Profile
Performance Profile
 
Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)Algorithmic Accountability & Learning Analytics (UCL)
Algorithmic Accountability & Learning Analytics (UCL)
 

Mehr von Matthew Lease

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopMatthew Lease
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd Matthew Lease
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)Matthew Lease
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016Matthew Lease
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)Matthew Lease
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing ScienceMatthew Lease
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsMatthew Lease
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingMatthew Lease
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkMatthew Lease
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Matthew Lease
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationMatthew Lease
 

Mehr von Matthew Lease (20)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loop
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016The Rise of Crowd Computing - 2016
The Rise of Crowd Computing - 2016
 
The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)The Rise of Crowd Computing (December 2015)
The Rise of Crowd Computing (December 2015)
 
Toward Better Crowdsourcing Science
 Toward Better Crowdsourcing Science Toward Better Crowdsourcing Science
Toward Better Crowdsourcing Science
 
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsBeyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
Beyond Mechanical Turk: An Analysis of Paid Crowd Work Platforms
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject Crowdsourcing
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd Work
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine Evaluation
 

Kürzlich hochgeladen

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Supporting Complex Annotation Tasks

  • 1. Matt Lease School of Information The University of Texas at Austin Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Supporting Complex Annotation Tasks 1 Lab: ir.ischool.utexas.edu @mattlease Slides: slideshare.net/mattlease
  • 2. Roadmap • Context: UT Good Systems & iSchool • Two parts to talk today – Content Moderation – Aggregating Complex Annotations 2
  • 3. 3 Goal: Design a future of Artificial Intelligence (AI) technologies to meet society’s needs and values. . http://goodsystems.utexas.edu Good Systems: an 8-year, $10M UT Austin Grand Challenge
  • 4. “The place where people & technology meet” ~ Wobbrock et al., 2009 “iSchools” now exist at over 100 universities around the world 4 What’s an Information School?
  • 5. Anubrata Das, Brandon Dang and Matthew Lease School of Information The University of Texas at Austin Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content 5 Lab: ir.ischool.utexas.edu @mattlease Slides: slideshare.net/mattlease
  • 6. Today’s Talk: Content Moderation - Social media platforms are hubs of user generated content - Some types of content are unacceptable or may cause harm - pornography & nudity, depictions of violence, hate speech, mis/disinformation - What is considered acceptable varies by platform and region - Further issues of free speech & due process in content removal & remediation - e.g., Moderate Globally, Impact Locally: The Global Impacts of Content Moderation (Yale, Nov. 2020) 6 Alon Halevy et al. "Preserving integrity in online social networks." arXiv preprint, September 25, 2020.
  • 7. Scale of Content Moderation 7Paul M. Barrett. (2020). Who Moderates the Social Media Giants? A Call to End Outsourcing. Facebook, Youtube
  • 8. Can’t we just use AI? • High cost of errors -> very high accuracy required • Continually evolving content and moderation policies – also regional variants, cultural issues, and adversarial attacks • While AI systems are often advertised/perceived as fully-automated, in practice, human labor is typically required and often hidden – Gray and Suri (2019) “ghost work”, Ekbia and Nardi (2014) ”heteromation”, Irani and Silberman (2013) “invisible work” • Human moderators today: Facebook ~15K, Youtube ~10K • No free lunch: human annotators still needed to create training data 8
  • 9. Barr & Cabrera, ACM Queue 2006 9 “Software developers with innovative ideas for businesses and technologies are constrained by the limits of artificial intelligence… If software developers could programmatically access and incorporate human intelligence into their applications, a whole new class of innovative businesses and applications would be possible. This is the goal of Amazon Mechanical Turk… people are freer to innovate because they can now imbue software with real human intelligence.”
  • 10. 10
  • 11. Implication on Moderators “The psychological effects of viewing harmful content is well documented, with reports of moderators experiencing posttraumatic stress disorder (PTSD) symptoms and other mental health issues as a result of the disturbing content they are exposed to.” (Cambridge Consultants, 2019) 11 “From my own interviews with more than 100 moderators… a significant number [get PTSD]. And many other employees develop long- lasting mental health symptoms that stop short of full-blown PTSD, including depression, anxiety, and insomnia.” (Casey Newton, 2020) Volume quotas (akin to a call center) - “constant measurement for accuracy is as pressurizing as a quota” (Dwoskin 2019) Image Source: The Verge
  • 12. The Great Irony 12 The sort of task we most want an algorithm to do (emotionally disturbing) is what people are doing because the algorithm isn’t good enough
  • 13. BUT WHO PROTECTS THE MODERATORS? (HCOMP 2018) BRANDON DANG1, MARTIN J. RIEDL2, AND MATTHEW LEASE1 1School of Information & 2School of Journalism (both students contributed equally) The University of Texas at Austin AAAI HCOMP -&- ACM Collective Intelligence July 2018, Zurich, Switzerland
  • 14. Research Question 14 By revealing less of an image, can we reduce the emotional labor of image moderation without compromising moderator accuracy and efficiency?
  • 15. Design and Demo http://ir.ischool.utexas.edu/CM/demo/ 15Dang, Brandon, Martin J. Riedl, and Matthew Lease. "But who protects the moderators? the case of crowdsourced image moderation." arXiv preprint arXiv:1804.10999 (2018). Code: https://github.com/budang/content-moderation
  • 16. Exposure and Control “shielding moderators from harm begins with giving them more control of what they’re seeing and how they’re seeing it, so just the existence of ...preferences helps” (Sullivan 2019) 16 “Scientifically, do we know how much [exposure] is too much? The answer is no, we don’t... If there’s something that were to keep me up at night... it’s that question” (Facebook psychologist Chris Harrison) “Finding the right balance between content reviewer well- being and resiliency, quality, and productivity is very challenging at the scale we operate in. We are continually working to get this balance right.” (Facebook’s Carolyn Glanville) Source: https://images.fastcompany.net/image/upload/w_596,c_limit,q_auto:best,f_auto/wp-cms/uploads/2019/06/Quick-Settings.png
  • 17. Exposure and Control - Industry moving towards establishing best practices for providing control & tools 17
  • 19. Exposure and Control - Industry moving towards establishing best practices for providing control & tools - Such interventions include greyscaling, muting videos, and blurring - Not well understood how effective such practices are - Google: Ramakrishnan and Karunakaran (HCOMP 2019) report grayscaling of images and videos reduces harm. Also study static blurring. 19
  • 21. Survey: Well-being and Usability 21 Usefulness04 Perceived usefulness and perceived ease of use (Davis 1989; Venkatesh and Davis 2000) Emotional Exhaustion03 Slightly modified version of emotional exhaustion scale (Wharton 1993) (Cates and Howe 2015) Positive and Negative Affect02 7-point Likert scale what emotions they are currently feeling (I-PANAS-SF) (Thompson 2007) Positive and Negative Experience01 5-point Likert scale how often they experience the following emotions: positive, negative, good, bad, pleasant, unpleasant, etc. (SPANE) (Diener et al, 2010)
  • 22. Experiment 22 - Random sample of 60 synthetic & real images across categories: 180 total images - Divided into groups of 9, balanced over classes - 20 HITs, Five workers/ HIT - Workers restricted to a single HIT - Adult content qualification, >98% approval rate with 300+ submitted HITs - $7.25/hour
  • 23. Results Performance - Accuracy - Time taken - Effort* - # Clicks - # Mouse Movement Well-being - Worker comfort - Experience - Affect - Emotional Exhaustion - Usefulness *Brandon Dang, Miles Hutson, & Matthew Lease. MmmTurkey: A Crowdsourcing Framework for Deploying Tasks and Recording Worker Behavior on Amazon Mechanical Turk. HCOMP 2016. https://github.com/budang/turkey-lite
  • 24. Speed and Accuracy is not Impacted in Interactive Blurring 24 Worker Accuracy Time
  • 25. Similar Effort Across Designs (except for “Click”) 25 # Clicks # Mouse Movement
  • 26. Slider is Perceived to be the Most Usable Interface 26 Perceived Usefulness Perceived Ease of Use
  • 27. Hover is perceived as most comfortable 27
  • 28. SPANE-B score for all interventions except for click is higher than the unblurred baseline 28 Positive and Negative Experience Overall Experience
  • 29. Overall emotional exhaustion is the least for hover 30
  • 30. Increased mean positive affect with increasing level of blur 31 Positive and Negative Affect
  • 31. Summary: Hover is the Champion for Adoption 32 B: Baseline, **p< 0.05, ***p< 0.005 - Slider and hover are both top performers - Hover shows significantly low emotional exhaustion with comparatively high accuracy - If key goal is to keep accuracy intact & reduce emotional impact, we recommend hover design
  • 32. 33 Future Work03 • Qualitative Analysis • Intelligent Unblurring • Early warning for severity Conclusion02 As opposed to static blurring that decreases accuracy, Interactive blurring, improves well-being without sacrificing accuracy and speed Contribution01 Proposed and extensively evaluated intervention that improves moderator well-being
  • 33. Alex Braylan1 and Matthew Lease2 1 Dept. of Computer Science & 2 School of Information The University of Texas at Austin Modeling and Aggregation of Complex Annotations via Annotation Distance 34 ml@utexas.edu @mattlease Slides: slideshare.net/mattlease Encore: Dec 11 talk @NeurIPS Crowd Science Workshop (https://research.yandex.com/workshops/crowd/neurips-2020) Code & Data: https://github.com/Praznat/annotationmodeling
  • 34. Simple annotation & aggregation • Classification – sentiment analysis – image categorization • Ordinal rating – product & movie reviews – search relevance • Multiple choice selection – quizzes Aggregation • Crowdsourcing: quality control • Experts: wisdom of crowds • Goal: select best label available for each item (no label fusion) 35
  • 35. What’s the capital of Texas? Austin Austin Houston 36
  • 36. What’s the capital of Texas? Austin Austin Houston Majority Vote 37
  • 37. Caption this image: 38 A cat is eating The cat eats A beautiful picture
  • 38. Caption this image: When majority voting falls short Problem: large label space, exact match doesn’t work! 39 A cat is eating The cat eats A beautiful picture
  • 39. What about complex annotations? Ranked lists Parse trees A1: A cat is eating A2: The cat eats A3: A beautiful picture Image captions Range sequences 40
  • 40. Outline • Prior work • Approach • Experiments • Conclusion 41
  • 41. Aggregating Simple Labels • Hundreds of papers • Multiple benchmarking studies • Rich body of Bayesian modeling • General-purpose aggregation models for simple labels don’t support complex labels! Dawid-Skene MACE Hierarchical Dawid-Skene Item Difficulty Logistic Random Effects Source: Paun et al 2018 “Comparing bayesian models of annotation” 42
  • 42. Task-specific models • Pros: – Task specialization maximizes accuracy • Cons: – Need new model for every task – Complicated, difficult to formulate Nguyen et al 2017 (Sequences) Lin, Mausam, and Weld 2012 (Math) 43
  • 43. Task-specific workflows • Pros: – Empower workers for complex tasks • Cons: – Need new workflow for every task – Complicated, difficult to formulate Noronha et al 2011 (image analysis) Lasecki et al 2012 (transcription) 44
  • 44. Our goals • We want aggregation for complex data types – Build on ideas from simple label aggregation models • We want to generalize across many labeling tasks – Can we reduce problem to common simpler state space? 45
  • 45. Outline • Prior work • Approach • Experiments • Conclusion 46
  • 46. Key Insight • Partial credit matching via task-specific distance function – Encapsulate task-specific label features into requester distance function – Model annotation distances rather than annotations – Distance functions already exist for most tasks because people need evaluation functions to compare predicted labels vs gold 47
  • 47. Distance functions 48 Properties of distance functions Non-negativity Symmetry Triangle inequality Data Free Text Rankings Example evaluation fn BLEU(x, y) Example distance fn Non-negativity ✓ ✓ Symmetry ✓ ✓ Triangle inequality ✓ ✓
  • 48. Calculate distances “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 49 • Example task: text annotation • Example distance function: string edit distance
  • 49. Calculate distances “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 0.05 0.1 0.1 50 • Example task: text annotation • Example distance function: string edit distance
  • 50. Calculate distances “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 0.8 0.82 0.05 0.1 0.1 51 0.82 • Example task: text annotation • Example distance function: string edit distance
  • 51. A1: A cat is eating A2: The cat eats A3: A beautiful picture 0.1 0.6 0.3 52 All tasks reduce to matrices of annotation distances
  • 52. How to aggregate given distances • Local selection model • Global selection model • Combined 53 Current item Other items
  • 53. Local approach: Smallest Avg Distance • For each item: 1. Compute average distance between annotations for the item 2. Choose annotation with smallest average distance • Generalization of majority vote • Independence between items • Local approach does not model annotator reliability 54 Current item Other items
  • 54. Global approach: Best Available User • For each annotator: – Score by average distance over full dataset • For each item: – Choose label by best-scoring annotator • Fixed annotator reliability • Global approach does not model how well annotators did on specific items 55 Current item Other items
  • 55. Can we get best of both worlds? • Want a method that combines: – Best available user (global) – Smallest avg distance (local) • Should build on rich history of work on Bayesian annotation modeling • Need a principled framework for modeling annotation distance matrices weights votes weighted voting 56
  • 56. Multidimensional Annotation Scaling (MAS) • Based on Multidimensional Scaling (Kruskal & Wish 1978) • Probabilistic model of multi- item distance matrices • “Hierarchical Bayesian” – Additional learned parameters represent crowd effects such as worker reliability A cat is eating The cat eats A beautiful picture 58
  • 57. MAS Objective 1: Likelihood Multidimensional Scaling objective: Diuv ∼ N(∥εiu−εiv∥, σ) • Diuv : observed distance • εiu : annotation embedding • σ : error scale “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 0.8 0.82 0.05 0.1 0.1 0.82 59
  • 58. MAS Objective 1: Likelihood Multidimensional Scaling objective: Diuv ∼ N(∥εiu−εiv∥, σ) • Diuv : observed distance • εiu : annotation embedding • σ : error scale “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 0.8 0.82 0.05 0.1 0.1 0.82 60
  • 59. MAS Objective 2: Prior “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” Pseudo-gold 61
  • 60. MAS Objective 2: Prior “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 62
  • 61. MAS Objective 2: Prior “a cat is eating” “cat is eating” “a beautiful picture” “the cat eats” 63
  • 62. MAS Objective 2: Prior 64
  • 63. MAS Objective 2: Prior 65
  • 64. Outline • Prior work • Approach • Experiments • Conclusion 66
  • 65. Tasks & datasets SYNTHETIC DATASETS • Syntactic parse trees – Distance function: evalb • Ranked lists – Distance function: Kendall’s tau REAL DATASETS • Biomedical text sequences – Distance function: Span F1 • Urdu-English translations – Distance function: GLEU 67 Nguyen et al 2017 Zaidan and Callison-Burch 2011
  • 66. Methods Baselines: • Random User (RU): pick one label randomly • ZenCrowd (ZC) (Demartini et al. 2012) – Weighted voting based on exact match (rare!) • Crowd Hidden Markov Model (CHMM) (Nguyen et al. 2017) – Sequence annotation task only Upper bound: Oracle (OR) (always picks best label) • Even if 5 workers answer, limited by best answer any of them gave 68
  • 67. Results Task Metric RU ZC CHMM MAS Oracle Translations GLEU 0.185 0.246 Sequences F1 0.561 0.827 Parses EVALB 0.812 0.939 Rankings 0.491 0.724 69 • Diverse complex label datasets
  • 68. Results Task Metric RU ZC CHMM MAS Oracle Translations GLEU 0.185 0.188 0.246 Sequences F1 0.561 0.569 0.827 Parses EVALB 0.812 0.819 0.939 Rankings 0.491 0.495 0.724 70 • Diverse complex label datasets
  • 69. Results Task Metric RU ZC CHMM MAS Oracle Translations GLEU 0.185 0.188 - 0.246 Sequences F1 0.561 0.569 0.702 0.827 Parses EVALB 0.812 0.819 - 0.939 Rankings 0.491 0.495 - 0.724 71 • Diverse complex label datasets
  • 70. Results Task Metric RU ZC CHMM MAS Oracle Translations GLEU 0.185 0.188 - 0.217 0.246 Sequences F1 0.561 0.569 0.702 0.709 0.827 Parses EVALB 0.812 0.819 - 0.932 0.939 Rankings 0.491 0.495 - 0.710 0.724 72 • Diverse complex label datasets • MAS aggregation is best way to get closer to ground truth with no model alteration between datasets
  • 71. Conclusion • Goal: general-purpose probabilistic model to aggregate complex annotations – Categorical-based methods insufficient – Custom models difficult to design for new annotation types • Solution: Model annotation distances via task-specific distance functions – Transforms problem into general-purpose variable space • Multi-dimensional Annotation Scaling (MAS) – Allows unsupervised weighted voting with inferred annotator reliability • Not covered in talk (see paper) – Semi-supervised learning – Partial credit 73
  • 72. Ongoing work • Generalization to more tasks (e.g., image bounding boxes & keypoints) • Generalization to simple annotation tasks (”one ring to rule them all”) • Support for multiple latent objects per item • Merging annotations rather than selecting best one – e.g. guessing weight of an ox – MAS vs. non-embedding EM model, varying noise, fewer annotations, … 74 Code & Data: https://github.com/Praznat/annotationmodeling A1: A cat is eating A2: The cat eats A3: A beautiful picture
  • 73. Thank you! 75 Matt Lease (University of Texas at Austin) Lab: ir.ischool.utexas.edu @mattlease Slides: slideshare.net/mattlease We thank our many talented crowd workers for their contributions to our research!