SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Data Excellence:
Better Data for Better AI
ODSC 2020
Lora Aroyo
http://lora-aroyo.org
@laroyo
By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN
0-8109-6720-0) by Justin Foote (talk)., Fair use,
https://en.wikipedia.org/w/index.php?curid=3955850
http://lora-aroyo.org @laroyo
TAKE HOME MESSAGE
2
data lifecycle - just like in software - is needed to
guide data research & development practices
data is the compass for AI - AI advances where
there is data
data is at the center - AI systems success
depends on the quality of their data
https://en.wikipedia.org/wiki/Metamorphosis_II
data quality must be addressed in AI practices
- multitude of notions of truth
- necessity for data quality standards
data lifecycle is the backbone for data
excellence tools and practices to stay ahead of
future unintended AI behaviours
http://lora-aroyo.org @laroyo 3
The Rise of the Machines
“AI Winter”
lab experiments
Expert Systems
small scale
experiments
http://lora-aroyo.org @laroyo 4
The Rise of the Machines
“AI Winter” → “AI Breakthroughs in Games”
IBM Watson Jeopardy
DeepMind AlphaGo
beat the humans
http://lora-aroyo.org @laroyo 5
The Rise of the Machines
“AI Winter” → “AI Breakthroughs in Games” → “Real World Tasks”
Health diagnostics
Flue prediction
Weather prediction
Text, Image and Video classification
Text Generation
Text Translation
Conversational AI
support the humans
http://lora-aroyo.org @laroyo 6
Mainstream Deployment of AI
“Real World Tasks” deployed in the wild → Unintended behaviors
Microsoft Tay bot
IBM Watson Oncology
Amazon Rekognition
Google Photos
Apple Face ID
Facebook chat bots
Various Speech Assistants
http://lora-aroyo.org @laroyo 7
getting computers to “see”
the diversity of data
data quality is essential for
guiding AI away from
unintended behaviours
Data is the compass for AI
http://lora-aroyo.org @laroyo 8
The Life of AI Data
“It exists!”
bootstrapping AI with data
Caltech101
LabelMe
Berkley-3D
https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
http://lora-aroyo.org @laroyo 9
The Life of AI Data
“It exists!” → “It is bigger!”
data hungry AI
ImageNet
SIFT10M
OpenImages
COCO
Web 1T 5-Gram
https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
http://lora-aroyo.org @laroyo 10
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
http://lora-aroyo.org @laroyo 11
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
it got worse ...
http://lora-aroyo.org @laroyo 12
Unintended Behaviors in AI
Adapted from “AI in the Open World: Discovering Blind Spots of AI”, SafeAI 2020, Ece Kumar
http://lora-aroyo.org @laroyo 13
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
reactive
data improvement
http://lora-aroyo.org @laroyo 14
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
to reach here
we need proactive
data improvement
http://lora-aroyo.org @laroyo 15
The Life of AI Data
Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems 24, 2 (2009)
In the decade since then, the research community have done a lot
with quantity, but quality has been left behind
http://lora-aroyo.org @laroyo 16
In the 90’s we introduced standards
to achieve Software reliability
introduced software engineering lifecycle
- requirements, design and testing
established processes for software maintenance
- version control, sharing, documenting
established software quality metrics & processes
Ben Hutchinson, 2020
http://lora-aroyo.org @laroyo 17
Now we need the same for Data
introduce data lifecycle
- requirements, design and testing
establish processes for dataset maintenance
- version control, sharing, documenting
establish data quality metrics & processes
Ben Hutchinson, 2020
http://lora-aroyo.org @laroyo 18
data quality is typically not
caused by software bugs or just
by human errors
dataset are not easy to debug
data quality is typically result of:
- how well a dataset
represent the actual task
- how is the annotation done
- are the quality metrics
adequate
Data Quality is not easy ...
http://lora-aroyo.org @laroyo
it is not easy to give Y/N answer
for most of our AI tasks
19
Do these images depict a GUITAR ?
Data Quality is not only human error
✓
✓ ✓
✘
✘
✘✘✓
✓
http://lora-aroyo.org @laroyo 20
Do these images depict NEW ZEALAND ?
Data Quality should consider context of use
it is not easy to give Y/N answer
for most of our AI tasks
the answer typically depends on
the context, on the task, on the
usage, etc
✓ ✘
✓ ✓ ✘
✘
http://lora-aroyo.org @laroyo 21
Do these images depict a WEDDING ?
Data Quality should include real world diversity
it is not easy to give Y/N answer
for most of our AI tasks
the answer typically depends on
the context, on the task, on the
usage, etc
disagreement is signal for
diversity and should be included
in AI training
✓
✘
✓
✓
✘
✓
http://lora-aroyo.org @laroyo 22
Does the Sentence expresses
Does the sentence express TREATS relation between Chloroquine, Malaria?
Data Quality is difficult even with experts
For prevention of malaria, use only in individuals traveling to malarious
areas where CHLOROQUINE resistant P. falciparum MALARIA
has not been reported.
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Among 56 subjects reporting to a clinic with symptoms of MALARIA
53 (95%) had ordinarily effective levels of CHLOROQUINE in blood.
✓
✘
✓
http://lora-aroyo.org @laroyo
DISAGREEMENT IS SIGNAL
Variety of sources for disagreement
http://lora-aroyo.org @laroyo 24
Does the Sentence expresses
Model of semantic interpretation
TRIANGLE OF MEANING
“Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty
Workshop on “Subjectivity, Ambiguity and Disagreement (SAD) in Crowdsourcing”, The Web Conference 2019, https://sadworkshop.wordpress.com/
Annotator disagreement
is signal, not noise
Annotator disagreement
is indicative of
variation in human
interpretation
Annotator disagreement
is indicative of
ambiguity, vagueness,
similarity, over-generality,
& quality
http://lora-aroyo.org @laroyo 25
Three sides of human interpretation
CROWDTRUTH Disagreement provides
guidance in task analysis:
● items with poor semantics
● items with salient terms
● items difficult to classify
● items that are ambiguous
● subjective annotations
● time-sensitive annotations
● difficult annotation tasks
● mis-translated annotations
● users with/without
specific knowledge
● communities of thought
● spammers
You can’t remove the corners…
“Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo
THE WORLD IS A SMOOTH SPECTRUM OF TRUTH
26
http://lora-aroyo.org @laroyo 27
One truth: knowledge acquisition typically assumes one
correct interpretation for every example
Experts rule: knowledge is captured from domain experts
One is enough: single expert’s knowledge is sufficient
Disagreement bad: when people disagree, they must not
understand the problem
Detailed explanations help: if examples cause
disagreement - adding instructions should help
Once done, forever valid: knowledge is not updated; new
data not aligned with old
All examples are created equal: triples are triples, one is
not more important than another, they are all either true or
false
… and we force the smoothness into a binary form
7 Myths about Human Annotation
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 28
High Quality Data
represents a phenomena
accurately and consistently over time
and is replicable, reproducible,
and maintainable over time;
has empirical and explanatory power;
and is collected, stored, and used
responsibly.
Rigorous Evaluation of AI Systems workshop, 2019, Human Computation (HCOMP), http://eval.how/
Evaluating Evaluation for AI Systems workshop, 2020, Association for the Advancement of Artificial Intelligence (AAAI), http://eval.how/aaai-2020/
http://lora-aroyo.org @laroyo 29
From Data Quality to Data Excellence
Data Quality is
- a point-estimate of goodness of data
Data Excellence is
- the set of practices and tools that result in
high quality data
http://lora-aroyo.org @laroyo 30
How do we achieve Data Excellence?
Maintainability
Well documented datasets with
owners, which follow best practices
for data at any scale.
Reproducibility
Basic and critical regression tests
for datasets which suppo solid
conclusions for decision making.
Reliability
Datasets which are internally sound
and consistent; factors that a ect
the data are addressed or disclosed.
Fidelity
Data which faithfully, accurately, and
comprehensively represents the
captured phenomenon.
Validity
Datasets which explain aspects of
the phenomena that they represent
in terms of external measures.
1st International Workshop on Data Excellence: http://eval.how/dew2020/
Utility
Data which adequately and
accurately achieves the intended
product behavior.
http://lora-aroyo.org @laroyo 31
much like in software lifecycles, cutting corners at each stage
cascades to subsequent versions, which lead to technical debt
Dataset [Requirements] Analysis
Requirements Analysis
Stakeholder Input
Privacy, compliance
Trust & safety planning
Dataset Maintenance
Updating data over time
Extending to other languages
Version control
Storage and accessibility
Dataset Design
Data acquisition methodology
Rater guidelines
Construct validation
Dataset Testing
Representation metrics
Fairness metrics
Reliability metrics
Approval process
Dataset Implementation
Human labeled data
Logging interaction data
Data
Lifecycle
Ben Hutchinson, 2020
http://lora-aroyo.org @laroyo
TAKE HOME MESSAGE
32
https://en.wikipedia.org/wiki/Metamorphosis_II
data lifecycle - just like in software - is needed to
guide data research & development practices
data is the compass for AI - AI advances where
there is data
data is at the center - AI systems success
depends on the quality of their data
data quality must be addressed in AI practices
- multitude of notions of truth
- necessity for data quality standards
data lifecycle is the backbone for data
excellence tools and practices to stay ahead of
future unintended AI behaviours
http://lora-aroyo.org @laroyo 33
Collaborators
EthicalAI
Ben Hutchinson
Crowd Platform
Amol Wankhede
Anurag Batra
People + AI Research (PAIR)
Nithya Sambasivan
Kristen Olson
Shivani Kapania
Jess Holbrook
Andrew Zaldivar
Mahima Pushkarna
Maysam Moussalem
Praveen Paritosh Ka Wong
Lora Aroyo Devi Krishna
Likert team
Data Excellence:
Better Data for Better AI
ODSC 2020
Lora Aroyo
http://lora-aroyo.org
@laroyo
By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN
0-8109-6720-0) by Justin Foote (talk)., Fair use,
https://en.wikipedia.org/w/index.php?curid=3955850
high profile data failure
not bugs in the software, not mistake of humans
problems caused by quality in the data
just like software quality in 90’s - the same has to happen with data
examples of questionable data
crowdtruth relation extraction
how would you annotate it
how do we know and measure the quality of the data
how well does it represent the actual task we are trying to solve
like software we need to establish data quality standards

Weitere ähnliche Inhalte

Was ist angesagt?

Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
State of the Cloud 2023—The AI era
State of the Cloud 2023—The AI eraState of the Cloud 2023—The AI era
State of the Cloud 2023—The AI era
Bessemer Venture Partners
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
Krishnaram Kenthapadi
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Krishnaram Kenthapadi
 

Was ist angesagt? (20)

AI Transformation
AI TransformationAI Transformation
AI Transformation
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 
mlops.community meetup - ML Governance_ A Practical Guide.pptx
mlops.community meetup - ML Governance_ A Practical Guide.pptxmlops.community meetup - ML Governance_ A Practical Guide.pptx
mlops.community meetup - ML Governance_ A Practical Guide.pptx
 
Product Management for AI by Google PM
Product Management for AI by Google PMProduct Management for AI by Google PM
Product Management for AI by Google PM
 
AI Product Thinking for Product Managers
AI Product Thinking for Product Managers AI Product Thinking for Product Managers
AI Product Thinking for Product Managers
 
The Data Unicorns
The Data UnicornsThe Data Unicorns
The Data Unicorns
 
How to Create a Data Analytics Roadmap
How to Create a Data Analytics RoadmapHow to Create a Data Analytics Roadmap
How to Create a Data Analytics Roadmap
 
10 Key Considerations for AI/ML Model Governance
10 Key Considerations for AI/ML Model Governance10 Key Considerations for AI/ML Model Governance
10 Key Considerations for AI/ML Model Governance
 
Data-Driven Decision Making: Trends, Challenges, and Solutions
Data-Driven Decision Making: Trends, Challenges, and SolutionsData-Driven Decision Making: Trends, Challenges, and Solutions
Data-Driven Decision Making: Trends, Challenges, and Solutions
 
What is Product vs. Platform Product Management by Oracle PM
What is Product vs. Platform Product Management by Oracle PMWhat is Product vs. Platform Product Management by Oracle PM
What is Product vs. Platform Product Management by Oracle PM
 
AI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for SuccessAI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for Success
 
Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking
 
Determine Your Data Strategy
Determine Your Data StrategyDetermine Your Data Strategy
Determine Your Data Strategy
 
The Data Driven Enterprise - Roadmap to Big Data & Analytics Success
The Data Driven Enterprise - Roadmap to Big Data & Analytics SuccessThe Data Driven Enterprise - Roadmap to Big Data & Analytics Success
The Data Driven Enterprise - Roadmap to Big Data & Analytics Success
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
 
State of the Cloud 2023—The AI era
State of the Cloud 2023—The AI eraState of the Cloud 2023—The AI era
State of the Cloud 2023—The AI era
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
 
Product Management for AI
Product Management for AIProduct Management for AI
Product Management for AI
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 

Ähnlich wie Data excellence: Better data for better AI

EIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptxEIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptx
Earley Information Science
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
Blogtalk 2008
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
Lora Aroyo
 

Ähnlich wie Data excellence: Better data for better AI (20)

Technology Governance & Migration In The AI Era
Technology Governance & Migration In The AI EraTechnology Governance & Migration In The AI Era
Technology Governance & Migration In The AI Era
 
Knowledge Graphs, Ontologies, and AI Applications
Knowledge Graphs, Ontologies, and AI ApplicationsKnowledge Graphs, Ontologies, and AI Applications
Knowledge Graphs, Ontologies, and AI Applications
 
UX in the Age of AI: Leading with Design
UX in the Age of AI: Leading with DesignUX in the Age of AI: Leading with Design
UX in the Age of AI: Leading with Design
 
UX in the Age of AI: Leading with Design UXPA2018
UX in the Age of AI: Leading with Design UXPA2018UX in the Age of AI: Leading with Design UXPA2018
UX in the Age of AI: Leading with Design UXPA2018
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive Computing
 
Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019
Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019
Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019
 
Designing Trustable AI Experiences at World Usability Day in Cleveland
Designing Trustable AI Experiences at World Usability Day in ClevelandDesigning Trustable AI Experiences at World Usability Day in Cleveland
Designing Trustable AI Experiences at World Usability Day in Cleveland
 
EIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptxEIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptx
 
Artificial Intelligence (AI) – Powering Data and Conversations.pptx
Artificial Intelligence (AI) – Powering Data and Conversations.pptxArtificial Intelligence (AI) – Powering Data and Conversations.pptx
Artificial Intelligence (AI) – Powering Data and Conversations.pptx
 
Prepping the Analytics organization for Artificial Intelligence evolution
Prepping the Analytics organization for Artificial Intelligence evolutionPrepping the Analytics organization for Artificial Intelligence evolution
Prepping the Analytics organization for Artificial Intelligence evolution
 
Interactive XAI for ODSC East 2023
Interactive XAI for ODSC East 2023Interactive XAI for ODSC East 2023
Interactive XAI for ODSC East 2023
 
Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808
Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808
Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808
 
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
 
Streamlining Information Flows In The Digital Workplace
Streamlining Information Flows In The Digital WorkplaceStreamlining Information Flows In The Digital Workplace
Streamlining Information Flows In The Digital Workplace
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open Source
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
 
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in BostonDesigning AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
 
Practical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in CybersecurityPractical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in Cybersecurity
 
A Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) EducationA Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) Education
 

Mehr von Lora Aroyo

Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Lora Aroyo
 

Mehr von Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Data excellence: Better data for better AI

  • 1. Data Excellence: Better Data for Better AI ODSC 2020 Lora Aroyo http://lora-aroyo.org @laroyo By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN 0-8109-6720-0) by Justin Foote (talk)., Fair use, https://en.wikipedia.org/w/index.php?curid=3955850
  • 2. http://lora-aroyo.org @laroyo TAKE HOME MESSAGE 2 data lifecycle - just like in software - is needed to guide data research & development practices data is the compass for AI - AI advances where there is data data is at the center - AI systems success depends on the quality of their data https://en.wikipedia.org/wiki/Metamorphosis_II data quality must be addressed in AI practices - multitude of notions of truth - necessity for data quality standards data lifecycle is the backbone for data excellence tools and practices to stay ahead of future unintended AI behaviours
  • 3. http://lora-aroyo.org @laroyo 3 The Rise of the Machines “AI Winter” lab experiments Expert Systems small scale experiments
  • 4. http://lora-aroyo.org @laroyo 4 The Rise of the Machines “AI Winter” → “AI Breakthroughs in Games” IBM Watson Jeopardy DeepMind AlphaGo beat the humans
  • 5. http://lora-aroyo.org @laroyo 5 The Rise of the Machines “AI Winter” → “AI Breakthroughs in Games” → “Real World Tasks” Health diagnostics Flue prediction Weather prediction Text, Image and Video classification Text Generation Text Translation Conversational AI support the humans
  • 6. http://lora-aroyo.org @laroyo 6 Mainstream Deployment of AI “Real World Tasks” deployed in the wild → Unintended behaviors Microsoft Tay bot IBM Watson Oncology Amazon Rekognition Google Photos Apple Face ID Facebook chat bots Various Speech Assistants
  • 7. http://lora-aroyo.org @laroyo 7 getting computers to “see” the diversity of data data quality is essential for guiding AI away from unintended behaviours Data is the compass for AI
  • 8. http://lora-aroyo.org @laroyo 8 The Life of AI Data “It exists!” bootstrapping AI with data Caltech101 LabelMe Berkley-3D https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  • 9. http://lora-aroyo.org @laroyo 9 The Life of AI Data “It exists!” → “It is bigger!” data hungry AI ImageNet SIFT10M OpenImages COCO Web 1T 5-Gram https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  • 10. http://lora-aroyo.org @laroyo 10 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ...
  • 11. http://lora-aroyo.org @laroyo 11 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ... it got worse ...
  • 12. http://lora-aroyo.org @laroyo 12 Unintended Behaviors in AI Adapted from “AI in the Open World: Discovering Blind Spots of AI”, SafeAI 2020, Ece Kumar
  • 13. http://lora-aroyo.org @laroyo 13 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ... reactive data improvement
  • 14. http://lora-aroyo.org @laroyo 14 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” to reach here we need proactive data improvement
  • 15. http://lora-aroyo.org @laroyo 15 The Life of AI Data Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems 24, 2 (2009) In the decade since then, the research community have done a lot with quantity, but quality has been left behind
  • 16. http://lora-aroyo.org @laroyo 16 In the 90’s we introduced standards to achieve Software reliability introduced software engineering lifecycle - requirements, design and testing established processes for software maintenance - version control, sharing, documenting established software quality metrics & processes Ben Hutchinson, 2020
  • 17. http://lora-aroyo.org @laroyo 17 Now we need the same for Data introduce data lifecycle - requirements, design and testing establish processes for dataset maintenance - version control, sharing, documenting establish data quality metrics & processes Ben Hutchinson, 2020
  • 18. http://lora-aroyo.org @laroyo 18 data quality is typically not caused by software bugs or just by human errors dataset are not easy to debug data quality is typically result of: - how well a dataset represent the actual task - how is the annotation done - are the quality metrics adequate Data Quality is not easy ...
  • 19. http://lora-aroyo.org @laroyo it is not easy to give Y/N answer for most of our AI tasks 19 Do these images depict a GUITAR ? Data Quality is not only human error ✓ ✓ ✓ ✘ ✘ ✘✘✓ ✓
  • 20. http://lora-aroyo.org @laroyo 20 Do these images depict NEW ZEALAND ? Data Quality should consider context of use it is not easy to give Y/N answer for most of our AI tasks the answer typically depends on the context, on the task, on the usage, etc ✓ ✘ ✓ ✓ ✘ ✘
  • 21. http://lora-aroyo.org @laroyo 21 Do these images depict a WEDDING ? Data Quality should include real world diversity it is not easy to give Y/N answer for most of our AI tasks the answer typically depends on the context, on the task, on the usage, etc disagreement is signal for diversity and should be included in AI training ✓ ✘ ✓ ✓ ✘ ✓
  • 22. http://lora-aroyo.org @laroyo 22 Does the Sentence expresses Does the sentence express TREATS relation between Chloroquine, Malaria? Data Quality is difficult even with experts For prevention of malaria, use only in individuals traveling to malarious areas where CHLOROQUINE resistant P. falciparum MALARIA has not been reported. Rheumatoid arthritis and MALARIA have been treated with CHLOROQUINE for decades. Among 56 subjects reporting to a clinic with symptoms of MALARIA 53 (95%) had ordinarily effective levels of CHLOROQUINE in blood. ✓ ✘ ✓
  • 23. http://lora-aroyo.org @laroyo DISAGREEMENT IS SIGNAL Variety of sources for disagreement
  • 24. http://lora-aroyo.org @laroyo 24 Does the Sentence expresses Model of semantic interpretation TRIANGLE OF MEANING “Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty Workshop on “Subjectivity, Ambiguity and Disagreement (SAD) in Crowdsourcing”, The Web Conference 2019, https://sadworkshop.wordpress.com/ Annotator disagreement is signal, not noise Annotator disagreement is indicative of variation in human interpretation Annotator disagreement is indicative of ambiguity, vagueness, similarity, over-generality, & quality
  • 25. http://lora-aroyo.org @laroyo 25 Three sides of human interpretation CROWDTRUTH Disagreement provides guidance in task analysis: ● items with poor semantics ● items with salient terms ● items difficult to classify ● items that are ambiguous ● subjective annotations ● time-sensitive annotations ● difficult annotation tasks ● mis-translated annotations ● users with/without specific knowledge ● communities of thought ● spammers You can’t remove the corners… “Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty
  • 26. http://lora-aroyo.org @laroyo THE WORLD IS A SMOOTH SPECTRUM OF TRUTH 26
  • 27. http://lora-aroyo.org @laroyo 27 One truth: knowledge acquisition typically assumes one correct interpretation for every example Experts rule: knowledge is captured from domain experts One is enough: single expert’s knowledge is sufficient Disagreement bad: when people disagree, they must not understand the problem Detailed explanations help: if examples cause disagreement - adding instructions should help Once done, forever valid: knowledge is not updated; new data not aligned with old All examples are created equal: triples are triples, one is not more important than another, they are all either true or false … and we force the smoothness into a binary form 7 Myths about Human Annotation “Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
  • 28. http://lora-aroyo.org @laroyo 28 High Quality Data represents a phenomena accurately and consistently over time and is replicable, reproducible, and maintainable over time; has empirical and explanatory power; and is collected, stored, and used responsibly. Rigorous Evaluation of AI Systems workshop, 2019, Human Computation (HCOMP), http://eval.how/ Evaluating Evaluation for AI Systems workshop, 2020, Association for the Advancement of Artificial Intelligence (AAAI), http://eval.how/aaai-2020/
  • 29. http://lora-aroyo.org @laroyo 29 From Data Quality to Data Excellence Data Quality is - a point-estimate of goodness of data Data Excellence is - the set of practices and tools that result in high quality data
  • 30. http://lora-aroyo.org @laroyo 30 How do we achieve Data Excellence? Maintainability Well documented datasets with owners, which follow best practices for data at any scale. Reproducibility Basic and critical regression tests for datasets which suppo solid conclusions for decision making. Reliability Datasets which are internally sound and consistent; factors that a ect the data are addressed or disclosed. Fidelity Data which faithfully, accurately, and comprehensively represents the captured phenomenon. Validity Datasets which explain aspects of the phenomena that they represent in terms of external measures. 1st International Workshop on Data Excellence: http://eval.how/dew2020/ Utility Data which adequately and accurately achieves the intended product behavior.
  • 31. http://lora-aroyo.org @laroyo 31 much like in software lifecycles, cutting corners at each stage cascades to subsequent versions, which lead to technical debt Dataset [Requirements] Analysis Requirements Analysis Stakeholder Input Privacy, compliance Trust & safety planning Dataset Maintenance Updating data over time Extending to other languages Version control Storage and accessibility Dataset Design Data acquisition methodology Rater guidelines Construct validation Dataset Testing Representation metrics Fairness metrics Reliability metrics Approval process Dataset Implementation Human labeled data Logging interaction data Data Lifecycle Ben Hutchinson, 2020
  • 32. http://lora-aroyo.org @laroyo TAKE HOME MESSAGE 32 https://en.wikipedia.org/wiki/Metamorphosis_II data lifecycle - just like in software - is needed to guide data research & development practices data is the compass for AI - AI advances where there is data data is at the center - AI systems success depends on the quality of their data data quality must be addressed in AI practices - multitude of notions of truth - necessity for data quality standards data lifecycle is the backbone for data excellence tools and practices to stay ahead of future unintended AI behaviours
  • 33. http://lora-aroyo.org @laroyo 33 Collaborators EthicalAI Ben Hutchinson Crowd Platform Amol Wankhede Anurag Batra People + AI Research (PAIR) Nithya Sambasivan Kristen Olson Shivani Kapania Jess Holbrook Andrew Zaldivar Mahima Pushkarna Maysam Moussalem Praveen Paritosh Ka Wong Lora Aroyo Devi Krishna Likert team
  • 34. Data Excellence: Better Data for Better AI ODSC 2020 Lora Aroyo http://lora-aroyo.org @laroyo By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN 0-8109-6720-0) by Justin Foote (talk)., Fair use, https://en.wikipedia.org/w/index.php?curid=3955850
  • 35. high profile data failure not bugs in the software, not mistake of humans problems caused by quality in the data just like software quality in 90’s - the same has to happen with data examples of questionable data crowdtruth relation extraction how would you annotate it how do we know and measure the quality of the data how well does it represent the actual task we are trying to solve like software we need to establish data quality standards