SlideShare a Scribd company logo
1 of 56
Download to read offline
Fairness in Machine Learning: are you
sure there is no bias in your
predictions?
Azzurra Ragone - Innovation and Diversity Advisor
Slides will be shared, follow @azzurraragone
Me…
Innovation and Diversity Advisor
Previous @Google DevRel team
Before Research fellow:
➢ Univ. Milano Bicocca,
➢ University of Michigan
➢ Politecnico of Bari
➢ University of Trento
People worry that computers will get too
smart and take over the world, but the
real problem is that they’re too stupid and
they’ve already taken over the world
The Master Algorithm
Pedro Domingos, 2015
How to make my ML system fair?
...and why care?
Our success, happiness and
wellbeing can be affected by other
decisions
Life-changing decisions:
➔ Admission to schools
➔ Job offers
➔ Patients screenings
➔ Mortgage grant
➔ ...
Arbitrary, inconsistent, or faulty decision-making thus
raises serious concerns because it risks limiting our
ability to achieve the goals that we have set for ourselves
and access the opportunities for which we are qualified.
Fairness and Machine Learning
S. Barocas, M. Hardt, A. Narayanan
How do we ensure that these decisions are
made the right way and for the right reasons?
Fairness and Machine Learning
S. Barocas, M. Hardt, A. Narayanan
The ML promise:
make decisions more consistent,
accurate and rigorous.
B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman.
Object Recognition by Scene Alignment.
Advances in Neural Information Processing Systems, 2007.
...but there are serious risks in learning
from examples.
Generalizing from examples
Source: https://design.google/library/fair-not-default/
Quick, Draw!
Generalizing from examples
Provide good examples:
- a sufficiently large and diverse set
- well annotated
Quick, Draw!
Source: https://design.google/library/fair-not-default/
Historical examples may reflect:
- Prejudices against a social group
- Cultural stereotypes
- Demographic inequalities
and finding patterns in these data means replicating these
same dynamics
Source: https://gluon-cv.mxnet.io/build/examples_datasets/imagenet.html
45% of ImageNet data comes from USA (4% of the world population)
3% of ImageNet data comes from China and India (36% of the world population)
Ref: Nature 559 and Shankar, S. et al. (2017)
Geo bias
Photo Credit: Left: iStock/Getty; Right: Prakash Singh/AFP/Getty (from Nature 559, 324-326 (2018))
Bride
Dress
Woman
Wedding
Performance
art
Costume
Word Embeddings
Debiasing Word Embeddings
Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V. & Kalai, A. Adv. Neural Inf. Proc. Syst. 2016, 4349–4357 (2016).
Credit: Pictures by Pixabay
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
Source: Fairness and Machine Learning
S. Barocas, M. Hardt, A. Narayanan
State of the world
Data
Measurement
The Machine Learning Loop
Provenance of data
is crucial.
Data cleaning is
mandatory.
The world is “messy”
Photo by pasja1000 on Pixabay
Measurement defines:
- your variables of interest,
- the process for turning your
observations into numbers,
- how you actually collect the
data
[Fairness and Machine Learning, 2018]
Photo by Iker Urteaga on Unsplash
The target variable is the
hardest to measure.
It is made up for the purpose
of the problem.
It is not a property that
people possess or lack
Ex. “creditworthiness”, “good
employee”, “attractiveness”
[Fairness and Machine Learning, 2018]
Photo by David Paschke on Unsplash
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
ML will extract
stereotypes the same
way that it extracts
knowledge
ML works better with more data, so it will work less well for
members of minority groups
Sample size disparity
Training set
Training data
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
Predictions - actions - outcome
Photo by Pixabay
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
If you predict future prices (and publicizes them) you create a self-fulfilling
feedback loop: houses with a lower sales prices predicted deter buyers,
demand goes down and the final price is even lower
House price prediction
PhotobyDevaDarshanonUnsplash
Some communities may be disproportionately targeted, with people being
arrested for crimes that might be ignored in other communities.
Ref.: Saunders, J., Hunt, P. & Hollywood, J. S. J. Exp. Criminol. 12, 347–371 (2016).
Self-fulfilling predictions
PhotobyJacquesTiberionPixabay
“Feedback loops occur when data discovered on the
basis of predictions are used to update the model.”
Danielle Ensign et al.,
“Runaway Feedback Loops in Predictive Policing,” 2017
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
Training data encode the demographic disparities in our society and
some stereotypes can be reinforced by ML (due to feedback loop)
The state of society
PhotobyCorySchadtonUnsplash
Solutions?
Bias may lurk in your data...
Analyze your data
Source: Google Machine Learning Crash Course
★ Are there missing feature values for a large number of observations?
★ Are there features that are missing that might affect other features?
★ Are there any unexpected feature values?
★ What signs of data skew do you see?
Missing feature values
Source: California Housing dataset,
Google Machine Learning Crash Course
Skew data (geographical bias)
Source: California Housing dataset,
Google Machine Learning Crash Course
Facets Overview
Source: Facet tool
(https://pair-code.github.io/facets/)
Facets Overview, an
interactive
visualization tool to
explore datasets.
Quickly analyze the
distribution of
values across the
datasets.
Facets Overview
Source: Facet tool
(https://pair-code.github.io/facets/)
⅔ of examples
represent males,
while we would
expect the
breakdown
between
genders to be
closer to 50/50
Facets Dive
Source: Facet tool
(https://pair-code.github.io/facets/)
Data are faceted by
marital-status
feature. Male
outnumbers female
by more than 5:1.
Married women are
underrepresented in
our data.
“What-if” tool
Analyze ML model
without writing code.
Given pointers to a TF
model and a dataset,
the What-If Tool
offers an interactive
visual interface for
exploring model
results.
Counterfactuals
It is possible to
compare a datapoint
to the most similar
point where your
model predicts a
different result.
Counterfactuals
a minor difference in
age and an
occupation change
flipped the model’s
prediction (earning
>50K)
Edit a datapoint
Edit a datapoint and see
how your model performs.
Edit, add or remove
features or feature values
for any selected datapoint
and then run inference to
test model performance.
★ Measurement is crucial
★ Know your data (and how data were collected and annotated)
★ Try to discover hidden biases (missing values, data skew, subgroups, etc.)
★ Ask questions. Don’t train the model and then walk away
★ Avoid feedback loop
★ Use tools that allow you to do such investigation
Key Takeaways
Thanks!
@azzurraragone
❏ AI can be sexist and racist — it’s time to make it fair James Zou &
Londa Schiebinger - Nature 559, 324-326 (2018)
❏ The Master Algorithm Pedro Domingos, 2015
❏ Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
❏ No Classification without Representation: Assessing Geodiversity
Issues in Open Data Sets for the Developing World Shreya Shankar,
Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley
❏ Man is to computer programmer as woman is to homemaker?
Debiasing word embeddings T. Bolukbasi, K.-W. Chang, J. Y. Zou, V.
Saligrama, A. T. Kalai,. Adv. Neural Inf. Process. Syst. 2016,
4349–4357 (2016)
References
❏ There is a blind spot in AI research, Kate Crawford & Ryan Calo, Nature
538, 311–313 (20 October 2016)
❏ Semantics Derived Automatically from Language Corpora Contain
Human-Like Biases, Aylin Caliskan, Joanna J. Bryson, and Arvind
Narayanan, Science 356, no. 6334 (2017): 183–86
❏ Predictions Put Into Practice: a Quasi-experimental Evaluation of
Chicago's Predictive Policing Pilot Saunders, J., Hunt, P. & Hollywood,
J. S. J. Exp. Criminol. 12, 347–371 (2016).
❏ Runaway Feedback Loops in Predictive Policing Danielle Ensign et al.
arXiv:1706.09847
References
❏ Object Recognition by Scene Alignment. B. C. Russell, A. Torralba, C.
Liu, R. Fergus, W. T. Freeman. Advances in Neural Information
Processing Systems, 2007.
❏ Fair Is Not the Default (https://design.google/library/fair-not-default/)
❏ “Playing with fairness” - David Weinberger.
❏ Google Machine Learning Crash Course
❏ What-if tool: https://pair-code.github.io/what-if-tool/
❏ Facet tool https://pair-code.github.io/facets/
References
APPENDIX
★ Group unaware: disregard the gender mix of the applicants, exclude gender and
gender-proxy information from the data set
★ Group thresholds: adjust the confidence thresholds for different group
independently
★ Demographic parity: The composition of the set should reflect the percentage of
applicants
★ Equal opportunity: Individuals who qualify for a desirable outcome should have an
equal chance of being correctly classified for this outcome (=true positive)
★ Equal accuracy: the system ought to be tuned so that the percentage of times it's
wrong in the total of approvals and denials is the same for both groups (=false
positive+false negative)
Types of Fairness
“Playing with fairness”
by David Weinberger.
Computer scientist Arvind Narayanan gave a talk:
“21 fairness definitions and their politics”
Watch it on Youtube!
★ Reporting bias (ex. book reviews)
★ Automation bias
★ Selection bias (ex. phone survey):
○ Coverage bias
○ Non-response bias
○ Sampling bias
★ Group attribution bias (ex. university)
★ Implicit bias (ex. Confirmation bias)
Types of Bias

More Related Content

What's hot

What's hot (8)

IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
Model bias in AI
Model bias in AIModel bias in AI
Model bias in AI
 
Math in data
Math in dataMath in data
Math in data
 
Knowledge base enabled Information Filtering on Social Web -- EMC
Knowledge base enabled Information Filtering on Social Web -- EMCKnowledge base enabled Information Filtering on Social Web -- EMC
Knowledge base enabled Information Filtering on Social Web -- EMC
 
How do Learning Analytics “act” in Education?
How do Learning Analytics “act” in Education?How do Learning Analytics “act” in Education?
How do Learning Analytics “act” in Education?
 
How to create a taxonomy for management buy-in
How to create a taxonomy for management buy-inHow to create a taxonomy for management buy-in
How to create a taxonomy for management buy-in
 
Lies, Damn Lies, and Big Data
Lies, Damn Lies, and Big DataLies, Damn Lies, and Big Data
Lies, Damn Lies, and Big Data
 
Growth, Engagement & Search Metrics: Snake Oil or North Stars
Growth, Engagement & Search Metrics: Snake Oil or North StarsGrowth, Engagement & Search Metrics: Snake Oil or North Stars
Growth, Engagement & Search Metrics: Snake Oil or North Stars
 

Similar to Don't blindly trust your ML System, it may change your life (Azzurra Ragone, Independent consultant)

Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Krishnaram Kenthapadi
 

Similar to Don't blindly trust your ML System, it may change your life (Azzurra Ragone, Independent consultant) (20)

Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine Learning
 
A Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) EducationA Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) Education
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
The Business Value of Reinforcement Learning and Causal Inference
The Business Value of Reinforcement Learning and Causal InferenceThe Business Value of Reinforcement Learning and Causal Inference
The Business Value of Reinforcement Learning and Causal Inference
 
Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?
 
How AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinksHow AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinks
 
Ramon van den Akker. Fairness of machine learning models an overview and prac...
Ramon van den Akker. Fairness of machine learning models an overview and prac...Ramon van den Akker. Fairness of machine learning models an overview and prac...
Ramon van den Akker. Fairness of machine learning models an overview and prac...
 
A brave new world: student surveillance in higher education
A brave new world: student surveillance in higher educationA brave new world: student surveillance in higher education
A brave new world: student surveillance in higher education
 
Neo4j - Responsible AI
Neo4j - Responsible AINeo4j - Responsible AI
Neo4j - Responsible AI
 
Scientific Method to Hire Great Scrum Masters
Scientific Method to Hire Great Scrum MastersScientific Method to Hire Great Scrum Masters
Scientific Method to Hire Great Scrum Masters
 
The Hidden Stories of Missing Data
The Hidden Stories of Missing DataThe Hidden Stories of Missing Data
The Hidden Stories of Missing Data
 
Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
 Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De... Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative Space
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Big data
Big dataBig data
Big data
 
Machine learning and_buzzwords
Machine learning and_buzzwordsMachine learning and_buzzwords
Machine learning and_buzzwords
 
Zombie categories, broken data and biased algorithms: What else can go wrong?...
Zombie categories, broken data and biased algorithms: What else can go wrong?...Zombie categories, broken data and biased algorithms: What else can go wrong?...
Zombie categories, broken data and biased algorithms: What else can go wrong?...
 

More from Data Driven Innovation

More from Data Driven Innovation (20)

Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
 
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
 
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
 
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
 
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
 
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
 
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
 
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
 
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
 
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
 
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
 
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
 
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
 
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Big Data Confederation: toward the local urban data market place (Renzo Taffa...Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
 
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
 
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
 
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
 
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
 
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
 
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
 

Recently uploaded

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 

Recently uploaded (20)

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 

Don't blindly trust your ML System, it may change your life (Azzurra Ragone, Independent consultant)

  • 1. Fairness in Machine Learning: are you sure there is no bias in your predictions? Azzurra Ragone - Innovation and Diversity Advisor Slides will be shared, follow @azzurraragone
  • 2. Me… Innovation and Diversity Advisor Previous @Google DevRel team Before Research fellow: ➢ Univ. Milano Bicocca, ➢ University of Michigan ➢ Politecnico of Bari ➢ University of Trento
  • 3. People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world The Master Algorithm Pedro Domingos, 2015
  • 4. How to make my ML system fair? ...and why care?
  • 5. Our success, happiness and wellbeing can be affected by other decisions
  • 6. Life-changing decisions: ➔ Admission to schools ➔ Job offers ➔ Patients screenings ➔ Mortgage grant ➔ ...
  • 7. Arbitrary, inconsistent, or faulty decision-making thus raises serious concerns because it risks limiting our ability to achieve the goals that we have set for ourselves and access the opportunities for which we are qualified. Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
  • 8. How do we ensure that these decisions are made the right way and for the right reasons? Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
  • 9. The ML promise: make decisions more consistent, accurate and rigorous.
  • 10. B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman. Object Recognition by Scene Alignment. Advances in Neural Information Processing Systems, 2007.
  • 11. ...but there are serious risks in learning from examples.
  • 12. Generalizing from examples Source: https://design.google/library/fair-not-default/ Quick, Draw!
  • 13. Generalizing from examples Provide good examples: - a sufficiently large and diverse set - well annotated Quick, Draw! Source: https://design.google/library/fair-not-default/
  • 14. Historical examples may reflect: - Prejudices against a social group - Cultural stereotypes - Demographic inequalities and finding patterns in these data means replicating these same dynamics
  • 16. 45% of ImageNet data comes from USA (4% of the world population) 3% of ImageNet data comes from China and India (36% of the world population) Ref: Nature 559 and Shankar, S. et al. (2017) Geo bias
  • 17. Photo Credit: Left: iStock/Getty; Right: Prakash Singh/AFP/Getty (from Nature 559, 324-326 (2018)) Bride Dress Woman Wedding Performance art Costume
  • 19. Debiasing Word Embeddings Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V. & Kalai, A. Adv. Neural Inf. Proc. Syst. 2016, 4349–4357 (2016). Credit: Pictures by Pixabay
  • 20. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop Source: Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
  • 21. State of the world Data Measurement The Machine Learning Loop
  • 22. Provenance of data is crucial. Data cleaning is mandatory. The world is “messy” Photo by pasja1000 on Pixabay
  • 23. Measurement defines: - your variables of interest, - the process for turning your observations into numbers, - how you actually collect the data [Fairness and Machine Learning, 2018] Photo by Iker Urteaga on Unsplash
  • 24. The target variable is the hardest to measure. It is made up for the purpose of the problem. It is not a property that people possess or lack Ex. “creditworthiness”, “good employee”, “attractiveness” [Fairness and Machine Learning, 2018] Photo by David Paschke on Unsplash
  • 25. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 26. ML will extract stereotypes the same way that it extracts knowledge
  • 27. ML works better with more data, so it will work less well for members of minority groups Sample size disparity Training set Training data
  • 28. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 29. Predictions - actions - outcome Photo by Pixabay
  • 30. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 31. If you predict future prices (and publicizes them) you create a self-fulfilling feedback loop: houses with a lower sales prices predicted deter buyers, demand goes down and the final price is even lower House price prediction PhotobyDevaDarshanonUnsplash
  • 32. Some communities may be disproportionately targeted, with people being arrested for crimes that might be ignored in other communities. Ref.: Saunders, J., Hunt, P. & Hollywood, J. S. J. Exp. Criminol. 12, 347–371 (2016). Self-fulfilling predictions PhotobyJacquesTiberionPixabay
  • 33. “Feedback loops occur when data discovered on the basis of predictions are used to update the model.” Danielle Ensign et al., “Runaway Feedback Loops in Predictive Policing,” 2017
  • 34. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 35. Training data encode the demographic disparities in our society and some stereotypes can be reinforced by ML (due to feedback loop) The state of society PhotobyCorySchadtonUnsplash
  • 37. Bias may lurk in your data...
  • 38. Analyze your data Source: Google Machine Learning Crash Course ★ Are there missing feature values for a large number of observations? ★ Are there features that are missing that might affect other features? ★ Are there any unexpected feature values? ★ What signs of data skew do you see?
  • 39. Missing feature values Source: California Housing dataset, Google Machine Learning Crash Course
  • 40. Skew data (geographical bias) Source: California Housing dataset, Google Machine Learning Crash Course
  • 41. Facets Overview Source: Facet tool (https://pair-code.github.io/facets/) Facets Overview, an interactive visualization tool to explore datasets. Quickly analyze the distribution of values across the datasets.
  • 42. Facets Overview Source: Facet tool (https://pair-code.github.io/facets/) ⅔ of examples represent males, while we would expect the breakdown between genders to be closer to 50/50
  • 43. Facets Dive Source: Facet tool (https://pair-code.github.io/facets/) Data are faceted by marital-status feature. Male outnumbers female by more than 5:1. Married women are underrepresented in our data.
  • 44. “What-if” tool Analyze ML model without writing code. Given pointers to a TF model and a dataset, the What-If Tool offers an interactive visual interface for exploring model results.
  • 45. Counterfactuals It is possible to compare a datapoint to the most similar point where your model predicts a different result.
  • 46. Counterfactuals a minor difference in age and an occupation change flipped the model’s prediction (earning >50K)
  • 47. Edit a datapoint Edit a datapoint and see how your model performs. Edit, add or remove features or feature values for any selected datapoint and then run inference to test model performance.
  • 48. ★ Measurement is crucial ★ Know your data (and how data were collected and annotated) ★ Try to discover hidden biases (missing values, data skew, subgroups, etc.) ★ Ask questions. Don’t train the model and then walk away ★ Avoid feedback loop ★ Use tools that allow you to do such investigation Key Takeaways
  • 50. ❏ AI can be sexist and racist — it’s time to make it fair James Zou & Londa Schiebinger - Nature 559, 324-326 (2018) ❏ The Master Algorithm Pedro Domingos, 2015 ❏ Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan ❏ No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley ❏ Man is to computer programmer as woman is to homemaker? Debiasing word embeddings T. Bolukbasi, K.-W. Chang, J. Y. Zou, V. Saligrama, A. T. Kalai,. Adv. Neural Inf. Process. Syst. 2016, 4349–4357 (2016) References
  • 51. ❏ There is a blind spot in AI research, Kate Crawford & Ryan Calo, Nature 538, 311–313 (20 October 2016) ❏ Semantics Derived Automatically from Language Corpora Contain Human-Like Biases, Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan, Science 356, no. 6334 (2017): 183–86 ❏ Predictions Put Into Practice: a Quasi-experimental Evaluation of Chicago's Predictive Policing Pilot Saunders, J., Hunt, P. & Hollywood, J. S. J. Exp. Criminol. 12, 347–371 (2016). ❏ Runaway Feedback Loops in Predictive Policing Danielle Ensign et al. arXiv:1706.09847 References
  • 52. ❏ Object Recognition by Scene Alignment. B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman. Advances in Neural Information Processing Systems, 2007. ❏ Fair Is Not the Default (https://design.google/library/fair-not-default/) ❏ “Playing with fairness” - David Weinberger. ❏ Google Machine Learning Crash Course ❏ What-if tool: https://pair-code.github.io/what-if-tool/ ❏ Facet tool https://pair-code.github.io/facets/ References
  • 54. ★ Group unaware: disregard the gender mix of the applicants, exclude gender and gender-proxy information from the data set ★ Group thresholds: adjust the confidence thresholds for different group independently ★ Demographic parity: The composition of the set should reflect the percentage of applicants ★ Equal opportunity: Individuals who qualify for a desirable outcome should have an equal chance of being correctly classified for this outcome (=true positive) ★ Equal accuracy: the system ought to be tuned so that the percentage of times it's wrong in the total of approvals and denials is the same for both groups (=false positive+false negative) Types of Fairness “Playing with fairness” by David Weinberger.
  • 55. Computer scientist Arvind Narayanan gave a talk: “21 fairness definitions and their politics” Watch it on Youtube!
  • 56. ★ Reporting bias (ex. book reviews) ★ Automation bias ★ Selection bias (ex. phone survey): ○ Coverage bias ○ Non-response bias ○ Sampling bias ★ Group attribution bias (ex. university) ★ Implicit bias (ex. Confirmation bias) Types of Bias