Shift AI was a success, connecting hundreds of professionals that were eager to propel the progress of AI and discuss the newest technologies in data mining, machine learning and neural networks. More at https://ai.shiftconf.co/.
Talk description:
With all the breakthroughs in Machine Learning space, ML models are now being used to make decisions affecting the lives of humans, more than ever. Hence judging the quality of a model can no longer only fulfilled by accuracy, precision, and recall. It's important to understand that each individual and group of people is being treated with equality without any historical bias existed in the data. This talk focuses on some of the many potential ways to establish fairness as metrics for ML models in your organization. Also, my learnings and challenges, I encountered while building a fairness tool for data scientists and business stakeholders.
Demo: Algorithmic Fairness Tool (AFT) was an innovation project, done at Accenture The Dock, which focused on bringing the latest research from academia and building a tool for the industry.
2. Algorithmic Bias: Why
is it important?
Set the scene: Fairness
Definitions
Introduction to
Fairness Metrics
Introduction to
Accenture Fairness Tool
0201
03 04Agenda
3. Our society's growing reliance on algorithmic decision making,
particularly in social and economic areas has raised a concern; that they
may inadvertently discriminate against certain groups.
“Business needs to consider
society as a stakeholder.” -
Cennydd Bowles, Future Ethics
Introduction | Context
Objective decision-making is a challenge.
“Algorithmic Fairness is a practice that
aims to mitigate unconscious bias against
any individual or group of people in
Machine Learning.”
4. “Data biases are inevitable. We must
design algorithms that account for
them. "
“The model summarizes the data correctly. If
the data is biased it is not the algorithm’s
fault.”
VS
From “Tutorial: 21 fairness definitions and their politics” on YouTube
From “Tutorial: 21 fairness definitions and their politics” on YouTube
6. Definitions:
“The Treaty on the Functioning of the European Union
prohibits (TFEU) discrimination on grounds of nationality.
It also encourages to combat discrimination based on
sex, racial or ethnic origin, religion or belief, disability,
age or sexual orientation.”
Status Definition: Privileged vs Unprivileged
Example: In criminal risk assessment tools, a common
example of protected feature is race, with the designated
levels: white defendants – privileged group vs black
defendants – unprivileged.
Group Bias VS Individual Bias:Protected feature:
“Group fairness approaches partition the
population/sample into groups and seeks to equalize a
statistical measure across the groups. ”
“Individual fairness seeks to understand if similar
individuals are treated similarly irrespective of their
membership to any of the groups.”
7. Mutual Information
Identifies proxies for protected feature
Prevalence Analysis
Fraction of a population that satisfies a
given outcome
Disparate Impact
Quantifies disparity of outcomes for
different protected groups
Predictive Parity - False Positive Rates
Proportion of all negatives that still yield positive test
outcomes
Predictive Parity - True Positive
Rates
Predictive Parity - Positive
Predictive Power
Predictive Parity - FalseNegative
Rates
7
Metrics Introduction
Myriad of metrics: which one to choose?
Individualfairness
…
“Tutorial: 21 fairness definitions and their politics” on YouTube
8. Mutual Information
Approach: Quantifies the amount of information obtained
about one random variable through observing the other
random variable.
Mutual Information for a Protected Variable, assesses the
relationship between the protected and unprotected
variables (which could be used in the model build as proxies
for sensitive ones and generate bias).
Objectives:
• Identify proxies
• Provoke further analysis:
• Is ‘blindness’ w.r.t. protected feature enough?
• Predictive power of the proxies with respect to targeted model
9. Prevalence
Approach: The prevalence of a certain outcome
in a certain population can be defined as the
fraction of that population that satisfies a given
outcome, say Y = reoffended. In other words
Prevalence Ratio for a given protected variable is
defined as the ratio of the prevalence in the
privileged population to the prevalence in the
unprivileged group. Prevalence Ratio is calculated
on the ground truth. For example:
Prevalence ratio (White vs Black) =
!"#
!$$%
%!&"
'(&"
= 34%
10. Disparate Impact
TN= 2812 FP = 189
FN = 677 TP = 681
Black Total Pop = 4359
Recidivated
Predicted: low risk Predicted: high risk
Approach: Unintentional bias is encoded via disparate
impact, which occurs when a selection process has
widely different outcomes for different groups, even as
it appears to be neutral.
Calculation:
Disparate Impact for a protected variable is the ratio of
the % privileged population with a predicted outcome
to the % unprivileged population with a predicted
outcome.
US Law:
Originally, the Uniform Guidelines on Employee Selection Procedures provided a simple "80
percent" rule for determining that a company's selection system was having an "adverse
impact" on a minority group.
Cautionary points: “Courts in the U.S. have questioned the arbitrary nature of the 80 percent rule”
TN = 1886 FP = 26
FN = 147 TP = 94
Didnotrecidivate
Recidivated
Predicted: low risk Predicted: high risk
Didnot
recidivate
White Total Pop = 2154
DI = 0.06/0.2=30%
(189+681)/
4358 =
0.20
(26+94)/
2154 =
0.06
11. False Positive Rate (FPR)
TN= 2812 FP = 189
FN = 677 TP = 681
Black Total Pop = 4,359
Recidivated
Predicted: low risk Predicted: high risk
Approach: Parity for False Positive Rates (FPR) implies that
the false positive rates are equal among the privileged and
unprivileged population.
Calculation:
Error Ratio for a given protected variable, is defined as
the ratio of error in the privileged population to the
ratio of error in the unprivileged population
Legislation: No legal precedent for error ratio, however
a similar approach to DI can be applied by using the
80% rule (bias when ratio of rates <= 0.8)
Didnotrecidivate
Recidivated
Predicted: low risk Predicted: high risk
Didnotrecidivate
White Total Pop = 2,154
TN= 1886 FP = 26
FN = 147 TP = 94
189/(2812 +
189) = 0.063
26/(2154 +
26) = 0.013
Error Ratio = 0.013/0.063 = 0.21
12. There is now a consensus amongst academia and scientists that
algorithmic fairness can not be achieved by the application of data
science alone.
The complexity of choosing the right solution to allow for group and for
individual fairness for example, or to account for accuracy versus
fairness, or the complexities that come with scale, and many more is a
challenge in itself.
All of this is further compounded by myriad of non science factors: What
may be fair statistically quite often can fall short ethically or may not be
viable from a business perspective.
"Bias is a feature of statistical
models. Fairness is a feature of
human value judgments.”
Learnings
https://www.semanticscholar.org/paper/Fairness-aware-machine-learning%3A-a-perspective-
Žliobaitė/69c7bf934e9ac7673be590f7656bcb38fcb9da48
13. What are main general challenges we have encountered when
assessing real-life use cases for potential bias?
• Metric selection
• Academic – Industry gap
• Non-binary Protected feature
• More than one protected feature
• Legislation and guidelines
14. Data scientists can solve for many fairness problems
from a technical perspective by using statistical
metrics but this is not just a data science problem, it
requires input from the broader organisation.
The tool starts with the data scientist and is
integrated with Jupyter hub. We want to add
fairness as a step into the current data science
workflow. Analyses are pushed to a repository for
business users.
The business user can explore the interactive
analyses and embed them in reports for
dissemination for the broader business for decision
making. As a communication tool it facilitates a
deeper understanding of the challenge.
How does the tool work?
Accenture Fairness Tool