algorithmic-bias.pptx

Algorithmic Bias and Fairness
1

Automated Decision Making: Pros
 Handles large volumes of data (Google search, airline
reservations, online markets, ..)
 Avoids certain kinds of bias
 Parole judges being more lenient after a meal
 Making hiring decisions based on the name of the person
 Subjectivity in evaluations of papers, music, teaching, etc.
 Human judgment in NYC stop and frisk policy
 4.4 M were stopped between 2004-2012
 88% of them led to no further action
 83% of the people stopped were Black or Hispanic – only
about half in the population are.
2

Complex and Opaque Decisions
 Hard to understand and make sense of
 Values, biases and potential discrimination built in
 The code is opaque and often trade secret
 Facebook’s newsfeed algorithm, recidivism algorithms, genetic testing
3

Gatekeeping Function
 Decide what gets attention, what is
published, and what is censored
 Google’s search results of geopolitical
queries might depend on location, e.g.,
different maps of Pakistan or India.
 Learning algorithms that make hiring
decisions.
 Pattern: Low commute time favors
low turnover
 Policy: Don’t hire from far off places
with bad public transportation
 Impact: People from poor and far off
neighborhoods may not be hired
4

Subjective Decision Making
 Algorithms to understand and translate language, drive cars, pilot
planes, and diagnose diseases.
 No right answer, but judgment and values.
 Detecting and removing terrorist content on the social networks.
 The definition of important words such as `terrorist’ and ‘extreme
content’ are controversial
 The scale makes it difficult for manual intervention.
 Algorithmic decisions may not be as good as people
5

Machine Learning
 Programs might be using protected attributes such as race and
gender to make predictions
 Even if the protected attributes are not used, they could be using
other “proxy” attributes which will have the same effect, e.g., zip
code.
 Recommendations based on earlier actions might create
bubbles, eg. Detecting trends on Twitter.
 Example: Predictive policing
 Predicting the neighborhoods most likely to be involved in
future crime based on crime statistics
 Rational but may be indistinguishable from racial profiling
 More police in the neighborhood lead to more arrests.
 Could lead to positive feedback loops and become a self-
fulfilling prophecy.
6

Data Privacy
 Who owns your browser data?
 Can your insurance company get access to your grocery list or
peek into your fridge?
 Can hospitals get access to consumer data to predict who is
going to get sick?
 Can your employer access your grades?
7

Transparency and Notification
 If the algorithm is opaque, there is no understanding or trust in
the program, e.g., medical decisions, hiring decisions
 Google’s search algorithm judged not demonstrably anti-
competitive in the US
 European Commission has successfully pursued an anti-
trust investigation
 Many points of trust: algorithm, input, learning data, control
surfaces, assumptions and models the algorithm uses, etc.
 Complete transparency makes it vulnerable to hacking. Does
not guarantee scrutiny.
 Consumers might demand the right to be notified when using
their information or demand excluding their personal information
8

Algorithmic Accountability
 How search engines censor violent/sexual search terms
 What influences Facebook’s newsfeed program or Google’s
advertisements
 Need causal explanations that link our digital experience with
data they are based upon
9

Government Regulation
 Destabilizing effect of high-speed trading systems led to
demands of transparency of these algorithms and ability to
modify them
 Should search algorithms be forced to follow some “search
neutrality rules”?
 Requires public officials to have access to the program and
modify it in the interest of public.
 There is no one right answer to the queries Google handles,
which makes it difficult.
10

Case Study: Recidivism Assessment
 COMPAS is a program to assess the recidivism of the prisoners –
their propensity to commit a crime in 3 years after the release.
 Propublica analyzed data of 10,000 prisoners in a Florida county
 There is one such table for Blacks and another for Whites. Θ is
chosen for each group separately.
 False Positive Rate 𝐹𝑃𝑅 =
𝐹𝑃
(𝐹𝑃+𝑇𝑁)
 Positive Predictive Value 𝑃𝑃𝑉 =
𝑇𝑃
(𝐹𝑃+𝑇𝑃)
 Propublica: FPR(Blacks) = 2 FPR(Whites)
 NorthPointe: PPV(Blacks) = PPV(Whites)
11
Recidivism Score ≤ θ Score > θ
False TN FP
True FN TP

Conflicting Demands on Fairness
12
 Red = False positives, FP; Blue= True positives TP
 Assumptions:
 Prevalence or rate of recidivism is higher for one group (say blacks)
 Positive Predictive Value 𝑃𝑃𝑉 =
𝑇𝑃
(𝐹𝑃+𝑇𝑃)
= same for both
𝐹𝑃
(𝐹𝑃+𝑇𝑁)
= higher for blacks.
White Black
Recidivism
=True
Prediction = Positive

Fairness of Recidivism Scores
Recidivism LowScore HighScore
False TN FP
True FN TP
13
𝐹𝑃
(𝐹𝑃+𝑇𝑁)
 False Negative Rate 𝐹𝑁𝑅 =
𝐹𝑁
(𝐹𝑁+𝑇𝑃)
 Prevalence p =
(𝐹𝑁+𝑇𝑃)
(𝐹𝑁+𝐹𝑃+𝑇𝑁+𝑇𝑃)
𝐹𝑃𝑅 =
𝑝
1 − 𝑝
1 − 𝑃𝑃𝑉
𝑃𝑃𝑉
(1 − FNR)
 Conclusion: If the prevalence p is different for two classes and
PPVs are the same then FNR or FPR or both must be different.
 The differences in FPR and FNR lead to disparate impacts –
more penalty for Blacks in both recidivism groups than Whites.

Summary
 It is mathematically impossible to achieve both equal PPV and
equal FPR across different groups.
 The differences in FPR and FNR persist in subgroups of
defendants.
 However, evidence suggests that data-driven risk assessment
tools (in medicine) are more accurate than human judgment.
 Human driven decisions are themselves prone to exhibiting
racial bias, eg, paroles, sentencing, stop and frisk, arrests, etc.
14

Case Study: Online Market Places
 How do we ensure that the sellers are honest about the quality of their
goods?
 Study: In early 2000’s eBay merchants misrepresented the quality
of their sports trading cards
 Problem largely solved by the feedback and reputation systems
 New development: demand for more information
 Study (2012): Subjects rated trustworthiness of potential borrowers
from photographs of them.
 People who looked trustworthy are more likely to get loans
 They are also more likely to repay their loans.
 More information leads to more freedom
 People can now choose whom to do business with based on looks
 A growing body of evidence suggests this leads to discrimination
15

Discrimination in Online Markets
 Air-BnB Study: 20 profiles sent to 6400 hosts
 The profiles are identical except 10 of them have names common to
white people and the rest to blacks
 Result: Requests for black-sounding names were 16% less
successful
 Discrimination was pervasive. Most of the people who rejected
never hosted a black guest.
 Other areas of discrimination: credit, labor markets, housing.
 Discrimination also occurs in algorithmic decisions.
 Searches for black sounding names on Google were more likely to
bring up ads about arrest records.
 Why?
 Learning from the past search data.
16

Principles and Recommendations
 Don’t Ignore potential discrimination
 Collect good data including race and gender stats
 Do regular reports and occasional audits
 Public disclosure of discrimination-related data
 Keep an experimental mindset to evaluate different design options
 Airbnb withholding host pictures from its ads
17

Design Decisions
 Control the information, its timing and salience
 When can you see the picture of Uber driver?
 Increase automation and charge for control
 Make instant book the default on AirBnB and charge a fee if the
host wants to approve the guest first
 Prioritize discrimination issues
 Remind the host about anti-discrimination policies at the time of
the transaction
 Make algorithms discrimination-aware
 Set explicit objectives: want my black and white customers to
be rejected at the same rate
18

Virtual Screens
 In mid 60’s less than 10% of the big 5 orchestras were women
 Moved away from face-to-face to behind-the-screen auditions
 Success rate of female musicians increased by 160%
 The online market allows virtual screens between buyers and sellers,
between employers and employees.
19

Case Study: Gerrymandering
20
 Background
 In the US, states are divided into
congressional districts every 10 years
 Each state is divided into precincts of
equal population
 The precincts are clustered into
congressional districts
 Whoever wins the majority of precincts
in the district wins that district
 Gerrymandering (named after Elbridge
Gerry) refers to manipulation of districts to
influence the outcome of an election.
 Packing: Pack most of the voters in the
opposing side into a small number of
districts
 Cracking: Split the voters of the
opposing side into several districts
where they are minority The original political cartoon on
Gerrymandered map of Essex
County Massachusetts, 1812

Impact of gerrymandering
 Racial gerrymandering that intentionally reduces minority
representation was ruled illegal in 1960.
 In 1980, voting rights act was amended to make states redraw
maps if they had racially discriminatory impact.
 Partisan gerrymandering has not been ruled illegal
 When republicans drew the maps (17 states) they won about 53 percent of
the vote and 72 percent of the seats.
 When democrats drew the maps (6 states), they won about 56 percent of
the vote and 71 percent of the seats.
 Proportional representation: Each party receives roughly the
same percent of votes as it wins the percent of the seats
 Wasted votes: Votes cast to the losing side or above the
minimum the winner needed to win.
 Efficiency gap: The difference in the wasted votes / total wasted.
It is intended to measure partisan bias.
21

Wisconsin’s redistricting in 2011
22
 Wisconsin’s Republican-led redistricting was struck
down by a 3 judge panel. It was heard by the supreme
court on October 3. A decision is pending.
 The arguments of the plaintiffs:
 Big efficiency gap indicates bias especially if it is
persistent. Wisconsin’s gap is the biggest ever.
 It violates voters’ right to equal treatment
 It discriminates against their views (first
amendment argument)
 Arguments of the defendants:
 Efficiency gaps arise naturally, e.g., when
democrats pack into cities
 Courts should stay out of it. States can appoint
independent commissions if they are concerned
 Justice Kennedy’s vote is probably going to be
decisive.

Discussion
Suppose you are heading an independent commission to
recommend a fair redistricting approach.
 How do you define fair redistricting? Why?
 How would you go about implementing your recommendation?
 What role do computer algorithms play?
23

algorithmic-bias.pptx

Recommended

Recommended

More Related Content

Similar to algorithmic-bias.pptx

Similar to algorithmic-bias.pptx (20)

Recently uploaded

Recently uploaded (20)

algorithmic-bias.pptx