Key takeaways
You will learn how to identify and plan for bias in Machine Learning applications
You will learn about how to implement a series of specific steps in any software projects to understand how the data in your systems.
As we use Machine Learning in our software - we need to understand the impact on what we build. The Design team at Google has created a framework named Human-Centered Machine Learning (HCML) to help us focus and guide that understanding. I will introduce this concept and show how you can use it in your development process. I will show how HCML can be used to answer important questions like: Is ML right for this problem? What unique solution does ML provide? Are we using the right information to train our system? What is the impact of wrong results? Just like with the web and mobile revolutions, ML will force us to consider new possibilities for every experience we build. We must stay grounded in human needs while solving them in unique ways. HCML provides techniques to help us accomplish this.
2. 2
MIT Medialab Study - looking at ImSitu and COCO image sets
https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html
@mikewolfson
3. “
Machine Learning
software trained on
the datasets didn’t just
mirror the biases, it
amplified them
3
https://www.wired.com/story/machines-taught-by-photos-learn-a-sexist-view-of-women/
@mikewolfson
Bias Amplification
8. HCML
By Google UX
Focus discussion about how ML is changing
our world
Grounded in human needs
Practical Steps
https://medium.com/google-design/human-centered-machine-learning-a
770d10562cd
8
Human Centered Machine Learning
@mikewolfson
9. HCML Steps
1. Don't expect ML to figure out what problems to
solve
2. Will ML address the identified problem in a unique
way
3. Fake It
4. Understand the costs of false positives or negatives
5. Plan for change, and adapting the system over
time
6. Teach your Algorithm using the right labels
7. ML is a creative process, involve everyone
9@mikewolfson
10. Don't expect ML to figure
out what problems to solve
1
10@mikewolfson
16. DON’T EXPECT ML TO FIGURE OUT WHAT PROBLEMS TO SOLVE
The tools we use today, are still needed
to guide ML.
Find human needs:
◦ User Interviews
◦ Surveys
◦ Analyzing logs and support tickets
◦ Other ways of discovering needs
16@mikewolfson
17. Will ML Address the
Problem in a Unique Way?
2
17@mikewolfson
18. WILL ML ADDRESS THE PROBLEM IN A UNIQUE WAY?
To understand the value of ML, ask:
◦ How would a “human” solve the
problem today?
◦ How would you guide the “human”
to improve their outcomes?
◦ What assumptions would you want
a “human” to make?
◦ Does the problem require ML?
18@mikewolfson
19. WILL ML ADDRESS THE PROBLEM IN A UNIQUE WAY?
Confusion Matrix
19
User Impact
ML Impact
@mikewolfson
20. WILL ML ADDRESS THE PROBLEM IN A UNIQUE WAY?
20
Maybe this will work:
SELECT * FROM …
GROUP BY ...
@mikewolfson
25. UNDERSTAND THE COSTS OF FALSE POSITIVES OR NEGATIVES
25
PREDICTION
R
E
F
E
R
E
N
C
E
True Positive
True Negative
False Negative
False Positive
@mikewolfson
28. FALSE POSITIVE IS NOT OK
28
Georgetown Law Estimate:
117 Million American
Adults are in face
recognition databases
used by law enforcement
https://www.perpetuallineup.org/
@mikewolfson
29. FALSE POSITIVE IS NOT OK
29
ACLU test (using Rekognition):
● Used 25K Public Mugshots
● Misidentified 25 Members of
Congress
● 38% of incorrect classifications
were for people of color
https://venturebeat.com/2019/01/24/amazon-rekognition-bias-mit/
32. Plan for change, and
adapting the system over
time
5
32@mikewolfson
33. 5. PLAN FOR CHANGE AND ADAPTING THE SYSTEM OVER TIME
As UsersUse Cases Expand, measure:
◦ Accuracy
◦ Errors
Get in situ feedback over entire product
lifecycle
Provide easy opportunities for user
feedback
33@mikewolfson
34. 5. PLAN FOR CHANGE AND ADAPTING THE SYSTEM OVER TIME
34@mikewolfson
36. 6. TEACH YOUR ALGORITHM USING THE RIGHT LABELS
Labels:
Group of sample
data is tagged by
people.
Known labels can
be used to identify
other data that is
unknown
36@mikewolfson
37. 6. TEACH YOUR ALGORITHM USING THE RIGHT LABELS
Difficult when goal of model is considered
subjective by user (when trying to
predicting taste or interest)
Models take time and expense to train, it is
important to understand intentions before
creating complex ML systems
Getting this wrong has impact on viability
37@mikewolfson
38. 6. TEACH YOUR ALGORITHM USING THE RIGHT LABELS
Create “Content Specialist”:
◦ Find domain experts
◦ Make assumptions and discuss with
diverse group of collaborators
◦ Identify most relevant assumptions,
then validate them
▫ “Fake It”
◦ Repeat
◦ Use validated results to create
portfolio of examples to guide ML
38@mikewolfson
39. 6. TEACH YOUR ALGORITHM USING THE RIGHT LABELS
Assumptions take this form:
“For _____ users,
in _____ situations,
we assume they’ll prefer____,
and not ______”
39@mikewolfson
40. ML is a creative process,
involve everyone
7
40@mikewolfson
43. GOOD NEWS!
◦ MIT Computer Science and Artificial Intelligence
Laboratory (CSAIL) - algorithm compensates for
under-represented data during training
◦ IBM Research - new algorithm using head shape,
intra-eye distance, vs. skin color, gender
▫ New Dataset with 1M Faces annotated with these
features
43https://www.technologyreview.com/s/612846/making-face-recognition-less-biased-doesnt-make-it-less-scary/
45. GOOD NEWS!
◦ MIT Computer Science and Artificial Intelligence
Laboratory (CSAIL) - algorithm compensates for
under-represented data during training
◦ IBM Research - new algorithm using head shape,
intra-eye distance, vs. skin color, gender
▫ New Dataset with 1M Faces annotated with these
features
45
◦ Joy Buolamwini - Founder Algorithmic Justice League
▫ https://www.ajlunited.org
▫ @jovialjoy
47. GOOD NEWS!
◦ MIT Computer Science and Artificial Intelligence
Laboratory (CSAIL) - algorithm compensates for
under-represented data during training
◦ IBM Research - new algorithm using head shape,
intra-eye distance, vs. skin color, gender
▫ New Dataset with 1M Faces annotated with these
features
◦ Joy Buolamwini - Founder Algorithmic Justice League
▫ https://www.ajlunited.org
▫ @jovialjoy
◦ Google Inclusive Images Competition + PAIR Initiative
▫ https://ai.googleblog.com/2018/09/introducing-inclus
ive-images-competition.html
▫ https://ai.google
47
48. GOOD NEWS! - Google PAIR initiative
https://ai.google/research/teams/brain/pair
48
49. GOOD NEWS! - Google PAIR initiative
https://ai.google/research/teams/brain/pair
49
50. LEARN MORE
◦ NY Times - Race and Facial Recognition:
▫ https://www.nytimes.com/2018/02/09/technology/facial-recognition-r
ace-artificial-intelligence.html
◦ Wired Story about Gender Bias:
▫ https://www.wired.com/story/machines-taught-by-photos-learn-a-sex
ist-view-of-women/
◦ Medium Post from Google UX about HCML:
▫ https://medium.com/google-design/human-centered-machine-learni
ng-a770d10562cd
◦ Venture Beat - Image Classification Review:
▫ https://venturebeat.com/2019/01/24/amazon-rekognition-bias-mit
◦ Presentation template by SlidesCarnival
◦ Photographs by Unsplash
50@mikewolfson