What does it take to build a good data product or service? Data practitioners always think about the technology, user experience and commercial viability. But rarely do they think about the implications of the systems they build. This talk will shed light on the impact of AI systems and the unintended consequences of the use of data in different products. It will also discuss our role, as data practitioners, in planting the seeds of fairness in the systems we build.
16. Data is the new oil,
in the way that oil is a ubiquitous
commodity that requires incredible
resource allocation to extract value
from, deep expertise to manage –
and even when all that goes well –
can have universally consequential
negative externalities.*
“ Drew Conway
Founder & CEO
17. Data is the new oil,
in the way that oil is a ubiquitous
commodity that requires incredible
resource allocation to extract value
from, deep expertise to manage –
and even when all that goes well –
can have universally consequential
negative externalities.*
“ Drew Conway
Founder & CEO
18. Data is the new oil,
in the way that oil is a ubiquitous
commodity that requires incredible
resource allocation to extract value
from, deep expertise to manage –
and even when all that goes well –
can have universally consequential
negative externalities.*
“ Drew Conway
Founder & CEO
29. No Classification
without Representation
Assessing Geodiversity Issues in Open
Data Sets for the Developing World*
Shreya Shankar, Yoni Halpern, Eric Breck,
James Atwood, Jimbo Wilson, D. Sculley
Google Brain Team
Open Images
ImageNet
US
US
30. WEDDING PHOTOS
Photos of bridegrooms from different countries aligned by the log-likelihood that
the classifier trained on Open Images assigns to the bridegroom class (Source)
BETTER AND MORE
CONSISTENT
CLASSIFICATION
31. The WEIRDest people in the world?
Joseph Henrich, Steven J. Heine, Ara Norenzayan
University of British Columbia*
32. The WEIRDest people in the world?
Western
Educated
Industrialized
Rich
Democratic
35. Amazon’s system TAUGHT ITSELF that
male candidates were preferable. It penalized
resumes that included the word “women’s,” as in “women’s
chess club captain.” And it downgraded graduates of two all-
women’s colleges, according to people familiar with the matter.
They did not specify the names of the schools.“
36. Amazon’s system TAUGHT ITSELF that
male candidates were preferable. It penalized
resumes that included the word “women’s,” as in “women’s
chess club captain.” And it downgraded graduates of two all-
women’s colleges, according to people familiar with the matter.
They did not specify the names of the schools.“
LEARNED FROM HUMANS
39. PRODUCT-X
- remove steps like resume reviews, phone screens, and
traditional assessments from their recruiting processes.
- Uses AI to give you more insight into candidates, so you
can make better decisions.“
”
40. Practitioners consistently:
- overestimate their model’s accuracy.
- propagate feedback loops.
- fail to notice data leaks.“
”“Why Should I Trust You?” Explaining the Predictions of Any Classifier
https://arxiv.org/pdf/1602.04938.pdf
42. COLLECT/LABEL
DATA
IT IS HUMANS WHO
BIAS IN:
- REPRESENTATION
- DISTRIBUTION
- LABELS
AND MORE…..
WRITE
ALGORITHMS
DEFINE
METRICS
43. IT IS HUMANS WHO
DEFINE
METRICS
WRITE
ALGORITHMS
COLLECT/LABEL
DATA
- TRAIN/TEST SPLIT
- FEATURES/PROXIES
- BLACK-BOX MODELS
AND MORE…..
44. IT IS HUMANS WHO
COLLECT/LABEL
DATA
DEFINE
METRICS
WRITE
ALGORITHMS
- WHAT IS THE IMPACT OF
DIFFERENT ERROR TYPES
ON DIFFERENT GROUPS?
- WHAT DO YOU OPTIMIZE
FOR?