Neutralising bias on word embeddings

NEUTRALISING BIAS ON WORD
EMBEDDINGS
–Wilder Rodrigues

Wilder Rodrigues
• Machine Learning Engineer at Quby;
• Coursera Mentor;
• City.AI Ambassador;
• School of AI Dean [Utrecht]
• IBM Watson AI XPRIZE contestant;
• Kaggler;
• Public speaker;
• Family man and father of 3.
@wilderrodrigues
https://medium.com/@wilder.rodrigues

How do you see racism?
• Before you proceed, please watch this video: https://www.youtube.com/watch?v=5F_atkP3pqs
• The audio is in Portuguese, but in the next slide you will ﬁnd translations for what people said in the
interviews.
Source: Canal deTV da FAP (Astrojildo Pereira Foundation)

Translations
• Group 1
• He is late;
• She is a fashion designer;
• Holds an executive position in either the HR
or Finance area;
• Taking care of his garden. Doesn’t look like a
gardener;
• She is cleaning her own house; the countertop;
• Grafﬁti artist; it’s an art, it’s not vandalism.
• Group II
• Vandalising the wall; she is a spitter;
• She is a housekeeper; cleaning the house;
• He is a gardener;
• He looks like a security guard or a
chauffeur;
• Seamstress; saleswoman;
• He is running away; he is a thief.

Unconscious bias
• Blue is for boys, pink for girls.
• Boys are better at maths and science.
• Tall people make better leaders.
• New mothers are more absent from work
than new fathers.
• People with tattoos are rebellious.
• Younger people are better with technology
than older people.

–Joanna Bryson, University of Bath and Princeton University
"AI is just an extension of our existing culture.”

Racialized code & Unregulated algorithms
Source: https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police
Joy Buolamwini, Code4Rights and MIT Media Lab Researcher.

How white engineers built racist code – and
why it's dangerous for black people
Source: https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police

Implicit AssociationTest
Both black and white Americans, for
example, are faster at associating names
like “Brad” and “Courtney” with words
like “happy” and “sunrise,” and names like
“Leroy” and “Latisha” with words like
“hatred” and “vomit” than vice versa.
Source: http://www.sciencemag.org/news/2017/04/even-artiﬁcial-intelligence-can-acquire-biases-against-race-and-gender

W.E.A.T
Names like “Brett” and “Allison” were
more similar to those for positive words
including love and laughter, and those for
names like “Alonzo” and “Shaniqua” were
more similar to negative words like
“cancer” and “failure.”

W.E.F.A.T
How closely related the embeddings for
words like “hygienist” and “librarian” were
to those of words like “female” and
“woman.” It then compared this
computer-generated gender association
measure to the actual percentage of
women in that occupation.

Word Embeddings
A ⋅ B
∥A∥∥B∥
=
∑
n
i=1
AiBi
∑
n
i=1
A2
i ∑
n
i=1
B2
i
Source: https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98
Father (L2 norm): 5.31
Mother (L2 norm): 5.63
d: 26.67
p: 29.89
Similarity: d / p = 0.89
Car (L2 norm): 5.73
Bird (L2 norm): 4.83
d: 5.96
p: 27.67
Similarity: d / p = 0.21

Identifying gender
[woman] - [man] = [female]

Neutralising bias from non-gender speciﬁc
words
ebias_comp
=
e ⋅ g
∥g∥2
2
g
edebiased
= e − ebias
Source: Bolukbasi et al., 2016, https://arxiv.org/pdf/1607.06520.pdf

Does it work?
• Cosine similarity between receptionist
and gender, before neutralising:
• 0.3307794175059373
• Cosine similarity between receptionist
and gender, after neutralising:
• 5.2021694209043796e-17

Equalising gender-speciﬁc words
Tricky
parts!

Equalising gender-speciﬁc words
• Cosine similarity between actor and gender, before
equalising:
• -0.08387555382505694
• Cosine similarity between actress and gender, before
equalising::
• 0.33422494897899785
• Cosine similarity between actor and gender, after
equalising:
• -0.8796563888581831
• Cosine similarity between actress and gender, after
equalising:
• 0.879656388858183

How far is actor from babysitter?
• Cosine similarity between actor and babysitter, before
neutralising:
• 0.2766562472128601
• Cosine similarity between actress and babysitter, before
neutralising::
• 0.3378475317457311
• Cosine similarity between actor and babysitter, after
neutralising:
• 0.1408988327631711
• Cosine similarity between actress and babysitter, after
neutralising:
• 0.14089883276317122

References
• https://www.youtube.com/watch?v=5F_atkP3pqs
• https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police
• http://www.sciencemag.org/news/2017/04/even-artiﬁcial-intelligence-can-acquire-biases-against-race-and-gender
• https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98
• Bolukbasi et al., 2016, https://arxiv.org/pdf/1607.06520.pdf
• Jeffrey Pennington, Richard Socher, and Christopher D. Manning, https://nlp.stanford.edu/projects/glove/
• https://github.com/ekholabs/DLinK/blob/master/notebooks/nlp/neutralising-equalising-word-embeddings.ipynb

Neutralising bias on word embeddings

Neutralising bias on word embeddings

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Neutralising bias on word embeddings

Ähnlich wie Neutralising bias on word embeddings (8)

Mehr von Wilder Rodrigues

Mehr von Wilder Rodrigues (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Neutralising bias on word embeddings