SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Claudia Wagner Oxford, May 2015
Inequalities and Biases
in socio-technical systems
Doleac & Stein,
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1615149
How to detect biases?
• Audit Studies / Field Experiments
• Observational Study / Secondary
Analysis of Data
Gender Inequalities on Wikipedia
4.4 times more likely to be reported in articles about women
3.8 times more likely to be reported in articles about women
What do they have in common?
The Smurfette Principle
What about other stuff on Wikipedia?
Unequal Chances
Thank You
claudia.wagner@gesis.org

Weitere ähnliche Inhalte

Was ist angesagt?

Educon: History, History
Educon: History, HistoryEducon: History, History
Educon: History, History
visiblehistory
 
Writer's resume 1
Writer's resume 1Writer's resume 1
Writer's resume 1
John Davis
 

Was ist angesagt? (18)

Educon: History, History
Educon: History, HistoryEducon: History, History
Educon: History, History
 
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...
 
Good Riddance: Academic Publishers are Abandoning Publishing
Good Riddance: Academic Publishers are Abandoning PublishingGood Riddance: Academic Publishers are Abandoning Publishing
Good Riddance: Academic Publishers are Abandoning Publishing
 
Writer's resume 1
Writer's resume 1Writer's resume 1
Writer's resume 1
 
Lowering veteran suicide rates
Lowering veteran suicide rates Lowering veteran suicide rates
Lowering veteran suicide rates
 
Sydney Issues in Africa
Sydney Issues in AfricaSydney Issues in Africa
Sydney Issues in Africa
 
Loud Library Voices
Loud Library VoicesLoud Library Voices
Loud Library Voices
 
UIUC SLIS LIS531 MiniCourse Endangered Data Week
UIUC SLIS LIS531 MiniCourse Endangered Data WeekUIUC SLIS LIS531 MiniCourse Endangered Data Week
UIUC SLIS LIS531 MiniCourse Endangered Data Week
 
Surviving in the Academy: Issues and Challenges in Gender (In)Equality in Sc...
Surviving in the Academy:Issues and Challenges in Gender (In)Equality in Sc...Surviving in the Academy:Issues and Challenges in Gender (In)Equality in Sc...
Surviving in the Academy: Issues and Challenges in Gender (In)Equality in Sc...
 
It's the end of the world as we know it, and i feel fine
It's the end of the world as we know it, and i feel fineIt's the end of the world as we know it, and i feel fine
It's the end of the world as we know it, and i feel fine
 
Demystifying the Academic Publishing Process
Demystifying the Academic Publishing ProcessDemystifying the Academic Publishing Process
Demystifying the Academic Publishing Process
 
Will Chang MAS2019
Will Chang MAS2019Will Chang MAS2019
Will Chang MAS2019
 
Alms08 Greenblatt
Alms08 GreenblattAlms08 Greenblatt
Alms08 Greenblatt
 
The biggest threat to science today: the scholarly publishing system
The biggest threat to science today: the scholarly publishing systemThe biggest threat to science today: the scholarly publishing system
The biggest threat to science today: the scholarly publishing system
 
Slidecast power point
Slidecast power pointSlidecast power point
Slidecast power point
 
PowerSlidecast power point
PowerSlidecast power pointPowerSlidecast power point
PowerSlidecast power point
 
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
 
Taking Flight:
Taking Flight:Taking Flight:
Taking Flight:
 

Mehr von Claudia Wagner

When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...
Claudia Wagner
 
WWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging Streams
Claudia Wagner
 
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Claudia Wagner
 
Eswc2013 audience short
Eswc2013 audience shortEswc2013 audience short
Eswc2013 audience short
Claudia Wagner
 

Mehr von Claudia Wagner (16)

Food and Culture
Food and CultureFood and Culture
Food and Culture
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...
 
WWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging Streams
 
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESIS
 
Spatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary Patterns
 
Eswc2013 audience short
Eswc2013 audience shortEswc2013 audience short
Eswc2013 audience short
 
The Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksThe Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social Networks
 
It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users
 
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
 
Socialbots www2012
Socialbots www2012Socialbots www2012
Socialbots www2012
 
SDOW (ISWC2011)
SDOW (ISWC2011)SDOW (ISWC2011)
SDOW (ISWC2011)
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness Streams
 
The wisdom in Tweetonomies
The wisdom in TweetonomiesThe wisdom in Tweetonomies
The wisdom in Tweetonomies
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Slam about "Discrimination and Inequalities in socio-computational systems"

Hinweis der Redaktion

  1. Welcome to my talk about biases and inequalities in socio-technical systems. The organizers told me that the main purpose of this invited talk is not to be funny. The main purpose is to do sth against inequalities. So since we had only one female slammer so far, I am here to push the number. So that’s good for me since I already achieved the goal by simply walking up the stage. That was easy. And it’s also good for me because the topic of my talk is actually nor exactly the funniest topic in the world – its in fact a bit depressing. But I try to focus on the less serious stuff for now.
  2. So lets look at some of examples of how biases manifest in diff systems. Let me introduce you to José Zamora who was desperately looking for a job online and since he was not very successful with his online profile he decided to simply remove the S from his name. So Jose became Joe Zamora, and a week later, he says his inbox was full. Another example comes from Stanford where researchers tried to sell exactly the same product on an ecommerce site. Once the product was hold in a black hand and once in a white hand. So it turned out that whatever you hold in a white hand sells for more money. Those 2 examples show how pre-existing biases also manifest in the online world. But online is even worse than offline. Beside these pre-existing biases also technical biases and emergent biases exist in the online world. An example of a technical bias is the face recognition and motion tracking software from HP that could not detect black people. Emergent biases arise from the interplay between algs and people – i.e., when algorithms adapt their behavior based on the usage patterns they encounter. E.g. an algorithm might learn e.g. that on Airbnb hosts with darker skin color are less popular and therefore may recommend them less often.
  3. So how can we detect those biases. Audit studies are typically field experiments in which researchers intrude into the social process or system that is studied. (e.g., sock puppet audits, scraping audits, code audits,…). We can do what Jose or Joe did. That means run small self-experiments. If we are bit more ambitious, we can run medium-scale or large scale experiments. However if you decide to do that I need to warn you. I recently had a chat with Aniko who studied price discrimination and personalization with her collegues at Northeastern and she told me a bit about her experience with running these medium scale experiments for detecting biases and this often involves to create fake accounts. So that means she had to create several hundred Google accounts. So her first strategy was to email all her fb friends (so some of them eventually decided to unfriend her). When she decided to try creating these accounts automatically, the whole IP range of the university got banned from Google (again many people love you if that happens and its your fault). Finally they were desparate enough and went to the MAC store since they thought there are all these laptops which one can use. Unfortunately they are all in the same IP range and that’s why also the MAC store IP got banned. SO you see running these experiments is a fairly adventurous process.
  4. So in our work we could be a bit less adventurous since were interested in exploring gender inequalities on Wikipedia. So we could simply grap articles about notable men and women and compare them. So the good news is that we found that it’s very easy for women to make it to Wikipedia. You even don’t need to have a name (or at least a first name is for sure sufficient). Just make sure you marry a really famous guy. And if you have on top of a famous husband also a famous your brother you are for sure notable.
  5. So looks like it’s easy to make it to Wikipedia if you are a women, but what do they then write about us? Well they deeply care about the relationship status of women (and that has of course nothing todo with the fact that they are mainly male). When comparing articles about men and women on a lexical level we found that if an article on Wikipedia mentions that a person is divorced it’s 4.4 times more likely to be about a women and an articles which mentions that a person is married is 3.8 times more likely to be about a woman. So looks like notable women get more frequently married and divorced or the Wikipedia editor community considers different aspects important depending on if they document the life of a man or a woman.
  6. SO I have a quiz question for you guys which seems to be off-topic but actually is not! What do the muppets show, the smurfs and Start Wars have in common? There is only one women in the enire galactic empire, the enire smurfland and the whole muppets show
  7. Katha Pollitt called this observation that mainly boys define a group and that girls often only exist in relation to boys in fiction as Smurfette Principle (or Token minority). So that principle applies to fiction editors but what about Wikipedia editors? We constructed a network of articles about men and women by extracting links between articles and compared the k-coreness distribution of male and female articles. And it turns out that more men are part of large and well connected groups. So the Smurfette principle also seems to hold on Wikipedia.
  8. So maybe you think now: ok but this male-bias only becomes visible in the biography articles on Wikipedia, right? So let me give you one other example: Articles about Professions Some collegues and I recently went for dinner and we were just for fun looking through Wikipedia sites about different professions (you might wonder why but its not uncommon for people in academia we do that every 2 year). So what we saw on Wikipedia was pretty shocking. It turned out to be so difficult to find an article about a profession with an picture of a women that we started making a game out of it while waiting for food. Everyone had one guess. For example I would say hairdresser, german wikipedia and if that’s correct I get a point. If you ever play this game I can give you some tips: hairdresser, model, number girl, bar girl, hostess work in many language editions. Well so we had a lot of fun and the next day we crawled all pictures from all professions in all language edistions and set up a crowdflower task.
  9. In German for many professions a male and female title exists. The German Wikipedia community has decided to use male as default and tries to redirect from the female profession title to the male one. However they keep a list of professions for which only a female for exists or which is predominantly executed by women. The list is short and contains things like: number girl, bar girl, hostess, prostitute
  10. When analyzing all pictures from articles about profession (that we found by matching a list of around 100 professions to different language editions of Wikipedia), we found that more than 2/3 of all images that depict one or several people are showing men (or depict several people but the men are dominant). 900 pics in total.
  11. Let’s look at individual language editions. So we found that italian girls will not find much evidence for the fact that also women have jobs. So lets hope that italian girls don’t try to make carrier decisions based on Wikipedia. German, French and English Wikipedia a slightly better. What’s the most gender balanced language edition? Any guesses? The most gender-balanced wikipedia edition turns out to be Esperanto  - the most widely spoken constructed language in the world! It has 9 female and 17 male pics in our sample of professions. Italian has 42 mail pics. So looks like we have 2 options: change Wikipedia or teach our kids Esperanto (if you have girls). So to sum up – what the take away message from my talk. If you are female enjoy the fact that you have the entire galactic empire for yourself. Make sure that you marry the most famous guy in the empire. Only care about notability that is enough. You will probably get divorced anyway and the whole world will know it. If you get kids, make sure that your little girls learn Esperanto and don’t allow them to watch the Smurfs or the Muppets Show.
  12. So looks like we have 2 options: change Wikipedia or teach our kids Esperanto (if you have girls). And never forget not allow your kids to watch the smurfs or the muppets show. So well to sum up if you are a woman enjoy the fact that you have the entire galactic empire for yourself. Make sure that you marry the most famous guy in the empire. Only care about notability that is enough. You will probably get divorced anyway and the whole world will know it. If you get kids, make sure that your little girls learn Esperanto and don’t allow them to watch the Smurfs or the Muppets Show.