SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Auditing Search Engines for
Differential Satisfaction
across Demographics
Rishabh Mehrotra, Ashton Anderson, Fernando Diaz,
Amit Sharma, Hanna Wallach, Emine Yilmaz
Microsoft Research New York
From public libraries to search engines
Motivation for auditing
• Ethical
• Equal access to everyone
• Practical
• Equal access helps attract a large and
diverse population of users
• Service providers are scrutinized for
seemingly unfair behavior [1,2,3]
• We offer methods for auditing a
system’s performance for detection
of differences in user satisfaction
across demographics
[1] N. Diakopoulos. Algorithmic accountability. Digital Journalism, 3(3):398–415, 2015
[2] S. Barocas and A. D. Selbst. Big data’s disparate impact. California Law Review, 104, 2016.
[3] C. Munoz, M. Smith, and D. Patel. Big data: A report on algorithmic systems, opportunity, and civil rights.
Technical report, Executive Office of the President of the United States, May 2016.
Tricky: straightforward optimization
can lead to differential performance
• Search engine uses a standard metric: time spent on clicked
result page as an indicator of satisfaction.
• Goal: estimate difference in user satisfaction between these two
demographic groups.
• Suppose older users issue more of “retirement planning” queries
Age: >50 years
80% users 10% users
Age: <30 years
…
1. Overall metrics can hide
differential satisfaction
• Average user satisfaction for “retirement planning”
may be high.
But,
• Average satisfaction for younger users=0.7
• Average satisfaction for older users=0.2
2. Query-level metrics can hide
differential satisfaction
<query>
<query>
<query>
<query>
<query>
<query>
retirement planning
<query>
<query>
retirement planning
retirement planning
<query>
retirement planning
…
Same user satisfaction for
“retirement planning” for both older and
younger users = 0.7
What if average satisfaction for
<query>=0.9?
Older users still receiving more of lower-
quality results than younger users.
Younger users
Older users
3. More critically, even individual-
level metrics can also hide differential
satisfaction
Reading time for the same webpage result for the
same user satisfaction
Time spent on a webpage
Younger Users
Older Users
We must control for natural
demographic variation to
meaningfully audit for differential
satisfaction.
Data: Demographic characteristics
of search engine users
• Internal logs from Bing.com for two weeks
• 4 M users | 32 M impressions | 17 M sessions
• Demographics: Age & Gender
• Age:
• post-Millenial: <18
• Millenial: 18-34
• Generation X: 35-54
• Baby Boomer: 55 - 74
Demographic distribution of user activity
Age Groups
Overall metrics across Demographics
Four metrics:
Graded Utility (GU) Reformulation Rate (RR)
Successful Click Count (SCC) Page Click Count (PCC)
Pitfalls with Overall Metrics
• Conflate two separate effects:
• natural demographic variation caused by the differing
traits among the different demographic groups e.g.
• Different queries issued
• Different information need for the same query
• Even for the same satisfaction, demographic A tends to click
more than demographic B
• Systemic difference in user satisfaction due to the search
engine
Utilize work from causal inference
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
I. Context Matching: selecting for
activity with near-identical context
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
Context
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
Context
For any two users from different demographics,
1. Same Query
2. Same Information Need:
1. Control for user intent: same final SAT click
2. Only consider navigational queries
3. Identical top-8 Search Results
1.2 M impressions, 19K unique queries, 617K users
Age-wise differences in metrics disappear
• General auditing tool: robust
• Very low coverage across queries
• Did we control for too much?
II. Query-level hierarchical model:
Differential satisfaction for the same query
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
• Simply fitting different models for each query will not
work for less popular queries.
• We formulate a hierarchical model that borrows
strength from more popular queries:
• Consider metric for each impression---query and user---as a
deviation from overall metric based on:
• Query Topic
• User demographics
Age-wise differences appear again:
bigger differences for harder queries
III. Query-level pairwise model:
Estimating satisfaction directly by
considering pairs of users
Information
Need
Demographics
Metric
User
satisfaction
Query
Search
Results
Estimating absolute satisfaction is non-trivial
• Instead, Estimate relative satisfaction by considering pairs of users
for the same query
• Conservative proxy for pairwise satisfaction by only considering “big”
differences in observed metric for the same query
• Logistic regression model for estimating probability of impression i
being more satisfied than impression j:
Again, see a small age-wise difference in
satisfaction
• Auditing is more nuanced than merely measuring
metrics on demographically-binned traffic.
• We find light trend towards older users being more
satisfied.
• General framework for auditing systems
• Plug-in different metrics
• Plug-in different demographics/user groups
• Suggests recalibration of metrics based on
demographics
Discussion
Thank You!
Amit Sharma
Postdoctoral Researcher
http://www.amitsharma.in
@amt_shrma
amshar@microsoft.com
Auditing is more nuanced than merely measuring metrics on demographically-binned
traffic.
General framework for auditing systems
Plug-in different metrics
Plug-in different demographics/user groups
Paper: http://datworkshop.org/papers/dat16-final41.pdf

Weitere ähnliche Inhalte

Was ist angesagt?

Crowdsourcing Predictors of Behavioral Outcomes
Crowdsourcing Predictors of Behavioral OutcomesCrowdsourcing Predictors of Behavioral Outcomes
Crowdsourcing Predictors of Behavioral OutcomesAlekya Yermal
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIAIJCSES Journal
 
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...ijaia
 
Data science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulData science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulSubir Paul
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
POLITICAL PREDICTION ANALYSIS USING TEXT MINING
POLITICAL PREDICTION ANALYSIS USING TEXT MININGPOLITICAL PREDICTION ANALYSIS USING TEXT MINING
POLITICAL PREDICTION ANALYSIS USING TEXT MININGVishwambhar Deshpande
 
Scalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision MakingScalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision MakingKatrien Verbert
 
Guide to Recommender Systems
Guide to Recommender SystemsGuide to Recommender Systems
Guide to Recommender SystemsAmancio Bouza
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet usersStruggler Ever
 
Recommender Systems and Misinformation: The Problem or the Solution?
Recommender Systems and Misinformation: The Problem or the Solution?Recommender Systems and Misinformation: The Problem or the Solution?
Recommender Systems and Misinformation: The Problem or the Solution?Alejandro Bellogin
 
Recommendation systems
Recommendation systems  Recommendation systems
Recommendation systems Badr Hirchoua
 
How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataDataCards
 

Was ist angesagt? (18)

Crowdsourcing Predictors of Behavioral Outcomes
Crowdsourcing Predictors of Behavioral OutcomesCrowdsourcing Predictors of Behavioral Outcomes
Crowdsourcing Predictors of Behavioral Outcomes
 
Sns e wom_ks_iccsa_pham
Sns e wom_ks_iccsa_phamSns e wom_ks_iccsa_pham
Sns e wom_ks_iccsa_pham
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
 
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...
 
Data science concept by Raj Krishna Paul
Data science concept by Raj Krishna PaulData science concept by Raj Krishna Paul
Data science concept by Raj Krishna Paul
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
POLITICAL PREDICTION ANALYSIS USING TEXT MINING
POLITICAL PREDICTION ANALYSIS USING TEXT MININGPOLITICAL PREDICTION ANALYSIS USING TEXT MINING
POLITICAL PREDICTION ANALYSIS USING TEXT MINING
 
How online social ties and product-related risks influence purchase intention...
How online social ties and product-related risks influence purchase intention...How online social ties and product-related risks influence purchase intention...
How online social ties and product-related risks influence purchase intention...
 
Scalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision MakingScalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision Making
 
Model bias in AI
Model bias in AIModel bias in AI
Model bias in AI
 
Poster_final
Poster_finalPoster_final
Poster_final
 
Guide to Recommender Systems
Guide to Recommender SystemsGuide to Recommender Systems
Guide to Recommender Systems
 
Mr1480.appa
Mr1480.appaMr1480.appa
Mr1480.appa
 
Recommender systems to help people move forward
Recommender systems to help people move forwardRecommender systems to help people move forward
Recommender systems to help people move forward
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet users
 
Recommender Systems and Misinformation: The Problem or the Solution?
Recommender Systems and Misinformation: The Problem or the Solution?Recommender Systems and Misinformation: The Problem or the Solution?
Recommender Systems and Misinformation: The Problem or the Solution?
 
Recommendation systems
Recommendation systems  Recommendation systems
Recommendation systems
 
How NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling DataHow NOT to Aggregrate Polling Data
How NOT to Aggregrate Polling Data
 

Andere mochten auch

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesAmit Sharma
 
Agenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzoAgenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzocolegiommc
 
Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Димка Куликов
 
типы химических связей
типы химических связейтипы химических связей
типы химических связейOlga Pishchik
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferencesAmit Sharma
 
11 al 15 de julio
11 al 15 de julio11 al 15 de julio
11 al 15 de juliocolegiommc
 
бенефис почтенной книге
бенефис почтенной книгебенефис почтенной книге
бенефис почтенной книгеДимка Куликов
 
Logistica elecciones 2014
Logistica elecciones 2014Logistica elecciones 2014
Logistica elecciones 2014colegiommc
 
Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)colegiommc
 
фотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезафотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезаДимка Куликов
 
Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Димка Куликов
 
From Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsFrom Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsOfficeReports
 
Injury prevention for female field hockey players sport coaching pedagogy
Injury prevention for female field hockey players   sport coaching pedagogyInjury prevention for female field hockey players   sport coaching pedagogy
Injury prevention for female field hockey players sport coaching pedagogyniinaflaherty
 
Reporting in Excel, PowerPoint and Word
Reporting in Excel, PowerPoint and WordReporting in Excel, PowerPoint and Word
Reporting in Excel, PowerPoint and WordOfficeReports
 

Andere mochten auch (20)

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practices
 
Agenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzoAgenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzo
 
Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Обзор периодической печати колледжа.
Обзор периодической печати колледжа.
 
типы химических связей
типы химических связейтипы химических связей
типы химических связей
 
Semana 19
Semana 19Semana 19
Semana 19
 
гид2013
гид2013гид2013
гид2013
 
Semana 20 (1)
Semana 20 (1)Semana 20 (1)
Semana 20 (1)
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferences
 
Semana 24
Semana 24Semana 24
Semana 24
 
11 al 15 de julio
11 al 15 de julio11 al 15 de julio
11 al 15 de julio
 
бенефис почтенной книге
бенефис почтенной книгебенефис почтенной книге
бенефис почтенной книге
 
Logistica elecciones 2014
Logistica elecciones 2014Logistica elecciones 2014
Logistica elecciones 2014
 
Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)
 
тюз
тюзтюз
тюз
 
фотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезафотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулеза
 
Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ.
 
From Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsFrom Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and Icons
 
Injury prevention for female field hockey players sport coaching pedagogy
Injury prevention for female field hockey players   sport coaching pedagogyInjury prevention for female field hockey players   sport coaching pedagogy
Injury prevention for female field hockey players sport coaching pedagogy
 
Reporting in Excel, PowerPoint and Word
Reporting in Excel, PowerPoint and WordReporting in Excel, PowerPoint and Word
Reporting in Excel, PowerPoint and Word
 
урок по-географии (1)
урок по-географии (1)урок по-географии (1)
урок по-географии (1)
 

Ähnlich wie Auditing search engines for differential satisfaction across demographics

Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsAmit Sharma
 
Developing and testing search engine algorithms –
Developing and testing search engine algorithms –Developing and testing search engine algorithms –
Developing and testing search engine algorithms –vonreventlow
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimJin Young Kim
 
Social media data analysis
Social media data analysisSocial media data analysis
Social media data analysisShweta Patnaik
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
 
Social media presentation for fcn 3-6-14 jao
Social media presentation for fcn   3-6-14 jaoSocial media presentation for fcn   3-6-14 jao
Social media presentation for fcn 3-6-14 jaoJessica Orquina
 
Dasts16 a koene_un_bias
Dasts16 a koene_un_biasDasts16 a koene_un_bias
Dasts16 a koene_un_biasAnsgar Koene
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceMounia Lalmas-Roelleke
 
Simple Measures, Big Results: Measuring Program Impact Data
Simple Measures, Big Results: Measuring Program Impact DataSimple Measures, Big Results: Measuring Program Impact Data
Simple Measures, Big Results: Measuring Program Impact DataTechSoup
 
Personalization of the Web Search
Personalization of the Web SearchPersonalization of the Web Search
Personalization of the Web SearchIJMER
 
Structural Balance Theory Based Recommendation for Social Service Portal
Structural Balance Theory Based Recommendation for Social Service PortalStructural Balance Theory Based Recommendation for Social Service Portal
Structural Balance Theory Based Recommendation for Social Service PortalYogeshIJTSRD
 
The influence of social media advertising on consumer.pptx
The influence of social media advertising on consumer.pptxThe influence of social media advertising on consumer.pptx
The influence of social media advertising on consumer.pptxMuhammadBilal23726
 
Personalization of the Web Search
Personalization of the Web SearchPersonalization of the Web Search
Personalization of the Web SearchIJMER
 
Analytics Slides for IDEA Programme @Cattolica
Analytics Slides for IDEA Programme @CattolicaAnalytics Slides for IDEA Programme @Cattolica
Analytics Slides for IDEA Programme @CattolicaGiorgio Suighi
 

Ähnlich wie Auditing search engines for differential satisfaction across demographics (20)

Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systems
 
Developing and testing search engine algorithms –
Developing and testing search engine algorithms –Developing and testing search engine algorithms –
Developing and testing search engine algorithms –
 
Using social network sites
Using social network sites Using social network sites
Using social network sites
 
MKT 380 Week 10
MKT 380 Week 10MKT 380 Week 10
MKT 380 Week 10
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
 
Social media data analysis
Social media data analysisSocial media data analysis
Social media data analysis
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement
 
Social media presentation for fcn 3-6-14 jao
Social media presentation for fcn   3-6-14 jaoSocial media presentation for fcn   3-6-14 jao
Social media presentation for fcn 3-6-14 jao
 
Dasts16 a koene_un_bias
Dasts16 a koene_un_biasDasts16 a koene_un_bias
Dasts16 a koene_un_bias
 
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerceTutorial on metrics of user engagement -- Applications to Search & E- commerce
Tutorial on metrics of user engagement -- Applications to Search & E- commerce
 
CS-IS 027
CS-IS 027CS-IS 027
CS-IS 027
 
An engaging click
An engaging clickAn engaging click
An engaging click
 
Simple Measures, Big Results: Measuring Program Impact Data
Simple Measures, Big Results: Measuring Program Impact DataSimple Measures, Big Results: Measuring Program Impact Data
Simple Measures, Big Results: Measuring Program Impact Data
 
Personalization of the Web Search
Personalization of the Web SearchPersonalization of the Web Search
Personalization of the Web Search
 
Structural Balance Theory Based Recommendation for Social Service Portal
Structural Balance Theory Based Recommendation for Social Service PortalStructural Balance Theory Based Recommendation for Social Service Portal
Structural Balance Theory Based Recommendation for Social Service Portal
 
The influence of social media advertising on consumer.pptx
The influence of social media advertising on consumer.pptxThe influence of social media advertising on consumer.pptx
The influence of social media advertising on consumer.pptx
 
Qs1 group a
Qs1 group a Qs1 group a
Qs1 group a
 
SM&WA_S1-2.pptx
SM&WA_S1-2.pptxSM&WA_S1-2.pptx
SM&WA_S1-2.pptx
 
Personalization of the Web Search
Personalization of the Web SearchPersonalization of the Web Search
Personalization of the Web Search
 
Analytics Slides for IDEA Programme @Cattolica
Analytics Slides for IDEA Programme @CattolicaAnalytics Slides for IDEA Programme @Cattolica
Analytics Slides for IDEA Programme @Cattolica
 

Mehr von Amit Sharma

Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceAmit Sharma
 
Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAmit Sharma
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolAmit Sharma
 
Causal inference in data science
Causal inference in data scienceCausal inference in data science
Causal inference in data scienceAmit Sharma
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesAmit Sharma
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsAmit Sharma
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsAmit Sharma
 
Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practiceAmit Sharma
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereAmit Sharma
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...Amit Sharma
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...Amit Sharma
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationAmit Sharma
 

Mehr von Amit Sharma (12)

Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal Models
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
 
Causal inference in data science
Causal inference in data scienceCausal inference in data science
Causal inference in data science
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actions
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systems
 
Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practice
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhere
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendation
 

Kürzlich hochgeladen

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 

Kürzlich hochgeladen (17)

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 

Auditing search engines for differential satisfaction across demographics

  • 1. Auditing Search Engines for Differential Satisfaction across Demographics Rishabh Mehrotra, Ashton Anderson, Fernando Diaz, Amit Sharma, Hanna Wallach, Emine Yilmaz Microsoft Research New York
  • 2. From public libraries to search engines
  • 3. Motivation for auditing • Ethical • Equal access to everyone • Practical • Equal access helps attract a large and diverse population of users • Service providers are scrutinized for seemingly unfair behavior [1,2,3] • We offer methods for auditing a system’s performance for detection of differences in user satisfaction across demographics [1] N. Diakopoulos. Algorithmic accountability. Digital Journalism, 3(3):398–415, 2015 [2] S. Barocas and A. D. Selbst. Big data’s disparate impact. California Law Review, 104, 2016. [3] C. Munoz, M. Smith, and D. Patel. Big data: A report on algorithmic systems, opportunity, and civil rights. Technical report, Executive Office of the President of the United States, May 2016.
  • 4. Tricky: straightforward optimization can lead to differential performance • Search engine uses a standard metric: time spent on clicked result page as an indicator of satisfaction. • Goal: estimate difference in user satisfaction between these two demographic groups. • Suppose older users issue more of “retirement planning” queries Age: >50 years 80% users 10% users Age: <30 years …
  • 5. 1. Overall metrics can hide differential satisfaction • Average user satisfaction for “retirement planning” may be high. But, • Average satisfaction for younger users=0.7 • Average satisfaction for older users=0.2
  • 6. 2. Query-level metrics can hide differential satisfaction <query> <query> <query> <query> <query> <query> retirement planning <query> <query> retirement planning retirement planning <query> retirement planning … Same user satisfaction for “retirement planning” for both older and younger users = 0.7 What if average satisfaction for <query>=0.9? Older users still receiving more of lower- quality results than younger users. Younger users Older users
  • 7. 3. More critically, even individual- level metrics can also hide differential satisfaction Reading time for the same webpage result for the same user satisfaction Time spent on a webpage Younger Users Older Users
  • 8. We must control for natural demographic variation to meaningfully audit for differential satisfaction.
  • 9. Data: Demographic characteristics of search engine users • Internal logs from Bing.com for two weeks • 4 M users | 32 M impressions | 17 M sessions • Demographics: Age & Gender • Age: • post-Millenial: <18 • Millenial: 18-34 • Generation X: 35-54 • Baby Boomer: 55 - 74
  • 10. Demographic distribution of user activity Age Groups
  • 11. Overall metrics across Demographics Four metrics: Graded Utility (GU) Reformulation Rate (RR) Successful Click Count (SCC) Page Click Count (PCC)
  • 12. Pitfalls with Overall Metrics • Conflate two separate effects: • natural demographic variation caused by the differing traits among the different demographic groups e.g. • Different queries issued • Different information need for the same query • Even for the same satisfaction, demographic A tends to click more than demographic B • Systemic difference in user satisfaction due to the search engine
  • 13. Utilize work from causal inference Information Need Demographics Metric User satisfaction Query Search Results
  • 14. I. Context Matching: selecting for activity with near-identical context Information Need Demographics Metric User satisfaction Query Search Results Context
  • 15. Information Need Demographics Metric User satisfaction Query Search Results Context For any two users from different demographics, 1. Same Query 2. Same Information Need: 1. Control for user intent: same final SAT click 2. Only consider navigational queries 3. Identical top-8 Search Results 1.2 M impressions, 19K unique queries, 617K users
  • 16. Age-wise differences in metrics disappear • General auditing tool: robust • Very low coverage across queries • Did we control for too much?
  • 17. II. Query-level hierarchical model: Differential satisfaction for the same query Information Need Demographics Metric User satisfaction Query Search Results
  • 18. • Simply fitting different models for each query will not work for less popular queries. • We formulate a hierarchical model that borrows strength from more popular queries: • Consider metric for each impression---query and user---as a deviation from overall metric based on: • Query Topic • User demographics
  • 19. Age-wise differences appear again: bigger differences for harder queries
  • 20. III. Query-level pairwise model: Estimating satisfaction directly by considering pairs of users Information Need Demographics Metric User satisfaction Query Search Results
  • 21. Estimating absolute satisfaction is non-trivial • Instead, Estimate relative satisfaction by considering pairs of users for the same query • Conservative proxy for pairwise satisfaction by only considering “big” differences in observed metric for the same query • Logistic regression model for estimating probability of impression i being more satisfied than impression j:
  • 22. Again, see a small age-wise difference in satisfaction
  • 23. • Auditing is more nuanced than merely measuring metrics on demographically-binned traffic. • We find light trend towards older users being more satisfied. • General framework for auditing systems • Plug-in different metrics • Plug-in different demographics/user groups • Suggests recalibration of metrics based on demographics Discussion
  • 24. Thank You! Amit Sharma Postdoctoral Researcher http://www.amitsharma.in @amt_shrma amshar@microsoft.com Auditing is more nuanced than merely measuring metrics on demographically-binned traffic. General framework for auditing systems Plug-in different metrics Plug-in different demographics/user groups Paper: http://datworkshop.org/papers/dat16-final41.pdf