SlideShare a Scribd company logo
1 of 16
Preventing Filter Bubbles and
Underprovision in Online
Communities with Social Curation
Algorithms:
Data-driven approaches to measuring “bias”
Jahna Otterbacher
Open University Cyprus, Nicosia CYPRUS
Libby Hemphill
Illinois Institute of Technology, Chicago USA
Social Curation Algorithms
in Online Communities
• Low barriers to entry
• Users contribute to a collection of shared content
• Users judge the value of content via binary voting
• Aggregated votes used in information display(s)
Aarhus University, 3 October 2013
Aarhus University, 3 October 2013
Aarhus University, 3 October 2013
Bias
• Content with particular properties systematically ranked
higher/lower than others
• Information display gives users a particular take on “what
others think”
• Prominently displayed content is what users see and read
• Users often do not change default settings
• They place trust in information displays
Aarhus University, 3 October 2013
Gender bias at IMDb
Aarhus University, 3 October 2013
Editing bias at Amazon, IMDb and Yelp
Aarhus University, 3 October 2013
Underprovision problem
• When social curation is used:
“too many people rely on others to contribute without
doing so themselves.” [Gilbert, 2013]
• Study of Reddit
• Most communities suffer from some degree of free riding
• At Reddit, users’ contributions being buried led to disincentives for
contributions
• “…it’s such an incredible resource when the comments are flowing,
but if your post gets buried for whatever reason, it’s painfully anti-
climactic.”
Aarhus University, 3 October 2013
Our perspective
• Bias is inevitable and is not necessarily bad
• Presence of bias could be revealed to users
• Research questions
• What types of biases may occur?
• Under what circumstances?
• How can we study bias across systems?
Aarhus University, 3 October 2013
Proposed framework
• Find diverse examples of systems
• Taxonomy of biases
• Participation rates and participant roles
• Examine correlations between system/participant
characteristics and observed biases
• Generate ideas of how to respond
Aarhus University, 3 October 2013
Aarhus University, 3 October 2013
Bias taxonomy
• Contributor characteristics
• Demographics
• Level, type of activities
• Information disclosure
• Contribution characteristics
• Writing style (e.g., narrative/reporting)
• Content (e.g., uniqueness/conformity)
• Metadata (e.g., time posted)
Aarhus University, 3 October 2013
Participation rates & roles
Aarhus University, 3 October 2013
Correlations
• How are system and participant characteristics correlated
to the biases that we observe?
• Are more information displays necessarily better?
• Which default display leads to more/less diversity with
respect to a given characteristic of content?
Aarhus University, 3 October 2013
Final thoughts
• Can we exploit bias in order to
• Entice users to participate in all activities?
• Convince users to question default information displays?
Aarhus University, 3 October 2013
Thank you!
jahna.otterbacher@gmail.com
Aarhus University, 3 October 2013

More Related Content

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

  • 1. Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms: Data-driven approaches to measuring “bias” Jahna Otterbacher Open University Cyprus, Nicosia CYPRUS Libby Hemphill Illinois Institute of Technology, Chicago USA
  • 2. Social Curation Algorithms in Online Communities • Low barriers to entry • Users contribute to a collection of shared content • Users judge the value of content via binary voting • Aggregated votes used in information display(s) Aarhus University, 3 October 2013
  • 3. Aarhus University, 3 October 2013
  • 4. Aarhus University, 3 October 2013
  • 5. Bias • Content with particular properties systematically ranked higher/lower than others • Information display gives users a particular take on “what others think” • Prominently displayed content is what users see and read • Users often do not change default settings • They place trust in information displays Aarhus University, 3 October 2013
  • 6. Gender bias at IMDb Aarhus University, 3 October 2013
  • 7. Editing bias at Amazon, IMDb and Yelp Aarhus University, 3 October 2013
  • 8. Underprovision problem • When social curation is used: “too many people rely on others to contribute without doing so themselves.” [Gilbert, 2013] • Study of Reddit • Most communities suffer from some degree of free riding • At Reddit, users’ contributions being buried led to disincentives for contributions • “…it’s such an incredible resource when the comments are flowing, but if your post gets buried for whatever reason, it’s painfully anti- climactic.” Aarhus University, 3 October 2013
  • 9. Our perspective • Bias is inevitable and is not necessarily bad • Presence of bias could be revealed to users • Research questions • What types of biases may occur? • Under what circumstances? • How can we study bias across systems? Aarhus University, 3 October 2013
  • 10. Proposed framework • Find diverse examples of systems • Taxonomy of biases • Participation rates and participant roles • Examine correlations between system/participant characteristics and observed biases • Generate ideas of how to respond Aarhus University, 3 October 2013
  • 11. Aarhus University, 3 October 2013
  • 12. Bias taxonomy • Contributor characteristics • Demographics • Level, type of activities • Information disclosure • Contribution characteristics • Writing style (e.g., narrative/reporting) • Content (e.g., uniqueness/conformity) • Metadata (e.g., time posted) Aarhus University, 3 October 2013
  • 13. Participation rates & roles Aarhus University, 3 October 2013
  • 14. Correlations • How are system and participant characteristics correlated to the biases that we observe? • Are more information displays necessarily better? • Which default display leads to more/less diversity with respect to a given characteristic of content? Aarhus University, 3 October 2013
  • 15. Final thoughts • Can we exploit bias in order to • Entice users to participate in all activities? • Convince users to question default information displays? Aarhus University, 3 October 2013

Editor's Notes

  1. I’d like to start out by showing you some examples of the types of OCS that we’re studying. This is a screenshot from the Internet Movie Database (IMDb), a shared space where users can contribute ratings and reviews of films.-The Godfather is currently listed as #2 on IMDb’s top movie list, and you can see that over 1500 users have contributed reviews. -In terms of what users are doing at this OCS, of course they’re critiquing films and expressing their opinions. However, they’re also often trying to make sense of the deeper meaning of a film.-Now, I’d like to point out how reviews are organized; you can see that users are asked whether or not a review was useful. Votes are then aggregated over all users and displayed above the review; reviews are then listed in rank order by usefulness, and that’s the default filter “Best.”
  2. The second example is from YahooNews, which aggregates news stories and provides a place for users to leave comments. This is obviously an article that was in the news this week, and at the time I took the screenshot, the article had been published for about 2.5 hours and there were almost 2000 user comments on it. In terms of what we can observe users doing here, of course there’s a good deal of sounding off about issues, but we also observe users trying to process an event that’s happened or make sense of it.-In terms of organization, we again see that a voting mechanism is in place, only here users can give a thumbs up or down on a comment. The default display is “Popular Now,” which prioritizes contributes that are recent and that have received positive votes.
  3. We are interested in the biases in the information display that result from the use of binary voting in OCS. Our working definition is…In other words, the information display gives users a particular take on “what others think.”The prominently displayed content is what users see and read, and if they have approached the OCS in an effort to make sense of a news item or a film or an issue, their impression of “what others think” is likely to be based on the highly-ranked content.And we know that’s likely to happen because there’s a great deal of research on how we interact with information, particularly when it’s presented to us as a ranked list of items.
  4. An example of a type of bias that many of us would consider undesirable.In our examination of the IMDb voting mechanism, we’ve found that generally speaking, women’s contributions are perceived as less “useful” as compared to men’s. Here, I show two reviews of the classic movie Casablanca. They illustrate the differences in writing style between genders, as well as the fact that women’s reviews receive fewer total votes and fewer “useful” votes. We’re quite convinced that when a user looks up a film at IMDb in order to see “what people think” what he or she is getting is the men’s impressions of the film.
  5. Now, to contrast, here’s one that many might find desirable.At three different OCS, we studied samples of forums reviewing various products, movies and services.A consistent finding was that front-page reviews (i.e., those deemed as helpful/useful by the crowd), are better edited than are those on latter pages.In this example, you see a review of a phone at Amazon. The content of the review is actually pretty comparable to highly-ranked reviews, but you can see that the presentation is not very standard (no capitalization, non-standard abbreviations).
  6. We want to better understand social voting bias and we need a way to study bias systematically.The current paper sketches our framework for a cross-system study of bias in OCS using binary voting mechanisms.
  7. Along with our students, we’ve been collecting a diverse set of examples of OCS, and we’ve been characterizing them with respect to their organization genomes, based on Malone and colleague’s framework. As you can see, we’re able to identify OCS that have essentially the same organizational genomes across many domains (such as health, news and entertainment). We’re discovering the “social voting genomes” of these systems as well. Social voting genome has three genes (voting construct, default information display, alternative displays). It’s interesting to note that different voting constructs are used, and it will be interesting in future work to consider if it makes any difference which constructs users vote on.
  8. Casting our net wide, what we’re doing right now is having a look at our example OCS, in order to come up with a taxonomy of biases that might occur. In other words, we’re trying to understand which properties of contributors and contribution characteristics might be susceptible to ranking bias.
  9. One obvious challenge for us in our study of bias in OCS is getting a feel for who does what, and we plan to undertake a survey of users across several OCS. There are really only three activities in the OCS, consume, share and vote on content, and this table shows the possible 7 participant roles, based on combinations of these activities. Ideally, every participant would be fully engaged, but of course we know that’s very unlikely. What we plan to do is to survey participants across several OCS (with similar genomes). What we want to examine is how the distribution of participant roles relates to the biases we observe. Of course, the less fully engaged users are, the more undesirable biases we might expect.
  10. Finally,
  11. We’ve advocated for a better understanding of biases that result when binary social voting mechanisms are used in OCS.Once we develop ways to detect and better understand which types of biases occur in a given system, we argue that we might be able to exploit it, by revealing its presence to users. We’d like to entice them to increase their participation across all activities in an OCS, as well as to go beyond just using the default information displays. For example, in our IMDb example, if we revealed to users that the majority of top reviews written on a film of interest were written by men, if that might make them curious enough to explore the alternative displays that you see here, and perhaps become exposed to more diverse content.