SlideShare a Scribd company logo
1 of 28
Download to read offline
Collaborative Filtering 
for fun 
Elissa Brown 
Erin Shellman 
Stella Rowlett 
…and profit! 
  
How’s it done? 
‣ Collaborative filtering! 
‣ Look for people who like the stuff 
you like, and recommend the 
things they’ve rated positively 
that you haven’t seen yet.
Step 1: Collect ratings 
Leaf Shorts Floral Shorts 
Elissa 5 5 
Erin 2 5 
Stella 4 1
5 
4 
3 
2 
1 
2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Stella
Step 2: Find someone 
similar for Mr. New Customer 
‣ A new customer wants to 
buy a pair of shorts for 
his new “girlfriend.” 
 
‣ Which of us is most 
similar to Justin? 
 
 

Finding the nearest neighbor 
‣Remember the 
Pythagorean Theorem? 
a2 + b2 = c2 
c = 
p 
a2 + b2 
p 
(x1  x2)2 + (y1  y2)2 
p 
(1  5)2 + (4  5)2 = 4.12 
5 
4 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Justin 
Stella
What’s the distance 
between Justin and Stella? 
5 
4 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Justin 
Stella ?
What’s the distance 
between Justin and Stella? 
5 
4 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Floral Shorts 
Person 
Elissa 
Erin 
Justin 
Stella 
p 
(1  4)2 + (1  4)2 = 4.24
Meant to be! 
 
‣Justin and Erin are a 
match made in 
heaven. 
‣How can this help us 
shop for Justin’s 
5 
4 
Floral Shorts 
3 
2 
1 
1 2 3 4 5 
Leaf Shorts 
Person 
Elissa 
Erin 
Justin 
Stella 
p girlfriend (not Erin)? 
(1  2)2 + (4  5)2 = 1.41
Step 3: Make a suggestion! 
Leaf Shorts Floral Shorts Jungle Shorts 
Elissa 5 5 1 
Erin 2 5 5 
Stella 4 1 2 
Justin 1 4 ?
Let’s go shopping! 

How should we store our ratings data?
How should we store our ratings data? 
A dictionary! 
# Store user ratings 
user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, 
Erin: {Leaf Shorts: 2, Floral Shorts: 5}, 
Stella: {Leaf Shorts: 4, Floral Shorts: 1} 
}
Python Warm-up! 
# Store user ratings 
user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, 
Erin: {Leaf Shorts: 2, Floral Shorts: 5}, 
Stella: {Leaf Shorts: 4, Floral Shorts: 1} 
} 
! 
# Print Stella's ratings. 
print user_ratings['Stella'] 
! 
# What did Elissa think of the floral ones? 
print user_ratings['Elissa']['Leaf Shorts'] 
! 
def what_did_they_think(rating): 
if rating  3: 
opinion = LOVED IT! 
else: 
opinion = HATED IT! 
return opinion 
! 
my_humble_opinion = what_did_they_think(user_ratings['Erin']['Leaf Shorts']) 
! 
print Erin's review of the Leaf Shorts:  + my_humble_opinion 
!!
What functions do we need 
to build a recommender?
What functions do we need 
to build a recommender? 
1. Function to compute distances 
2. Function to find nearby people 
3. Function to recommend items I haven’t rated yet
Pseudocode: 
Function to compute distances
compute_distance 
def compute_distance(user1_ratings, user2_ratings): 
 
This function computes the distance between two user's 
ratings. Both arguments should be dictionaries keyed on users, 
and items. 
 
distances = [] 
for key in user1_ratings: 
if key in user2_ratings: 
distances.append((user1_ratings[key] - user2_ratings[key]) ** 2) 
total_distance = round(sum(distances) ** 0.5, 2) 
return total_distance
Pseudocode: 
Function to find closest match
find_nearest_neighbors 
def find_nearest_neighbors(username, user_ratings): 
 
Returns the list of neighbors, ordered by distance. 
Call like this: find_nearest_neighbors('Erin', user_ratings) 
 
distances = [] 
for user in user_ratings: 
if user != username: 
distance = compute_distance(user_ratings[user], user_ratings[username]) 
distances.append((distance, user)) 
distances.sort() 
return distances
Pseudocode: 
Function to make recommendations
get_recommendations 
def get_recommendations(username, user_ratings): 
 
Return a list of recommendations. 
 
nearest_users = find_nearest_neighbors(username, user_ratings) 
recommendations = [] 
! 
# Input user's ratings 
ratings = user_ratings[username] 
! 
for neighbor in nearest_users: 
neighbor_name = neighbor[1] 
for item in user_ratings[neighbor_name]: 
if not item in ratings: 
recommendations.append((item, user_ratings[neighbor_name][item])) 
! 
return sorted(recommendations, 
key = lambda personTuple: personTuple[1], 
reverse = True)
Limitations 
‣ What happens if everyone rated the 
same group of items? 
‣ What if there’s no overlap? 
‣ What about new users with no ratings?
Now what? 
‣Ask your classmates to rate their electives, 
and make a recommender to help students 
pick their classes. 
‣Poll recent graduates from your school about 
their college, and make a recommender to 
help students pick colleges.
 
https://github.com/erinshellman/girls-who-code-recommender

More Related Content

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Collaborative Filtering for fun ...and profit!

  • 1. Collaborative Filtering for fun Elissa Brown Erin Shellman Stella Rowlett …and profit!   
  • 2.
  • 3.
  • 4.
  • 5. How’s it done? ‣ Collaborative filtering! ‣ Look for people who like the stuff you like, and recommend the things they’ve rated positively that you haven’t seen yet.
  • 6. Step 1: Collect ratings Leaf Shorts Floral Shorts Elissa 5 5 Erin 2 5 Stella 4 1
  • 7. 5 4 3 2 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Stella
  • 8. Step 2: Find someone similar for Mr. New Customer ‣ A new customer wants to buy a pair of shorts for his new “girlfriend.”  ‣ Which of us is most similar to Justin?   
  • 9. Finding the nearest neighbor ‣Remember the Pythagorean Theorem? a2 + b2 = c2 c = p a2 + b2 p (x1 x2)2 + (y1 y2)2 p (1 5)2 + (4 5)2 = 4.12 5 4 3 2 1 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Justin Stella
  • 10. What’s the distance between Justin and Stella? 5 4 3 2 1 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Justin Stella ?
  • 11. What’s the distance between Justin and Stella? 5 4 3 2 1 1 2 3 4 5 Leaf Shorts Floral Shorts Person Elissa Erin Justin Stella p (1 4)2 + (1 4)2 = 4.24
  • 12. Meant to be!  ‣Justin and Erin are a match made in heaven. ‣How can this help us shop for Justin’s 5 4 Floral Shorts 3 2 1 1 2 3 4 5 Leaf Shorts Person Elissa Erin Justin Stella p girlfriend (not Erin)? (1 2)2 + (4 5)2 = 1.41
  • 13. Step 3: Make a suggestion! Leaf Shorts Floral Shorts Jungle Shorts Elissa 5 5 1 Erin 2 5 5 Stella 4 1 2 Justin 1 4 ?
  • 15. How should we store our ratings data?
  • 16. How should we store our ratings data? A dictionary! # Store user ratings user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, Erin: {Leaf Shorts: 2, Floral Shorts: 5}, Stella: {Leaf Shorts: 4, Floral Shorts: 1} }
  • 17. Python Warm-up! # Store user ratings user_ratings = {Elissa: {Leaf Shorts: 5, Floral Shorts: 5}, Erin: {Leaf Shorts: 2, Floral Shorts: 5}, Stella: {Leaf Shorts: 4, Floral Shorts: 1} } ! # Print Stella's ratings. print user_ratings['Stella'] ! # What did Elissa think of the floral ones? print user_ratings['Elissa']['Leaf Shorts'] ! def what_did_they_think(rating): if rating 3: opinion = LOVED IT! else: opinion = HATED IT! return opinion ! my_humble_opinion = what_did_they_think(user_ratings['Erin']['Leaf Shorts']) ! print Erin's review of the Leaf Shorts: + my_humble_opinion !!
  • 18. What functions do we need to build a recommender?
  • 19. What functions do we need to build a recommender? 1. Function to compute distances 2. Function to find nearby people 3. Function to recommend items I haven’t rated yet
  • 20. Pseudocode: Function to compute distances
  • 21. compute_distance def compute_distance(user1_ratings, user2_ratings): This function computes the distance between two user's ratings. Both arguments should be dictionaries keyed on users, and items. distances = [] for key in user1_ratings: if key in user2_ratings: distances.append((user1_ratings[key] - user2_ratings[key]) ** 2) total_distance = round(sum(distances) ** 0.5, 2) return total_distance
  • 22. Pseudocode: Function to find closest match
  • 23. find_nearest_neighbors def find_nearest_neighbors(username, user_ratings): Returns the list of neighbors, ordered by distance. Call like this: find_nearest_neighbors('Erin', user_ratings) distances = [] for user in user_ratings: if user != username: distance = compute_distance(user_ratings[user], user_ratings[username]) distances.append((distance, user)) distances.sort() return distances
  • 24. Pseudocode: Function to make recommendations
  • 25. get_recommendations def get_recommendations(username, user_ratings): Return a list of recommendations. nearest_users = find_nearest_neighbors(username, user_ratings) recommendations = [] ! # Input user's ratings ratings = user_ratings[username] ! for neighbor in nearest_users: neighbor_name = neighbor[1] for item in user_ratings[neighbor_name]: if not item in ratings: recommendations.append((item, user_ratings[neighbor_name][item])) ! return sorted(recommendations, key = lambda personTuple: personTuple[1], reverse = True)
  • 26. Limitations ‣ What happens if everyone rated the same group of items? ‣ What if there’s no overlap? ‣ What about new users with no ratings?
  • 27. Now what? ‣Ask your classmates to rate their electives, and make a recommender to help students pick their classes. ‣Poll recent graduates from your school about their college, and make a recommender to help students pick colleges.