Recommender Systems

•Als PPT, PDF herunterladen•

1 gefällt mir•490 views

This presentation provides the theory and basics of recommendation systems specifically details about Collaborative and Content Based Filtering In this presentation we would only discuss personalised recommendations as opposed to group based or cluster based: 1. Collaborative Filtering 2. Content Based Recommendations 3. Knowledge Based Recommendations 4. Hybrid Personalised recommendations include a user, from whom we can derive profile and user data and a recommender system thatoutputs a list of items that state affinity of the user for any particular item. In Collaborative Filtering the results would be the same type of list stating the affinity of a user for an item. For the input we would also include community data based on which we can derive similar usage behaviours and make recommendations based on that. As you can see here the input is a matrix of ratings given by different users and the output is a predicted rating for a new item. If we were to treat each of these ratings as a coordinate on the Euclidian system we can calculate the “distance” between any two users. Then we can recommend the rating given by the most similar user to the current user or maybe some combination of the most similar n users. In Collaborative Filtering we are agnostic to what the items the users are using and make recommendations based solely on the behaviour of other users. Hybrid recommender system takes a mixed approach based on all the other approaches. It leverages all available inputs which may or may not include user data, product features, additional info and even community data When using a collaborative filtering approach we would need a method to obtain ratings for every pair of user and item. There are 3 ways to do that: Explicit Rating: A system that allows users to explicitly assign ratings to items Like Number of stars Likes or dislikes E.g. Youtube, Uber etc. Implicit Rating: A system that derives ratings from user behaviour Number of visits to a page Number of recharges Number of purchases of an item E.g. Amazon, YouTube etc. Hybrid After you have already figured out a way to calculate ratings. The next step would be to decide a way to calculate distance, also called similarity. There are many ways to calculate similarity between two points mathematically. On of the most common of which is mentioned here: Pearson Corelation

Technologie

Vipul Rajan
Lead Developer
Whiteklay
Vipul holds a keen interest in machine learning and recommender
systems. He has worked extensively with Apache Spark leveraging the
platform to provide optimal solutions to a variety of problems in a
conglomerate of use cases.

Empfohlen

Schema registryWhiteklay

Spark streamingWhiteklay

Kafka connect 101Whiteklay

The Need and Importance of AnsibleWhiteklay

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

Empfohlen

Schema registryWhiteklay

Spark streamingWhiteklay

Kafka connect 101Whiteklay

The Need and Importance of AnsibleWhiteklay

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Training state-of-the-art general text embeddingZilliz

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Commit 2024 - Secret Management made easyAlfredo García Lavilla

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Gen AI in Business - Global Trends Report 2024.pdfAddepto

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

CloudStudio User manual (basic edition):comworks

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz

Story boards and shot lists for my a level piececharlottematthew16

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Training state-of-the-art general text embeddingZilliz

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Commit 2024 - Secret Management made easyAlfredo García Lavilla

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Gen AI in Business - Global Trends Report 2024.pdfAddepto

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

CloudStudio User manual (basic edition):comworks

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz

Story boards and shot lists for my a level piececharlottematthew16

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Kürzlich hochgeladen (20)

Streamlining Python Development: A Guide to a Modern Project Setup

Training state-of-the-art general text embedding

Artificial intelligence in cctv survelliance.pptx

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

My INSURER PTE LTD - Insurtech Innovation Award 2024

Vertex AI Gemini Prompt Engineering Tips

Commit 2024 - Secret Management made easy

SAP Build Work Zone - Overview L2-L3.pptx

Gen AI in Business - Global Trends Report 2024.pdf

"Debugging python applications inside k8s environment", Andrii Soldatenko

CloudStudio User manual (basic edition):

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Nell’iperspazio con Rocket: il Framework Web di Rust!

Vector Databases 101 - An introduction to the world of Vector Databases

Story boards and shot lists for my a level piece

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost

Ensuring Technical Readiness For Copilot in Microsoft 365

Search Engine Optimization SEO PDF for 2024.pdf

Unraveling Multimodality with Large Language Models.pdf

Empfohlen

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

ChatGPT webinar slidesAlireza Esmikhani

More than Just Lines on a Map: Best Practices for U.S Bike RoutesProject for Public Spaces & National Center for Biking and Walking

Empfohlen (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

ChatGPT webinar slides

More than Just Lines on a Map: Best Practices for U.S Bike Routes

Recommender Systems

10. www.whiteklay.com

11. www.whiteklay.com

12. www.whiteklay.com

13. www.whiteklay.com

14. www.whiteklay.com

15. www.whiteklay.com

16. www.whiteklay.com

17. www.whiteklay.com

18. www.whiteklay.com

19. www.whiteklay.com

20. www.whiteklay.com

21. www.whiteklay.com

22. Vipul Rajan Lead Developer Whiteklay Vipul holds a keen interest in machine learning and recommender systems. He has worked extensively with Apache Spark leveraging the platform to provide optimal solutions to a variety of problems in a conglomerate of use cases.

Hinweis der Redaktion

Recommender System Basics &lt;number&gt;
RS help to match users with items &lt;number&gt;
In this presentation we would only discuss personalised recommendations as opposed to group based or cluster based: 1. Collaborative Filtering 2. Content Based Recommendations 3. Knowledge Based Recommendations 4. Hybrid &lt;number&gt;
Personalised recommendations include a user, from whom we can derive profile and user data and a recommender system thatoutputs a list of items that state affinity of the user for any particular item. &lt;number&gt;
In Collaborative Filtering the results would be the same type of list stating the affinity of a user for an item. For the input we would also include community data based on which we can derive similar usage behaviours and make recommendations based on that. &lt;number&gt;
As you can see here the input is a matrix of ratings given by different users and the output is a predicted rating for a new item. If we were to treat each of these ratings as a coordinate on the Euclidian system we can calculate the “distance” between any two users. Then we can recommend the rating given by the most similar user to the current user or maybe some combination of the most similar n users. &lt;number&gt;
In Collaborative Filtering we are agnostic to what the items the users are using and make recommendations based solely on the behaviour of other users. &lt;number&gt;
In a content based approach we consider the attributes of the item as well. Instead of calculating the distance between two users we directly calculate the distance between users and different items. &lt;number&gt;
In knowledge based approach we also specify some additional data along with item and user features. E.g. A sweater might be the closest item by distance to a particular user, but we have some additional knowledge that it’s summer, it would make sense not to recommend the user a sweater. &lt;number&gt;
Hybrid recommender system takes a mixed approach based on all the other approaches. It leverages all available inputs which may or may not include user data, product features, additional info and even community data &lt;number&gt;
When using a collaborative filtering approach we would need a method to obtain ratings for every pair of user and item. There are 3 ways to do that: Explicit Rating: A system that allows users to explicitly assign ratings to items Like Number of stars Likes or dislikes E.g. Youtube, Uber etc. Implicit Rating: A system that derives ratings from user behaviour Number of visits to a page Number of recharges Number of purchases of an item E.g. Amazon, YouTube etc. Hybrid &lt;number&gt;
After you have already figured out a way to calculate ratings. The next step would be to decide a way to calculate distance, also called similarity. There are many ways to calculate similarity between two points mathematically. On of the most common of which is mentioned here: Pearson Corelation &lt;number&gt;
&lt;number&gt;
In collaborative filtering instead of assigning items to users you can very well just reverse the whole process and use an item based approach. &lt;number&gt;
In case of content based approach you’d have to find a way to calculate item attributes. We would look at an example of TF-IDF TF: Term Frequency, Measures how often a term appears (density in a document) IDF: inverse Document Frequency, Aims to reduce the weight of terms that appear in all documents &lt;number&gt;
Given a keyword I and a document j TF(I,j) Term frequency of keyword I in document j IDF(i) Inverse document frequency calculated as IDF(i) = log(N/n(i)) N: number of all recommendable documents n(i): number of documents from N in which keyword i appears TF – IDF Is calculated as: TF-IDF(I,j) = TF(I,j) * IDF(i) &lt;number&gt;