SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Recommender Systems
Twenty years of research
Lior Rokach
Dept. of Software and Information Systems Eng.,
Ben-Gurion University of the Negev
2
Recommender Systems
• A recommender system (RS) helps users that have no
sufficient competence or time to evaluate the, potentially
overwhelming, number of alternatives offered by a web
site.
– In their simplest form, RSs recommend to their users personalized
and ranked lists of items
The Impact of RecSys
• 35% of the purchases on Amazon are the result of their
recommender system, according to McKinsey.
• During the Chinese global shopping festival of
November 11, 2016, Alibaba achieved growth of up to
20% of their conversion rate using personalized landing
pages, according to Alizila.
• Recommendations are responsible for 70% of the time
people spend watching videos on YouTube.
• 75% of what people are watching on Netflix comes
from recommendations, according to McKinsey
https://tryolabs.com/blog/introduction-to-recommender-systems/
The Rise of the Recommender System
1 0 3 1 1 3 0 25 44 63
115
195 240
308
415
487
590
766
985
1311
1645
1898
2172
2571
2687
2924
3075
3320
0
500
1000
1500
2000
2500
3000
3500
4000
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018
# Papers in Microsoft Academic
*
*2018-Estimated
Recommendation Models
Model Commonness
Used By:
Jinni Taste Kid Nanocrowd Clerkdogs Criticker IMDb Flixster Movielens Netflix Shazam Pandora LastFM YooChoose Think Analytics Itunes Amazon
Collaborative Filtering v v v v v v v v v v v v
Content-Based Techniques v v v v v v v v v v v
Knowledge-Based Techniques v v v v v v v
Stereotype-Based Recommender Systems v v v v v v v
Ontologies and Semantic Web Technologies for
Recommender Systems
v v v
Community Based Recommender Systems v v v v v v v
Demographic Based Recommender Systems v
Context Aware Recommender Systems v v v v v v
Conversational/Critiquing Recommender Systems v v
Hybrid Techniques
v v v v v
 Tryingto predictthe opinion theuser will haveon thedifferent items and be able to
recommendthe “best” items to each user based on: the user’s previous likings and
the opinions of other like minded (“Similar”)users
abcd
The Idea
?
Positive Rating
Negative
Rating
Collaborative Filtering
Overview
24.04.2022
 Input:
 Rating Data
 Event Data
 Explicit Feedback (Rating, Like/Dislike)
vs.
Implicit Feedback (Viewed item page, time spend in page)
 Goal:
 Rating Prediction
 Purchase Prediction
 Top-n Recommendation
 Etc.
abcd
Various Tasks
7
Collaborative Filtering
24.04.2022
 The ratings of users and items are represented in a matrix
abcd
Example of Rating Matrix
8
Collaborative Filtering
Rating Matrix
24.04.2022
Given a set of users U that haverated some set of items M, for each rating not yetpresent, predict the rating rij
that user ui will give item mj
abcd
Rating Prediction
9
Collaborative Filtering
Rating Prediction Task
24.04.2022 10
Collaborative Filtering
Techniques
Nearest Neighbor
Matrix Factorization
Deep Learning
Popular Techniques
24.04.2022
abcd
“People who liked this also liked…”
Collaborative Filtering
Approach 1: Nearest Neighbors
11
Item to
Item
Userto User
abcd
User-to-User
 Recommendationsaremade byfinding userswith similartastes.Jane
andTim bothliked Item 2 anddislikedItem 3; it seemstheymight have
similartaste,which suggeststhat in generalJaneagreeswith Tim. This
makes Item 1 a goodrecommendationforTim.
Thisapproachdoesnot scalewellfor millionsof users.
Item-to-Item
 Recommendationsaremade byfinding itemsthathave similarappealto
many users.
Tom andSandraaretwouserswho likedbothItem 1 andItem 4. That
suggeststhat, in general,peoplewho likedItem 4 will alsolike item 1, so
Item 1 will berecommendedto Tim. Thisapproachisscalableto
millionsof usersandmillionsof items.
24.04.2022
Nearest Neighbor Technique
Popular Methods
12
Methods
 Using predefined similaritymeasures(such asPearsonor Hamming Distance)
 Learning similaritythe relationsweights via optimization
24.04.2022
Hamming
distance
5 6 6 5 4 8
0 Dislike
1 Like
? Unknown
1
?
0
1
1
0
1
1
0
1
1
1
1
0
Current User Users
Items
User Model =
interaction
history
1
1st item rate
14th item rate
Nearest Neighbor
Using predefined Similarity Measure
 Nearest
Neighbor
abcd
13
 This user did not
rate the item. We
will try to predict a
rating according
to his neighbors.
abcd
Unknown Rating
 There are other
users who rated
the same item.
We are interested
in the Nearest
Neighbors.
abcd
Other Users
 We are looking
for the Nearest
Neighbor. The
one with the
lowest Hamming
distance.
abcd
Nearest Neighbors
 The prediction
was made based
on the nearest
neighbor.
abcd
Prediction
abcd
A basic model
14
min 𝑟𝑢𝑖 − 𝑟𝑢𝑖
2
Nearest Neighbor
Using optimization
abcd
Factorization
 IntheRecommendationSystemsfield,SVDmodelsusers
anditemsasvectorsoflatentfeatureswhichwhencross
productproducetheratingfortheuseroftheitem
 WithSVDamatrixisfactoredintoaseriesoflinear
approximationsthatexpose theunderlyingstructureofthe
matrix.
 Thegoalistouncoverlatentfeaturesthatexplain observed
ratings
abcd
24.04.2022 15
Collaborative Filtering
Approach 2: Matrix factorization
The Netflix Prize
 Started on Oct. 2006
 $1,000,000 Grand Prize
 Training dataset: 100 million ratings (1,2,3,4,5 stars) from 480K
customers on 18 K movies.
 Qualifying set (2,817,131 ratings) consisting of:
 Test set (1,408,789 ratings), used to determine winners
 Quiz set (1,408,342 ratings), used to calculate leaderboard scores
 Goal:
 Improve the Netflix existing algorithm by at least 10%
 Reduce RMSE From 0.9525 to RMSE<0.8572
16
17
18
20 min
later
The Prize Goes To …
 Once a team succeeded to improve the RMSE by 10%, the jury issue a
last call, giving all teams 30 days to send their submissions.
 On July 25, 2009 the team "The Ensemble” achieved a 10.09%
improvement.
 After some dispute …
19
Lessons Learned from the Netflix Prize
 Competition is an excellent way for companies to:
 Outsource their challenges
 Get PR.
 Hire top talent
 SVD has become the method-of-choice in CF.
 Ensemble is crucial for winning.
 Regularization is important for alleviating over-fitting.
 When an abundant training data is given, content features (e.g. genre and
actors) found to be useless.
 Methods that were developed during competitions are not always useful for
real systems.
20
24.04.2022
Users & Ratings Latent Concepts or Factors
SVD Process
abcd SVD
SVD reveals hidden
connections and
its strength
abcd
Hidden Concept
21
Latent Factor Models
Example
User Rating
abcd SVD
24.04.2022
Users & Ratings Latent Concepts or Factors
SVD revealed a
movie this user
might like!
abcd
Recommendation
22
Latent Factor Models
Example
24.04.2022 23
Latent Factor Models
Concept space
Popular Factorization
• SVD
𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ Σ𝑑 ×𝑑 ∙ 𝑉𝑛×𝑑
𝑇
d=min(m,n)
• Low Rank Factorization
• Code-Book
𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ 𝐵𝑑 ×𝑙∙ 𝑉𝑛×𝑙
𝑇
𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ 𝑉𝑛×𝑑
𝑇
diagonal matrix where
singular values indicate
the factor importance
Permutation
Matrix
Estimate latent factors through optimization
• Decision Variables:
– Matrices U, V
• Goal function:
– Minimize some loss function on available entries in the
training rating matrix
– Most frequently MSE is used:
• Easy to optimize
• A proxy to other predictive performance measures
• Methods:
– e.g. use stochastic gradient descent
Three Related Issues
• Sparseness
• Long Tail
– many items in the Long Tail
have only few ratings
• Cold Start
– System cannot draw any
inferences for users or items
about which it has not yet
gathered sufficient data
Transfer Learning (TL)
27
h
Different
tasks
Learning
system
Learning
system
Learning
system
Traditional Machine Learning Transfer learning
knowledge Learning
system
Source
domain
Target
domain
Transfer previously learned “knowledge” to new domains,
making them capable of learning a model from very few training
examples.
Transfer Learning
Share-Nothing
28
Games Music
Transfer Learning
Share-Nothing
29
Best seller
Trendy
Classic
Best seller
Trendy
Classic
Games Music
e
d
c
b
a
1
3
3
1
?
1
3
?
2
3
3
2
?
3
?
2
2
3
?
?
3
1
1
4
1
3
?
?
1
5
3
2
2
?
3
6
2
3
3
2
?
7
Users
Items
Rating Matrix
𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ 𝐵𝑑 ×𝑙∙ 𝑉𝑛×𝑙
𝑇
e
d
c
b
a
1
3
3
1
1
1
3
?
2
3
3
2
?
3
?
2
2
3
?
?
3
1
1
4
1
3
?
?
1
5
3
2
2
?
3
6
2
3
3
2
?
7
31
Users
Items
Rating Matrix
𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ 𝐵𝑑 ×𝑙∙ 𝑉𝑛×𝑙
𝑇
34
Codebook Transfer
e
b
a
d
c
1
1
?
3
3
1
?
1
1
?
3
4
1
?
1
3
?
5
3
3
3
?
2
2
3
?
3
2
2
6
?
2
2
3
?
3
2
2
?
3
3
7
e
d
c
b
c
1
3
3
1
?
1
3
?
2
3
3
2
?
3
?
2
2
3
?
?
3
1
1
4
1
3
?
?
1
5
3
2
2
?
3
6
2
3
3
2
?
7
d
c
f
b
e
a
2
2
1
1
?
3
2
2
?
1
1
3
3
3
3
3
?
3
2
2
1
3
3
3
3
2
2
5
1
1
2
2
3
?
4
?
1
2
2
3
3
6
f
e
d
c
b
a
?
2
3
3
3
2
1
1
?
2
2
1
3
2
1
3
2
?
1
3
3
2
3
1
1
2
?
4
3
2
3
3
3
2
5
2
3
?
1
2
3
6
C
B
A
2
1
3
X
3
3
2
Y
1
2
3
Z
items
u
s
e
r
s
B
A
1
3
X
3
2
Y
2
3
Z
items
u
s
e
r
s
Source domain (music)
Target domain (games)
• Assumption: related domains share similar cluster level
rating patterns.
After permutation
After permutation
Why does it make sense?
• The rows/columns in the code-book matrix
represents the users’/items’ rating distribution:
J
I
H
G
F
E
D
C
B
A
2
2
3
1
1
2
2
1
1
3
a
3
3
5
4
5
5
5
4
4
2
b
1
5
2
4
3
4
2
3
5
1
c
1
4
4
3
2
2
3
2
1
2
d
1
2
2
3
4
3
3
5
1
3
e
2
3
2
1
2
1
3
1
5
3
f
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1 2 3 4 5
-0.1
6E-16
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1 2 3 4 5
• Less training instances are required to match
users/items to existing patterns than
rediscover these patterns
36
TALMUD
TrAnsfer Learning from MUltiple Domains
• Extends the codebook transfer concept to support:
• Multiple source domains with varying levels of relevance.
37
TALMUD-Problem Definition
1. Objective: Minimizing MSE (Mean squared Error) in
the target domain
2. Variables:
• Users and items clusters memberships
in each source domain n - 𝑈𝑛 , 𝑉
𝑛
• 𝛼𝑛– Relatedness coefficient between each
source domain i and the target domain
37
Min
min
𝑈𝑛 ∈ 0,1 𝑝×𝑘𝑛
𝑉𝑛 ∈ 0,1 𝑞×𝑙𝑛
𝛼𝑛 ∈𝑅 ∀𝑛∈𝑁
𝑋𝑡𝑔𝑡 − 𝛼𝑛 𝑈𝑛 𝐵𝑛 𝑉
𝑛
𝑇
𝑁
𝑛=1
⃘𝑊
2
𝑆. 𝑇 𝑈n 1 = 1, 𝑉n 1 = 1
38
The TALMUD Algorithm
•Step 1: creating a cluster (Codebook 𝐵𝑛)
for each source domain
•Step 2: Learning the target clusters membership based on all
source domains simultaneously.
2.1: finding the users’
corresponding clusters
2.2: finding the items’
corresponding clusters
2.3: Learning the
coefficients 𝛼𝑛
•Step 3: Calculate the filled-in
target rating matrix
𝑗 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑗 𝑋𝑡𝑔𝑡 𝑖∗
− 𝛼𝑛 𝐵𝑛 𝑉
𝑛
(𝑡−1) 𝑇
𝑗 ∗
𝑁
𝑛=1 𝑊𝑖∗
2
𝑗 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑗 𝑋𝑡𝑔𝑡 ∗𝑖
− 𝛼𝑛 𝑈𝑛
(𝑡)
𝐵𝑛 ∗𝑗
𝑁
𝑛=1 𝑊∗𝑖
2
𝑋𝑡𝑔𝑡 = 𝑊 ⃘𝑋𝑡𝑔𝑡 + 1 − 𝑊 ⃘ 𝛼𝑛(𝑈𝑛 𝐵𝑛𝑉
𝑛
𝑇
)
𝑁
𝑛=1
39
Forward Selection of Sources
1) Adding sources gradually-
• Begins with an empty set of sources
• Examine the addition of each source
• Add the source that improves the
model the most
• Wrapper approach is used to decide
when to stop.
2) Retrain using the entire dataset with the
selected sources
Data
Training Test
Validation
Training Test
1)
2)
• Public Dataset (Source Domain)
– Netfilx (Movies)
– Jester (Jokes)
– MovieLense (Movies)
• Target Domain
– Music loads
– Games loads
– BookCrossing (Books)
40
Datasets
Comparison Results
48.67
74.84
49.56
53.38
78.1
133.3
54.58
78.06
120.5
61.17
85.21
103.15
88.11
96.16
219.21
0
50
100
150
200
250
Games Music BookCrossing
MAE
Target Domain
Talmud
CBT
RMGM
SVD
CB
44
Curse of Sources
Too many sources leads to over-fitting.
Not all given source domains should be used.
0
10
20
30
40
50
60
70
80
90
100
0 1 2 3 4
MAE
Number of Sources
Target Games
Test Error of Complete Forward Selection
Train Error of Complete Forward Selection
46
SVD Implementation
dot product
Deep Implementation
How to win Netflix Prize with a few
lines of code:
movie_count = 17771
user_count = 2649430
model_left = Sequential()
model_left.add(Embedding(movie_count, 60, input_length=1))
model_right = Sequential()
model_right.add(Embedding(user_count, 20, input_length=1))
model = Sequential()
model.add(Merge([model_left, model_right], mode='concat'))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('sigmoid'))
model.add(Dense(64))
model.add(Activation('sigmoid'))
model.add(Dense(64))
model.add(Activation('sigmoid'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adadelta')
model.fit([tr[:,0].reshape((L,1)), tr[:,1].reshape((L,1))], tr[:,2].reshape((L,1)), batch_size=24000,
nb_epoch=42, validation_data=([ ts[:,0].reshape((M,1)), ts[:,1].reshape((M,1))], ts[:,2].reshape((M,1))))
Item2Vec: Item Embedding
• Represent each item with a low-dimensional
vector
• Item similarity = vector similarity
• Learned from users’ sessions.
• Inspired by Word2Vec
– Words = Items
– Sentences = Users’ Sessions
Continuous Bag of Items
• E.g. given a user’s session of (I1, I2, I3,I4,I5)
• Window size = 2
51
I1
I2
I4
I5
I3
52
0
1
0
0
0
0
0
0
…
0
0
0
0
1
0
0
0
0
…
0
I2
I4
0
0
0
0
0
0
0
1
…
0
Input layer
Hidden layer
I2
Output layer
𝑊𝑉×𝑁
𝑊𝑉×𝑁
V-dim
V-dim
N-dim
𝑊′𝑁×𝑉
V-dim
V is the size of product catalog
We must learn W and W’
N is the size of embedding vector
53
0
1
0
0
0
0
0
0
…
0
0
0
0
1
0
0
0
0
…
0
xI2
xI4
0
0
0
0
0
0
0
1
…
0
Input layer
Hidden layer
I3
Output layer
V-dim
V-dim
N-dim
V-dim
+
0.1 2.4 1.6 1.8 0.5 0.9 … … … 3.2
0.5 2.6 1.4 2.9 1.5 3.6 … … … 6.1
… … … … … … … … … …
… … … … … … … … … …
0.6 1.8 2.7 1.9 2.4 2.0 … … … 1.2
×
0
1
0
0
0
0
0
0
…
0
𝑊𝑉×𝑁
𝑇
× 𝑥𝐼1 = 𝑣𝐼1
2.4
2.6
…
…
1.8
=
54
0
1
0
0
0
0
0
0
…
0
0
0
0
1
0
0
0
0
…
0
xI2
xI4
0
0
0
0
0
0
0
1
…
0
Input layer
Hidden layer
I3
Output layer
V-dim
V-dim
N-dim
+
𝑣
=
𝑣
𝐼2
+
𝑣
𝐼4
2
𝑦
=
𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑧)
𝑊𝑉×𝑁
′
× 𝑣 = 𝑧
0.01
0.02
0.00
0.02
0.01
0.02
0.01
0.7
…
0.00
We would prefer 𝑦 close to 𝑦𝐼3
Some interesting results
• Similarity:
• Most similar item to Samsung Galaxy S7 G930V:
• Samsung Galaxy S7 G930A
• Samsung Galaxy S7 Edge
• Item Analogy:
+ Apple iPhone 5C
- Apple iPhone 4s
+ Samsung Galaxy S5 Edge
=
Samsung Galaxy S6 Edge
55
Given that the algorithm was not exposed to item title or description:
Why Analogy Relations Are Preserved?
Target Item Prepaid
Micro Sim
Prepaid
Nano Sim
Samsung
Charger Cable
Apple Earpods
iPhone 5 0 1 0 1
iPhone 4 1 0 0 1
Galaxy S5 1 0 1 0
Galaxy S6 0 1 1 0
56
Other Items in the Session
+
-
+
=
Beyond Accuracy:
Future Trends in RecSys
• Diversity & Serendipity
• Incorporating price in RecSys models
• Explainable RecSys
• Counteract the effect of the existing RecSys and isolate the
organic browsing of the users
• Knowledge-based RecSys
57

Weitere ähnliche Inhalte

Ähnlich wie Rokach-GomaxSlides (1).pptx

Empirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsEmpirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsUniversity of Bergen
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface晓愚 孟
 
[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation SystemsAxel de Romblay
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVIntoTheMinds
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVFrancisco Couto
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
 
Udacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsUdacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsAxel de Romblay
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
 
Recommender system
Recommender systemRecommender system
Recommender systemSaiguru P.v
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE
 
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation ApproachYONG ZHENG
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Sonya Liberman
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionPerumalPitchandi
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
A Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine LearningA Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine LearningIRJET Journal
 

Ähnlich wie Rokach-GomaxSlides (1).pptx (20)

Empirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsEmpirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender Systems
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems[UPDATE] Udacity webinar on Recommendation Systems
[UPDATE] Udacity webinar on Recommendation Systems
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 
Udacity webinar on Recommendation Systems
Udacity webinar on Recommendation SystemsUdacity webinar on Recommendation Systems
Udacity webinar on Recommendation Systems
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Recommender system
Recommender systemRecommender system
Recommender system
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019Recommender Systems @ Scale - PyData 2019
Recommender Systems @ Scale - PyData 2019
 
Lecture Notes on Recommender System Introduction
Lecture Notes on Recommender System IntroductionLecture Notes on Recommender System Introduction
Lecture Notes on Recommender System Introduction
 
Fashiondatasc
FashiondatascFashiondatasc
Fashiondatasc
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
A Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine LearningA Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine Learning
 

Mehr von Jadna Almeida

Introdução Segurança e Auditoria.pptx
Introdução Segurança e Auditoria.pptxIntrodução Segurança e Auditoria.pptx
Introdução Segurança e Auditoria.pptxJadna Almeida
 
Tópicos em Sistemas de Informação e Web I.pptx
Tópicos em Sistemas de Informação e Web I.pptxTópicos em Sistemas de Informação e Web I.pptx
Tópicos em Sistemas de Informação e Web I.pptxJadna Almeida
 
Aula 02- Projeto de Interfaces.ppt
Aula 02- Projeto de Interfaces.pptAula 02- Projeto de Interfaces.ppt
Aula 02- Projeto de Interfaces.pptJadna Almeida
 
Aula03_04_ModelosProcessos.pdf
Aula03_04_ModelosProcessos.pdfAula03_04_ModelosProcessos.pdf
Aula03_04_ModelosProcessos.pdfJadna Almeida
 
2019_Aula 1 - Introdução à Engenharia de Software.pdf
2019_Aula 1 - Introdução à Engenharia de Software.pdf2019_Aula 1 - Introdução à Engenharia de Software.pdf
2019_Aula 1 - Introdução à Engenharia de Software.pdfJadna Almeida
 
Aula 01 e 02 - Engenharia de Software.pdf
Aula 01 e 02 - Engenharia de Software.pdfAula 01 e 02 - Engenharia de Software.pdf
Aula 01 e 02 - Engenharia de Software.pdfJadna Almeida
 
Aula 08LingProgrMauricio.pdf
Aula 08LingProgrMauricio.pdfAula 08LingProgrMauricio.pdf
Aula 08LingProgrMauricio.pdfJadna Almeida
 
Slides 02 - Orientacao a Objetos.pdf
Slides 02 - Orientacao a Objetos.pdfSlides 02 - Orientacao a Objetos.pdf
Slides 02 - Orientacao a Objetos.pdfJadna Almeida
 
Slides 04 - A Linguagem Java.pdf
Slides 04 - A Linguagem Java.pdfSlides 04 - A Linguagem Java.pdf
Slides 04 - A Linguagem Java.pdfJadna Almeida
 
Aula 2 - Introducao e Algoritmos.ppt
Aula 2 - Introducao e Algoritmos.pptAula 2 - Introducao e Algoritmos.ppt
Aula 2 - Introducao e Algoritmos.pptJadna Almeida
 
A04_Orientacao a Objetos 02.pdf
A04_Orientacao a Objetos 02.pdfA04_Orientacao a Objetos 02.pdf
A04_Orientacao a Objetos 02.pdfJadna Almeida
 
POO2 - Orientacao a Objetos (1).pdf
POO2 - Orientacao a Objetos (1).pdfPOO2 - Orientacao a Objetos (1).pdf
POO2 - Orientacao a Objetos (1).pdfJadna Almeida
 
linguagens_de_programacao.ppt
linguagens_de_programacao.pptlinguagens_de_programacao.ppt
linguagens_de_programacao.pptJadna Almeida
 
Aula 2 - Introducao a Algoritmo.pptx
Aula 2 - Introducao a Algoritmo.pptxAula 2 - Introducao a Algoritmo.pptx
Aula 2 - Introducao a Algoritmo.pptxJadna Almeida
 
COMP6411.1.history.ppt
COMP6411.1.history.pptCOMP6411.1.history.ppt
COMP6411.1.history.pptJadna Almeida
 

Mehr von Jadna Almeida (20)

Introdução Segurança e Auditoria.pptx
Introdução Segurança e Auditoria.pptxIntrodução Segurança e Auditoria.pptx
Introdução Segurança e Auditoria.pptx
 
Tópicos em Sistemas de Informação e Web I.pptx
Tópicos em Sistemas de Informação e Web I.pptxTópicos em Sistemas de Informação e Web I.pptx
Tópicos em Sistemas de Informação e Web I.pptx
 
PadroesGRASP.ppt
PadroesGRASP.pptPadroesGRASP.ppt
PadroesGRASP.ppt
 
lect22.ppt
lect22.pptlect22.ppt
lect22.ppt
 
Aula 02- Projeto de Interfaces.ppt
Aula 02- Projeto de Interfaces.pptAula 02- Projeto de Interfaces.ppt
Aula 02- Projeto de Interfaces.ppt
 
Aula03_04_ModelosProcessos.pdf
Aula03_04_ModelosProcessos.pdfAula03_04_ModelosProcessos.pdf
Aula03_04_ModelosProcessos.pdf
 
2019_Aula 1 - Introdução à Engenharia de Software.pdf
2019_Aula 1 - Introdução à Engenharia de Software.pdf2019_Aula 1 - Introdução à Engenharia de Software.pdf
2019_Aula 1 - Introdução à Engenharia de Software.pdf
 
Aula 01 e 02 - Engenharia de Software.pdf
Aula 01 e 02 - Engenharia de Software.pdfAula 01 e 02 - Engenharia de Software.pdf
Aula 01 e 02 - Engenharia de Software.pdf
 
Aula 08LingProgrMauricio.pdf
Aula 08LingProgrMauricio.pdfAula 08LingProgrMauricio.pdf
Aula 08LingProgrMauricio.pdf
 
Slides 02 - Orientacao a Objetos.pdf
Slides 02 - Orientacao a Objetos.pdfSlides 02 - Orientacao a Objetos.pdf
Slides 02 - Orientacao a Objetos.pdf
 
Slides 04 - A Linguagem Java.pdf
Slides 04 - A Linguagem Java.pdfSlides 04 - A Linguagem Java.pdf
Slides 04 - A Linguagem Java.pdf
 
poo-aula01.pdf
poo-aula01.pdfpoo-aula01.pdf
poo-aula01.pdf
 
Aula 2 - Introducao e Algoritmos.ppt
Aula 2 - Introducao e Algoritmos.pptAula 2 - Introducao e Algoritmos.ppt
Aula 2 - Introducao e Algoritmos.ppt
 
A04_Orientacao a Objetos 02.pdf
A04_Orientacao a Objetos 02.pdfA04_Orientacao a Objetos 02.pdf
A04_Orientacao a Objetos 02.pdf
 
POO2 - Orientacao a Objetos (1).pdf
POO2 - Orientacao a Objetos (1).pdfPOO2 - Orientacao a Objetos (1).pdf
POO2 - Orientacao a Objetos (1).pdf
 
linguagens_de_programacao.ppt
linguagens_de_programacao.pptlinguagens_de_programacao.ppt
linguagens_de_programacao.ppt
 
Aula 2 - Introducao a Algoritmo.pptx
Aula 2 - Introducao a Algoritmo.pptxAula 2 - Introducao a Algoritmo.pptx
Aula 2 - Introducao a Algoritmo.pptx
 
COMP6411.1.history.ppt
COMP6411.1.history.pptCOMP6411.1.history.ppt
COMP6411.1.history.ppt
 
22_ideals (1).ppt
22_ideals (1).ppt22_ideals (1).ppt
22_ideals (1).ppt
 
lecture244-mf.pptx
lecture244-mf.pptxlecture244-mf.pptx
lecture244-mf.pptx
 

Kürzlich hochgeladen

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 

Kürzlich hochgeladen (20)

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 

Rokach-GomaxSlides (1).pptx

  • 1. Recommender Systems Twenty years of research Lior Rokach Dept. of Software and Information Systems Eng., Ben-Gurion University of the Negev
  • 2. 2 Recommender Systems • A recommender system (RS) helps users that have no sufficient competence or time to evaluate the, potentially overwhelming, number of alternatives offered by a web site. – In their simplest form, RSs recommend to their users personalized and ranked lists of items
  • 3. The Impact of RecSys • 35% of the purchases on Amazon are the result of their recommender system, according to McKinsey. • During the Chinese global shopping festival of November 11, 2016, Alibaba achieved growth of up to 20% of their conversion rate using personalized landing pages, according to Alizila. • Recommendations are responsible for 70% of the time people spend watching videos on YouTube. • 75% of what people are watching on Netflix comes from recommendations, according to McKinsey https://tryolabs.com/blog/introduction-to-recommender-systems/
  • 4. The Rise of the Recommender System 1 0 3 1 1 3 0 25 44 63 115 195 240 308 415 487 590 766 985 1311 1645 1898 2172 2571 2687 2924 3075 3320 0 500 1000 1500 2000 2500 3000 3500 4000 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 # Papers in Microsoft Academic * *2018-Estimated
  • 5. Recommendation Models Model Commonness Used By: Jinni Taste Kid Nanocrowd Clerkdogs Criticker IMDb Flixster Movielens Netflix Shazam Pandora LastFM YooChoose Think Analytics Itunes Amazon Collaborative Filtering v v v v v v v v v v v v Content-Based Techniques v v v v v v v v v v v Knowledge-Based Techniques v v v v v v v Stereotype-Based Recommender Systems v v v v v v v Ontologies and Semantic Web Technologies for Recommender Systems v v v Community Based Recommender Systems v v v v v v v Demographic Based Recommender Systems v Context Aware Recommender Systems v v v v v v Conversational/Critiquing Recommender Systems v v Hybrid Techniques v v v v v
  • 6.  Tryingto predictthe opinion theuser will haveon thedifferent items and be able to recommendthe “best” items to each user based on: the user’s previous likings and the opinions of other like minded (“Similar”)users abcd The Idea ? Positive Rating Negative Rating Collaborative Filtering Overview
  • 7. 24.04.2022  Input:  Rating Data  Event Data  Explicit Feedback (Rating, Like/Dislike) vs. Implicit Feedback (Viewed item page, time spend in page)  Goal:  Rating Prediction  Purchase Prediction  Top-n Recommendation  Etc. abcd Various Tasks 7 Collaborative Filtering
  • 8. 24.04.2022  The ratings of users and items are represented in a matrix abcd Example of Rating Matrix 8 Collaborative Filtering Rating Matrix
  • 9. 24.04.2022 Given a set of users U that haverated some set of items M, for each rating not yetpresent, predict the rating rij that user ui will give item mj abcd Rating Prediction 9 Collaborative Filtering Rating Prediction Task
  • 10. 24.04.2022 10 Collaborative Filtering Techniques Nearest Neighbor Matrix Factorization Deep Learning Popular Techniques
  • 11. 24.04.2022 abcd “People who liked this also liked…” Collaborative Filtering Approach 1: Nearest Neighbors 11 Item to Item Userto User abcd User-to-User  Recommendationsaremade byfinding userswith similartastes.Jane andTim bothliked Item 2 anddislikedItem 3; it seemstheymight have similartaste,which suggeststhat in generalJaneagreeswith Tim. This makes Item 1 a goodrecommendationforTim. Thisapproachdoesnot scalewellfor millionsof users. Item-to-Item  Recommendationsaremade byfinding itemsthathave similarappealto many users. Tom andSandraaretwouserswho likedbothItem 1 andItem 4. That suggeststhat, in general,peoplewho likedItem 4 will alsolike item 1, so Item 1 will berecommendedto Tim. Thisapproachisscalableto millionsof usersandmillionsof items.
  • 12. 24.04.2022 Nearest Neighbor Technique Popular Methods 12 Methods  Using predefined similaritymeasures(such asPearsonor Hamming Distance)  Learning similaritythe relationsweights via optimization
  • 13. 24.04.2022 Hamming distance 5 6 6 5 4 8 0 Dislike 1 Like ? Unknown 1 ? 0 1 1 0 1 1 0 1 1 1 1 0 Current User Users Items User Model = interaction history 1 1st item rate 14th item rate Nearest Neighbor Using predefined Similarity Measure  Nearest Neighbor abcd 13  This user did not rate the item. We will try to predict a rating according to his neighbors. abcd Unknown Rating  There are other users who rated the same item. We are interested in the Nearest Neighbors. abcd Other Users  We are looking for the Nearest Neighbor. The one with the lowest Hamming distance. abcd Nearest Neighbors  The prediction was made based on the nearest neighbor. abcd Prediction
  • 14. abcd A basic model 14 min 𝑟𝑢𝑖 − 𝑟𝑢𝑖 2 Nearest Neighbor Using optimization
  • 15. abcd Factorization  IntheRecommendationSystemsfield,SVDmodelsusers anditemsasvectorsoflatentfeatureswhichwhencross productproducetheratingfortheuseroftheitem  WithSVDamatrixisfactoredintoaseriesoflinear approximationsthatexpose theunderlyingstructureofthe matrix.  Thegoalistouncoverlatentfeaturesthatexplain observed ratings abcd 24.04.2022 15 Collaborative Filtering Approach 2: Matrix factorization
  • 16. The Netflix Prize  Started on Oct. 2006  $1,000,000 Grand Prize  Training dataset: 100 million ratings (1,2,3,4,5 stars) from 480K customers on 18 K movies.  Qualifying set (2,817,131 ratings) consisting of:  Test set (1,408,789 ratings), used to determine winners  Quiz set (1,408,342 ratings), used to calculate leaderboard scores  Goal:  Improve the Netflix existing algorithm by at least 10%  Reduce RMSE From 0.9525 to RMSE<0.8572 16
  • 17. 17
  • 19. The Prize Goes To …  Once a team succeeded to improve the RMSE by 10%, the jury issue a last call, giving all teams 30 days to send their submissions.  On July 25, 2009 the team "The Ensemble” achieved a 10.09% improvement.  After some dispute … 19
  • 20. Lessons Learned from the Netflix Prize  Competition is an excellent way for companies to:  Outsource their challenges  Get PR.  Hire top talent  SVD has become the method-of-choice in CF.  Ensemble is crucial for winning.  Regularization is important for alleviating over-fitting.  When an abundant training data is given, content features (e.g. genre and actors) found to be useless.  Methods that were developed during competitions are not always useful for real systems. 20
  • 21. 24.04.2022 Users & Ratings Latent Concepts or Factors SVD Process abcd SVD SVD reveals hidden connections and its strength abcd Hidden Concept 21 Latent Factor Models Example User Rating abcd SVD
  • 22. 24.04.2022 Users & Ratings Latent Concepts or Factors SVD revealed a movie this user might like! abcd Recommendation 22 Latent Factor Models Example
  • 23. 24.04.2022 23 Latent Factor Models Concept space
  • 24. Popular Factorization • SVD 𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ Σ𝑑 ×𝑑 ∙ 𝑉𝑛×𝑑 𝑇 d=min(m,n) • Low Rank Factorization • Code-Book 𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ 𝐵𝑑 ×𝑙∙ 𝑉𝑛×𝑙 𝑇 𝑋𝑚 ×𝑛 ≈ 𝑈𝑚 ×𝑑 ∙ 𝑉𝑛×𝑑 𝑇 diagonal matrix where singular values indicate the factor importance Permutation Matrix
  • 25. Estimate latent factors through optimization • Decision Variables: – Matrices U, V • Goal function: – Minimize some loss function on available entries in the training rating matrix – Most frequently MSE is used: • Easy to optimize • A proxy to other predictive performance measures • Methods: – e.g. use stochastic gradient descent
  • 26. Three Related Issues • Sparseness • Long Tail – many items in the Long Tail have only few ratings • Cold Start – System cannot draw any inferences for users or items about which it has not yet gathered sufficient data
  • 27. Transfer Learning (TL) 27 h Different tasks Learning system Learning system Learning system Traditional Machine Learning Transfer learning knowledge Learning system Source domain Target domain Transfer previously learned “knowledge” to new domains, making them capable of learning a model from very few training examples.
  • 33. Why does it make sense? • The rows/columns in the code-book matrix represents the users’/items’ rating distribution: J I H G F E D C B A 2 2 3 1 1 2 2 1 1 3 a 3 3 5 4 5 5 5 4 4 2 b 1 5 2 4 3 4 2 3 5 1 c 1 4 4 3 2 2 3 2 1 2 d 1 2 2 3 4 3 3 5 1 3 e 2 3 2 1 2 1 3 1 5 3 f 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 2 3 4 5 -0.1 6E-16 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 2 3 4 5 • Less training instances are required to match users/items to existing patterns than rediscover these patterns
  • 34. 36 TALMUD TrAnsfer Learning from MUltiple Domains • Extends the codebook transfer concept to support: • Multiple source domains with varying levels of relevance.
  • 35. 37 TALMUD-Problem Definition 1. Objective: Minimizing MSE (Mean squared Error) in the target domain 2. Variables: • Users and items clusters memberships in each source domain n - 𝑈𝑛 , 𝑉 𝑛 • 𝛼𝑛– Relatedness coefficient between each source domain i and the target domain 37 Min min 𝑈𝑛 ∈ 0,1 𝑝×𝑘𝑛 𝑉𝑛 ∈ 0,1 𝑞×𝑙𝑛 𝛼𝑛 ∈𝑅 ∀𝑛∈𝑁 𝑋𝑡𝑔𝑡 − 𝛼𝑛 𝑈𝑛 𝐵𝑛 𝑉 𝑛 𝑇 𝑁 𝑛=1 ⃘𝑊 2 𝑆. 𝑇 𝑈n 1 = 1, 𝑉n 1 = 1
  • 36. 38 The TALMUD Algorithm •Step 1: creating a cluster (Codebook 𝐵𝑛) for each source domain •Step 2: Learning the target clusters membership based on all source domains simultaneously. 2.1: finding the users’ corresponding clusters 2.2: finding the items’ corresponding clusters 2.3: Learning the coefficients 𝛼𝑛 •Step 3: Calculate the filled-in target rating matrix 𝑗 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑗 𝑋𝑡𝑔𝑡 𝑖∗ − 𝛼𝑛 𝐵𝑛 𝑉 𝑛 (𝑡−1) 𝑇 𝑗 ∗ 𝑁 𝑛=1 𝑊𝑖∗ 2 𝑗 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑗 𝑋𝑡𝑔𝑡 ∗𝑖 − 𝛼𝑛 𝑈𝑛 (𝑡) 𝐵𝑛 ∗𝑗 𝑁 𝑛=1 𝑊∗𝑖 2 𝑋𝑡𝑔𝑡 = 𝑊 ⃘𝑋𝑡𝑔𝑡 + 1 − 𝑊 ⃘ 𝛼𝑛(𝑈𝑛 𝐵𝑛𝑉 𝑛 𝑇 ) 𝑁 𝑛=1
  • 37. 39 Forward Selection of Sources 1) Adding sources gradually- • Begins with an empty set of sources • Examine the addition of each source • Add the source that improves the model the most • Wrapper approach is used to decide when to stop. 2) Retrain using the entire dataset with the selected sources Data Training Test Validation Training Test 1) 2)
  • 38. • Public Dataset (Source Domain) – Netfilx (Movies) – Jester (Jokes) – MovieLense (Movies) • Target Domain – Music loads – Games loads – BookCrossing (Books) 40 Datasets
  • 40. 44 Curse of Sources Too many sources leads to over-fitting. Not all given source domains should be used. 0 10 20 30 40 50 60 70 80 90 100 0 1 2 3 4 MAE Number of Sources Target Games Test Error of Complete Forward Selection Train Error of Complete Forward Selection
  • 41. 46
  • 44. How to win Netflix Prize with a few lines of code: movie_count = 17771 user_count = 2649430 model_left = Sequential() model_left.add(Embedding(movie_count, 60, input_length=1)) model_right = Sequential() model_right.add(Embedding(user_count, 20, input_length=1)) model = Sequential() model.add(Merge([model_left, model_right], mode='concat')) model.add(Flatten()) model.add(Dense(64)) model.add(Activation('sigmoid')) model.add(Dense(64)) model.add(Activation('sigmoid')) model.add(Dense(64)) model.add(Activation('sigmoid')) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adadelta') model.fit([tr[:,0].reshape((L,1)), tr[:,1].reshape((L,1))], tr[:,2].reshape((L,1)), batch_size=24000, nb_epoch=42, validation_data=([ ts[:,0].reshape((M,1)), ts[:,1].reshape((M,1))], ts[:,2].reshape((M,1))))
  • 45. Item2Vec: Item Embedding • Represent each item with a low-dimensional vector • Item similarity = vector similarity • Learned from users’ sessions. • Inspired by Word2Vec – Words = Items – Sentences = Users’ Sessions
  • 46. Continuous Bag of Items • E.g. given a user’s session of (I1, I2, I3,I4,I5) • Window size = 2 51 I1 I2 I4 I5 I3
  • 47. 52 0 1 0 0 0 0 0 0 … 0 0 0 0 1 0 0 0 0 … 0 I2 I4 0 0 0 0 0 0 0 1 … 0 Input layer Hidden layer I2 Output layer 𝑊𝑉×𝑁 𝑊𝑉×𝑁 V-dim V-dim N-dim 𝑊′𝑁×𝑉 V-dim V is the size of product catalog We must learn W and W’ N is the size of embedding vector
  • 48. 53 0 1 0 0 0 0 0 0 … 0 0 0 0 1 0 0 0 0 … 0 xI2 xI4 0 0 0 0 0 0 0 1 … 0 Input layer Hidden layer I3 Output layer V-dim V-dim N-dim V-dim + 0.1 2.4 1.6 1.8 0.5 0.9 … … … 3.2 0.5 2.6 1.4 2.9 1.5 3.6 … … … 6.1 … … … … … … … … … … … … … … … … … … … … 0.6 1.8 2.7 1.9 2.4 2.0 … … … 1.2 × 0 1 0 0 0 0 0 0 … 0 𝑊𝑉×𝑁 𝑇 × 𝑥𝐼1 = 𝑣𝐼1 2.4 2.6 … … 1.8 =
  • 49. 54 0 1 0 0 0 0 0 0 … 0 0 0 0 1 0 0 0 0 … 0 xI2 xI4 0 0 0 0 0 0 0 1 … 0 Input layer Hidden layer I3 Output layer V-dim V-dim N-dim + 𝑣 = 𝑣 𝐼2 + 𝑣 𝐼4 2 𝑦 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑧) 𝑊𝑉×𝑁 ′ × 𝑣 = 𝑧 0.01 0.02 0.00 0.02 0.01 0.02 0.01 0.7 … 0.00 We would prefer 𝑦 close to 𝑦𝐼3
  • 50. Some interesting results • Similarity: • Most similar item to Samsung Galaxy S7 G930V: • Samsung Galaxy S7 G930A • Samsung Galaxy S7 Edge • Item Analogy: + Apple iPhone 5C - Apple iPhone 4s + Samsung Galaxy S5 Edge = Samsung Galaxy S6 Edge 55 Given that the algorithm was not exposed to item title or description:
  • 51. Why Analogy Relations Are Preserved? Target Item Prepaid Micro Sim Prepaid Nano Sim Samsung Charger Cable Apple Earpods iPhone 5 0 1 0 1 iPhone 4 1 0 0 1 Galaxy S5 1 0 1 0 Galaxy S6 0 1 1 0 56 Other Items in the Session + - + =
  • 52. Beyond Accuracy: Future Trends in RecSys • Diversity & Serendipity • Incorporating price in RecSys models • Explainable RecSys • Counteract the effect of the existing RecSys and isolate the organic browsing of the users • Knowledge-based RecSys 57

Hinweis der Redaktion

  1. While the term was coined in early 90s It became popular in 1997 with the important special issue of RS by Paul Resnik in Communication of the ACM
  2. Simple but very effective!!!
  3. Matrix factorization models (SVD, SVD++, and Time-aware): [41] Latent factor models approach Collaborative Filtering with the holistic goal to uncover latent features that explain observed ratings; this type of methods includes SVD (Singular Value Decomposition), SVD++ and Time-aware factor methods. SVD models users and items as vectors of latent features which when cross product produce the rating for the user of the item. In SVD we face an optimization problem consisting of finding the best values for each user and item vectors. SVD++ is shown to offer accuracy superior to SVD. An improvement is achieved by incorporating implicit feedback into the SVD model, especially for users that provides more implicit data than explicit one. Time-aware factor models temporal effects such as changes in user biases, item biases and user preference over time since these may change. These models can also be extended to consider just Boolean ratings, such as purchased/not-purchased, or visited/not-visited, that may be easier to collect in real scenarios.
  4. This will be done by developing an algorithm that will integrate the rating patterns of all the source domain into one model that will enable to predict the target matrix missing values.