SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Scalable Recommendation
Algorithms for Massive Data
Maruf Aytekin
PhD Candidate


Computer Engineering Department

Bahcesehir University
Outline
• Introduction
• Collaborative Filtering (CF) and Scalability Problem
• Locality Sensitive Hashing (LSH) for Recommendation
• Improvement for LSH methods
• Preliminary Results
• Work Plan
Recommender Systems
•Recommender systems
•Applied to various domains:
•Book/movie/news recommendations
•Contextual advertising
•Search engine personalization
•Matchmaking
•Two type of problems:
• Preference elicitation (prediction)
• Set-based recommendations (top-N)
Recommender Systems
• Content-based filtering
• Collaborative filtering (CF)
• Model-based
• Neighborhood-based
Neighborhood-based
Methods
The idea: Similar users behave in a similar way.
• User-based: rely on the opinion of like-minded users to
predict a rating.
• Item-based: look at rating given to similar items.
Require computation of similarity weights to select
trusted neighbors whose ratings are used in the
prediction.
Neighborhood-based
Methods
Problem
• Compare all users/items to find trusted neighbors
(k-nearest-neighbors)
• Not scale well with data size (# of users/items)
Computational Complexity
Space Model Build Query
User-based O(m2) O(m2n) O(m)
Item-based O(n2) O(n2m) O(n)
m : number of users
n : number of items
Various Methods
Model-based recommendation techniques
• Dimensionality reduction (SVD, PCA, Random projections)
• Classification (like, dislike)
• Neural network classifier
• Clustering (ANN)
• Bayesian inference techniques
Distributed computation
• Map-reduce
• Distributed CF algorithms
Locality Sensitive Hashing
(LSH)
• ANN search method
• Provides a way to eliminate searching all of the data to
find the nearest neighbors
• Finds the nearest neighbors fast in basic
neighbourhood based methods.
Locality Sensitive Hashing
(LSH)
General approach:
• “Hash” items several times, in such a way that similar
items are more likely to be hashed to the same
bucket than dissimilar items are.
• Pairs hashed to the same bucket candidate pairs.
• Check only the candidate pairs for similarity.


Locality-Sensitive Functions
The function h will “hash” items, and the decision will be
based on whether or not the result is equal.
• h(x) = h(y) make x and y a candidate pair.
• h(x) ≠ h(y) do not make x and y a candidate pair.
g = h1 AND h2 AND h3 …
or
g = h1 OR h2 OR h3 …
A collection of functions of this form will be called a family of
functions.
LSH for Cosine
Charikar defines family of functions for Cosine as follows:
Let u and v be rating vectors and r is a random generated vector
whose components are +1 and −1.
The family of hash functions (H) generated:
, where
shows the probability of u and v being declared as a candidate pair.
LSH for Cosine
Example:
r1 = [-1, 1, 1,-1,-1]
r2 = [ 1, 1, 1,-1,-1]
r3 = [-1,-1, 1,-1, 1]
r4 = [-1, 1,-1, 1,-1]
h1(u1) = u1.r1 = -6 => 0
h2(u1) = u1.r2 = 4 => 1
h3(u1) = u1.r3 = -12 => 0
h4(u1) = u1.r4 = 2 => 1
u1 = [5, 4, 0, 4, 1]
u2 = [2, 1, 1, 1, 4]
u3 = [4, 3, 0, 5, 2]
g(u1) = 0 1 0 1
g(u2) = 0 0 1 0
g(u3) = 0 1 0 1
AND
g(u1) = 0101
max 24 = 16 buckets
LSH Model Build
U1
U2
U3
Um
.
.
.
.
.
h1
h3
U7
U11
U10
.
.
U13
U39
.
.
Um
U1
U3
U5
.
.
U2
U9
U6
.
.
bucket 1
key: 0101
bucket 2
key: 1110
bucket 3
key: 1101
bucket 4
key: 1001
h2
h4
[0,1]
[0,1]
AND-Construction
[0,1]
[0,1]
K = 4, number of hash functions . . . .
Hash Tables (Bands)
U2
U6
U1
U3
.
.
.
candidate set for U5
C(U5)
L = 2
K = 4
hash table 1
hash table 2
LSH Methods
• Clustering Based:
• UB-KNN-LSH: User-based CF prediction with LSH
• IB-KNN-LSH: Item-based CF with LSH
• Frequency Based:
• UB-LSH1: User-based prediction with LSH
• IB-LSH1: Item-based prediction with LSH
LSH Methods
for
Prediction
UB-KNN-LSH IB-KNN-LSH
• find candidate set, C, for target
user, u, with LSH. 

• find k-nearest-neighbors to u
from C that have rated on i. 

• use k-nearest-neighbors to
generate a prediction for u on i. 

• find candidate set, C, for target
item, i, with LSH.

• find k-nearest-neighbors to i
from C which user u rated on. 

• use k-nearest-neighbors to
generate a prediction for u on
item i. 

LSH MethodsPrediction
UB-LSH1 IB-LSH1
• find candidate users list, Cl, for
u who rated on i with LSH. 

• calculate frequency of each
user in Cl who rated on i. 

• sort candidate users based on
frequency and get top k users 

• use frequency as weight to
predict rating for u on i with
user-based prediction.
• find candidate items list, Cl, for i
with LSH. 

• calculate frequency of items in
Cl which is rated by u.
• sort candidate items based on
frequency and get top k items. 

• use frequency as weight to
predict rating for u on i with item
based prediction.
LSH MethodsPrediction
ImprovementPrediction
UB-LSH2 IB-LSH2
• find candidate users list, Cl, for
u who rated on i with LSH.
• select k users from Cl randomly.
• predict rating for u on i with
user-based prediction as the
average ratings of k users.
• find candidate items list, Cl, for i
with LSH.
• select k items rated by u from Cl
randomly.
• predict rating for u on i with
item-based prediction as the
average ratings of k items.
- Eliminate frequency calculation and sorting.
- Frequent users or items in Cl have higher chance to be selected randomly.
Complexity
Prediction
Space Model Build Prediction
User-based O(m) O(m2) O(mn)
Item-based O(n) O(n2) O(mn)
UB-KNN-LSH O(mL) O(mLKt) O(L+|C|n+k)
IB-KNN-LSH O(nL) O(nLKt) O(L+|C|m+k)
UB-LSH1 O(mL) O(mLKt) O(L+|Cl|+|Cl|lg(|Cl|)+k)
IB-LSH1 O(nL) O(nLKt) O(L+|Cl|+|Cl|lg(|Cl|)+k)
UB-LSH2 O(mL) O(mLKt) O(L+2k)
IB-LSH2 O(nL) O(nLKt) O(L+2k)
m : number of users
n : number of items
L: number of hash tables
K : number of hash functions
t : time to evaluate a hash function
C: Candidate user (or item) set ( |C| ≤ Lm / 2K or |C| ≤ Ln / 2K )
Cl : Candidate user (or item) list ( | Cl | ≤ Lm / 2K or | Cl | ≤ Ln / 2K )
| Cl | ≤ Lm / 2K
L = 5
m =16,042
Candidate List (Cl)
Prediction
0
10000
20000
30000
40000
50000
1 2 3 4 5 6 7 8 9 10
NumberofUsers
Number of Hash Functions
Cl
m
| Cl | ≤ Ln / 2K
L = 5
n =17,454
0
10000
20000
30000
40000
50000
1 2 3 4 5 6 7 8 9 10
NumberofItems
Number of Hash Functions
Cl
n
Results
Model Build
ResultsPrediction
0.8
1
1.2
1.4
1.6
1.8
4 5 6 7 8 9 10 11 12 13
MAE
Number of Hash Functions
UB-KNN
IB-KNN
UB-KNN-LSH
IB-KNN-LSH
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
0.7
0.8
0.9
1
1.1
1.2
1.3
4 5 6 7 8 9 10 11 12 13
MAE
Number of Hash Functions
UB-KNN
IB-KNN
UB-KNN-LSH
IB-KNN-LSH
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
Movie Lens 1M Amazon Movies
0
2
4
6
8
10
12
14
4 5 6 7 8 9 10 11 12 13
RunTime(ms)
Number of Hash Functions
UB-KNN-LSH
IB-KNN-LSH
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
4 5 6 7 8 9 10 11 12 13
RunTime(ms)
Number of Hash Functions
UB-KNN-LSH
IB-KNN-LSH
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
Movie Lens 1M Amazon Movies
ResultsPrediction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
4 5 6 7 8 9 10 11 12 13
RunTime(ms.)
Number of Hash Functions
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
4 5 6 7 8 9 10 11 12 13
RunTime(ms.)
Number of Hash Functions
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
Movie Lens 1M Amazon Movies
ResultsPrediction
0
0.2
0.4
0.6
0.8
1
4 5 6 7 8 9 10 11 12 13
PredictionCoverage
Number of Hash Functions
UB-KNN-LSH
IB-KNN-LSH
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
0
0.2
0.4
0.6
0.8
1
4 5 6 7 8 9 10 11 12 13
PredictionCoverage
Number of Hash Functions
UB-KNN-LSH
IB-KNN-LSH
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
Movie Lens 1M Amazon Movies
ResultsPrediction
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Coverage-higherisbetter
Runtime -lower is better
Performance-Coverage tradeoff -upper and left is better
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Coverage-higherisbetter
Runtime -lower is better
Performance-Coverage tradeoff -upper and left is better
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
Movie Lens 1M Amazon Movies
ResultsPrediction
ResultsPrediction
0
0.2
0.4
0.6
0.8
1
0.8 0.85 0.9 0.95 1 1.05 1.1
RunningTime(ms.)-lowerisbetter
MAE -lower is better
MAE-Performance tradeoff -lower and left is better
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.75 0.8 0.85 0.9 0.95 1
RunningTime(ms.)-lowerisbetter
MAE -lower is better
MAE-Performance tradeoff -lower and left is better
UB-LSH1
UB-LSH2
IB-LSH1
IB-LSH2
Movie Lens 1M Amazon Movies
LSH Methods
for
Top-N Recommendation
UB-LSH1 IB-LSH1
• find candidate set, C, for user u
with LSH.
• for each user, v, in C; retrieve
items that rated by v and add
to a running candidate list, Cl.
• calculate frequency of items in
Cl.
• sort Cl based on frequency.
• recommend the most frequent
N items to u.
• for each item, i, u rated; retrieve
candidate set, C, for i with LSH
and add C to a running
candidate list, Cl.
• calculate frequency of items in
Cl.
• sort Cl based on frequency.
• recommend the most frequent N
items to u.
LSH MethodsTop-N Recommendation
Improvement
Top-N Recommendation
UB-LSH2 IB-LSH2
• find candidate set, C, for user
u with LSH.
• for each user, v, in C; retrieve
items that rated by v and add
to a running candidate list, Cl.
• select N items from Cl randomly
and recommend to u.

• for each item, i, u rated; retrieve
candidate set, C, for i with LSH
and add to a running candidate
list, Cl.
• select N items from Cl randomly
and recommend to u.
Eliminates frequency calculation and sorting.
Complexity
Top-N Recommendation
Space Model Build Top-N Recommendation
User-based O(m) O(m2) O(mn)
Item-based O(n) O(n2) O(mn)
UB-LSH1 O(mL) O(mLKt) O(L+|C|+|Cl|+|Cl|lg(|Cl|)
IB-LSH1 O(nL) O(nLKt) O(pL+|Cl|+|Cl|lg(|Cl|))
UB-LSH2 O(mL) O(mLKt) O(L+|C|+N)
IB-LSH2 O(nL) O(nLKt) O(pL+N)
m : number of users
n : number of items
p : number of ratings of a user
L : number of hash tables
K : number of hash functions
t : time to evaluate a hash function
C : Candidate user (or item) set ( |C| ≤ Lm / 2K or |C| ≤ Ln / 2K)
Cl : Candidate item list ( |Cl| ≤ p|C| for UB-LSH1 and IB-LSH1 s.t. |Cl| ≤ Lpn / 2K )
|Cl| ≤ Lpn / 2K )
L = 5
n =1000
p = 100 (avg. number of ratings for a user)
Candidate List (Cl)
Top-N Recommendation
0
5000
10000
15000
20000
25000
30000
35000
4 5 6 7 8 9 10 11 12 13
NumberofItems
Number of Hash Functions
min Cl
max Cl
n
ResultsTop-N Recommendation
0
0.01
0.02
0.03
0.04
0.05
0.06
4 5 6 7 8 9 10 11 12 13
Precision
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
0
0.005
0.01
0.015
0.02
4 5 6 7 8 9 10 11 12 13
Precision
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
Movie Lens 1M Amazon Movies
0
20
40
60
80
100
4 5 6 7 8 9 10 11 12 13
AvgRecc.Time(ms.)
Number of Hash Functions
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
ResultsTop-N Recommendation
0
10
20
30
40
50
60
70
4 5 6 7 8 9 10 11 12 13
AvgRecc.Time(ms.)
Number of Hash Functions
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
Movie Lens 1M Amazon Movies
0
500
1000
1500
2000
2500
3000
3500
4 5 6 7 8 9 10 11 12 13
AggregateDiversity
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
ResultsTop-N Recommendation
0
1000
2000
3000
4000
5000
4 5 6 7 8 9 10 11 12 13
AggregateDiversity
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
Movie Lens 1M Amazon Movies
0
0.2
0.4
0.6
0.8
1
4 5 6 7 8 9 10 11 12 13
Diversity
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
ResultsTop-N Recommendation
0
0.2
0.4
0.6
0.8
1
4 5 6 7 8 9 10 11 12 13
Diversity
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
Movie Lens 1M Amazon Movies
0
2
4
6
8
10
12
4 5 6 7 8 9 10 11 12 13
Novelty
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
ResultsTop-N Recommendation
Movie Lens 1M Amazon Movies
5
5.5
6
6.5
7
7.5
8
8.5
9
9.5
4 5 6 7 8 9 10 11 12 13
Novelty
Number of Hash Functions
IB-TOP-N
UB-TOP-N
IB-LSH1
IB-LSH2
UB-LSH1
UB-LSH2
ResultsTop-N Recommendation
Our improvement is simple but efficient;
• Improves:
• Performance
• Diversity
• Coverage
• Novelty
• but costs accuracy.
• LSH as a real-time stream recommendation algorithm
• Dimensionality reduction methods (e.g., Matrix
Factorization)
• Other ANN Methods:
• Tree based
• Clustering based
Work Plan
Q & A

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix FactorizationTatsuya Yokota
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildSujit Pal
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention NetworksTaeoh Kim
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
 
Neural networks...
Neural networks...Neural networks...
Neural networks...Molly Chugh
 
[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)Susang Kim
 
4. Block Ciphers
4. Block Ciphers 4. Block Ciphers
4. Block Ciphers Sam Bowne
 
Graph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsGraph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsWQ Fan
 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsXavier Amatriain
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classificationSung Yub Kim
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to AutoencodersYan Xu
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentMuhammad Rasel
 
Detailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss FunctionDetailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss Function범준 김
 
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery ivaderivader
 

Was ist angesagt? (20)

Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
 
Neural networks...
Neural networks...Neural networks...
Neural networks...
 
[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)
 
4. Block Ciphers
4. Block Ciphers 4. Block Ciphers
4. Block Ciphers
 
Gnn overview
Gnn overviewGnn overview
Gnn overview
 
Graph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsGraph Neural Networks for Recommendations
Graph Neural Networks for Recommendations
 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender Systems
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descent
 
Detailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss FunctionDetailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss Function
 
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
 
Locality sensitive hashing
Locality sensitive hashingLocality sensitive hashing
Locality sensitive hashing
 

Ähnlich wie Scalable Recommendation Algorithms with LSH

LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationMaruf Aytekin
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić
 
Models for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationModels for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationArjen de Vries
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Incremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringIncremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringAllen Wu
 
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowPaper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowMin-Yih Hsu
 
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...Lucidworks
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label ClassificationYONG ZHENG
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with SparkChris Johnson
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlightsSandra Garcia
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18Matt Yang
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...Prabhu Kumar
 
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...Chris Fregly
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li
 
Download
DownloadDownload
Downloadbutest
 

Ähnlich wie Scalable Recommendation Algorithms with LSH (20)

LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in Recommendation
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
 
Models for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationModels for Information Retrieval and Recommendation
Models for Information Retrieval and Recommendation
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Incremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringIncremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clustering
 
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data FlowPaper Study - Demand-Driven Computation of Interprocedural Data Flow
Paper Study - Demand-Driven Computation of Interprocedural Data Flow
 
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
The Evolution of Streaming Expressions - Joel Bernstein, Alfresco & Dennis Go...
 
[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification[WI 2014]Context Recommendation Using Multi-label Classification
[WI 2014]Context Recommendation Using Multi-label Classification
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlights
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
 
Paris Data Geeks
Paris Data GeeksParis Data Geeks
Paris Data Geeks
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...
 
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 
Download
DownloadDownload
Download
 

Kürzlich hochgeladen

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...jabtakhaidam7
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfsumitt6_25730773
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxpritamlangde
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdfKamal Acharya
 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptxrouholahahmadi9876
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...Amil baba
 

Kürzlich hochgeladen (20)

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 

Scalable Recommendation Algorithms with LSH

  • 1. Scalable Recommendation Algorithms for Massive Data Maruf Aytekin PhD Candidate 
 Computer Engineering Department
 Bahcesehir University
  • 2. Outline • Introduction • Collaborative Filtering (CF) and Scalability Problem • Locality Sensitive Hashing (LSH) for Recommendation • Improvement for LSH methods • Preliminary Results • Work Plan
  • 3. Recommender Systems •Recommender systems •Applied to various domains: •Book/movie/news recommendations •Contextual advertising •Search engine personalization •Matchmaking •Two type of problems: • Preference elicitation (prediction) • Set-based recommendations (top-N)
  • 4. Recommender Systems • Content-based filtering • Collaborative filtering (CF) • Model-based • Neighborhood-based
  • 5. Neighborhood-based Methods The idea: Similar users behave in a similar way. • User-based: rely on the opinion of like-minded users to predict a rating. • Item-based: look at rating given to similar items. Require computation of similarity weights to select trusted neighbors whose ratings are used in the prediction.
  • 6. Neighborhood-based Methods Problem • Compare all users/items to find trusted neighbors (k-nearest-neighbors) • Not scale well with data size (# of users/items) Computational Complexity Space Model Build Query User-based O(m2) O(m2n) O(m) Item-based O(n2) O(n2m) O(n) m : number of users n : number of items
  • 7. Various Methods Model-based recommendation techniques • Dimensionality reduction (SVD, PCA, Random projections) • Classification (like, dislike) • Neural network classifier • Clustering (ANN) • Bayesian inference techniques Distributed computation • Map-reduce • Distributed CF algorithms
  • 8. Locality Sensitive Hashing (LSH) • ANN search method • Provides a way to eliminate searching all of the data to find the nearest neighbors • Finds the nearest neighbors fast in basic neighbourhood based methods.
  • 9. Locality Sensitive Hashing (LSH) General approach: • “Hash” items several times, in such a way that similar items are more likely to be hashed to the same bucket than dissimilar items are. • Pairs hashed to the same bucket candidate pairs. • Check only the candidate pairs for similarity.
  • 10. 
 Locality-Sensitive Functions The function h will “hash” items, and the decision will be based on whether or not the result is equal. • h(x) = h(y) make x and y a candidate pair. • h(x) ≠ h(y) do not make x and y a candidate pair. g = h1 AND h2 AND h3 … or g = h1 OR h2 OR h3 … A collection of functions of this form will be called a family of functions.
  • 11. LSH for Cosine Charikar defines family of functions for Cosine as follows: Let u and v be rating vectors and r is a random generated vector whose components are +1 and −1. The family of hash functions (H) generated: , where shows the probability of u and v being declared as a candidate pair.
  • 12. LSH for Cosine Example: r1 = [-1, 1, 1,-1,-1] r2 = [ 1, 1, 1,-1,-1] r3 = [-1,-1, 1,-1, 1] r4 = [-1, 1,-1, 1,-1] h1(u1) = u1.r1 = -6 => 0 h2(u1) = u1.r2 = 4 => 1 h3(u1) = u1.r3 = -12 => 0 h4(u1) = u1.r4 = 2 => 1 u1 = [5, 4, 0, 4, 1] u2 = [2, 1, 1, 1, 4] u3 = [4, 3, 0, 5, 2] g(u1) = 0 1 0 1 g(u2) = 0 0 1 0 g(u3) = 0 1 0 1 AND g(u1) = 0101 max 24 = 16 buckets
  • 13. LSH Model Build U1 U2 U3 Um . . . . . h1 h3 U7 U11 U10 . . U13 U39 . . Um U1 U3 U5 . . U2 U9 U6 . . bucket 1 key: 0101 bucket 2 key: 1110 bucket 3 key: 1101 bucket 4 key: 1001 h2 h4 [0,1] [0,1] AND-Construction [0,1] [0,1] K = 4, number of hash functions . . . .
  • 14. Hash Tables (Bands) U2 U6 U1 U3 . . . candidate set for U5 C(U5) L = 2 K = 4 hash table 1 hash table 2
  • 15. LSH Methods • Clustering Based: • UB-KNN-LSH: User-based CF prediction with LSH • IB-KNN-LSH: Item-based CF with LSH • Frequency Based: • UB-LSH1: User-based prediction with LSH • IB-LSH1: Item-based prediction with LSH
  • 17. UB-KNN-LSH IB-KNN-LSH • find candidate set, C, for target user, u, with LSH. 
 • find k-nearest-neighbors to u from C that have rated on i. 
 • use k-nearest-neighbors to generate a prediction for u on i. 
 • find candidate set, C, for target item, i, with LSH.
 • find k-nearest-neighbors to i from C which user u rated on. 
 • use k-nearest-neighbors to generate a prediction for u on item i. 
 LSH MethodsPrediction
  • 18. UB-LSH1 IB-LSH1 • find candidate users list, Cl, for u who rated on i with LSH. 
 • calculate frequency of each user in Cl who rated on i. 
 • sort candidate users based on frequency and get top k users 
 • use frequency as weight to predict rating for u on i with user-based prediction. • find candidate items list, Cl, for i with LSH. 
 • calculate frequency of items in Cl which is rated by u. • sort candidate items based on frequency and get top k items. 
 • use frequency as weight to predict rating for u on i with item based prediction. LSH MethodsPrediction
  • 19. ImprovementPrediction UB-LSH2 IB-LSH2 • find candidate users list, Cl, for u who rated on i with LSH. • select k users from Cl randomly. • predict rating for u on i with user-based prediction as the average ratings of k users. • find candidate items list, Cl, for i with LSH. • select k items rated by u from Cl randomly. • predict rating for u on i with item-based prediction as the average ratings of k items. - Eliminate frequency calculation and sorting. - Frequent users or items in Cl have higher chance to be selected randomly.
  • 20. Complexity Prediction Space Model Build Prediction User-based O(m) O(m2) O(mn) Item-based O(n) O(n2) O(mn) UB-KNN-LSH O(mL) O(mLKt) O(L+|C|n+k) IB-KNN-LSH O(nL) O(nLKt) O(L+|C|m+k) UB-LSH1 O(mL) O(mLKt) O(L+|Cl|+|Cl|lg(|Cl|)+k) IB-LSH1 O(nL) O(nLKt) O(L+|Cl|+|Cl|lg(|Cl|)+k) UB-LSH2 O(mL) O(mLKt) O(L+2k) IB-LSH2 O(nL) O(nLKt) O(L+2k) m : number of users n : number of items L: number of hash tables K : number of hash functions t : time to evaluate a hash function C: Candidate user (or item) set ( |C| ≤ Lm / 2K or |C| ≤ Ln / 2K ) Cl : Candidate user (or item) list ( | Cl | ≤ Lm / 2K or | Cl | ≤ Ln / 2K )
  • 21. | Cl | ≤ Lm / 2K L = 5 m =16,042 Candidate List (Cl) Prediction 0 10000 20000 30000 40000 50000 1 2 3 4 5 6 7 8 9 10 NumberofUsers Number of Hash Functions Cl m | Cl | ≤ Ln / 2K L = 5 n =17,454 0 10000 20000 30000 40000 50000 1 2 3 4 5 6 7 8 9 10 NumberofItems Number of Hash Functions Cl n
  • 23. ResultsPrediction 0.8 1 1.2 1.4 1.6 1.8 4 5 6 7 8 9 10 11 12 13 MAE Number of Hash Functions UB-KNN IB-KNN UB-KNN-LSH IB-KNN-LSH UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 0.7 0.8 0.9 1 1.1 1.2 1.3 4 5 6 7 8 9 10 11 12 13 MAE Number of Hash Functions UB-KNN IB-KNN UB-KNN-LSH IB-KNN-LSH UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 Movie Lens 1M Amazon Movies
  • 24. 0 2 4 6 8 10 12 14 4 5 6 7 8 9 10 11 12 13 RunTime(ms) Number of Hash Functions UB-KNN-LSH IB-KNN-LSH UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 4 5 6 7 8 9 10 11 12 13 RunTime(ms) Number of Hash Functions UB-KNN-LSH IB-KNN-LSH UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 Movie Lens 1M Amazon Movies ResultsPrediction
  • 25. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 4 5 6 7 8 9 10 11 12 13 RunTime(ms.) Number of Hash Functions UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 4 5 6 7 8 9 10 11 12 13 RunTime(ms.) Number of Hash Functions UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 Movie Lens 1M Amazon Movies ResultsPrediction
  • 26. 0 0.2 0.4 0.6 0.8 1 4 5 6 7 8 9 10 11 12 13 PredictionCoverage Number of Hash Functions UB-KNN-LSH IB-KNN-LSH UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 0 0.2 0.4 0.6 0.8 1 4 5 6 7 8 9 10 11 12 13 PredictionCoverage Number of Hash Functions UB-KNN-LSH IB-KNN-LSH UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 Movie Lens 1M Amazon Movies ResultsPrediction
  • 27. 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Coverage-higherisbetter Runtime -lower is better Performance-Coverage tradeoff -upper and left is better UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Coverage-higherisbetter Runtime -lower is better Performance-Coverage tradeoff -upper and left is better UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 Movie Lens 1M Amazon Movies ResultsPrediction
  • 28. ResultsPrediction 0 0.2 0.4 0.6 0.8 1 0.8 0.85 0.9 0.95 1 1.05 1.1 RunningTime(ms.)-lowerisbetter MAE -lower is better MAE-Performance tradeoff -lower and left is better UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.75 0.8 0.85 0.9 0.95 1 RunningTime(ms.)-lowerisbetter MAE -lower is better MAE-Performance tradeoff -lower and left is better UB-LSH1 UB-LSH2 IB-LSH1 IB-LSH2 Movie Lens 1M Amazon Movies
  • 30. UB-LSH1 IB-LSH1 • find candidate set, C, for user u with LSH. • for each user, v, in C; retrieve items that rated by v and add to a running candidate list, Cl. • calculate frequency of items in Cl. • sort Cl based on frequency. • recommend the most frequent N items to u. • for each item, i, u rated; retrieve candidate set, C, for i with LSH and add C to a running candidate list, Cl. • calculate frequency of items in Cl. • sort Cl based on frequency. • recommend the most frequent N items to u. LSH MethodsTop-N Recommendation
  • 31. Improvement Top-N Recommendation UB-LSH2 IB-LSH2 • find candidate set, C, for user u with LSH. • for each user, v, in C; retrieve items that rated by v and add to a running candidate list, Cl. • select N items from Cl randomly and recommend to u.
 • for each item, i, u rated; retrieve candidate set, C, for i with LSH and add to a running candidate list, Cl. • select N items from Cl randomly and recommend to u. Eliminates frequency calculation and sorting.
  • 32. Complexity Top-N Recommendation Space Model Build Top-N Recommendation User-based O(m) O(m2) O(mn) Item-based O(n) O(n2) O(mn) UB-LSH1 O(mL) O(mLKt) O(L+|C|+|Cl|+|Cl|lg(|Cl|) IB-LSH1 O(nL) O(nLKt) O(pL+|Cl|+|Cl|lg(|Cl|)) UB-LSH2 O(mL) O(mLKt) O(L+|C|+N) IB-LSH2 O(nL) O(nLKt) O(pL+N) m : number of users n : number of items p : number of ratings of a user L : number of hash tables K : number of hash functions t : time to evaluate a hash function C : Candidate user (or item) set ( |C| ≤ Lm / 2K or |C| ≤ Ln / 2K) Cl : Candidate item list ( |Cl| ≤ p|C| for UB-LSH1 and IB-LSH1 s.t. |Cl| ≤ Lpn / 2K )
  • 33. |Cl| ≤ Lpn / 2K ) L = 5 n =1000 p = 100 (avg. number of ratings for a user) Candidate List (Cl) Top-N Recommendation 0 5000 10000 15000 20000 25000 30000 35000 4 5 6 7 8 9 10 11 12 13 NumberofItems Number of Hash Functions min Cl max Cl n
  • 34. ResultsTop-N Recommendation 0 0.01 0.02 0.03 0.04 0.05 0.06 4 5 6 7 8 9 10 11 12 13 Precision Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 0 0.005 0.01 0.015 0.02 4 5 6 7 8 9 10 11 12 13 Precision Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 Movie Lens 1M Amazon Movies
  • 35. 0 20 40 60 80 100 4 5 6 7 8 9 10 11 12 13 AvgRecc.Time(ms.) Number of Hash Functions IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 ResultsTop-N Recommendation 0 10 20 30 40 50 60 70 4 5 6 7 8 9 10 11 12 13 AvgRecc.Time(ms.) Number of Hash Functions IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 Movie Lens 1M Amazon Movies
  • 36. 0 500 1000 1500 2000 2500 3000 3500 4 5 6 7 8 9 10 11 12 13 AggregateDiversity Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 ResultsTop-N Recommendation 0 1000 2000 3000 4000 5000 4 5 6 7 8 9 10 11 12 13 AggregateDiversity Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 Movie Lens 1M Amazon Movies
  • 37. 0 0.2 0.4 0.6 0.8 1 4 5 6 7 8 9 10 11 12 13 Diversity Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 ResultsTop-N Recommendation 0 0.2 0.4 0.6 0.8 1 4 5 6 7 8 9 10 11 12 13 Diversity Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 Movie Lens 1M Amazon Movies
  • 38. 0 2 4 6 8 10 12 4 5 6 7 8 9 10 11 12 13 Novelty Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2 ResultsTop-N Recommendation Movie Lens 1M Amazon Movies 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 4 5 6 7 8 9 10 11 12 13 Novelty Number of Hash Functions IB-TOP-N UB-TOP-N IB-LSH1 IB-LSH2 UB-LSH1 UB-LSH2
  • 39. ResultsTop-N Recommendation Our improvement is simple but efficient; • Improves: • Performance • Diversity • Coverage • Novelty • but costs accuracy.
  • 40. • LSH as a real-time stream recommendation algorithm • Dimensionality reduction methods (e.g., Matrix Factorization) • Other ANN Methods: • Tree based • Clustering based Work Plan
  • 41. Q & A