2. Agenda.
1. Introduction to machine learning and Deep Learning Rec Sys.
2. Collaborative Recommender System
3. Method and Techniques.
4. Related work.
5. Case study.
5. Conclusion
3. Machine learning is an application of artificial intelligence that provides systems
automatically to learn and improve from experience without being explicitly
programmed.
Machine learning focuses on the development of computer programs that can
access data and use it learn for themselves.
Machine learning
4. Why Deep Learning has a potential for RecSys?
The explosive growth of e-commerce and online
environments has made the issue of information search
and selection increasingly serious.[1].
9. Recommender system(sometimes replacing "system"
with a synonym such as platform or engine) is a
subclass of information filtering system that seeks to
predict the "rating" or "preference" a user would give to
an item.
Recommender System
10. Introduction
37% of sales
2/3 watched movies
38% of top news
visualization
Recommender Systems are responsible for:
3
11. Scalability, Cold Start, Sparsity and Accuracy.
Different users might use different scales
Finding similar users/user groups isn’t very easy
New user: No preferences available
New item: No ratings available
Demographic filtering is required
Multi-criteria ratings is required
ProblemincollaborativefilteringRecSym
13. Introduction
"We are leaving the Information Age and entering the
Recommendation Age.".
Cris Anderson, "The long tail"
2
14. CHALLENGES
B. Data Sparsity
most of the user do not rate most of items and hence the user
item rating matrix is “sparse”, therefore the probability of finding
a set of users with significant similar rating is usually low.
The most active users will only have rated a small subset of the
overall database. Thus even the most popular items have very
few ratings
C. First rater :- can not recommend an item that has not been
previously rated.
15. process of filtering or evaluating items using the opinions of other
people.[3]
Use other users recommendations (ratings) to judge item’s utility
method of making automatic predictions (filtering) about the interests of a
user by collecting preferences or taste information from many users
(collaborating).
Collaborative Filtering
16. QUESTIONS:
1.What is Collaborative Filtering?
2. How do you decide which movie to watch?
Figure 3: Over flow of recommender system.
Collaborative filtering
21. Method
Collaborative filtering methods are based on collecting and
analysing a large amount of information on users’ behaviour,
activity or preferences and predicting what users will like based
on their similarity to other users.
item-to-item filtering is most common types of Collaborative
collaborative filtering (people who buy x also buy y),
User-based :-asking a friend for a recommendation.
22. A. Memory Based Collaborative Filtering
Memory-based CF uses user-to-user or item-to-item
correlations based on users 'rating behaviour to recommend or
predict ratings for users on future items.
Correlations can be measured by various distance metrics,
such as Pearson correlation coefficient, cosine distance, and
Euclidean distance.
Memory-based collaborative filtering uses the whole training
set each time it computes a prediction on large data sets.[4]
23. B. Model Based Collaborative Filtering
Unlike memory-based it does not use the whole data set to
compute a prediction.
builds a model of the data based on a training set and uses that
model to predict future ratings.
A very successful model-based method is the Singular Value
Decomposition (SVD) which represents the data by a set of
vectors, one for each item and user after models are constructed,
26. Why Deep Learning has a potential for RecSys?
Feature extraction directly fromthe content (e.g., image, text,
audio)
Heterogenous data handledeasily
Dynamic behaviour modeling with Recurrent neural
networks
More accurate representation learning of usersand items
○ Natural extensions of CFRecSys is a complex domain
○ Deep learning worked well in other complexdomains
27. The Deep Learning era of RecSys
2015
2007
Deep Boltzmann Machines
for rating prediction
calm before the
storm
A few seminal papers
2016
First DeepLearningRSworkshop
and
papers on RecSys,KDD,
SIGIR/Special Interest Group
on Information Retrieva/
Continued increase
2017-2018
28. Research directions inDeep Llearning-RecSys
Deep Collaborative Filtering
Learning Item embeddings
Feature Extraction directly fromthe content
Session-based recommendations with
Recurrent neural network
And their combinations...
29. tem-based in the sense that they
analyze item-item relations in order
to produce item similarities
Learning Item Embedding CF Filtering algorithms are
item-based in the sense that they analyze item-item
relations in order to produce item similarities
Learning Item Embedding's CF Faltering algorithms
30. ● Learning user representation
Follows paragraph2vec
User embedding added as globalcontext
Input: user + products purchased except
for the i-th
Target: i-th product purchased by theuser
User embeddings for user to produce predictions
prod2vec skip-gram model
Learning Item Embeddings
32. Why Deep Learning has a potential for RecSys?
Feature extraction directly fromthe content (e.g., image, text,
audio)
Heterogenous data handledeasily
Dynamic behaviour modeling with Recurrent neural
network More accurate representation learning of usersand
items.
○ Natural extensions of CF
○ RecSys is a complex domain
○ Deep learning worked well in other complexdomains
33. Feature extraction from unstructured data
Images Audio/Music
● CNN
Text
● 1D CNN
● RNNs
● Weighted word
embeddings
● CNN
● RNN
35. Wide & Deep Learning in CRS (Cheng et. al, 2016)
Jointtrainingof twomodels
DeepNeuralNetwork-Focusedingeneralization
LinearModel-Focusedinmemorization
Improvedonlineperformance
+2.9%deepoverwide
+3.9%deep&wideoverwide
Deep CollaborativeFiltering
36. 1. Explicit data collections
Asking a user to rate an item on a sliding scale.
Presenting two items to a use Endpoints
r and asking him/her to choose the better one of them.
Asking a user to create a list of items that he/she likes.
2. Implicit data collection
Observing the items that a user views in an online store.
Keeping a record of the items that a user purchases online
Data collection method
37. How to Outbrain Prediction -Kaggle is a platform for predictive modelling and
analytics competitions in which companies and Researchers post data and statisticians
and data miners compete to produce the best models for predicting and describing the
data
Dataset
● Sample of users page
views and clicks during
14 days on June, 2016
● 2 Billion page views
● 17 million click records
● 700 Million unique users
● 560 sites
18
Wide & Deep Estimation Prediction in collaborative Rec Syt
38. Wide & Deep Model code database
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
Deep Neural Network LinearCombined Classifier Estimator for
Collaborative Recommender System.
39. Wide & Deep Model database
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
Wide and Deepfeatures
41. To investigate, design, implement, and evaluate a deep
learning meta-architecture for news Recommendation,
in order to improve the accuracy of recommendations
provided by news portals, satisfying readers' dynamic
information needs in such a challenging recommendation
scenario.
Research Objective on Collaborative Recommender
42. Key is to find users/user groups whose interests match with the current user
More users, more ratings: better results
Can account for items dissimilar to the ones seen in the past too
Collaborative filtering Deep process
User-Based CF:- compute similarity base on user.
Item-Based CF :- compute similarity base on item
43.
44.
45.
46. Related Survey
1.By Guangping and Xueli “A Framework for Multi-Type
Recommendations:- Deals in the field of web mining concern
on some drawbacks in collaborative filtering and also on multi
type Recommendation.
CF suffers some weaknesses: problems with new users (cold
start).data sparseness, difficulty in spotting "malicious" or
"unreliable" users and so on.[5]
47. Related Survey
Additionally CF can’t recommend different type of items at the
same time.
So in order to make it adaptive, new Web applications, such as
urban computing, visit schedule planning and so on, introduced a
new recommendation framework, which combines CF and case-
based reasoning (CBR) to improve performance of RS in Deep
learning.
Based on this framework, the authors have developed a semantic
search Demo System
48. Related Survey
2. By Ibrahim .Almosallam and Yi Shang [8] “A New Adaptive
Framework for Collaborative Filtering Prediction”
The paper focused on memory-based collaborative filtering (CF).
Existing CF techniques work well on dense data but poorly on sparse
data.
To address this weakness, the paper proposed to use z-scores instead
of explicit ratings and introduce a mechanism that adaptively
combines global statistics with item-based values based on data
density level which need implementation of Deep Learning .
They present a new adaptive framework that encapsulates various CF
algorithms and the relationships among them. [4]
49. Case study :- GeneralStepsforDeeplearning Coll,,RecSyst
• Problem definition (user-based, item-based, ratings/binary…)
• Map-Reduce, cleansing, massaging data (input matrix)
• Training Set, Validation Set
Data Prep
• bias removal - Z-score, Mean-centering,LogNormalize
• Pearson Correlation Coefficient
• Cosine Similarity
• K-nearest neighbor
Similarity
weights/Neighbors
• Training model (only in model-based approaches)Train
• Predict missing ratings
• top-N predictions for everyuser
Predict
• Reverse of normalizationDenormalize
• Accuracy, Precision, RecalEvaluate Accuracy
50. ReHowhhhcommender
Approaches
Item
Hierarchy
(You bought
Printer you
will also need
ink - BestBuy)
Collaborative
Filtering –User-
User Similarity
(People like you
who boughtbeer
also bought
diapers - Target)
Attribute-based
recommendations
(You like action
movies, starring
Clint Eastwood, you
might like “Good,
Bad and theUgly”
Netflix)
Collaborative
Filtering – Item-
Item similarity
(You like Godfather
so you will like
Scarface - Netflix)
Social+Interest
Graph Based (Your
friends like Lady
Gaga so you will
like Lady Gaga,
PYMK – Facebook,
LinkedIn)
Model Based Training
Singular value
decomposition
implicit features
Proposed Study data flow Diagram
51. Open SourceToolsSoftware Description Language URL
ApacheMahout
Hadoop MLlibrary thatincludes Collaborative
Filtering
Java
http://mahout.apache.org/
Cofi Collaborative FilteringLibrary Java http://www.nongnu.org/cofi/
Crab
Componentsto create
recommendersystems Python https://github.com/muricoca/crab
easyrec Recommenderfor webpages Java http://easyrec.org/
LensKit
Collaborative Filtering algorithms from
GroupLensResearch Java http://lenskit.grouplens.org/
MyMediaLite Recommendersystemalgorithms C#/Mono http://mloss.org/software/view/282/
SVDFeature
Toolkit for FeaturebasedMatrix
Factorization C++ http://mloss.org/software/view/333/
VogooPHPLIB
Collaborative Filteringfor
personalizedwebsites PHP http://sourceforge.net/projects/vogoo/
recommenderlab
Rlibrary for developing andtesting collaborative
filtering systems R
http://cran.r- project.org/web/packages/recommender
lab/index.html
Scikit-learn
Python module integrating classic
ML algorithms in scientific Python
packages (numpy, scipy,matplotlib) Python http://scikit-learn.org/stable/
Open source software for creation of Deep Learning Recommender System
52. In fact Collaborative Filtering is mostly used filtering technique but it has some issues
related to sparsity, accuracy, scalability etc.
Model-based CF methods usually achieve less accurate prediction than memory-based
methods on dense data sets where a large fraction of user-item values are available in the
training set, but perform better on sparse data set.
They all are focuses on Scalability, Cold Start, Sparsity and Accuracy. But there is not
much work was done on sparsity issue.
Since, today internet data is growing fastly; that's why sparsity also increases as new
records, items, things, music, data etc. are increasing and loaded day by day.[3]
Conclusion
53.
54.
55. References
[1]. Pazzani, M., Billsus, D.” Content-based Recommendation Systems.” In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.):
TheAdaptive Web: Methods and Strategies of Web Personalization, Lecture Notes in Computer Science, Vol. 4321.
Springer-Verlag,Berlin Heidelberg New York (2007) this volume.
2. Breese, J.S., Heckerman, D., Kadie, C.” Empirical Analysis of Predictive Algorithms for Collaborative Filtering”. In
Proceeding of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI).(1998) Madison, Wisconsin.
Morgan Kaufmann p. 43-52. [2]
3. Linden, G., Smith, B., York, Big Data, Data Mining and Machine Learning by Jared Dean Ø Recommendation
systems Handbook by Francesco Ricci , Lior Rokach , Bracha Shapira, Paul B. Kantor [3]
4. Guangping Zhuo, Jingyu Sun and Xueli Yu “A Framework for Multi-Type Recommendations”, Eighth International
Conference on Fuzzy Systems and Knowledge Discovery, 2007. [4]
5.https://www.r-bloggers.com/recommender-systems-101-a-step-by-step-practical-example-in-r/[5]
6. (PDF) Collaborative Recommendation with Multi-Criteria Ratings. Available from:
https://www.researchgate.net/publication/228578993_Collaborative_Recommendation_with_Multi-Criteria_Ratings
[accessed Oct 10 2018].[6]