In this talk, we provide an overview of the state on how Deep Learning techniques have been recently applied to Recommender Systems. Furthermore, I provide an brief view of my ongoing Phd. research on News Recommender Systems with Deep Learning
7. Why Deep Learning has a potential for RecSys?
● Feature extraction directly from the content (e.g., image,
text, audio)
● Heterogenous data handled easily
● Dynamic behaviour modeling with RNNs
● More accurate representation learning of users and items
○ Natural extensions of CF
● RecSys is a complex domain
○ Deep learning worked well in other complex domains
8. The Deep Learning era of RecSys
2007
2015
2016
2017-2018
Deep Boltzmann Machines
for rating prediction
calm before the
storm
A few seminal papers
First DLRS workshop and
papers on RecSys, KDD,
SIGIR
Continued increase
9. Research directions in DL-RecSys
And their combinations...
Session-based recommendations with
RNNs
Feature Extraction directly from the content
Learning Item embeddings
Deep Collaborative Filtering
11. Item embeddings
● Embedding: a (learned) real value vector representing an entity
● Also known as Latent feature vector / (Latent) representation
● Similar entities’ embeddings are similar
● Use in recommenders:
○ Initialization of item representation in more advanced
algorithms
● Item-to-item recommendations
Learning Item Embeddings
12. Prod2Vec (Grbovic et. al, 2015)
● Based on word2vec and paragraph2vec
● Skip-gram model on products
○ Input: i-th product purchased by the user
○ Context: the other purchases of the user
● Learning user representation
○ Follows paragraph2vec
○ User embedding added as global context
○ Input: user + products purchased except for
the i-th
○ Target: i-th product purchased by the user
User embeddings for user to produce predictions
prod2vec skip-gram model
Learning Item Embeddings
14. Feature extraction from unstructured data
Images Text Audio/Music
● CNN
● 1D CNN
● RNNs
● Weighted word
embeddings
● CNN
● RNN
15. ● Initializing
○ Obtain item representation based on metadata
○ Use this representation as initial item features
● Regularizing
○ Obtain metadata based representations
○ The interaction based representation should be close to the metadata
based (regularizing term to loss of this difference)
● Joining
○ o Have the item feature vector be a concatenation
■ Fixed metadata based part
■ Learned interaction based part
Content Features in Hybrid Recommenders
17. Wide & Deep Learning (Cheng et. al, 2016)
● Joint training of two models
○ Deep Neural Network - Focused in generalization
○ Linear Model - Focused in memorization
● Improved online performance
○ +2.9% deep over wide
○ +3.9% deep & wide over wide
Deep Collaborative Filtering
18. Outbrain Click Prediction - Kaggle competition
Dataset
● Sample of users page views
and clicks during
14 days on June, 2016
● 2 Billion page views
● 17 million click records
● 700 Million unique users
● 560 sites
Can you predict which recommended content each user will click?
18
Wide & Deep Model example
19. Wide & Deep Model example
Outbrain Click Prediction - Data Model
Numerical
Spatial
Temporal
Categorical
Target
20. Wide & Deep Model example
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
DNNLinearCombinedClassifier estimator
21. Wide & Deep Model example
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
Wide and Deep features
22. Wide & Deep Model example
Source: https://github.com/gabrielspmoreira/kaggle_outbrain_click_prediction_google_cloud_ml_engine
POC Results
Framework / Platform Model Mean Average Precision (MAP)
Vowpal Wabbit Linear 0.6751
Tensorflow / Google ML
Engine
Linear (Wide) 0.6779
Deep model 0.6674
Wide & Deep model 0.6765
25. News Recommender Systems
News RS Challenges
● Sparse user profiling
(LI et al. , 2011) (LIN et al. , 2014) (PELÁEZ et al. , 2016)
● Fast growing number of items
(PELÁEZ et al. , 2016) (MOHALLICK; ÖZGÖBEK , 2017)
● Accelerated item’s value decay
(DAS et al. , 2007)
● Users’ preferences shift
(PELÁEZ et al. , 2016) (EPURE et al. , 2017)
26. To investigate, design, implement, and evaluate a
deep learning meta-architecture for news
recommendation, in order to improve the accuracy
of recommendations provided by news portals,
satisfying readers' dynamic information needs in
such a challenging recommendation scenario.
Research Objective
27. Conceptual model of factors affecting news relevance
News
relevance
Topics Entities Publisher
News static properties
Recency Popularity
News dynamic properties
News article
User
TimeLocation Device
User current context
Long-term
interests
Short-term
interests
Global factors
Season-
ality
User interests
Breaking
events
Popular
Topics
Referrer
29. Meta-Architecture Requirements
● RQ1 - to provide personalized news recommendations in extreme
cold-start scenarios, as most news are fresh and most users cannot be
identified
● RQ2 - to use Deep Learning to automatically learn news
representations from textual content and news metadata, minimizing
the need of manual feature engineering
● RQ3 - to leverage the user session information, as the sequence of
interacted news may indicate the user’s short-term preferences for
session-based recommendations
● RQ4 - to leverage user's past sessions information, when available, to
model long-term interests for session-aware recommendations
30. Meta-Architecture Requirements
● RQ5 - to leverage users’ contextual information as a rich data source,
in such information scarcity about the user
● RQ6 - to model explicitly contextual news properties – popularity and
recency – as those are important factors on news interest life cycle
● RQ7 - to support an increasing number of new items and users by
incremental model retraining (online learning), without the need to
retrain on the whole historical dataset
● RQ8 - to provide a modular structure for news recommendation,
allowing its modules to be instantiated by different and increasingly
advanced neural network architectures and methods
31. CHAMELEON Meta-Architecture for News RS
Article
Context
Article
Content
Embeddings
Article Content Representation (ACR)
Textual Features Representation (TFR)
Metadata Prediction (MP)
Category Tags Entities
Article Metadata Attributes
Next-Article Recommendation (NAR)
Time
Location
Device
When a news article is published...
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
Article
Content
Embedding
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Content word embeddings
New York is a multicultural city , ...Publisher
Metadata
Attributes
News Article
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributes
Article Content Embedding
Legend:
Word
Embeddings
32. CHAMELEON - ACR module
Article
Content
Embeddings
Article Content Representation (ACR)
Textual Features Representation (TFR)
Metadata Prediction (MP)
Category Tags Entities
Article Metadata Attributes
When a news article is published...
Content word embeddings
New York is a multicultural city , ...Publisher
Metadata
Attributes
News Article
Module Sub-Module EmbeddingInput Output Data repositoryAttributes
Article Content Embedding
Legend:
Word
Embeddings
T-SNE viz. of articles embeddings colored by
category, with similar articles highlighted
“Sports”
“Berlin”
33. Articles Embeddings Similarity Evaluation (e.g. “sports”)
Article Content (Title | Kicker | Description) Similarity
Kommentar: Generation Mut | Kommentar von Alexander Wölffing zur
Handball-Nationalmannschaft bei der EM | Der offensive Ansatz des Deutschen Handball-Bundes
um Bob Hanning zahlt sich aus. Alexander Wölffing, stellvertretender Chefredakteur und Leiter
Sports bei SPORT1,
kommentiert.
-
”Seahawks werden Brady bearbeiten” | NFL-Legende Hines Ward analysiert den Super Bowl | Hines
Ward favorisiert Seattle. Der frühere MVP des Endspiels sieht aber Vollmer und Co. sowie einen
X-Faktor als Chance der Patriots. Für SPORT1 analysiert er den Super Bowl.
0.970
Bald kein ”Hack-a-Drummond” mehr? | Foul als taktisches Mittel: NBA will ”hack-a-player”-Regel
¨ander | Andre rummond und alle schlechten Freiwerfer in der NBA wird es freuen: Die Liga will
Fouls als taktisches Mittel anders bestrafen, um den Sport nicht kaputt zu machen.
0.964
Nummer 2: Leno findet sich damit ab | Bernd Leno spricht im SPORT1- Interview uber Manuel
Neuer ¨ | Bayer Leverkusens Bernd Leno nimmt bei SPORT1 Stellung zur Torhuter-Situation im
DFB-Team sowie Vorbild ¨ Manuel Neuer - und spricht auch uber Fehler und Frust.
0.962
33
34. CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
34
35. CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Sessions in a batch
Article
Content
Embedding
35
I1,1
I1,2
I1,3
I1,4
I1,5
I2,1
I2,2
I3,1
I3,2
I3,3
Input
I1,1
I1,2
I1,3
I1,4
I1,2
I1,3
I1,4
I1,5
Expected Output (labels)
36. CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
36
Negative Sampling strategy
Articles read by users overall n the last hour
Buffer
Articles read in other user sessions in the
batch
Next article read by user in his session
Samples
37. CHAMELEON - NAR module
Article
Context
Article
Content
Embeddings
Next-Article Recommendation (NAR)
Time
Location
Device
User context
User interaction
past read articles
Popularity
Recency
Article context
Users Past
Sessions
candidate next articles
(positive and neg.)
active article
Active
Sessions
When a user reads a news article...
Predicted Next-Article Embedding
Session Representation (SR)
Recommendations Ranking (RR)
User-Personalized Contextual Article Embedding
Recommended
articles
Contextual Article Representation (CAR)
Active user session
Module Sub-Module EmbeddingInput Output Data repositoryAttributesLegend:
Article
Content
Embedding
37
Recommendations Ranking
(RR) sub-module
Eq. 7 - Loss function (HUANG et al., 2013)
Eq. 4 - Relevance Score of an item for a user session
Eq. 5 - Cosine similarity
Eq. 6 - Softmax over Relevance Score (HUANG et al., 2013)
38. CHAMELEON - NAR module loss function
Recommendation loss function implemented on TensorFlow
39. ● Predict the next article user will read in a session
● Training with interactions up to an hour, and evaluate the predictions of
next-item clicks for the next hour sessions
● Ranking Metrics
○ Recall@3 - Scores when the actually clicked article is among the top-3
items in the recommended ranking list
○ NDCG@3 - Takes into account the predicted position of actually clicked
item (e.g. in position 1 it would score higher than on position 2)
● Evaluation Benchmarks
○ Popular Recent - Keeps a buffer of all clicks of the last hour, and return
the most popular articles on that time window
○ Co-occurrent - Recommends articles that are most commonly read
together on user sessions
○ Content-Based - Return articles whose content embeddings are the
most similar
CHAMELEON - Offline evaluation protocol
47. ● ACR module
○ Try different word embeddings (Word2Vec, GloVe)
○ Try different textual feature extractors (CNNs, RNNs) to generate
articles embeddings
● NAR module
○ Test different RNN architectures (LSTM, GRU)
○ Try different negative sampling strategies
○ Leverage users past sessions to initialize RNNs for new sessions
○ Evaluate on different (and larger) news datasets
○ Hyperparameters tuning
Next steps