Paper presentation at the 2016 ACM Recommender Systems conference in Boston (MIT).
Computing useful recommendations for cold-start users is a major challenge in the design of recommender systems, and additional data is often required to compensate the scarcity of user feedback. In this paper we address such problem in a target domain by exploiting user preferences from a related auxiliary domain. Following a rigorous methodology for cold-start, we evaluate a number of recommendation methods on a dataset with positive-only feedback in the movie and music domains, both in single and cross-domain scenarios. Comparing the methods in terms of item ranking accuracy, diversity and catalog coverage, we show that cross-domain preference data is useful to provide more accurate suggestions when user feedback in the target domain is scarce or not available at all, and may lead to more diverse recommendations depending on the target domain. Moreover, evaluating the impact of the user profile size and diversity in the source domain, we show that, in general, the quality of target recommendations increases with the size of the profile, but may deteriorate with too diverse profiles.
Recsys 2016 - Accuracy and Diversity in Cross-domain Recommendations for Cold-Start Userswith Positive-only Feedback
1. Accuracy and Diversity in Cross-domain
Recommendations for Cold-Start Users
with Positive-only Feedback
Ignacio Fernández-Tobías1, Paolo Tomeo2,
Iván Cantador1, Tommaso Di Noia2, Eugenio Di Sciascio2
1 Autonomous University of Madrid, Spain
{ignacio.fernandezt, ivan.cantador}@uam.es
2 Polytechnic University of Bari, Italy
{paolo.tomeo, tommaso.dinoia, eugenio.disciascio}@poliba.it
2. User Cold-Start Problem
Cold-Start
Extreme Cold-Start
Items
Users
Little or no information about some users
(usually new users)
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 1
3. Cross-domain recommendation
A simple way to combine different domains
is to horizontally concatenate the user-item matrices
Movies
Users
Music
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 2
4. Research Questions
1. Introduction
1.1. Motivation
RQ1 - How beneficial in terms of accuracy is to exploit
cross-domain information for cold-start users?
RQ2 - Is cross-domain information really useful to
improve the recommendation diversity?
RQ3 - What is the impact of the size and diversity of
source user profile on the target recommendation
accuracy?
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 3
5. Positive-only Dataset
1 - Facebook likes extracted
by using Graph API
2 - Items mapped to DBpedia
entities by using SPARQL
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 4
6. Dataset Statistics
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 5
Metrics
Users Items
(Facebook pages)
Likes
Music 50K 5K 5M
Movies 27K 4K 800K
Accuracy MRR
Individual Diversity ILD@10, BinomDiv@10
Profile Diversity ILD
7. Evaluation Setting
5-fold cross validation
training → 10 likes
Splitting validation → 5 likes
test → the remaining likes, at least 1
Simulation of different user profile sizes (from 0 to 10 likes)
evaluated with the same test set [Kluver and Konstan, RecSys ‘14]
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 6
8. Recommendation algorithms
3. Recommendation models
3.3. Baseline models
• Popularity-based (POP)
• User-based Nearest Neighbors (UNN)
• Item-based Nearest Neighbors (INN)
• Implicit Matrix Factorization (IMF) [Hu et al., 2008]
• HeteRec [Yu et al., 2014]
• PathRank [Lee et al., 2012]
Prefix “CD-” indicates cross-domain version (e.g. CD-UNN)
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 7
10. Which algorithm is more accurate?
…and which one provides more diversity?
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 9
11. Impact of source profile size
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 10
12. Impact of source profile diversity
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 11
13. Conclusions
5. Conclusions and future work
Cross-domain recommendation may improve accuracy (RQ1), but
not always providing diversity (RQ2)
The choice of the recommendation algorithm depends on the
domain and the amount of user information available
Recommendation accuracy increases with size of source profile,
but may deteriorate with diversity (RQ3)
Investigating which characteristics of the datasets could explain
the differences in the obtained results
Extending the analysis to more domains and sophisticated
methods
Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback 12
Future work
Hinweis der Redaktion
This work shows some of the results obtained during my visit to the Information Retrieval Group of the Autonoma University of Madrid where I worked with Ivan Cantador and Ignacio Fernandez-Tobias.
We evaluated some state-of-the-art algorithms in terms of recommendation accuracy and diversity in cold-start user scenario.
In particular exploiting cross-domain information using a dataset composed of facebook likes, thus with positive-only feedback.
We have already seen in the previous talks the definition of cold-start user. Here we also consider the extreme cold-start situation: users with no information at all.
Finding accurate recommendations for cold-start users is obviously a non trivial problem.
A possible solution is to exploit additional information about the users.
A simple way to combine different domain is to horizontally concatenate the corresponding matrices.
In this work, we used facebook likes of movies and music pages.
Therefore We identified three main research questions:
First of all, what is the impact of using cross-domain for recommending accurate and diverse items to cold-start users?
We know that diversity is important for user satisfaction, but In spite of some conjecture, no previous work has evaluated the diversity .
IVAN: Which are the (addressed) questions? I would clearly state them (RQ1:…, RQ2:… here and give the corresponding answers in the conclusions slide)
For that, take into account the title/keywords: cold-start, positive-only feedback, cross-recommendation
To simulate different user profile sizes (from 0 to 10 likes), we repeat the training and the evaluation eleven times, starting without likes in the training set and then incrementally increasing it one by one. Each profile size is evaluated with the same test set, to avoid any potential bias in the evaluation due to different test set sizes
Let’s see the difference for each methods with and without cross-domain information
In the paper there is a table with all the experimental results.
For sake of presentation, here we can see a summarized table, where the green up arrow indicates that adding cross-domain information improves the quality results of the method in the row.
As we can see, some methods may benefit by using cross-domain information, while other may be penalized. It’s noteworthy the fact that the improvements in terms of diversity strongly depends on the domain: using music as source, movie recommendations are less diverse; conversely, using movie domain as source, generally such methods give more diverse music recommendations except for PathRank.
Then we looked for the more accurante
In general the quality of target recommendations improves as more information about the user’s preferences is available in the source domain.
The only exception happens for IMF: we can see a slight decrease for users with more than 100 likes
Conversely, source profile diversity and quality recommendations seem almost inversely proportional, in particular when music is used to recommend movies.