Slides presenting the winning approach of the Recsys Challenge 2014 workshop, presented at the RecSys 2014 conference on Oct 10, in Foster City (CA, USA) by Frédéric Guillou.
User Engagement as Evaluation: a Ranking or a Regression Problem?
1. User Engagement as Evaluation: a Ranking or a
Regression Problem?
Frederic Guillou, Romaric Gaudel, Jeremie Mary et Philippe Preux
Recsys - 10/10/2014
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 1 / 17
2. Plan
1 Introduction
2 Recsys Challenge
Description
User Engagement as Evaluation
The Retweet Eect
3 Method
LambdaMART Model
Regression Approach
4 Experiments
Experimental Results
Relevant Features
5 Conclusion
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 2 / 17
3. Introduction
Standards approaches for recommender systems
Try to predict ratings or interests by minimizing prediction error
Such approaches have
aws: good predictions does not mean good
recommendation
Learning to Rank (LTR)
Use a set of query-item pairs described by a set of input features and
a score
Try to predict the most accurate ranking of items for a query
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 3 / 17
5. c recommendation task
Find items with the highest user engagement
Rank a set of tweets according to their success (retweet, favorite)
Use a regression or a LTR method?
Data
Tweets from MovieTweetings dataset: tweets representing ratings on
IMDb app
Input features are metadata about the user, the tweet and the movie
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 4 / 17
6. Recsys Challenge / User Engagement as Evaluation
The overall evaluation: average of the NDCG@10 calculated for each
user
Metric Value
Tweets 170,285
Unique users 22,079
Unique items 13,618
Tweets with 0 engagement 162,107 (95.2%)
Unsuccessful users 17,502 (79.27%)
1
2
3
185,496
5-14
4
15-51
Figure : Distribution of user engagement for
successful tweets
Almost a binary classi
7. cation problem?
Low number of successful tweets: most users only have tweet with
zero engagement
Even among successful tweets, the engagement is mostly 1
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 5 / 17
8. Recsys Challenge / User Engagement as Evaluation
1
2
3-10
11-20
20
Figure : Distribution of the number of tweets per user
Some cases where any ranking is equivalent:
The user doesn't have any successful tweets
The user has only one tweet or tweets with the exact same engagement
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 6 / 17
9. Recsys Challenge / The Retweet Eect
Some of the tweets are retweets, not original tweets
This has strong consequences:
All these tweets have a strictly positive engagement
The retweet count of the tweet is directly accessible through the
original tweet metadata
Metric Value
Number of retweets 1,808
Percentage among successful tweets 22,10%
Arti
10. cial successful users 28,44%
After removal of the retweet eect and cleaning users, 2,792 users
provide a clean ranking
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 7 / 17
11. Method / LambdaMART
A listwise approach
Consider the whole list of items as instances of learning
Try directly to optimize a performance measure
A combination of two previous algorithms
MART: a pointwise approach based on boosted tree model
Output is a linear combination of the output of a set of regression tree
Can be viewed as performing gradient descent in a function space using
regression trees
LambdaRank: a method based on neural networks
Express gradients based on the ranks of documents
Lambda terms: rules de
12. ning how to change the ranks of items in
order to optimize a performance measure
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 8 / 17
14. ed dataset with only useful users
Very small dataset, with most users having only few successful tweets
Lambda-MART score: 0.838
Lambda-MART cannot capture exactly the information about the
retweet eect
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 9 / 17
15. Method / Regression methods
The retweet eect can be taken into account through three steps:
1 Remove the retweet count of the original tweet from the features
2 Modify the user engagement by substracting the retweet count of the
original tweet
3 Add the retweet count of the original tweet to the result returned by
the learned model
Pointwise approachs: builds a regression model and tries to predict
the score of items
Use of Linear Regression and Random Forests
Lin / WrapLin : 0.806 / 0.843
RF / WrapRF : 0.823 / 0.858
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 10 / 17
16. Experiments / Results
Table : NDCG@10 on test dataset
Model NDCG@10
Retweet 0.806
-MART 0.838
RF / WrapRF 0.823 / 0.858
Lin / WrapLin 0.806 / 0.843
Mean(-MART, Retweet) 0.876
Mean(-MART, WrapRF, WrapLin) 0.874
OptAvg(-MART, WrapRF, WrapLin) 0.876
OptAvg(-MART, RF, Retweet) 0.878
The improvement through wrapping shows the importance of the
retweet eect
All best scores of NDCG are using the LTR method LambdaMART
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 11 / 17
17. Experiments / Relevant Features with Random Forests
0.0 0.1 0.2 0.3 0.4 0.5 0.6
retweeted_fav
retweeted_retweet
listed_count
followers_count
retweeted_status
lang_ar
rating
votes
release_date
created_at
0.0 0.1 0.2 0.3 0.4 0.5 0.6
followers_count
listed_count
favourites_count
lang_ar
rating
media
friends_count
user_mentions
time_tweet_scrap
created_at
0.0 0.1 0.2 0.3 0.4 0.5 0.6
rating
lang_ar
time_tweet_movierelease
release_date
media
user_mentions
budget
time_tweet_scrap
lang_fa
votes
All features: features of the original tweet contribute the most
Without retweet: user features are important ones
Without user features: the rating given to the movie by the user
contributes the most
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 12 / 17
18. Conclusion
Recommendation problems are Learn To Rank problems
Best approach builds upon a LTR strategy
We can combine LambdaMART and Random Forests through a linear
combination
Scarcity of data make the robustness of model dicult to be proven
The success of Twitter or IMDb should make the data collection
easier to build bigger datasets and continue analysis and experiments
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 13 / 17
19. Thank you for your attention
Questions ?
...
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 14 / 17
20. Thank you for your attention
Questions ?
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 15 / 17
21. Discussions
Temporal aspects
Dierence of time interval between train/test set in
uences prediction
or evaluation
Possibly a large impact of time on some features:
A user has more chances to gather followers throughout a longer period
of time
Some movies are released at a speci
22. c time of the year
A sequential approach?
The evolution of a user or the popularity of a movie can be seen as
sequential problems
NDCG does not capture these information ) See metrics such as
regret
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 16 / 17
23. Thank you for your attention
Now is the real question time
F. Guillou, R. Gaudel; J. Mary P. Preux A Ranking or a Regression Problem? October 10, 2014 17 / 17