Utilizing additional information in factorization methods is an overview of research into context-aware recommender systems using factorization models. It discusses improving factorization methods from early context-aware tensor models like iTALS and iTALSx to a general factorization framework. The research aims to better model implicit feedback, context, and improve scalability using techniques like conjugate gradient descent learning. Future work includes estimating the utility of context dimensions, modeling continuous context variables, and optimizing models with pairwise ranking loss functions.
Factorization Methods for Utilizing Additional Information in Recommendations
1. Utilizing additional information in
factorization methods
- Research overview -
- 11. April 2014 -
Balázs Hidasi
balazs@hidasi.eu
2. About me
• Datamining researcher at Gravity R&D
• PhD student at BME (BUTE)
• Research interests:
• Machine learning & Data mining
• Algorithm research and development
• Currently: recommender systems
• Previously: time series classification
3. Gravity R&D
• Recommender service provider, based in Hungary
• Founded by team Gravity after the Netflix Prize
• Started working there: January 2010
• Data analysis
• Algorithm development & implementation
• Research
4. Budapest University of Technology
and Economics
• Leading tech university in Hungary
• Faculty of Electrical Engineering and Informatics
• Computer science and engineering B.Sc./M.Sc.
• Ph.D. student since September 2011.
• Department of Telecommunications and Media
Informatics
• Data Science and Content Technologies Laboratory (DC
Lab)
5. RecSys research – aims & roots
• Aims: Developing novel algorithms that enable the usage
of additional information with factorization to improve
recommendation accuracy for implicit feedback based
recommendation tasks
• Roots:
• Implicit feedback
• Context
• Factorization
• In addition:
• ALS learning
• Recall based evaluation
6. Implicit feedback
• Transactions provide no explicit user preference
• View, buy, etc.
• Presence of an event noisy positive feedback
• Absence of an event ?
• Negative feedback is not available
10. Recall based evaluation
• Recall: number of relevant and recommended items in
proportion to the number of relevant items
• @N: only topN items are considered
• Nowadays less common in RecSys
• MAP, NDCG
• Practical point of view
• Rank does not matter as long as the item is shown
• TopN list presented in chunks
• TopN list should contain the relevant items
• For many practical scenarios; there are exceptions
11. RecSys research – overview
• Injecting additional info into MF (through initialization)
• Context-aware methods: iTALS, iTALSx
• Scalability improvement: CD/CG learning
• General factorization framework
• Modeling context
• Pairwise ranking loss with ALS
16. LS/CD/CG comparison
• Little to none degradation in recall
• Training time: CG < CD < LS
• CD is unstable with models using members of higher order
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
900.00
1000.00
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Runningtime(s)
Number of features (K)
iTALS iTALS-CG (N_I = 2) iTALS-CD (N_I = 2)
17. General factorization
framework
• Goal: fully flexible framework that allows
experimentation with arbitrary linear factorization
models
• State-of-the-art methods use fixed models/model-
classes
• Designed for implicit but supports explicit feedback as
well
• wRMSE+ALS based learning
• Approximate LS with CG for better scaling
• No restrictions on the number and meaning of the used
dimension
• Even items and/or users can be emitted
• Duplication of dimensions is allowed
20. User-item-context relations
• Basically 3 types:
• UCI: user-item relation is reweighted by the feature
vector of the current context
• IC: context dependent item bias
• UC: context dependent user bias
• Doesn’t play role in ranking
• Different context dimensions for different roles
21. Context modeling – Utility of
standard context dimensions
• Quality of context dimension
• Huge impact on accuracy
• Can we measure it?
• Which context for which role?
• CA item bias / CA user bias / reweighting user-item
relations
• Can it be predetermined?
• Usefulness of a context dimension
• Given a number of already defined dimension
• Can it be measured without training?
22. Context modeling – Non-standard
context dimensions
• Composite context
• E.g. transactions of the current session
• General factorization framework handles it
• Continuous context (& ordered context)
• E.g. time or distance based context
• Problems:
• Context-state rigidness
• Context-state ordinality
• Context-state continuity
• A solution: to be presented Sunday at CaRR 2014
23. Summary
• Context-aware factorization methods mainly for the implicit
feedback based problem
• From improved MF,
• through context-aware tensor methods
• to a fully flexible general framework
• On the way:
• Improving scalability
• Future:
• Context modeling
• Automatic model learning
• Option for pairwise ranking loss
24. Thanks for the attention!
Papers & slides available
through my website:
http://hidasi.eu
MF initia-
lization
iTALS
iTALSx
Scalability
CG/CD
General
Framework
Model
learning
Pairwise
ranking loss
Context
utility
estimation
Continuous
context
modeling
Implicit feedback; context; factorization; (ALS);