SlideShare ist ein Scribd-Unternehmen logo
1 von 82
Downloaden Sie, um offline zu lesen
Factorization Models & Polynomial Regression Factorization Machines Applications Summary 
Factorization Machines 
Steen Rendle 
Current aliation: Google Inc. 
Work was done at University of Konstanz 
MLConf, November 14, 2014 
Steen Rendle 1 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 2 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization 
Example for data: Matrix Factorization: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk 
k is the rank of the reconstruction. 
Steen Rendle 3 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization 
Example for data: Matrix Factorization: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk 
^y(u; i) = ^yu;i = 
Xk 
f =1 
wu;f hi ;f = hwu; hi i 
k is the rank of the reconstruction. 
Steen Rendle 3 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
^ySVD++(u; i) := 
* 
vu + 
X 
j2N(u) 
vj ; vi 
+ 
^yFact-KNN(u; i ) := 
1 
jR(u)j 
X 
j2R(u) 
ru;j hvi ; vj i 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization  Extensions 
Example for data: Examples for models: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
^yMF(u; i ) := 
Xk 
f =1 
vu;f vi ;f = hvu; vi i 
^ySVD++(u; i) := 
* 
vu + 
X 
j2N(u) 
vj ; vi 
+ 
^yFact-KNN(u; i ) := 
1 
jR(u)j 
X 
j2R(u) 
ru;j hvi ; vj i 
Rating 
Matrix 
time 
^ytimeSVD(u; i ; t) := hvu + vu;t ; vi i 
^ytimeTF(u; i ; t) := 
Xk 
f =1 
vu;f vi ;f vt;f 
: : : 
Steen Rendle 4 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Tensor Factorization 
Example for data: Examples for models: 
Triples of Subject, Predicate, Object 
^yPARAFAC(s; p; o) := 
Xk 
f =1 
vs;f vp;f vo;f 
^yPITF(s; p; o) := hvs ; vpi + hvs ; voi + hvp; voi 
: : : 
Steen Rendle 5 / 53 
[illustration from Drumond et al. 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Sequential Factorization Models 
Example for data: Examples for models: 
Bt Bt­3 
b 
b a b 
a c 
User 1 ? 
c 
e 
c c 
a 
? 
d c e e ? 
? 
User 2 
User 3 
User 4 
Bt­2 
Bt­1 
a 
^yFMC(u; i ; t) := 
X 
l2Bt1 
hvi ; vl i 
^yFPMC(u; i ; t) := hvu; vi i + 
X 
l2Bt1 
hvi ; vl i 
: : : 
Steen Rendle 6 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Models: Discussion 
I Advantages 
I Can estimate interactions between two (or more) variables even if 
the cross is not observed. 
I E.g. user  movie, current product  next product, user  query  
url, : : : 
Steen Rendle 7 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Models: Discussion 
I Advantages 
I Can estimate interactions between two (or more) variables even if 
the cross is not observed. 
I E.g. user  movie, current product  next product, user  query  
url, : : : 
I Downsides 
I Factorization models are usually build speci
cally for each problem. 
I Learning algorithms and implementations are tailored to individual 
models. 
Steen Rendle 7 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 8 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Data and Variable Representation 
Many standard ML approaches work with real valued feature vectors as 
input. It allows to represent, e.g.: 
I any number of variables 
I categorical domains by using dummy indicator variables 
I numerical domains 
I set-categorical domains by using dummy indicator variables 
Using this representation allows to apply a wide variety of standard 
models (e.g. linear regression, SVM, etc.). 
Steen Rendle 9 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Linear Regression 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi 
I Model parameters: 
w0 2 R; w 2 Rp 
O(p) model parameters. 
Steen Rendle 10 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Polynomial Regression 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
wi ;j xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; W 2 Rpp 
O(p2) model parameters. 
Steen Rendle 11 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Models 
Linear/ Polynomial Regression 
Comparison 
Factorization Machines 
Applications 
Summary 
Steen Rendle 12 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
Movie 
TI NH SW ST ... 
5 3 1 ? ... 
? ? 4 5 ... 
1 ? 5 ? ... 
... ... ... ... ... 
A 
B 
C 
... 
User 
, 
# User Movie Rating 
1 Alice Titanic 5 
2 Alice Notting Hill 3 
3 Alice Star Wars 1 
4 Bob Star Wars 4 
5 Bob Star Trek 5 
6 Charlie Titanic 1 
7 Charlie Star Wars 5 
. . . . . . . . . . . . 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Representation: Matrix/ Tensor vs. Feature Vectors 
Matrix/ Tensor data can be represented by feature vectors: 
# User Movie Rating 
1 Alice Titanic 5 
2 Alice Notting Hill 3 
3 Alice Star Wars 1 
4 Bob Star Wars 4 
5 Bob Star Trek 5 
6 Charlie Titanic 1 
7 Charlie Star Wars 5 
. . . . . . . . . . . . 
) 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Steen Rendle 13 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Polynomial regression: ^y(x) = w0 + wu + wi + wu;i 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
Target y 
5 
3 
1 y(3) 
4 
5 
1 
5 
y(1) 
y(2) 
y(4) 
y(5) 
y(6) 
y(7) 
Applying regression models to this data leads to: 
Linear regression: ^y(x) = w0 + wu + wi 
Polynomial regression: ^y(x) = w0 + wu + wi + wu;i 
Matrix factorization: ^y(u; i) = hwu; hi i 
Steen Rendle 14 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
I Max.-likelihood estimator for a pairwise eect is: 
wi ;j = 
( 
y  w0  wi  wu; if (i ; j ; y) 2 S: 
not de
ned; else 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Application to Sparse Feature Vectors 
For the data of the example: 
I Linear regression has no user-item interaction. 
I ) Linear regression is not expressive enough. 
I Polynomial regression includes pairwise interactions but cannot 
estimate them from the data. 
I n  p2: number of cases is much smaller than number of model 
parameters. 
I Max.-likelihood estimator for a pairwise eect is: 
wi ;j = 
( 
y  w0  wi  wu; if (i ; j ; y) 2 S: 
not de
ned; else 
I Polynomial regression cannot generalize to any unobserved pairwise 
eect. 
Steen Rendle 15 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 16 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Compared to Polynomial regression: 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
wi ;j xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; W 2 Rpp 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 2): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machine (FM) 
I Let x 2 Rp be an input vector with p predictor variables. 
I Model equation (degree 3): 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
+ 
Xp 
i=1 
Xp 
ji 
Xp 
lj 
Xk 
f =1 
v(3) 
i ;f v(3) 
j ;f v(3) 
l ;f xi xj xl 
I Model parameters: 
w0 2 R; w 2 Rp; V 2 Rpk ; V(3) 2 Rpk 
Steen Rendle 17 / 53 
[Rendle 2010, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorization Machines: Discussion 
I FMs work with real valued input. 
I FMs include variable interactions like polynomial regression. 
I Model parameters for interactions are factorized. 
I Number of model parameters is O(k p) (instead of O(p2) for poly. 
regr.). 
Steen Rendle 18 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 19 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Matrix Factorization and Factorization Machines 
Two categorical variables encoded with real valued predictor variables: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
With this data, the FM is identical to MF with biases1: 
^y(x) = w0 + wu + wi + hvu; vi i | {z } 
MF 
1libFM, k = 128, MCMC inference, Net
ix RMSE=0.8937 
Steen Rendle 20 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
RDF-Triple Prediction with Factorization Machines 
Three categorical variables encoded with real valued predictor variables: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0 0 0 1 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
S1 S2 S3 ... P1 P2 P3 P4 ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
1 
0 
0 
0 
0 
0 
0 
... 
... 
... 
... 
... 
0 0 0 1 ... 
O1 O2 O3 O4 ... 
Subject Predicate Object 
With this data, the FM is equivalent to the PITF model: 
^y(x) := w0 + ws + wp + wo + hvs ; vpi + hvs ; voi + hvp; voi 
[PITF: Rendle et al. 2010, WSDM Best Student Paper, ECML 2009 Best DC Award] 
Steen Rendle 21 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Time with Factorization Machines 
Two categorical variables and time as linear predictor: 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
0.2 
0.6 
0.61 
0.3 
0.5 
0.1 
0.8 
Time 
The FM model would correspond to: 
^y(x) := w0 + wi + wu + t wtime + hvu; vi i + t hvu; vtimei + t hvi ; vtimei 
Steen Rendle 22 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Time with Factorization Machines 
Two categorical variables and time discretized in bins (b(t)): 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
User Movie 
1 
0 
0 
1 
0 
1 
0 
0 
1 
1 
0 
1 
0 
0 
Time 
0 
0 
0 
0 
0 
0 
1 
T1 T2 T3 
The FM model would correspond to:2 
^y(x) := w0 + wi + wu + wb(t) + hvu; vi i + hvu; vb(t)i + hvi ; vb(t)i 
2libFM, k = 128, MCMC inference, Net
ix RMSE=0.8873 
Steen Rendle 22 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
SVD++ 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0.3 0.3 0.3 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
A B C ... TI NH SW ST ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
0.3 
0.3 
0 
0 
0.5 
0.3 
0.3 
0 
0 
0 
0.3 
0.3 
0.5 
0.5 
0.5 
0 
0 
0.5 
0.5 
0 
... 
... 
... 
... 
... 
0.5 0 0.5 0 ... 
TI NH SW ST ... 
User Movie Other Movies rated 
With this data, the FM3 is identical to: 
^y(x) = 
SVD++ z }| { 
w0 + wu + wi + hvu; vi i + 
1 p 
jNuj 
X 
l2Nu 
hvi ; vl i 
+ 
1 p 
jNuj 
X 
l2Nu 
0 
@wl + hvu; vl i + 
1 p 
jNuj 
X 
l 02Nu ;l 0l 
hvl ; v0 
l i 
1 
A 
3libFM, k = 128, MCMC inference, Net
ix RMSE=0.8865 
Steen Rendle 23 / 53 
[Koren, 2008]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Factorizing Personalized Markov Chains (FPMC) 
Two categorical variables (u,i ), one set categorical (Bt1): 
1 0 0 ... 
1 0 0 ... 
x(3) 1 0 0 ... 0 0 1 0 ... 0.5 0.5 0 0 ... 
0 1 0 ... 
0 1 0 ... 
0 0 1 ... 
1 
0 
0 
0 
1 
0 
1 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
1 
0 
... 
... 
... 
... 
... 
0 0 1 ... 0 0 1 0 ... 
u1 u2 u3 ... A B C D ... 
x(1) 
x(2) 
x(4) 
x(5) 
x(6) 
x(7) 
Feature vector x 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
1 
0 
0 
0 
0 
0 
0 
... 
... 
... 
... 
... 
1 0 0 0 ... 
User Product 
A B C D ... 
Last Basket 
Sequential Baskets 
u1 A,B C 
u2 C D 
u3 A C 
FM is equivalent to 
^y(x) := w0 + wu + wi + 
1 
jBt1j 
X 
j2Bt1 
wj + hvu; vi i + 
1 
jBt1j 
X 
j2Bt1 
hvi ; vj i + ::: 
Steen Rendle 24 / 53 
[Rendle et al. 2010, WWW Best Paper]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 25 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
I Ecient computation can be done in: O(p k) 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Computation Complexity 
Factorization Machine model equation: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
I Trivial computation: O(p2 k) 
I Ecient computation can be done in: O(p k) 
I Making use of many zeros in x even in: O(Nz (x) k), where Nz (x) is 
the number of non-zero elements in vector x. 
Steen Rendle 26 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Proof: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
= w0 + 
Xp 
i=1 
wi xi + 
1 
2 
Xk 
f =1 
2 
4 
  Xp 
i=1 
xi vi ;f 
!2 
 
Xp 
i=1 
(xi vi ;f )2 
3 
5 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Ecient Computation 
The model equation of an FM can be computed in O(p k). 
Proof: 
^y(x) := w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
ji 
hvi ; vj i xi xj 
= w0 + 
Xp 
i=1 
wi xi + 
1 
2 
Xk 
f =1 
2 
4 
  Xp 
i=1 
xi vi ;f 
!2 
 
Xp 
i=1 
(xi vi ;f )2 
3 
5 
I In the sums over i , only non-zero xi elements have to be summed up 
) O(Nz (x) k). 
I (The complexity of polynomial regression is O(Nz (x)2).) 
Steen Rendle 27 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Multilinearity 
FMs are multilinear: 
8 2  = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x)  + g()(x) 
where g() and h() do not depend on the value of . 
Steen Rendle 28 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Multilinearity 
FMs are multilinear: 
8 2  = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x)  + g()(x) 
where g() and h() do not depend on the value of . 
E.g. for second order eects ( = vl ;f ): 
^y(x; vl;f ) := 
g(vl;f )(x) 
z }| { 
w0 + 
Xp 
i=1 
wi xi + 
Xp 
i=1 
Xp 
j=i+1 
Xk 
f 0=1 
(f 06=f )_(l62fi ;jg) 
vi ;f 0 vj;f 0 xi xj 
+ vl;f xl 
X 
i=1;i6=l 
vi ;f xi 
| {z } 
h(vl;f )(x) 
Steen Rendle 28 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 29 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Learning 
Using these properties, learning algorithms can be developed: 
I L2-regularized regression and classi
cation: 
I Stochastic gradient descent [Rendle, 2010] 
I Alternating least squares/ Coordinate Descent [Rendle et al., 2011, 
Rendle 2012] 
I Markov Chain Monte Carlo (for Bayesian FMs) [Freudenthaler et al. 
2011, Rendle 2012] 
I L2-regularized ranking: 
I Stochastic gradient descent [Rendle, 2010] 
All the proposed learning algorithms have a runtime of O(k Nz (X) i ), 
where i is the number of iterations and Nz (X) the number of non-zero 
elements in the design matrix X. 
Steen Rendle 30 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Stochastic Gradient Descent (SGD) 
I For each training case (x; y) 2 S, SGD updates the FM model 
parameter  using: 
0 =    
 
(^y(x)  y)h()(x) + () 
 
I  is the learning rate / step size. 
I () is the regularization value of the parameter . 
I SGD can easily be applied to other loss functions. 
Steen Rendle 31 / 53 
[Rendle, 2010]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Coordinate Descent (CD) 
I CD updates each FM model parameter  using: 
0 = 
P 
(x;y)2S 
 
y  g()(x) 
 
h()(x) 
P 
(x;y)2S h2 
()(x) + () 
I Using caches of intermediate results, the runtime for updating all 
model parameters is O(k Nz (X)). 
I CD can be extended to classi
cation [Rendle, 2012]. 
Steen Rendle 32 / 53 
[Rendle et al., 2011]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Gibbs Sampling (MCMC) 
I Gibbs sampling with a block for each FM model parameter : 
jS; n fg  N 
  
 
P 
(x;y)2S 
 
y  g()(x) 
 
h()(x) 
 
P 
(x;y)2S h2 
()(x) + () 
; 
1 
 
P 
(x;y)2S h2 
()(x) + () 
! 
I Mean is the same as for CD ) computational complexity is also 
O(k Nz (X)). 
I MCMC can be extended to classi
cation using link functions. 
Steen Rendle 33 / 53 
[Freudenthaler et al. 2011, Rendle 2012]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Learning Regularization Values 
 v ,v 
yi 
w ,w 
wj 
w0 
xij 
i=1,...,n 
v j 
 
j=1,...,p 
 , 0 ,0 
yi 
wj 
w0 
w 
xij 
i=1,...,n 
v j 
 
j=1,...,p 
w0 ,w0 
0 ,0 
w0 ,w0 
w  v v 
Standard FM with priors. Two level FM with hyperpriors. 
Steen Rendle 34 / 53 
[Freudenthaler et al., 2011]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Model 
Examples 
Properties 
Learning 
libFM Software 
Applications 
Summary 
Steen Rendle 35 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
libFM Software 
libFM is an implementation of FMs 
I Model: second-order FMs 
I Learning/ inference: SGD, ALS, MCMC 
I Classi
cation and regression 
I Uses the same data format as LIBSVM, LIBLINEAR [Lin et. al], 
SVMlight [Joachims]. 
I Supports variable grouping. 
I Open source: GPLv3. 
Steen Rendle 36 / 53 
[http://www.libfm.org/]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 37 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
(Context-aware) Rating Prediction 
I Main variables: 
I User ID (categorical) 
I Item ID (categorical) 
I Additional variables: 
I time 
I mood 
I user pro
le 
I item meta data 
I . . . 
I Examples: Net
ix prize, Movielens, KDDCup 2011 
+ 
♪ + + 
Song 
User Time Mood 
Steen Rendle 38 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Netflix Prize: Prediction Error 
Public Leaderboard 
RMS Error 
0.86 0.87 0.88 0.89 0.90 
user, movie 
user, movie, day 
user, movie, impl. 
user, movie, 
day, impl. 
SGD Matrix 
Factorization 
user, movie, 
day, impl., 
freq, lin. day 
$1M Prize 
I k = 128 factors, 512 MCMC samples (no burnin phase, initialization 
from random) 
I MCMC inference (no hyperparameters (learning rate, regularization) 
to specify) 
Steen Rendle 39 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Method (Name) Ref. Learning Method k Quiz RMSE 
Models using user ID and item ID 
Probabilistic Matrix Factorization [14, 13] Batch GD 40 *0.9170 
Probabilistic Matrix Factorization [14, 13] Batch GD 150 0.9211 
Matrix Factorization [6] Variational Bayes 30 *0.9141 
Matchbox [15] Variational Bayes 50 *0.9100 
ALS-MF [7] ALS 100 0.9079 
ALS-MF [7] ALS 1000 *0.9018 
SVD/ MF [3] SGD 100 0.9025 
SVD/ MF [3] SGD 200 *0.9009 
Bayesian Probablistic Matrix Factorization 
[13] MCMC 150 0.8965 
(BPMF) 
Bayesian Probablistic Matrix Factorization 
(BPMF) 
[13] MCMC 300 *0.8954 
FM, pred. var: user ID, movie ID - MCMC 128 0.8937 
Models using implicit feedback 
Probabilistic Matrix Factorization with Cons- 
traints 
[14] Batch GD 30 *0.9016 
SVD++ [3] SGD 100 0.8924 
SVD++ [3] SGD 200 *0.8911 
BSRM/F [18] MCMC 100 0.8926 
BSRM/F [18] MCMC 400 *0.8874 
FM, pred. var: user ID, movie ID, impl. - MCMC 128 0.8865 
Steen Rendle 40 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Net
ix Prize 
Method (Name) Ref. Learning Method k Quiz RMSE 
Models using time information 
Bayesian Probabilistic Tensor Factorization 
[17] MCMC 30 *0.9044 
(BPTF) 
FM, pred. var: user ID, movie ID, day - MCMC 128 0.8873 
Models using time and implicit feedback 
timeSVD++ [5] SGD 100 0.8805 
timeSVD++ [5] SGD 200 *0.8799 
FM, pred. var: user ID, movie ID, day, impl. - MCMC 128 0.8809 
FM, pred. var: user ID, movie ID, day, impl. - MCMC 256 0.8794 
Assorted models 
BRISMF/UM NB corrected [16] SGD 1000 *0.8904 
BMFSI plus side information [8] MCMC 100 *0.8875 
timeSVD++ plus frequencies [4] SGD 200 0.8777 
timeSVD++ plus frequencies [4] SGD 2000 *0.8762 
FM, pred. var: user ID, movie ID, day, impl., 
- MCMC 128 0.8779 
freq., lin. day 
FM, pred. var: user ID, movie ID, day, impl., 
freq., lin. day 
- MCMC 256 0.8771 
Steen Rendle 40 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 41 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Link Prediction in Social Networks 
I Main variables: 
I Actor A ID 
I Actor B ID 
I Additional variables: 
I pro
les 
I actions 
I . . . 
+ 
Actor A Actor B 
Steen Rendle 42 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
KDDCup 2012: Track 1 
KDDCup 2012 Track 1: Prediction Quality 
Public Leaderboard Private Leaderboard 
Mean Average Precision @3 
0.32 0.34 0.36 0.38 0.40 0.42 
none 
gender, age, ... 
keywords 
friends 
all 
none 
gender, age, ... 
keywords 
friends 
all 
Top 1 
Top 5 
Top 10 
Top 100 
I k = 22 factors, 512 MCMC samples (no burnin phase, initialization 
from random) 
I MCMC inference (no hyperparameters (learning rate, regularization) 
to specify) 
Steen Rendle 43 / 53 
[Awarded 2nd place (out of 658 teams)]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 44 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Clickthrough Prediction 
I Main variables: 
I User ID 
I Query ID 
I Ad/ Link ID 
I Additional variables: 
I query tokens 
I user pro
le 
I . . . 
+ 
keyword... + 
Link 1 
Link 2 
Link 3 
User Query Ad/ Link 
Steen Rendle 45 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
KDDCup 2012: Track 2 
Model Inference wAUC (public) wAUC (private) 
ID-based model (k = 0) SGD 0.78050 0.78086 
Attribute-based model (k = 8) MCMC 0.77409 0.77555 
Mixed model (k = 8) SGD 0.79011 0.79321 
Final ensemble n/a 0.79857 0.80178 
Ensemble 
I Rank positions (not predicted clickthrough rates) are used. 
I The MCMC attribute-based model and dierent variations of the 
. SGD models are included. 
Steen Rendle 46 / 53 
[Awarded 3rd place (out of 171 teams)]
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
Outline 
Factorization Models  Polynomial Regression 
Factorization Machines 
Applications 
Recommender Systems 
Link Prediction in Social Networks 
Clickthrough Prediction 
Personalized Ranking 
Student Performance Prediction 
Kaggle Competitions 
Summary 
Steen Rendle 47 / 53
Factorization Models  Polynomial Regression Factorization Machines Applications Summary 
ECML/PKDD Discovery Challenge 2013 
I Problem: Recommend given names. 
I Main variables: 
I User ID 
I Name ID 
I Additional variables: 
I session info 
I string representation for each name 
I . . . 
I FM approach won 1st place (online track) and 2nd (oine track). 
Steen Rendle 48 / 53

Weitere ähnliche Inhalte

Was ist angesagt?

Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines IntroductionBartlomiej Twardowski
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsYONG ZHENG
 
Training and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian ProcessesTraining and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian ProcessesKeyon Vafa
 
Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)hyunsung lee
 
Training language models to follow instructions with human feedback.pdf
Training language models to follow instructions
with human feedback.pdfTraining language models to follow instructions
with human feedback.pdf
Training language models to follow instructions with human feedback.pdfPo-Chuan Chen
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Edureka!
 
Mrbml004 : Introduction to Information Theory for Machine Learning
Mrbml004 : Introduction to Information Theory for Machine LearningMrbml004 : Introduction to Information Theory for Machine Learning
Mrbml004 : Introduction to Information Theory for Machine LearningJaouad Dabounou
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? HackerEarth
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsChris Johnson
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsEnrico Palumbo
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...Simplilearn
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 
Deep learning for audio-based music recommendation
Deep learning for audio-based music recommendationDeep learning for audio-based music recommendation
Deep learning for audio-based music recommendationRussia.AI
 
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationPr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationTaeoh Kim
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 

Was ist angesagt? (20)

Warsaw Data Science - Factorization Machines Introduction
Warsaw Data Science -  Factorization Machines IntroductionWarsaw Data Science -  Factorization Machines Introduction
Warsaw Data Science - Factorization Machines Introduction
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
Training and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian ProcessesTraining and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian Processes
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 
Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)
 
Training language models to follow instructions with human feedback.pdf
Training language models to follow instructions
with human feedback.pdfTraining language models to follow instructions
with human feedback.pdf
Training language models to follow instructions with human feedback.pdf
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
Autoencoders Tutorial | Autoencoders In Deep Learning | Tensorflow Training |...
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
 
Mrbml004 : Introduction to Information Theory for Machine Learning
Mrbml004 : Introduction to Information Theory for Machine LearningMrbml004 : Introduction to Information Theory for Machine Learning
Mrbml004 : Introduction to Information Theory for Machine Learning
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Deep learning for audio-based music recommendation
Deep learning for audio-based music recommendationDeep learning for audio-based music recommendation
Deep learning for audio-based music recommendation
 
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationPr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 

Andere mochten auch

Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization MachinesPavel Kalaidin
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFMLiangjie Hong
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFMLconf
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFMLconf
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFMLconf
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFMLconf
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFMLconf
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...Sri Ambati
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFMLconf
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank AggregationNIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank Aggregationsleepy_yoshi
 
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLEvan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLMLconf
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsDKALab
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFMLconf
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewBartlomiej Twardowski
 
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15MLconf
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine DemystifiedDKALab
 

Andere mochten auch (20)

Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization Machines
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SFLise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
Lise Getoor, Professor, Computer Science, UC Santa Cruz at MLconf SF
 
Quoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SFQuoc Le, Software Engineer, Google at MLconf SF
Quoc Le, Software Engineer, Google at MLconf SF
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SFAmeet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
Ameet Talwalkar, assistant professor of Computer Science, UCLA at MLconf SF
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank AggregationNIPS2010読み会: A New Probabilistic Model for Rank Aggregation
NIPS2010読み会: A New Probabilistic Model for Rank Aggregation
 
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATLEvan Estola – Data Scientist, Meetup.com at MLconf ATL
Evan Estola – Data Scientist, Meetup.com at MLconf ATL
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
 
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
Ben Hamner, Co-founder and CTO, Kaggle at MLconf SF - 11/13/15
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine Demystified
 

Ähnlich wie Steffen Rendle, Research Scientist, Google at MLconf SF

PF_IDETC_2012_Souma
PF_IDETC_2012_SoumaPF_IDETC_2012_Souma
PF_IDETC_2012_SoumaMDO_Lab
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...Francisco (Paco) Florez-Revuelta
 
CP3_SDM_2010_Souma
CP3_SDM_2010_SoumaCP3_SDM_2010_Souma
CP3_SDM_2010_SoumaMDO_Lab
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиSkolkovo Robotics Center
 
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...Kanika Anand
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...Thanh Hieu
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)Nidhi Gopal
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...JuanPabloCarbajal3
 
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Souma Chowdhury
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesMohammad
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Sergey Staroletov
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_SouamMDO_Lab
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROMMohammad
 

Ähnlich wie Steffen Rendle, Research Scientist, Google at MLconf SF (20)

PF_IDETC_2012_Souma
PF_IDETC_2012_SoumaPF_IDETC_2012_Souma
PF_IDETC_2012_Souma
 
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
A Multiple Kernel Learning Based Fusion Framework for Real-Time Multi-View Ac...
 
CP3_SDM_2010_Souma
CP3_SDM_2010_SoumaCP3_SDM_2010_Souma
CP3_SDM_2010_Souma
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботами
 
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
An_expected_improvement_criterion_for_the_global_optimization_of_a_noisy_comp...
 
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
GDRR Opening Workshop - Variance Reduction for Reliability Assessment with St...
 
FMI output gap
FMI output gapFMI output gap
FMI output gap
 
Wp13105
Wp13105Wp13105
Wp13105
 
The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...The study on mining temporal patterns and related applications in dynamic soc...
The study on mining temporal patterns and related applications in dynamic soc...
 
Compact 3 d_resist_models_general
Compact 3 d_resist_models_generalCompact 3 d_resist_models_general
Compact 3 d_resist_models_general
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
 
Phd Defense 2007
Phd Defense 2007Phd Defense 2007
Phd Defense 2007
 
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
Comprehensive Product Platform Planning (CP3) - Souma - AIAA/SDM2010
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
 
AbdoSummerANS_mod3
AbdoSummerANS_mod3AbdoSummerANS_mod3
AbdoSummerANS_mod3
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
Applying Model Checking Approach with Floating Point Arithmetic for Verificat...
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_Souam
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROM
 

Mehr von MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Mehr von MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Kürzlich hochgeladen

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Kürzlich hochgeladen (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Steffen Rendle, Research Scientist, Google at MLconf SF

  • 1. Factorization Models & Polynomial Regression Factorization Machines Applications Summary Factorization Machines Steen Rendle Current aliation: Google Inc. Work was done at University of Konstanz MLConf, November 14, 2014 Steen Rendle 1 / 53
  • 2. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 2 / 53
  • 3. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Example for data: Matrix Factorization: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk k is the rank of the reconstruction. Steen Rendle 3 / 53
  • 4. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Example for data: Matrix Factorization: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^ Y := W Ht ; W 2 RjUjk ;H 2 RjIjk ^y(u; i) = ^yu;i = Xk f =1 wu;f hi ;f = hwu; hi i k is the rank of the reconstruction. Steen Rendle 3 / 53
  • 5. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i Steen Rendle 4 / 53
  • 6. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i ^ySVD++(u; i) := * vu + X j2N(u) vj ; vi + ^yFact-KNN(u; i ) := 1 jR(u)j X j2R(u) ru;j hvi ; vj i Steen Rendle 4 / 53
  • 7. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization Extensions Example for data: Examples for models: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User ^yMF(u; i ) := Xk f =1 vu;f vi ;f = hvu; vi i ^ySVD++(u; i) := * vu + X j2N(u) vj ; vi + ^yFact-KNN(u; i ) := 1 jR(u)j X j2R(u) ru;j hvi ; vj i Rating Matrix time ^ytimeSVD(u; i ; t) := hvu + vu;t ; vi i ^ytimeTF(u; i ; t) := Xk f =1 vu;f vi ;f vt;f : : : Steen Rendle 4 / 53
  • 8. Factorization Models Polynomial Regression Factorization Machines Applications Summary Tensor Factorization Example for data: Examples for models: Triples of Subject, Predicate, Object ^yPARAFAC(s; p; o) := Xk f =1 vs;f vp;f vo;f ^yPITF(s; p; o) := hvs ; vpi + hvs ; voi + hvp; voi : : : Steen Rendle 5 / 53 [illustration from Drumond et al. 2012]
  • 9. Factorization Models Polynomial Regression Factorization Machines Applications Summary Sequential Factorization Models Example for data: Examples for models: Bt Bt­3 b b a b a c User 1 ? c e c c a ? d c e e ? ? User 2 User 3 User 4 Bt­2 Bt­1 a ^yFMC(u; i ; t) := X l2Bt1 hvi ; vl i ^yFPMC(u; i ; t) := hvu; vi i + X l2Bt1 hvi ; vl i : : : Steen Rendle 6 / 53
  • 10. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Models: Discussion I Advantages I Can estimate interactions between two (or more) variables even if the cross is not observed. I E.g. user movie, current product next product, user query url, : : : Steen Rendle 7 / 53
  • 11. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Models: Discussion I Advantages I Can estimate interactions between two (or more) variables even if the cross is not observed. I E.g. user movie, current product next product, user query url, : : : I Downsides I Factorization models are usually build speci
  • 12. cally for each problem. I Learning algorithms and implementations are tailored to individual models. Steen Rendle 7 / 53
  • 13. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 8 / 53
  • 14. Factorization Models Polynomial Regression Factorization Machines Applications Summary Data and Variable Representation Many standard ML approaches work with real valued feature vectors as input. It allows to represent, e.g.: I any number of variables I categorical domains by using dummy indicator variables I numerical domains I set-categorical domains by using dummy indicator variables Using this representation allows to apply a wide variety of standard models (e.g. linear regression, SVM, etc.). Steen Rendle 9 / 53
  • 15. Factorization Models Polynomial Regression Factorization Machines Applications Summary Linear Regression I Let x 2 Rp be an input vector with p predictor variables. I Model equation: ^y(x) := w0 + Xp i=1 wi xi I Model parameters: w0 2 R; w 2 Rp O(p) model parameters. Steen Rendle 10 / 53
  • 16. Factorization Models Polynomial Regression Factorization Machines Applications Summary Polynomial Regression I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji wi ;j xi xj I Model parameters: w0 2 R; w 2 Rp; W 2 Rpp O(p2) model parameters. Steen Rendle 11 / 53
  • 17. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Models Linear/ Polynomial Regression Comparison Factorization Machines Applications Summary Steen Rendle 12 / 53
  • 18. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User Steen Rendle 13 / 53
  • 19. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: Movie TI NH SW ST ... 5 3 1 ? ... ? ? 4 5 ... 1 ? 5 ? ... ... ... ... ... ... A B C ... User , # User Movie Rating 1 Alice Titanic 5 2 Alice Notting Hill 3 3 Alice Star Wars 1 4 Bob Star Wars 4 5 Bob Star Trek 5 6 Charlie Titanic 1 7 Charlie Star Wars 5 . . . . . . . . . . . . Steen Rendle 13 / 53
  • 20. Factorization Models Polynomial Regression Factorization Machines Applications Summary Representation: Matrix/ Tensor vs. Feature Vectors Matrix/ Tensor data can be represented by feature vectors: # User Movie Rating 1 Alice Titanic 5 2 Alice Notting Hill 3 3 Alice Star Wars 1 4 Bob Star Wars 4 5 Bob Star Trek 5 6 Charlie Titanic 1 7 Charlie Star Wars 5 . . . . . . . . . . . . ) 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Steen Rendle 13 / 53
  • 21. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Steen Rendle 14 / 53
  • 22. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Steen Rendle 14 / 53
  • 23. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Polynomial regression: ^y(x) = w0 + wu + wi + wu;i Steen Rendle 14 / 53
  • 24. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie Target y 5 3 1 y(3) 4 5 1 5 y(1) y(2) y(4) y(5) y(6) y(7) Applying regression models to this data leads to: Linear regression: ^y(x) = w0 + wu + wi Polynomial regression: ^y(x) = w0 + wu + wi + wu;i Matrix factorization: ^y(u; i) = hwu; hi i Steen Rendle 14 / 53
  • 25. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. Steen Rendle 15 / 53
  • 26. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. Steen Rendle 15 / 53
  • 27. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. Steen Rendle 15 / 53
  • 28. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. Steen Rendle 15 / 53
  • 29. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. I Max.-likelihood estimator for a pairwise eect is: wi ;j = ( y w0 wi wu; if (i ; j ; y) 2 S: not de
  • 30. ned; else Steen Rendle 15 / 53
  • 31. Factorization Models Polynomial Regression Factorization Machines Applications Summary Application to Sparse Feature Vectors For the data of the example: I Linear regression has no user-item interaction. I ) Linear regression is not expressive enough. I Polynomial regression includes pairwise interactions but cannot estimate them from the data. I n p2: number of cases is much smaller than number of model parameters. I Max.-likelihood estimator for a pairwise eect is: wi ;j = ( y w0 wi wu; if (i ; j ; y) 2 S: not de
  • 32. ned; else I Polynomial regression cannot generalize to any unobserved pairwise eect. Steen Rendle 15 / 53
  • 33. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 16 / 53
  • 34. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 35. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Compared to Polynomial regression: I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji wi ;j xi xj I Model parameters: w0 2 R; w 2 Rp; W 2 Rpp Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 36. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 2): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 37. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machine (FM) I Let x 2 Rp be an input vector with p predictor variables. I Model equation (degree 3): ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj + Xp i=1 Xp ji Xp lj Xk f =1 v(3) i ;f v(3) j ;f v(3) l ;f xi xj xl I Model parameters: w0 2 R; w 2 Rp; V 2 Rpk ; V(3) 2 Rpk Steen Rendle 17 / 53 [Rendle 2010, Rendle 2012]
  • 38. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorization Machines: Discussion I FMs work with real valued input. I FMs include variable interactions like polynomial regression. I Model parameters for interactions are factorized. I Number of model parameters is O(k p) (instead of O(p2) for poly. regr.). Steen Rendle 18 / 53
  • 39. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 19 / 53
  • 40. Factorization Models Polynomial Regression Factorization Machines Applications Summary Matrix Factorization and Factorization Machines Two categorical variables encoded with real valued predictor variables: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie With this data, the FM is identical to MF with biases1: ^y(x) = w0 + wu + wi + hvu; vi i | {z } MF 1libFM, k = 128, MCMC inference, Net ix RMSE=0.8937 Steen Rendle 20 / 53
  • 41. Factorization Models Polynomial Regression Factorization Machines Applications Summary RDF-Triple Prediction with Factorization Machines Three categorical variables encoded with real valued predictor variables: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 0 0 1 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... S1 S2 S3 ... P1 P2 P3 P4 ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 1 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 ... ... ... ... ... 0 0 0 1 ... O1 O2 O3 O4 ... Subject Predicate Object With this data, the FM is equivalent to the PITF model: ^y(x) := w0 + ws + wp + wo + hvs ; vpi + hvs ; voi + hvp; voi [PITF: Rendle et al. 2010, WSDM Best Student Paper, ECML 2009 Best DC Award] Steen Rendle 21 / 53
  • 42. Factorization Models Polynomial Regression Factorization Machines Applications Summary Time with Factorization Machines Two categorical variables and time as linear predictor: 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie 0.2 0.6 0.61 0.3 0.5 0.1 0.8 Time The FM model would correspond to: ^y(x) := w0 + wi + wu + t wtime + hvu; vi i + t hvu; vtimei + t hvi ; vtimei Steen Rendle 22 / 53
  • 43. Factorization Models Polynomial Regression Factorization Machines Applications Summary Time with Factorization Machines Two categorical variables and time discretized in bins (b(t)): 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x User Movie 1 0 0 1 0 1 0 0 1 1 0 1 0 0 Time 0 0 0 0 0 0 1 T1 T2 T3 The FM model would correspond to:2 ^y(x) := w0 + wi + wu + wb(t) + hvu; vi i + hvu; vb(t)i + hvi ; vb(t)i 2libFM, k = 128, MCMC inference, Net ix RMSE=0.8873 Steen Rendle 22 / 53
  • 44. Factorization Models Polynomial Regression Factorization Machines Applications Summary SVD++ 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0.3 0.3 0.3 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... A B C ... TI NH SW ST ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 0.3 0.3 0 0 0.5 0.3 0.3 0 0 0 0.3 0.3 0.5 0.5 0.5 0 0 0.5 0.5 0 ... ... ... ... ... 0.5 0 0.5 0 ... TI NH SW ST ... User Movie Other Movies rated With this data, the FM3 is identical to: ^y(x) = SVD++ z }| { w0 + wu + wi + hvu; vi i + 1 p jNuj X l2Nu hvi ; vl i + 1 p jNuj X l2Nu 0 @wl + hvu; vl i + 1 p jNuj X l 02Nu ;l 0l hvl ; v0 l i 1 A 3libFM, k = 128, MCMC inference, Net ix RMSE=0.8865 Steen Rendle 23 / 53 [Koren, 2008]
  • 45. Factorization Models Polynomial Regression Factorization Machines Applications Summary Factorizing Personalized Markov Chains (FPMC) Two categorical variables (u,i ), one set categorical (Bt1): 1 0 0 ... 1 0 0 ... x(3) 1 0 0 ... 0 0 1 0 ... 0.5 0.5 0 0 ... 0 1 0 ... 0 1 0 ... 0 0 1 ... 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 ... ... ... ... ... 0 0 1 ... 0 0 1 0 ... u1 u2 u3 ... A B C D ... x(1) x(2) x(4) x(5) x(6) x(7) Feature vector x 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 ... ... ... ... ... 1 0 0 0 ... User Product A B C D ... Last Basket Sequential Baskets u1 A,B C u2 C D u3 A C FM is equivalent to ^y(x) := w0 + wu + wi + 1 jBt1j X j2Bt1 wj + hvu; vi i + 1 jBt1j X j2Bt1 hvi ; vj i + ::: Steen Rendle 24 / 53 [Rendle et al. 2010, WWW Best Paper]
  • 46. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 25 / 53
  • 47. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) Steen Rendle 26 / 53
  • 48. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) I Ecient computation can be done in: O(p k) Steen Rendle 26 / 53
  • 49. Factorization Models Polynomial Regression Factorization Machines Applications Summary Computation Complexity Factorization Machine model equation: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj I Trivial computation: O(p2 k) I Ecient computation can be done in: O(p k) I Making use of many zeros in x even in: O(Nz (x) k), where Nz (x) is the number of non-zero elements in vector x. Steen Rendle 26 / 53
  • 50. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Steen Rendle 27 / 53
  • 51. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Proof: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj = w0 + Xp i=1 wi xi + 1 2 Xk f =1 2 4 Xp i=1 xi vi ;f !2 Xp i=1 (xi vi ;f )2 3 5 Steen Rendle 27 / 53
  • 52. Factorization Models Polynomial Regression Factorization Machines Applications Summary Ecient Computation The model equation of an FM can be computed in O(p k). Proof: ^y(x) := w0 + Xp i=1 wi xi + Xp i=1 Xp ji hvi ; vj i xi xj = w0 + Xp i=1 wi xi + 1 2 Xk f =1 2 4 Xp i=1 xi vi ;f !2 Xp i=1 (xi vi ;f )2 3 5 I In the sums over i , only non-zero xi elements have to be summed up ) O(Nz (x) k). I (The complexity of polynomial regression is O(Nz (x)2).) Steen Rendle 27 / 53
  • 53. Factorization Models Polynomial Regression Factorization Machines Applications Summary Multilinearity FMs are multilinear: 8 2 = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x) + g()(x) where g() and h() do not depend on the value of . Steen Rendle 28 / 53
  • 54. Factorization Models Polynomial Regression Factorization Machines Applications Summary Multilinearity FMs are multilinear: 8 2 = fw0;w1; : : : ;wp; v1;1; : : : ; vp;kg : ^y(x; ) = h()(x) + g()(x) where g() and h() do not depend on the value of . E.g. for second order eects ( = vl ;f ): ^y(x; vl;f ) := g(vl;f )(x) z }| { w0 + Xp i=1 wi xi + Xp i=1 Xp j=i+1 Xk f 0=1 (f 06=f )_(l62fi ;jg) vi ;f 0 vj;f 0 xi xj + vl;f xl X i=1;i6=l vi ;f xi | {z } h(vl;f )(x) Steen Rendle 28 / 53
  • 55. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 29 / 53
  • 56. Factorization Models Polynomial Regression Factorization Machines Applications Summary Learning Using these properties, learning algorithms can be developed: I L2-regularized regression and classi
  • 57. cation: I Stochastic gradient descent [Rendle, 2010] I Alternating least squares/ Coordinate Descent [Rendle et al., 2011, Rendle 2012] I Markov Chain Monte Carlo (for Bayesian FMs) [Freudenthaler et al. 2011, Rendle 2012] I L2-regularized ranking: I Stochastic gradient descent [Rendle, 2010] All the proposed learning algorithms have a runtime of O(k Nz (X) i ), where i is the number of iterations and Nz (X) the number of non-zero elements in the design matrix X. Steen Rendle 30 / 53
  • 58. Factorization Models Polynomial Regression Factorization Machines Applications Summary Stochastic Gradient Descent (SGD) I For each training case (x; y) 2 S, SGD updates the FM model parameter using: 0 = (^y(x) y)h()(x) + () I is the learning rate / step size. I () is the regularization value of the parameter . I SGD can easily be applied to other loss functions. Steen Rendle 31 / 53 [Rendle, 2010]
  • 59. Factorization Models Polynomial Regression Factorization Machines Applications Summary Coordinate Descent (CD) I CD updates each FM model parameter using: 0 = P (x;y)2S y g()(x) h()(x) P (x;y)2S h2 ()(x) + () I Using caches of intermediate results, the runtime for updating all model parameters is O(k Nz (X)). I CD can be extended to classi
  • 60. cation [Rendle, 2012]. Steen Rendle 32 / 53 [Rendle et al., 2011]
  • 61. Factorization Models Polynomial Regression Factorization Machines Applications Summary Gibbs Sampling (MCMC) I Gibbs sampling with a block for each FM model parameter : jS; n fg N P (x;y)2S y g()(x) h()(x) P (x;y)2S h2 ()(x) + () ; 1 P (x;y)2S h2 ()(x) + () ! I Mean is the same as for CD ) computational complexity is also O(k Nz (X)). I MCMC can be extended to classi
  • 62. cation using link functions. Steen Rendle 33 / 53 [Freudenthaler et al. 2011, Rendle 2012]
  • 63. Factorization Models Polynomial Regression Factorization Machines Applications Summary Learning Regularization Values  v ,v yi w ,w wj w0 xij i=1,...,n v j  j=1,...,p  , 0 ,0 yi wj w0 w xij i=1,...,n v j  j=1,...,p w0 ,w0 0 ,0 w0 ,w0 w  v v Standard FM with priors. Two level FM with hyperpriors. Steen Rendle 34 / 53 [Freudenthaler et al., 2011]
  • 64. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Model Examples Properties Learning libFM Software Applications Summary Steen Rendle 35 / 53
  • 65. Factorization Models Polynomial Regression Factorization Machines Applications Summary libFM Software libFM is an implementation of FMs I Model: second-order FMs I Learning/ inference: SGD, ALS, MCMC I Classi
  • 66. cation and regression I Uses the same data format as LIBSVM, LIBLINEAR [Lin et. al], SVMlight [Joachims]. I Supports variable grouping. I Open source: GPLv3. Steen Rendle 36 / 53 [http://www.libfm.org/]
  • 67. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 37 / 53
  • 68. Factorization Models Polynomial Regression Factorization Machines Applications Summary (Context-aware) Rating Prediction I Main variables: I User ID (categorical) I Item ID (categorical) I Additional variables: I time I mood I user pro
  • 69. le I item meta data I . . . I Examples: Net ix prize, Movielens, KDDCup 2011 + ♪ + + Song User Time Mood Steen Rendle 38 / 53
  • 70. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Netflix Prize: Prediction Error Public Leaderboard RMS Error 0.86 0.87 0.88 0.89 0.90 user, movie user, movie, day user, movie, impl. user, movie, day, impl. SGD Matrix Factorization user, movie, day, impl., freq, lin. day $1M Prize I k = 128 factors, 512 MCMC samples (no burnin phase, initialization from random) I MCMC inference (no hyperparameters (learning rate, regularization) to specify) Steen Rendle 39 / 53
  • 71. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Method (Name) Ref. Learning Method k Quiz RMSE Models using user ID and item ID Probabilistic Matrix Factorization [14, 13] Batch GD 40 *0.9170 Probabilistic Matrix Factorization [14, 13] Batch GD 150 0.9211 Matrix Factorization [6] Variational Bayes 30 *0.9141 Matchbox [15] Variational Bayes 50 *0.9100 ALS-MF [7] ALS 100 0.9079 ALS-MF [7] ALS 1000 *0.9018 SVD/ MF [3] SGD 100 0.9025 SVD/ MF [3] SGD 200 *0.9009 Bayesian Probablistic Matrix Factorization [13] MCMC 150 0.8965 (BPMF) Bayesian Probablistic Matrix Factorization (BPMF) [13] MCMC 300 *0.8954 FM, pred. var: user ID, movie ID - MCMC 128 0.8937 Models using implicit feedback Probabilistic Matrix Factorization with Cons- traints [14] Batch GD 30 *0.9016 SVD++ [3] SGD 100 0.8924 SVD++ [3] SGD 200 *0.8911 BSRM/F [18] MCMC 100 0.8926 BSRM/F [18] MCMC 400 *0.8874 FM, pred. var: user ID, movie ID, impl. - MCMC 128 0.8865 Steen Rendle 40 / 53
  • 72. Factorization Models Polynomial Regression Factorization Machines Applications Summary Net ix Prize Method (Name) Ref. Learning Method k Quiz RMSE Models using time information Bayesian Probabilistic Tensor Factorization [17] MCMC 30 *0.9044 (BPTF) FM, pred. var: user ID, movie ID, day - MCMC 128 0.8873 Models using time and implicit feedback timeSVD++ [5] SGD 100 0.8805 timeSVD++ [5] SGD 200 *0.8799 FM, pred. var: user ID, movie ID, day, impl. - MCMC 128 0.8809 FM, pred. var: user ID, movie ID, day, impl. - MCMC 256 0.8794 Assorted models BRISMF/UM NB corrected [16] SGD 1000 *0.8904 BMFSI plus side information [8] MCMC 100 *0.8875 timeSVD++ plus frequencies [4] SGD 200 0.8777 timeSVD++ plus frequencies [4] SGD 2000 *0.8762 FM, pred. var: user ID, movie ID, day, impl., - MCMC 128 0.8779 freq., lin. day FM, pred. var: user ID, movie ID, day, impl., freq., lin. day - MCMC 256 0.8771 Steen Rendle 40 / 53
  • 73. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 41 / 53
  • 74. Factorization Models Polynomial Regression Factorization Machines Applications Summary Link Prediction in Social Networks I Main variables: I Actor A ID I Actor B ID I Additional variables: I pro
  • 75. les I actions I . . . + Actor A Actor B Steen Rendle 42 / 53
  • 76. Factorization Models Polynomial Regression Factorization Machines Applications Summary KDDCup 2012: Track 1 KDDCup 2012 Track 1: Prediction Quality Public Leaderboard Private Leaderboard Mean Average Precision @3 0.32 0.34 0.36 0.38 0.40 0.42 none gender, age, ... keywords friends all none gender, age, ... keywords friends all Top 1 Top 5 Top 10 Top 100 I k = 22 factors, 512 MCMC samples (no burnin phase, initialization from random) I MCMC inference (no hyperparameters (learning rate, regularization) to specify) Steen Rendle 43 / 53 [Awarded 2nd place (out of 658 teams)]
  • 77. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 44 / 53
  • 78. Factorization Models Polynomial Regression Factorization Machines Applications Summary Clickthrough Prediction I Main variables: I User ID I Query ID I Ad/ Link ID I Additional variables: I query tokens I user pro
  • 79. le I . . . + keyword... + Link 1 Link 2 Link 3 User Query Ad/ Link Steen Rendle 45 / 53
  • 80. Factorization Models Polynomial Regression Factorization Machines Applications Summary KDDCup 2012: Track 2 Model Inference wAUC (public) wAUC (private) ID-based model (k = 0) SGD 0.78050 0.78086 Attribute-based model (k = 8) MCMC 0.77409 0.77555 Mixed model (k = 8) SGD 0.79011 0.79321 Final ensemble n/a 0.79857 0.80178 Ensemble I Rank positions (not predicted clickthrough rates) are used. I The MCMC attribute-based model and dierent variations of the . SGD models are included. Steen Rendle 46 / 53 [Awarded 3rd place (out of 171 teams)]
  • 81. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 47 / 53
  • 82. Factorization Models Polynomial Regression Factorization Machines Applications Summary ECML/PKDD Discovery Challenge 2013 I Problem: Recommend given names. I Main variables: I User ID I Name ID I Additional variables: I session info I string representation for each name I . . . I FM approach won 1st place (online track) and 2nd (oine track). Steen Rendle 48 / 53
  • 83. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 49 / 53
  • 84. Factorization Models Polynomial Regression Factorization Machines Applications Summary Student Performance Prediction I Main variables: I Student ID I Question ID I Additional variables: I question hierarchy I sequence of questions I skills required I . . . I Examples: KDDCup 2010, Grockit Challenge4 (FM placed 1st/241) + ? Student Question 4http://www.kaggle.com/c/WhatDoYouKnow Steen Rendle 50 / 53
  • 85. Factorization Models Polynomial Regression Factorization Machines Applications Summary Outline Factorization Models Polynomial Regression Factorization Machines Applications Recommender Systems Link Prediction in Social Networks Clickthrough Prediction Personalized Ranking Student Performance Prediction Kaggle Competitions Summary Steen Rendle 51 / 53
  • 86. Factorization Models Polynomial Regression Factorization Machines Applications Summary Kaggle Competitions FMs have been successfully applied to several Kaggle competitions: I Criteon Display Advertising Challenge: 1st place (team '3 idiots'). I Blue Book for Bulldozers: 1st place (team 'Leustagos Titericz'). I EMI Music Data Science Hackathon: 2nd place (team 'lns'). Steen Rendle 52 / 53
  • 87. Factorization Models Polynomial Regression Factorization Machines Applications Summary Summary I Factorization machines combine linear/polynomial regression with factorization models. I Feature interactions are learned with a low rank representation. I Estimation of unobserved interactions is possible. I Factorization machines can be computed eciently and have high prediction quality. Steen Rendle 53 / 53
  • 88. Factorization Models Polynomial Regression Factorization Machines Applications Summary L. Drumond, S. Rendle, and L. Schmidt-Thieme. Predicting rdf triples in incomplete knowledge bases with tensor factorization. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC '12, pages 326{331, New York, NY, USA, 2012. ACM. C. Freudenthaler, L. Schmidt-Thieme, and S. Rendle. Bayesian factorization machines. In NIPS workshop on Sparse Representation and Low-rank Approximation, 2011. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative
  • 89. ltering model. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 426{434, New York, NY, USA, 2008. ACM. Y. Koren. The bellkor solution to the net ix grand prize. 2009. Steen Rendle 53 / 53
  • 90. Factorization Models Polynomial Regression Factorization Machines Applications Summary Y. Koren. Collaborative
  • 91. ltering with temporal dynamics. In KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 447{456, New York, NY, USA, 2009. ACM. Y. J. Lim and Y. W. Teh. Variational Bayesian approach to movie rating prediction. In Proceedings of KDD Cup and Workshop, 2007. I. Pilaszy, D. Zibriczky, and D. Tikk. Fast als-based matrix factorization for explicit and implicit feedback datasets. In RecSys '10: Proceedings of the fourth ACM conference on Recommender systems, pages 71{78, New York, NY, USA, 2010. ACM. I. Porteous, A. Asuncion, and M. Welling. Bayesian matrix factorization with side information and dirichlet process mixtures. In Proceedings of the Twenty-Fourth AAAI Conference on Arti
  • 92. cial Intelligence, AAAI 2010, pages 563{568, 2010. Steen Rendle 53 / 53
  • 93. Factorization Models Polynomial Regression Factorization Machines Applications Summary S. Rendle. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM '10, pages 995{1000, Washington, DC, USA, 2010. IEEE Computer Society. S. Rendle. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol., 3(3):57:1{57:22, May 2012. S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Factorizing personalized markov chains for next-basket recommendation. In WWW '10: Proceedings of the 19th international conference on World wide web, pages 811{820, New York, NY, USA, 2010. ACM. S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme. Fast context-aware recommendations with factorization machines. In Proceedings of the 34th ACM SIGIR Conference on Reasearch and Development in Information Retrieval. ACM, 2011. R. Salakhutdinov and A. Mnih. Steen Rendle 53 / 53
  • 94. Factorization Models Polynomial Regression Factorization Machines Applications Summary Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning, ICML '08, pages 880{887, New York, NY, USA, 2008. ACM. R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1257{1264, Cambridge, MA, 2008. MIT Press. D. H. Stern, R. Herbrich, and T. Graepel. Matchbox: large scale online bayesian recommendations. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 111{120, New York, NY, USA, 2009. ACM. G. Takacs, I. Pilaszy, B. Nemeth, and D. Tikk. Scalable collaborative
  • 95. ltering approaches for large recommender systems. J. Mach. Learn. Res., 10:623{656, June 2009. Steen Rendle 53 / 53
  • 96. Factorization Models Polynomial Regression Factorization Machines Applications Summary L. Xiong, X. Chen, T.-K. Huang, J. Schneider, and J. G. Carbonell. Temporal collaborative
  • 97. ltering with bayesian probabilistic tensor factorization. In Proceedings of the SIAM International Conference on Data Mining, pages 211{222. SIAM, 2010. S. Zhu, K. Yu, and Y. Gong. Stochastic relational models for large-scale dyadic data using MCMC. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 1993{2000, 2009. Steen Rendle 53 / 53