Call Girls in Sarai Kale Khan Delhi đŻ Call Us đ9205541914 đ( Delhi) Escorts S...
Â
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
1. SemanticSVD++:
INCORPORATING SEMANTIC
TASTE EVOLUTION FOR
PREDICTING RATINGS
DR. MATTHEW ROWE
SCHOOL OF COMPUTING AND COMMUNICATIONS
@MROWEBOT | M.ROWE@LANCASTER.AC.UK
International Conference on Web Intelligence 2014
Warsaw, Poland
3. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
2
1 ⌠f
1
2
3
1 2 3
1 4* 4* 2*
2 5* ? 1*
3 5* 4* 1*
1 2 3
1
âŚ
f
â
Latent Factor Models: Factor Consistency Problem
â˘âŻ Cannot âaccuratelyâ align latent factors
â˘âŻ Cannot tell how usersâ taste have evolved
F = #factors (a priori)
Time
?
?
?
?
4. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
3
1 ⌠c
1
2
3
1 2 3
1 4* 4* 2*
2 5* ? 1*
3 5* 4* 1*
â
i <URI> {<SKOS_CATEGORY>}
Time
c = Dimensionality
of category space
Solution: Semantic Categories
Preference for
category c at
time s
â" â"
5. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
4
Semantic Alignment of Datasets
SPARQL Query
for Candidate
URIs from
Movieâs title
Get Semantic
Categories of
each candidate
Disambiguate
based on
Movieâs YearFor each
movie item
{(ItemID,<URI>)}
May Jul Sep Nov Jan
Time
Numberof
040,
(a) MovieLens
Mar May Jul Sep
Time
Numberof
0400
(b) MovieTweetings
Fig. 1. Distribution of reviews per day across the MovieLens and Movi-
eTweetings datasets. The ďŹrst dashed blue line indicates the cutoff point for
the training set, and the dashed red line indicates the cutoff point for the test
set - i.e. every rating after that point is placed in the test set. The validation
set contains the ratings between the blue and red dashed lines.
released in 1979, which we shall now use as a running
example, the following categories are found:
<h t t p : / / dbpedia . org / r e s o u r c e / Alien ( film)>
dcterms : s u b j e c t c a t e g o r y : Alien ( f r a n c h i s e ) f i l m s ;
dcterms : s u b j e c t c a t e g o r y :1979 h o r r o r f i l m s .
In this work we use DBPedia URIs, given their relation
th
B
re
o
it
d
w
a
a
th
in
w
T
in
e
li
fo
th
la
6. Semantic alignment = fewer elements
Time-ordered datasets split for experiments:
â˘âŻ 80%/10%/10% for training/validation/testing
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
5
Reduced Recommendation Datasets
RI that
w can
hod for
e used
stances
ained a
e URI
he set
matches
e from
Leven-
iprocal
hold to
the dataset. We also note that the reduction in the number of
ratings is not as great, this suggests two things: (i) mapped
items are popular, and thus dominate the ratings; and (ii)
obscure items are present within the data.
TABLE I. STATISTICS OF THE REVISED REVIEW DATASETS USED FOR
OUR ANALYSIS AND EXPERIMENTS. REDUCTION OVER THE ORIGINAL
DATASETS ARE SHOWN IN PARENTHESES.
Dataset #Users #Items #Ratings
MovieLens 5,390 (-11%) 3,231 (-12.1%) 841,602 (-6.7%)
MovieTweetings 2,357 (-89%) 7,913 (-30.8%) 73,397 (-38.2%)
Total 7,747 11,144 914,999
As Table I suggests, certain more âobscureâ movies do
not have DBPedia URIs; despite our use of the most recent
DBPedia datasets (i.e. version 3.9) coverage is still limited in
certain places. The reason for this lack of coverage for certain
items is largely due to the obscurity of the ďŹlm not having a
wikipedia page. For instance, for the MovieLens dataset we fail
to map the three movies âNever Met Picassoâ, âDiebinnenâ and
âFollow the Bitchâ, despite these ďŹlms having IMDB pages they
have no wikipedia page, and hence no DBPedia entry. For the
Movie Tweetings dataset we fail to map âSummer Codaâ and
Hipster Dilemma: Occurs when obscure movie
items cannot be aligned to semantic web URIs!
7. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
6
Forming Semantic Taste Profiles
Split userâs
training ratings
into 5-stages
Derive the userâs
average rating
per semantic
category
Calculate the
probability of
the user rating
the category
highly
For each stageâŚ
Pu
s
d how their tastes have evolved, at this
e. From this point onwards we reserve
aracters for set notations, as follows:
s, and i, j denote items.
own rating value (where r 2 [1, 5] or
Ër denotes a predicted rating value.
rovided as quadruples of the form
where t denotes the time of the rating,
nted into training (Dtrain), validation
st (Dtest) sets by the above mentioned
mantic category that an item has been
cats(i) is a convenience function that
of semantic categories of item i.
les
es describe the preferences that a user
time for given semantic categories.
derstanding how a proďŹle at one point
proďŹle at an earlier point in time,
taste evolution has taken place. In
y and Leskovec [5] the assessment of
n the context of review platforms (e.g.
Review) demonstrated the propensity
From these deďŹnitions we then derived the discrete prob-
ability distribution of the user rating the category favourably
as follows, deďŹning the set Cu,s
train as containing all unique
categories of items rated by u in stage s:
Pr(c|Du,s
train) =
avrating(Du,s,c
train)
X
c02Cu,s
train
avrating(Du,s,c0
train )
(4)
When implementing this approach, we only consider the
categories that item URIs are directly mapped to; that is,
only those categories that are connected to the URI by the
dbterms:subject predicate. Prior work by Ostuni et al.
[8] performed a mapping where grandparent categories were
mapped to URIs, however we chose the parent categories in
this instance to open up the possibility of other mappings in
the future - i.e. via linked data node vertex kernels.
B. User Taste Evolution: From Prior Taste ProďŹles
We now turn to looking at the evolution of usersâ tastes
over time in order to understand how their preferences change.
Given our use of probability distributions to model the lifecycle
stage speciďŹc taste proďŹle of each user, we can apply infor-
mation theoretic measures based on information entropy. One
such measure is conditional entropy, it enables one to assess
the userâs ratings distribution per semantic category within the
allotted time window (provided by the lifecycle stage of the
user as this denotes a closed interval - i.e. s = [t, t0
], t < t0
).
We formed a discrete probability distribution for category c at
time period s 2 S (where S is the set of 5 lifecycle stages)
by interpolating the userâs ratings within the distribution. We
ďŹrst deďŹned two sets, the former (Du,s,c
train) corresponding to the
ratings by u during period/stage s for items from category c,
and the latter (Du,s
train) corresponding to ratings by u during s,
hence Du,s,c
train â Du,s
train, these sets are formed as follows:
Du,s,c
train = {(u, i, r, t) : (u, i, r, t) 2 Dtrain, t 2 s, c 2 cats(i)}
(1)
Du,s
train = {(u, i, r, t) : (u, i, r, t) 2 Dtrain, t 2 s} (2)
We then deďŹned the function avrating to derive the
average rating value from all rating quadruples in a given set:
avrating(Du,s
train) =
1
|Du,s
train|
X
(u,i,r,t)2Du,s
train
r (3)
the increase in
usersâ lifecycles
however the inc
the semantic ta
previous prefer
have follow the
categories
C. User Taste E
Our second
in general hav
modelling user-
development as
entropy to asses
step (s) has bee
and global taste
(s 1). For the
probability dist
8. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
7
Taste Evolution from Taste Profiles
0.2250.2350.245
Lifecycle Stages
ConditionalEntropy
â
â
â
â
1 2 3 4 5
(a) MovieLens
0.2750.2800.2850.290
Lifecycle Stages
ConditionalEntropy
â
â
â
â
1 2 3 4 5
(b) MovieTweetings
Fig. 2. Conditional entropy between consecutive lifecycle stages (e.g.
H(P2|P3)) across the datasets, together with the bounds of the 95% con-
ďŹdence interval for the derived means.
users who posted ratings within the time interval of stage s.
Now, assume that we have a random variable that describes the
local categories that have been reviewed at the current stage
(Ys), a random variable of local categories at the previous stage
(Ys 1). and a third random variable of global categories at the
previous stage (Xs 1), we then deďŹne the transfer entropy of
one lifecycle stage to another as follows [11]:
TX!Y = H(Ys|Ys 1) H(Ys|Ys 1, Xs 1) (6)
Fig
H(
ďŹde
na
lie
is
gre
Prior Tastes Comparison
â˘âŻ Computed conditional entropy
between consecutive profiles
â˘âŻ Increase: divergence from
prior tastes
â˘âŻ Both datasetsâ users diverge
from prior tastes
cle Stages
â
â
â
3 4 5
ieLens
0.2750.2800.2850.290
Lifecycle Stages
ConditionalEntropy
â
â
â
â
1 2 3 4 5
(b) MovieTweetings
entropy between consecutive lifecycle stages (e.g.
datasets, together with the bounds of the 95% con-
derived means.
0.1200.1220.124
Lifecycle Stages
TransferEntropy
â
â
â
â
1 2 3 4 5
(a) MovieLens
0.1120.1140.116
Lifecycle Stages
TransferEntropy
â â
â
â
1 2 3 4 5
(b) MovieTweetings
Fig. 3. Transfer entropy between consecutive lifecycle stages (e.g.
H(P2|P3)) across the datasets, together with the bounds of the 95% con-
ďŹdence interval for the derived means.
Global Influence
â˘âŻ Computed transfer entropy of how
global tastes have influenced users
tastes
â˘âŻ Decrease: global tastes have a
stronger influence than prior tastes
â˘âŻ Difference between datasets in
global influenceâs role
9. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
8
Putting it all together: SemanticSVD++!
95% con-
stage s.
ibes the
nt stage
us stage
es at the
tropy of
(6)
alculate
al prob-
ariables
(7)
H(P2|P3)) across the datasets, together with the bounds of the 95% con-
ďŹdence interval for the derived means.
named SemanticSV D++
, an extension of Koren et al.âs ear-
lier SV D++
model [2]. The predictive function of the model
is shown in full in Eq. 8, we now explain each component in
greater detail.
Ërui =
Static Biases
z }| {
Âľ + bi + bu +
Category Biases
z }| {
âľibi,cats(i) + âľubu,cats(i)
+
Personalisation Component
z }| {
q|
i pu + |R(u)|
1
2
X
j2R(u)
yj
+ |cats(R(u))|
1
2
X
c2cats(R(u))
zc
!
(8)
A. Static Biases
Modified version of SVD++ with:
â˘âŻ User taste evolution captured in semantic category biases
â˘âŻ Semantic personalisation component
c latent factor vectors
for each of the rated
categories by the user
10. egan from:
1
4 k
4X
s=k
Qs+1(c) Qs(c)
Qs(c)
(9)
then calculated the conditional probability
y being rated highly by accounting for the
ng preference for the category as follows:
+|c) =
Prior Rating
z }| {
Q5(c) +
Change Rate
z }| {
cQ5(c) (10)
his over all categories for the item i we can
ving item bias from the provided training
i) =
1
|cats(i)|
X
c2cats(i)
Pr(+|c) (11)
Towards Categories: In the previous sec-
er-user discrete probability distributions that
bility of the user u rating a given category c
Given that a single item can be linked to many categories
on the web of linked data, we take the average across all
categories as the bias of the user given the categories of the
item:
bu,cats(i) =
1
|cats(i)|
X
c2cats(i)
Pr(+|c, u) (15)
Other schemes for calculating the biases towards categories
(both item and user) could be used, e.g. choosing the maximum
bias, however we use the average as an initial scheme.
3) Weighting Category Biases: The above category biases
are derived as static features within the recommendation model
(Eq. 8) mined from the provided training portion, however
each user may be inďŹuenced by these factors in different ways
when performing their ratings. To this end we included two
weights, one for each category bias, deďŹned as âľi and âľu for
the item biases to categories and the user biases to categories
respectively. As we will explain below, these weights are then
learnt during the training phase of inducing the model.
C. Personalisation Component
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
9
shown in full in Eq. 8, we now explain each component in
eater detail.
Ërui =
Static Biases
z }| {
Âľ + bi + bu +
Category Biases
z }| {
âľibi,cats(i) + âľubu,cats(i)
+
Personalisation Component
z }| {
q|
i pu + |R(u)|
1
2
X
j2R(u)
yj
+ |cats(R(u))|
1
2
X
c2cats(R(u))
zc
!
(8)
Static Biases
The static biases include the general bias of the given
taset (Âľ), which is the mean rating score across all ratings
category c began from:
c =
1
4 k
4X
s=k
Qs+1(c) Qs(c)
Qs(c)
(9)
om this we then calculated the conditional probability
given category being rated highly by accounting for the
e rate of rating preference for the category as follows:
Pr(+|c) =
Prior Rating
z }| {
Q5(c) +
Change Rate
z }| {
cQ5(c) (10)
y averaging this over all categories for the item i we can
ate the evolving item bias from the provided training
ent:
bi,cats(i) =
1
|cats(i)|
X
c2cats(i)
Pr(+|c) (11)
Given that a single item ca
on the web of linked data, w
categories as the bias of the u
item:
bu,cats(i) =
1
|cats(i)
Other schemes for calculati
(both item and user) could be u
bias, however we use the aver
3) Weighting Category Bia
are derived as static features w
(Eq. 8) mined from the prov
each user may be inďŹuenced b
when performing their ratings
weights, one for each category
the item biases to categories a
respectively. As we will expla
transfer entropy for each user over time and modelling this as
global inďŹuence factor u
. We derive this as follows, based o
measuring the proportional change in transfer entropy startin
from lifecycle period k that produced a monotonic increase o
decrease in transfer entropy:
u
=
1
4 k
4X
s=k
T
s+1|s
Q!P T
s|s 1
Q!P
T
s|s 1
Q!P
(13
By combining the average change rate ( u
c ) of the use
highly rating a given category c with the global inďŹuence facto
( u
), we then derived the conditional probability of a use
rating a given category highly as follows, where Pu
5 denote
the taste proďŹle of the user observed for the ďŹnal lifecycl
stage (5):
Pr(+|c, u) =
Prior Rating
z }| {
Pu
5 (c) +
Change Rate
z }| {
u
c Pu
5 (c) +
Global InďŹuence
z }| {
u
Q5(c) (14
Of global category
rating probability
Average change in Transfer Entropy
of the User
Incorporating Taste Evolution with Biases
From this we then calculated the conditional probability
of a given category being rated highly by accounting for the
change rate of rating preference for the category as follows:
Pr(+|c) =
Prior Rating
z }| {
Q5(c) +
Change Rate
z }| {
cQ5(c) (10)
By averaging this over all categories for the item i we can
calculate the evolving item bias from the provided training
egment:
bi,cats(i) =
1
|cats(i)|
X
c2cats(i)
Pr(+|c) (11)
2) User Biases Towards Categories: In the previous sec-
ion, we induced per-user discrete probability distributions that
captured the probability of the user u rating a given category c
highly during lifecycle stage s: Pu
s (c). Given that usersâ taste
evolve, our goal is to estimate the probability of the user rating
an item highly given its categories by capturing how the userâs
preferences for each category have changed in past (decaying
or growing). To capture the development of a userâs preference
or a category we derived the average change rate ( u
c ) over
he k lifecycle periods coming before the ďŹnal lifecycle stage
n the training set. The parameter k is the number of stages
back in the training segment from which either a monotonic
ncrease or decrease in the probability of rating category c
began from. We deďŹne the change rate ( u
c ) as follows:
|cats(i)|
c2cats(i)
Other schemes for calculating the bias
(both item and user) could be used, e.g. ch
bias, however we use the average as an
3) Weighting Category Biases: The a
are derived as static features within the re
(Eq. 8) mined from the provided traini
each user may be inďŹuenced by these fac
when performing their ratings. To this e
weights, one for each category bias, deďŹ
the item biases to categories and the use
respectively. As we will explain below, th
learnt during the training phase of induc
C. Personalisation Component
The personalisation component of th
model builds on the existing SV D++
m
[2]. The modiďŹed model has four latent fa
denotes the f latent factors associated wit
denotes the f latent factors associated wit
denotes the f latent factors for item j f
items by user u: R(u); and we have deďŹn
Rf
which captures the latent factor vec
for a given semantic category c. We den
tional component as the category factor
General Category
Biases
User Biases to
Categories
11. Evaluation Setup
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
10
¨ď¨âŻ Tested three models (trained using Stochastic Gradient
Descent)
¤ď¤âŻ SVD++ (baseline)
¤ď¤âŻ SB-SVD++: SVD++ with Semantic Category Biases
¤ď¤âŻ S-SVD++ (SB-SVD++ with personalisation component)
¨ď¨âŻ Tuned hyperparameters over the validation splits
¨ď¨âŻ Model testing:
¤ď¤âŻ Trained models with tuned hyperparameters using both
training and validation splits
¤ď¤âŻ Applied to held-out final 10% of reviews
¨ď¨âŻ Evaluation measure: Root Mean Square Error
12. Evaluation Results
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
11
¨ď¨âŻ Significantly outperformed the SVD++ baseline
¨ď¨âŻ MovieLens:
¤ď¤âŻ Full model (S-SVD++) produces significantly superior
performance
¨ď¨âŻ MovieTweetings:
¤ď¤âŻ Marginal difference between SB-SVD++ and S-SVD++
TABLE III. ROOT MEAN SQUARE ERROR (RMSE) OF THE THREE
MODELS ACROSS THE TWO DATASETS. EACH DATASETâS BEST MODEL IS
HIGHLIGHTED IN BOLD WITH THE P-VALUE FROM THE MANN-WHITNEY
WITH THE NEXT BEST MODEL.
Model MovieLens MovieTweetings
SV D++ 1.520 0.969
SB SV D++ 1.517 0.963
S SV D++ 1.513 (< 0.001) 0.963 (< 0.1)
13. Conclusions
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
12
¨ď¨âŻ Semantic taste profiles can track usersâ tastes:
¤ď¤âŻ Overcomes the factor consistency problem
¤ď¤âŻ Enables modelling of global taste influence
¤ď¤âŻ SemanticSVD++ boosts recommendation performance
¨ď¨âŻ Semantic categories are limited however:
¤ď¤âŻ Hipster dilemma
¤ď¤âŻ Cold-start Categories
14. SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
13
dbpedia:c1!
dbpedia:c3!
dbpedia:c4!
Cold-start Categories
dbpedia:c5!
5* 4* ?
Transferring Semantic Categories with Vertex Kernels:
Recommendations with SemanticSVD++. M Rowe. To appear in
the proceedings of the International Semantic Web Conference.
Trentino, Italy. (2014)
dcterms:subject!
dbpedia:c2!
Unrated
Categories