SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
Recommender
systems to help
people move
forward
RecSysNL meetup
Oct 17, 2018
Martijn Willemsen
M.C.Willemsen@tue.nl
PI of the recommender LAB
http://www.martijnwillemsen.nl/
@mcwillemsen
Decision Making, Process tracing, Cognition,
Recommender Systems, online behavior,
e-coaching, Data Science
My lab has a strong user-centric RecSys focus…, why?
• Because I failed as a (real) engineer ?
2
MSc in EE
2nd Bsc:
Technology and
Society
PhD in Decision
Making
Recommender
Systems
MSc Technology
and Society
PostDoc in
Process Tracing
Electrical
Engineering
My lab has a strong user-centric RecSys focus…, why?
• Because it is easier to get papers into the RecSys conference?
3
Surely not….
Because just optimizing accuracy is not enough…
REVEAL Workshop RecSys2018: (Joe Konstan)
“…if the recommender systems we are building are trained
to predict the very items that user founds by themselves
without recommendation (yes, I’m looking at you
Precision_at_k), then the usefulness of the recommender
becomes very debatable. ”
4
https://medium.com/@olivier.koch/recsys-
2018-recommender-systems-that-care-
16389e43114c
And optimizing for engagement/behavior is tricky…
Netflix tradeoffs popularity, diversity and accuracy
AB tests to test ranking between and within rows
Source: RecSys 2016, 18 Sept: Talk by Xavier Amatriain
http://www.slideshare.net/xamat/past-present-and-future-of-recommender-systems-and-industry-perspective
We don’t need the user:
Let’s do AB Testing!
Netflix used 5-star rating scales to get
input from users (apart from log data)
Netflix reported an AB test of thumbs
up/down versus rating:
Yellin (Netflix VP of product): “The result
was that thumbs got 200% more ratings
than the traditional star-rating feature.”
So is the 5-star rating wrong?
or just different information?
Should we only trust the behavior?
6
However, over time, Netflix
realized that explicit star
ratings were less relevant than
other signals. Users would rate
documentaries with 5 stars,
and silly movies with just 3
stars, but still watch silly
movies more often than those
high-rated documentaries.
http://variety.com/2017/digital/ne
ws/netflix-thumbs-vs-stars-
1202010492/
Behavior versus Experience
Looking at behavior…
• Testing a recommender against a random videoclip system, the
number of clicked clips and total viewing time went down!
Looking at user experience…
• Users found what they liked
faster with less ineffective
clicks…
Behaviorism is not enough!
(Ekstrand & Willemsen, RecSys 2016)
We need to measure user experience
and relate it to user behavior…
We need to understand user goals and
develop Rec. Systems that help users
attain these goals!
7
User-Centric Research can help us understand…
• How they perceive recommender algorithms (Ekstrand et
al. 2014)
• WHY users are satisfied… even when we reduce accuracy
by diversification (Willemsen et al. 2016)
• What inaction (non-behavior) means… (Zhao et al. 2018)
Or even help built systems to help people move forward
• Energy saving recommendations (Starke et al. 2017)
• Music Genre Explorer (Liang and Willemsen, 2018)
8
User-Centric Framework
Computers Scientists (and marketing researchers) would study behavior….
(they hate asking the user or just cannot (AB tests))
User-Centric Framework
Psychologists and HCI people are mostly interested in experience…
User-Centric Framework
Though it helps to triangulate experience and behavior…
User-Centric Framework
Our framework adds the intermediate construct of perception that explains why
behavior and experiences changes due to our manipulations
User-Centric Framework
• And adds personal
and situational
characteristics
•
Relations modeled
using factor analysis
and SEM
Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C. (2012). Explaining the
User Experience of Recommender Systems. User Modeling and User-Adapted Interaction
(UMUAI), vol 22, p. 441-504 http://bit.ly/umuai
User Perceptions of
Differences in
Recommender Algorithms
Joint work with grouplens
Michael Ekstrand, Max Harper and Joseph Konstan
Ekstrand, M.D., Harper, F.M., Willemsen, M.C.& Konstan, J.A. (2014). User Perception of
Differences in Recommender Algorithms. In Proceedings of the 8th ACM conference on
Recommender systems (pp. 161–168). New York, NY, USA: ACM
Going beyond accuracy…
McNee et al. (2006): Accuracy is not enough
“study recommenders from a user-centric perspective to
make them not only accurate and helpful, but also a
pleasure to use”
But wait!
we don’t even know how the standard algorithms are
perceived… and what differences there are…
Compare 3 classic algorithms (Item-Item, User-User
and SVD) side by side (joint evaluation) in terms of
preference and perceptions
The task provided to the user
First impression
Perceived Diversity
& novelty and
satisfaction
Choice of algo
First look at the measurement model
• only measurement model relating the concepts (no
conditions)
• All concepts are relative comparisons
– e.g. if they think list A is more diverse than B, they are also more
satisfied with list A than B
SSA
EXPSSA
INT
INT
What algorithms do users prefer?
528 users completed the
questionnaire
Joint evaluation, 3 pairs of
comparing A with B
User-User CF significantly
looses from the other two
Item-Item and SVD are on par
Why?
– User-user more novel than either SVD or item-item
– User-user more diverse than SVD
– Item-item slightly more diverse than SVD (but diversity didn't
affect satisfaction)
I-I
I-I
SVD
U-U
SVD
U-U
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
I-I v. U-U I-I v. SVD SVD v. U-U
Objective measures
No accuracy differences, but consistent with subjective data
RQ2: User-user more novel, SVD somewhat less diverse
Aligning objective with subjective measures
Objective and subjective metrics correlate consistently
But their effects on choice are mediated by the subjective
perceptions!
(Objective) obscurity only influences satisfaction if it increases
perceived novelty (i.e. if it is registered by the user)
Conclusions
Novelty is not always good: complex, largely negative effect
Diversity is important for satisfaction
Diversity/accuracy tradeoff does not seem to hold…
Subjective Perceptions and experience mediate the effect
of objective measures on choice / preference for algorithm
Brings the ‘WHY’: e.g. User-user is less satisfactory and less
often chosen because of its obscure items (which are
perceived as novel)
Choice difficulty and
satisfaction in RecSys
Applying latent feature diversification
Willemsen, M.C., Graus, M.P, & Knijnenburg, B.P. (2016). Understanding the role of latent
feature diversification on choice difficulty and satisfaction. User Modeling and User-
Adapted Interaction (UMUAI), vol 26 (4), 347-389 doi:10.1007/s11257-016-9178-6
Seminal example of choice overload
Satisfaction decreases with larger sets as increased
attractiveness is counteracted by choice difficulty
Can we reduce difficulty while controlling attractiveness?
More attractive
3% sales
Less attractive
30% sales
Higher purchase
satisfaction
From Iyengar and Lepper (2000)
Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix Factorization Techniques
for Recommender Systems. IEEE Computer 42, 8, 30–37.
Dimensionality reduction
Users and items are
represented as vectors on a
set of latent features
Rating is the dot product of
these vectors (overall utility!)
Gus will like Dumb and Dumber
but hate Color Purple
Use the properties of Matrix Factorization algorithms!
Latent feature diversification: high diversity/equal attractiveness
26
Latent Feature Diversification
Psychology-
informed
Diversity
manipulation
Increased
perceived
Diversity &
attractiveness
Reduced
difficulty &
increased
satisfaction
Less hovers
More choice
for lower
ranked items
Diversification Rank of chosen
None (top 5) 3.6
Medium 14.5
High 77.6
-0.2
0
0.2
0.4
0.6
0.8
1
none med high
standardizedscore
diversification
Choice Satisfaction
Higher satisfaction for high
diversification, despite choice for
lower predicted/ranked items
Interpreting User
Inaction in
Recommender Systems
Zhao, Q., Willemsen, M. C., Adomavicius, G., Maxwell Harper, F., & Konstan, J. A. (2018).
Interpreting user inaction in recommender systems. In Proceedings of the 12th ACM
Conference on Recommender Systems (blz. 40-48). New York: Association for Computing
Machinery, Inc. DOI: 10.1145/3240323.3240366
28
Action and Inaction in MovieLens.org
Add into a wishlist
Not interested
Rating
Click to see
details
29
How to interpret user inaction?
Randomly pick one and ask users.
7 Categories of User Inaction
30
Did not notice it (38.6%)
Noticed but watched it before (14.6%)
Noticed and have
not watched it yet
(46.8%)
Would not enjoy it (5.8%)
Others are better (9.5%)
Okay but not now (18.2%)
Plan to explore it soon (6.9%)
Have decided to watch (5.8%)
Should movielens keep recommending this item to you?
most
preferred
least
preferred
4%
1%
11%
9%
30%
42%
63%
65%
51%
25%
27%
23%
19%
5%
Based on this survey data we:
Built an inaction classification model
• Predictors: item attributes, position, user actions, predicted
rating and action probabilities: (clicking, rating, adding to a
wishlist)
• Best accuracy: 48.5% (majority class is NotNoticed: 39.9%)
Try to improve the recommender system: what to do with
inaction items?
• Utilize inferred 7-class probabilities
• Estimate the (in)action or adjust the recommendation timing…
• Hide an item or Delay showing an item !
32
How recommenders
can help users
achieve their goals
Research with
Alain Starke
(PhD student)
RecSys 2017
33
Recommending for Behavioral change
• Behavioral change is hard…
– Exercising more, eat healthy, reduce alcohol consumption
(reducing Binge watching on Netflix )
– Needs awareness, motivation and commitment
Combi model:
Klein, Mogles, Wissen
Journal of Biomedical Informatics, 2014
What can recommenders do?
• Persuasive Technology: focused on how to help people change their
behavior:
– personalize the message…
• Recommenders systems can help with what to change and when to
act
– personalize what to do next…
• This requires different models/algorithms
– our past behavior/liking is not what we want to do now!
Central question
Can we design a recommender
interface which effectively supports a
user’s energy-saving goals?
36
Regular RecSys approaches, e.g. collaborative filtering,
are prone to reinforcing current behavior
• If we want consumers and users to achieve (energy-
saving) goals, we should not only focus on past
behavior but ‘move forward’ (cf. Ekstrand & Willemsen, 2016)
• We need a model which considers future goals
37
Energy-saving measures can be ordered as
increasingly difficult behavioral steps towards
attaining the goal of saving energy
(Kaiser et al., 2010; Urban & Scasny, 2014)
< <
38
These steps reflect willingness & capacity to
save energy: a person’s energy-saving ability
(Kaiser et al., 2010; Urban & Scasny, 2014)
< <
39
We infer behavioral difficulties
based on engagement frequencies
40
INPUT
Persons
indicate
which
measures
they
perform
Difficult /
Obscure
Easy /
Popular
In a similar vein, we infer
energy-saving abilities
41
Low ability,
Performs few
Persons
indicate
which
measures
they
perform
INPUT
High ability,
Performs many
One’s energy-saving ability is a good starting
point to look for appropriate measures
A person has a 50% probability of performing a measure
with a difficulty equal to his/her ability
42
43
Study 2: How should advice be tailored to
support energy-efficient choices?
(And can fit scores help to persuade users to
pick more challenging measures?)
44
45
Web shop interface with three lists (tabs):
‘Base’, ‘Recommended’ and ‘Challenging’
• ‘Recommended’ contains 15 best-matching measures,
with fit scores ranging 100% to 60%
• ‘Base’ are easier, ‘challenging’ more difficult
Matched on their ability:
-1, 0 or +1 Logit
(75%, 50% of 25% likelihood)
Show a fit score (or not)
3x2 Between-subject research design
• 3 levels of difficulty, determining contents of the
‘recommended’ list:
– Easy / below ability (~75% probability)
– Ability-tailored (~50% probability)
– Difficult / above ability (~25% probability)
• 2 levels of fit score: they were either shown or not
– The 100% score was consistent with the difficulty condition
– E.g. in the easy condition, measures below a user’s ability
(75%) had a 100% match score
Easy recommendations were perceived as
feasible and, in turn, supportive & satisfactory
SEM Statistics: χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990
47 *** p < 0.001, ** p < 0.01, * p < 0.05.
Perceived
Feasibility
−.469***
Rec
difficulty
-
Easy recommendations were perceived as
feasible and, in turn, supportive & satisfactory
SEM Statistics: χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990
48 *** p < 0.001, ** p < 0.01, * p < 0.05.
Perceived
Feasibility
−.469***
Perceived
Support
Choice
Satisfaction.234***
Rec
difficulty
.506***
.221***
-
+ +
• Users who felt supported selected more measures
• Satisfied users showed a higher % of follow-up
49 *** p < 0.001, ** p < 0.01, * p < 0.05.
Perceived
Feasibility
−.469***
Perceived
Support
Choice
Satisfaction.234***
No. of
chosen
items
%
Executed
items
Difficulty
chosen
items
Rec
difficulty
.506***
.221***
.385***
−.113**
-
-
++
+
+
+
+
Users chose slightly more measures
when presented easier ones
(Showing fit scores did not really matter)
50
Fit scores boosted satisfaction levels for easy
measures, but backfired for difficult ones
51
Lessons learned
• A satisfactory user interface can lead to the adoption of more
energy-saving measures (within system + after 4 weeks)
• Easy tailored measures seem to be attractive, as they were
perceived as feasible and chosen more often
• Fit scores were merely self-reinforcing, not persuasive to
attain ‘more difficult goals’
52
General conclusions
• Recommender systems are all about good UX
• Taking a psychological, user-oriented approach
we can better account for how users perceive
recommendations and reach their goals
• Behaviorism is not enough: an integrated user-
centric approach offers many insights/benefits!
• Recommenders should take into account user
goals and should adapt their algorithms to it…
53

Weitere ähnliche Inhalte

Ähnlich wie Recommender systems to help people move forward

Evaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsEvaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsMegaVjohnson
 
Recommender Systems in TEL
Recommender Systems in TELRecommender Systems in TEL
Recommender Systems in TELtelss09
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015Marianne Sweeny
 
The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendationSimon Belak
 
Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedKatrien Verbert
 
Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsAmit Sharma
 
User Studies for APG: How to support system development with user feedback?
User Studies for APG: How to support system development with user feedback?User Studies for APG: How to support system development with user feedback?
User Studies for APG: How to support system development with user feedback?Joni Salminen
 
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
 
Combining Methods: Web Analytics and User Research
Combining Methods: Web Analytics and User ResearchCombining Methods: Web Analytics and User Research
Combining Methods: Web Analytics and User ResearchUser Intelligence
 
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015Journal For Research
 
Recommender Systems and the Human Factor
Recommender Systems and the Human FactorRecommender Systems and the Human Factor
Recommender Systems and the Human FactorMark Graus
 
UX Thinking - an introduction
UX Thinking - an introductionUX Thinking - an introduction
UX Thinking - an introductionScreamin Wrba
 
A survey on recommendation system
A survey on recommendation systemA survey on recommendation system
A survey on recommendation systemiosrjce
 

Ähnlich wie Recommender systems to help people move forward (20)

Evaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender SystemsEvaluating Collaborative Filtering Recommender Systems
Evaluating Collaborative Filtering Recommender Systems
 
Recommender Systems in TEL
Recommender Systems in TELRecommender Systems in TEL
Recommender Systems in TEL
 
20120140506003
2012014050600320120140506003
20120140506003
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendation
 
Useful interactions
Useful interactionsUseful interactions
Useful interactions
 
Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learned
 
Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systems
 
User Studies for APG: How to support system development with user feedback?
User Studies for APG: How to support system development with user feedback?User Studies for APG: How to support system development with user feedback?
User Studies for APG: How to support system development with user feedback?
 
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영
 
Combining Methods: Web Analytics and User Research
Combining Methods: Web Analytics and User ResearchCombining Methods: Web Analytics and User Research
Combining Methods: Web Analytics and User Research
 
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
 
Ux thinking
Ux thinking Ux thinking
Ux thinking
 
Ux General V8 0
Ux General V8 0Ux General V8 0
Ux General V8 0
 
Ux General V8 0
Ux General V8 0Ux General V8 0
Ux General V8 0
 
Ux General V8 0
Ux General V8 0Ux General V8 0
Ux General V8 0
 
Recommender Systems and the Human Factor
Recommender Systems and the Human FactorRecommender Systems and the Human Factor
Recommender Systems and the Human Factor
 
UX Thinking - an introduction
UX Thinking - an introductionUX Thinking - an introduction
UX Thinking - an introduction
 
A survey on recommendation system
A survey on recommendation systemA survey on recommendation system
A survey on recommendation system
 
I017654651
I017654651I017654651
I017654651
 

Kürzlich hochgeladen

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Kürzlich hochgeladen (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

Recommender systems to help people move forward

  • 1. Recommender systems to help people move forward RecSysNL meetup Oct 17, 2018 Martijn Willemsen M.C.Willemsen@tue.nl PI of the recommender LAB http://www.martijnwillemsen.nl/ @mcwillemsen Decision Making, Process tracing, Cognition, Recommender Systems, online behavior, e-coaching, Data Science
  • 2. My lab has a strong user-centric RecSys focus…, why? • Because I failed as a (real) engineer ? 2 MSc in EE 2nd Bsc: Technology and Society PhD in Decision Making Recommender Systems MSc Technology and Society PostDoc in Process Tracing Electrical Engineering
  • 3. My lab has a strong user-centric RecSys focus…, why? • Because it is easier to get papers into the RecSys conference? 3
  • 4. Surely not…. Because just optimizing accuracy is not enough… REVEAL Workshop RecSys2018: (Joe Konstan) “…if the recommender systems we are building are trained to predict the very items that user founds by themselves without recommendation (yes, I’m looking at you Precision_at_k), then the usefulness of the recommender becomes very debatable. ” 4 https://medium.com/@olivier.koch/recsys- 2018-recommender-systems-that-care- 16389e43114c
  • 5. And optimizing for engagement/behavior is tricky… Netflix tradeoffs popularity, diversity and accuracy AB tests to test ranking between and within rows Source: RecSys 2016, 18 Sept: Talk by Xavier Amatriain http://www.slideshare.net/xamat/past-present-and-future-of-recommender-systems-and-industry-perspective
  • 6. We don’t need the user: Let’s do AB Testing! Netflix used 5-star rating scales to get input from users (apart from log data) Netflix reported an AB test of thumbs up/down versus rating: Yellin (Netflix VP of product): “The result was that thumbs got 200% more ratings than the traditional star-rating feature.” So is the 5-star rating wrong? or just different information? Should we only trust the behavior? 6 However, over time, Netflix realized that explicit star ratings were less relevant than other signals. Users would rate documentaries with 5 stars, and silly movies with just 3 stars, but still watch silly movies more often than those high-rated documentaries. http://variety.com/2017/digital/ne ws/netflix-thumbs-vs-stars- 1202010492/
  • 7. Behavior versus Experience Looking at behavior… • Testing a recommender against a random videoclip system, the number of clicked clips and total viewing time went down! Looking at user experience… • Users found what they liked faster with less ineffective clicks… Behaviorism is not enough! (Ekstrand & Willemsen, RecSys 2016) We need to measure user experience and relate it to user behavior… We need to understand user goals and develop Rec. Systems that help users attain these goals! 7
  • 8. User-Centric Research can help us understand… • How they perceive recommender algorithms (Ekstrand et al. 2014) • WHY users are satisfied… even when we reduce accuracy by diversification (Willemsen et al. 2016) • What inaction (non-behavior) means… (Zhao et al. 2018) Or even help built systems to help people move forward • Energy saving recommendations (Starke et al. 2017) • Music Genre Explorer (Liang and Willemsen, 2018) 8
  • 9. User-Centric Framework Computers Scientists (and marketing researchers) would study behavior…. (they hate asking the user or just cannot (AB tests))
  • 10. User-Centric Framework Psychologists and HCI people are mostly interested in experience…
  • 11. User-Centric Framework Though it helps to triangulate experience and behavior…
  • 12. User-Centric Framework Our framework adds the intermediate construct of perception that explains why behavior and experiences changes due to our manipulations
  • 13. User-Centric Framework • And adds personal and situational characteristics • Relations modeled using factor analysis and SEM Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C. (2012). Explaining the User Experience of Recommender Systems. User Modeling and User-Adapted Interaction (UMUAI), vol 22, p. 441-504 http://bit.ly/umuai
  • 14. User Perceptions of Differences in Recommender Algorithms Joint work with grouplens Michael Ekstrand, Max Harper and Joseph Konstan Ekstrand, M.D., Harper, F.M., Willemsen, M.C.& Konstan, J.A. (2014). User Perception of Differences in Recommender Algorithms. In Proceedings of the 8th ACM conference on Recommender systems (pp. 161–168). New York, NY, USA: ACM
  • 15. Going beyond accuracy… McNee et al. (2006): Accuracy is not enough “study recommenders from a user-centric perspective to make them not only accurate and helpful, but also a pleasure to use” But wait! we don’t even know how the standard algorithms are perceived… and what differences there are… Compare 3 classic algorithms (Item-Item, User-User and SVD) side by side (joint evaluation) in terms of preference and perceptions
  • 16. The task provided to the user First impression Perceived Diversity & novelty and satisfaction Choice of algo
  • 17. First look at the measurement model • only measurement model relating the concepts (no conditions) • All concepts are relative comparisons – e.g. if they think list A is more diverse than B, they are also more satisfied with list A than B SSA EXPSSA INT INT
  • 18. What algorithms do users prefer? 528 users completed the questionnaire Joint evaluation, 3 pairs of comparing A with B User-User CF significantly looses from the other two Item-Item and SVD are on par Why? – User-user more novel than either SVD or item-item – User-user more diverse than SVD – Item-item slightly more diverse than SVD (but diversity didn't affect satisfaction) I-I I-I SVD U-U SVD U-U 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% I-I v. U-U I-I v. SVD SVD v. U-U
  • 19. Objective measures No accuracy differences, but consistent with subjective data RQ2: User-user more novel, SVD somewhat less diverse
  • 20. Aligning objective with subjective measures Objective and subjective metrics correlate consistently But their effects on choice are mediated by the subjective perceptions! (Objective) obscurity only influences satisfaction if it increases perceived novelty (i.e. if it is registered by the user)
  • 21. Conclusions Novelty is not always good: complex, largely negative effect Diversity is important for satisfaction Diversity/accuracy tradeoff does not seem to hold… Subjective Perceptions and experience mediate the effect of objective measures on choice / preference for algorithm Brings the ‘WHY’: e.g. User-user is less satisfactory and less often chosen because of its obscure items (which are perceived as novel)
  • 22. Choice difficulty and satisfaction in RecSys Applying latent feature diversification Willemsen, M.C., Graus, M.P, & Knijnenburg, B.P. (2016). Understanding the role of latent feature diversification on choice difficulty and satisfaction. User Modeling and User- Adapted Interaction (UMUAI), vol 26 (4), 347-389 doi:10.1007/s11257-016-9178-6
  • 23. Seminal example of choice overload Satisfaction decreases with larger sets as increased attractiveness is counteracted by choice difficulty Can we reduce difficulty while controlling attractiveness? More attractive 3% sales Less attractive 30% sales Higher purchase satisfaction From Iyengar and Lepper (2000)
  • 24. Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix Factorization Techniques for Recommender Systems. IEEE Computer 42, 8, 30–37. Dimensionality reduction Users and items are represented as vectors on a set of latent features Rating is the dot product of these vectors (overall utility!) Gus will like Dumb and Dumber but hate Color Purple Use the properties of Matrix Factorization algorithms!
  • 25. Latent feature diversification: high diversity/equal attractiveness
  • 26. 26 Latent Feature Diversification Psychology- informed Diversity manipulation Increased perceived Diversity & attractiveness Reduced difficulty & increased satisfaction Less hovers More choice for lower ranked items Diversification Rank of chosen None (top 5) 3.6 Medium 14.5 High 77.6 -0.2 0 0.2 0.4 0.6 0.8 1 none med high standardizedscore diversification Choice Satisfaction Higher satisfaction for high diversification, despite choice for lower predicted/ranked items
  • 27. Interpreting User Inaction in Recommender Systems Zhao, Q., Willemsen, M. C., Adomavicius, G., Maxwell Harper, F., & Konstan, J. A. (2018). Interpreting user inaction in recommender systems. In Proceedings of the 12th ACM Conference on Recommender Systems (blz. 40-48). New York: Association for Computing Machinery, Inc. DOI: 10.1145/3240323.3240366
  • 28. 28 Action and Inaction in MovieLens.org Add into a wishlist Not interested Rating Click to see details
  • 29. 29 How to interpret user inaction? Randomly pick one and ask users.
  • 30. 7 Categories of User Inaction 30 Did not notice it (38.6%) Noticed but watched it before (14.6%) Noticed and have not watched it yet (46.8%) Would not enjoy it (5.8%) Others are better (9.5%) Okay but not now (18.2%) Plan to explore it soon (6.9%) Have decided to watch (5.8%)
  • 31. Should movielens keep recommending this item to you? most preferred least preferred 4% 1% 11% 9% 30% 42% 63% 65% 51% 25% 27% 23% 19% 5%
  • 32. Based on this survey data we: Built an inaction classification model • Predictors: item attributes, position, user actions, predicted rating and action probabilities: (clicking, rating, adding to a wishlist) • Best accuracy: 48.5% (majority class is NotNoticed: 39.9%) Try to improve the recommender system: what to do with inaction items? • Utilize inferred 7-class probabilities • Estimate the (in)action or adjust the recommendation timing… • Hide an item or Delay showing an item ! 32
  • 33. How recommenders can help users achieve their goals Research with Alain Starke (PhD student) RecSys 2017 33
  • 34. Recommending for Behavioral change • Behavioral change is hard… – Exercising more, eat healthy, reduce alcohol consumption (reducing Binge watching on Netflix ) – Needs awareness, motivation and commitment Combi model: Klein, Mogles, Wissen Journal of Biomedical Informatics, 2014
  • 35. What can recommenders do? • Persuasive Technology: focused on how to help people change their behavior: – personalize the message… • Recommenders systems can help with what to change and when to act – personalize what to do next… • This requires different models/algorithms – our past behavior/liking is not what we want to do now!
  • 36. Central question Can we design a recommender interface which effectively supports a user’s energy-saving goals? 36
  • 37. Regular RecSys approaches, e.g. collaborative filtering, are prone to reinforcing current behavior • If we want consumers and users to achieve (energy- saving) goals, we should not only focus on past behavior but ‘move forward’ (cf. Ekstrand & Willemsen, 2016) • We need a model which considers future goals 37
  • 38. Energy-saving measures can be ordered as increasingly difficult behavioral steps towards attaining the goal of saving energy (Kaiser et al., 2010; Urban & Scasny, 2014) < < 38
  • 39. These steps reflect willingness & capacity to save energy: a person’s energy-saving ability (Kaiser et al., 2010; Urban & Scasny, 2014) < < 39
  • 40. We infer behavioral difficulties based on engagement frequencies 40 INPUT Persons indicate which measures they perform Difficult / Obscure Easy / Popular
  • 41. In a similar vein, we infer energy-saving abilities 41 Low ability, Performs few Persons indicate which measures they perform INPUT High ability, Performs many
  • 42. One’s energy-saving ability is a good starting point to look for appropriate measures A person has a 50% probability of performing a measure with a difficulty equal to his/her ability 42
  • 43. 43
  • 44. Study 2: How should advice be tailored to support energy-efficient choices? (And can fit scores help to persuade users to pick more challenging measures?) 44
  • 45. 45 Web shop interface with three lists (tabs): ‘Base’, ‘Recommended’ and ‘Challenging’ • ‘Recommended’ contains 15 best-matching measures, with fit scores ranging 100% to 60% • ‘Base’ are easier, ‘challenging’ more difficult Matched on their ability: -1, 0 or +1 Logit (75%, 50% of 25% likelihood) Show a fit score (or not)
  • 46. 3x2 Between-subject research design • 3 levels of difficulty, determining contents of the ‘recommended’ list: – Easy / below ability (~75% probability) – Ability-tailored (~50% probability) – Difficult / above ability (~25% probability) • 2 levels of fit score: they were either shown or not – The 100% score was consistent with the difficulty condition – E.g. in the easy condition, measures below a user’s ability (75%) had a 100% match score
  • 47. Easy recommendations were perceived as feasible and, in turn, supportive & satisfactory SEM Statistics: χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990 47 *** p < 0.001, ** p < 0.01, * p < 0.05. Perceived Feasibility −.469*** Rec difficulty -
  • 48. Easy recommendations were perceived as feasible and, in turn, supportive & satisfactory SEM Statistics: χ²(140) = 198.693, p < 0.001, CFI = 0.992, TLI = 0.990 48 *** p < 0.001, ** p < 0.01, * p < 0.05. Perceived Feasibility −.469*** Perceived Support Choice Satisfaction.234*** Rec difficulty .506*** .221*** - + +
  • 49. • Users who felt supported selected more measures • Satisfied users showed a higher % of follow-up 49 *** p < 0.001, ** p < 0.01, * p < 0.05. Perceived Feasibility −.469*** Perceived Support Choice Satisfaction.234*** No. of chosen items % Executed items Difficulty chosen items Rec difficulty .506*** .221*** .385*** −.113** - - ++ + + + +
  • 50. Users chose slightly more measures when presented easier ones (Showing fit scores did not really matter) 50
  • 51. Fit scores boosted satisfaction levels for easy measures, but backfired for difficult ones 51
  • 52. Lessons learned • A satisfactory user interface can lead to the adoption of more energy-saving measures (within system + after 4 weeks) • Easy tailored measures seem to be attractive, as they were perceived as feasible and chosen more often • Fit scores were merely self-reinforcing, not persuasive to attain ‘more difficult goals’ 52
  • 53. General conclusions • Recommender systems are all about good UX • Taking a psychological, user-oriented approach we can better account for how users perceive recommendations and reach their goals • Behaviorism is not enough: an integrated user- centric approach offers many insights/benefits! • Recommenders should take into account user goals and should adapt their algorithms to it… 53