Correlating Perception-Oriented Aspects in User-Centric Recommender System Evaluation
1. Correlating Perception-Oriented Aspects in User-
Centric Recommender System Evaluation
Alan Said, Brijnesh J. Jain,
Andreas Lommatzsch, Sahin Albayrak
Evaluating a Recommender Systems User Study
Recommender systems evaluation generally measures the quality of the algorithm based
on some information retrieval metric, e.g. precision or recall. The metrics do not always
reflect users' perceptions of the recommendations. Perception-related values are instead
measured through user studies.
In this work we choose to neglect the quality of the recommender system and instead
focus on the similarity of aspects related to users' perception of recommender systems.
Based on a user study (N=132) we show the correlation of concepts such as usefulness,
ratings, obviousness, and serendipity from the users' perspectives.
Assumption
There is a correlation between different concepts used in user-centric recommender
systems evaluation.
The questionnaire
The User Study Participants and Collected Data
Our aim is to show how common concepts correlate to each The link to the study was circulated on online social networking sites by
other. For this purpose, a user study was conducted. people from different lines of work (e.g. computer science professionals,
Participants were asked to rate a minimum of 10 movies lawyers, business management professionals).
based on a random selection of the 500 most popular The study ran during the second half of March 2012, is still open for
movies in the Movielens 10M dataset. Based on the ratings, participation at http://www.dai-lab.de/~alan/survey/.
one of three recommendation algorithms generated
recommendations which were presented to the Data used in this work:
participants. • 132 users
Following this we asked them to answer these questions: • 19 countries
For each recommended movie
• Have you seen the movie?
• Yes: Please rate it (on a1-5 star rating scale)
• No:
Calculating the Correlations
The answers collected for each question were averaged per algorithm
• Are you familiar with it?
creating a vector of length 3 for each question. The vectors were correlated
• Would you watch it?
using the Pearson Product-Moment Correlation Coefficient.
For the complete set of movies
(on a 1-5 scale where 1=most disagree and 5=most agree)
• Are the recommendations novel?
• Are the recommendations obvious?
• Are the recommendations recognizable?
• Are the recommendations serendipitous?
• Are the recommendations useful?
• Would you use the system again?
Results & Conclusion
Would you watch
Know the movie
Recognizability
The calculated correlations show that there
Obviousness
seems to be a high overlap between the
Serendipity
Usefulness
Use again
concepts of ratings, serendipity, usefulness
C
Ratings
Novelty
or
and questions regarding whether people
re
la
would watch a recommended item or use the
t
io
n
Ratings 1.00 0.97 0.99 1.00 0.94 0.67 0.72 -0.93 -0.92 1.00 high correlation recommender system again. The concepts of
Serendipity 0.97 1.00 0.99 0.96 1.00 0.84 0.87 -0.99 -0.99 0.75 novelty and recognizability seem to be
Usefulness 0.99 0.99 1.00 0.98 0.98 0.77 0.81 -0.97 -0.96 0.50 completely disconnected from the other
Would you watch 1.00 0.96 0.98 1.00 0.93 0.64 0.69 -0.91 -0.90 0.25 questions.
Use again 0.94 1.00 0.98 0.93 1.00 0.88 0.91 -1.00 -1.00 0.00 This work is still in an early phase,
Obviousness 0.67 0.84 0.77 0.64 0.88 1.00 1.00 -0.90 -0.91 -0.25 nevertheless, the figure to the left can be
Recognizability 0.72 0.87 0.81 0.69 0.91 1.00 1.00 -0.92 -0.93 -0.50 used as a rule-of-thumb when considering
Novelty -0.93 -0.99 -0.97 -0.91 -1.00 -0.90 -0.92 1.00 1.00 -0.75 creating a user study to evaluate a
Know the movie -0.92 -0.99 -0.96 -0.90 -1.00 -0.91 -0.93 1.00 1.00 -1.00 low correlation recommender system.
Correlation of the questions from the user study
Technische Universität Berlin {alan, jain, andreas, sahin}@dai-lab.de www.dai-lab.de