SlideShare a Scribd company logo
1 of 1
Download to read offline
Correlating Perception-Oriented Aspects in User-
Centric Recommender System Evaluation
                                                                                                                                                                                                             Alan Said, Brijnesh J. Jain,
                                                                                                                                                                                                   Andreas Lommatzsch, Sahin Albayrak



  Evaluating a Recommender Systems User Study
  Recommender systems evaluation generally measures the quality of the algorithm based
  on some information retrieval metric, e.g. precision or recall. The metrics do not always
  reflect users' perceptions of the recommendations. Perception-related values are instead
  measured through user studies.

  In this work we choose to neglect the quality of the recommender system and instead
  focus on the similarity of aspects related to users' perception of recommender systems.
  Based on a user study (N=132) we show the correlation of concepts such as usefulness,
  ratings, obviousness, and serendipity from the users' perspectives.



  Assumption
  There is a correlation between different concepts used in user-centric recommender
  systems evaluation.
                                                                                                                                                                                                               The questionnaire




  The User Study                                                                                                                                                  Participants and Collected Data
  Our aim is to show how common concepts correlate to each                                                                                                        The link to the study was circulated on online social networking sites by
  other. For this purpose, a user study was conducted.                                                                                                            people from different lines of work (e.g. computer science professionals,
  Participants were asked to rate a minimum of 10 movies                                                                                                          lawyers, business management professionals).
  based on a random selection of the 500 most popular                                                                                                             The study ran during the second half of March 2012, is still open for
  movies in the Movielens 10M dataset. Based on the ratings,                                                                                                      participation at http://www.dai-lab.de/~alan/survey/.
  one of three recommendation algorithms generated
  recommendations which were presented to the                                                                                                                     Data used in this work:
  participants.                                                                                                                                                   • 132 users
  Following this we asked them to answer these questions:                                                                                                         • 19 countries
  For each recommended movie
  • Have you seen the movie?
     • Yes: Please rate it (on a1-5 star rating scale)
     • No:
                                                                                                                                                                  Calculating the Correlations
                                                                                                                                                                  The answers collected for each question were averaged per algorithm
       • Are you familiar with it?
                                                                                                                                                                  creating a vector of length 3 for each question. The vectors were correlated
       • Would you watch it?
                                                                                                                                                                  using the Pearson Product-Moment Correlation Coefficient.
   For the complete set of movies
     (on a 1-5 scale where 1=most disagree and 5=most agree)
  • Are the recommendations novel?
  • Are the recommendations obvious?
  • Are the recommendations recognizable?
  • Are the recommendations serendipitous?
  • Are the recommendations useful?
  • Would you use the system again?


                                                                                                                                                                                                   Results & Conclusion
                                                                    Would you watch




                                                                                                                                                 Know the movie
                                                                                                                   Recognizability




                                                                                                                                                                                                   The calculated correlations show that there
                                                                                                    Obviousness




                                                                                                                                                                                                   seems to be a high overlap between the
                                       Serendipity

                                                      Usefulness



                                                                                       Use again




                                                                                                                                                                                                   concepts of ratings, serendipity, usefulness
             C




                            Ratings




                                                                                                                                      Novelty
              or




                                                                                                                                                                                                   and questions regarding whether people
               re
                 la




                                                                                                                                                                                                   would watch a recommended item or use the
                    t
                    io
                       n




              Ratings      1.00       0.97           0.99          1.00               0.94         0.67           0.72 -0.93 -0.92                                      1.00    high correlation   recommender system again. The concepts of
           Serendipity     0.97       1.00           0.99          0.96               1.00         0.84           0.87 -0.99 -0.99                                      0.75                       novelty and recognizability seem to be
           Usefulness      0.99       0.99           1.00          0.98               0.98         0.77           0.81 -0.97 -0.96                                      0.50                       completely disconnected from the other
      Would you watch      1.00       0.96           0.98          1.00               0.93         0.64           0.69 -0.91 -0.90                                      0.25                       questions.
            Use again      0.94       1.00           0.98          0.93               1.00         0.88           0.91 -1.00 -1.00                                      0.00                       This work is still in an early phase,
         Obviousness       0.67       0.84           0.77          0.64               0.88         1.00           1.00 -0.90 -0.91                                      -0.25                      nevertheless, the figure to the left can be
        Recognizability    0.72       0.87           0.81          0.69               0.91         1.00           1.00 -0.92 -0.93                                      -0.50                      used as a rule-of-thumb when considering
              Novelty      -0.93 -0.99 -0.97 -0.91 -1.00 -0.90 -0.92                                                                 1.00       1.00                    -0.75                      creating a user study to evaluate a
       Know the movie      -0.92 -0.99 -0.96 -0.90 -1.00 -0.91 -0.93                                                                 1.00       1.00                    -1.00   low correlation    recommender system.
                                              Correlation of the questions from the user study




Technische Universität Berlin                                                                  {alan, jain, andreas, sahin}@dai-lab.de                                                                                       www.dai-lab.de

More Related Content

More from Alan Said

State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012Alan Said
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge OpeningAlan Said
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsAlan Said
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Alan Said
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationAlan Said
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesAlan Said
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceAlan Said
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityAlan Said
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsAlan Said
 

More from Alan Said (10)

State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge Opening
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender Systems
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening Presentation
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender Performance
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 

Correlating Perception-Oriented Aspects in User-Centric Recommender System Evaluation

  • 1. Correlating Perception-Oriented Aspects in User- Centric Recommender System Evaluation Alan Said, Brijnesh J. Jain, Andreas Lommatzsch, Sahin Albayrak Evaluating a Recommender Systems User Study Recommender systems evaluation generally measures the quality of the algorithm based on some information retrieval metric, e.g. precision or recall. The metrics do not always reflect users' perceptions of the recommendations. Perception-related values are instead measured through user studies. In this work we choose to neglect the quality of the recommender system and instead focus on the similarity of aspects related to users' perception of recommender systems. Based on a user study (N=132) we show the correlation of concepts such as usefulness, ratings, obviousness, and serendipity from the users' perspectives. Assumption There is a correlation between different concepts used in user-centric recommender systems evaluation. The questionnaire The User Study Participants and Collected Data Our aim is to show how common concepts correlate to each The link to the study was circulated on online social networking sites by other. For this purpose, a user study was conducted. people from different lines of work (e.g. computer science professionals, Participants were asked to rate a minimum of 10 movies lawyers, business management professionals). based on a random selection of the 500 most popular The study ran during the second half of March 2012, is still open for movies in the Movielens 10M dataset. Based on the ratings, participation at http://www.dai-lab.de/~alan/survey/. one of three recommendation algorithms generated recommendations which were presented to the Data used in this work: participants. • 132 users Following this we asked them to answer these questions: • 19 countries For each recommended movie • Have you seen the movie? • Yes: Please rate it (on a1-5 star rating scale) • No: Calculating the Correlations The answers collected for each question were averaged per algorithm • Are you familiar with it? creating a vector of length 3 for each question. The vectors were correlated • Would you watch it? using the Pearson Product-Moment Correlation Coefficient. For the complete set of movies (on a 1-5 scale where 1=most disagree and 5=most agree) • Are the recommendations novel? • Are the recommendations obvious? • Are the recommendations recognizable? • Are the recommendations serendipitous? • Are the recommendations useful? • Would you use the system again? Results & Conclusion Would you watch Know the movie Recognizability The calculated correlations show that there Obviousness seems to be a high overlap between the Serendipity Usefulness Use again concepts of ratings, serendipity, usefulness C Ratings Novelty or and questions regarding whether people re la would watch a recommended item or use the t io n Ratings 1.00 0.97 0.99 1.00 0.94 0.67 0.72 -0.93 -0.92 1.00 high correlation recommender system again. The concepts of Serendipity 0.97 1.00 0.99 0.96 1.00 0.84 0.87 -0.99 -0.99 0.75 novelty and recognizability seem to be Usefulness 0.99 0.99 1.00 0.98 0.98 0.77 0.81 -0.97 -0.96 0.50 completely disconnected from the other Would you watch 1.00 0.96 0.98 1.00 0.93 0.64 0.69 -0.91 -0.90 0.25 questions. Use again 0.94 1.00 0.98 0.93 1.00 0.88 0.91 -1.00 -1.00 0.00 This work is still in an early phase, Obviousness 0.67 0.84 0.77 0.64 0.88 1.00 1.00 -0.90 -0.91 -0.25 nevertheless, the figure to the left can be Recognizability 0.72 0.87 0.81 0.69 0.91 1.00 1.00 -0.92 -0.93 -0.50 used as a rule-of-thumb when considering Novelty -0.93 -0.99 -0.97 -0.91 -1.00 -0.90 -0.92 1.00 1.00 -0.75 creating a user study to evaluate a Know the movie -0.92 -0.99 -0.96 -0.90 -1.00 -0.91 -0.93 1.00 1.00 -1.00 low correlation recommender system. Correlation of the questions from the user study Technische Universität Berlin {alan, jain, andreas, sahin}@dai-lab.de www.dai-lab.de