SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Goal-driven Collaborative Filtering
A Directional Error Based Approach




    Tamas Jambor and Jun Wang
     University College London
Structure of the talk



•   Background/Problem description
•   Goal-driven design
•   Experimental results
•   Conclusions
Collaborative filtering


• Predicting user preference
  towards unknown items
• Based on previously expressed preferences
           Love     Pulp      Crazy   White    Up in     A Single
           Actually Fiction   Heart   Ribbon   the Air   Man
  Sophie           ?               ?         ?
  Peter       ?                ?             ?
  Jaden           ?         ?           ?      
Evaluation metrics

                                  ˆ
• Root Mean Squared Error E (( r r ) 2 )
• Netflix recommendation competition
  adopted this metric
• The objective function for some of the SVD
  implementations is equivalent to the performance
  measure [Koren et al 2009]
• Criticism
  – Error criterion is uniform across rating scales
  – Is it consistent with users’ satisfactions?
Goal-driven design


• We argue that
  – Measure does not always reflect user needs
  – Different user needs require different performance
    measures
• The algorithm should be defined based on user
  needs
  – Start from the user point of view, define measure and
    algorithm accordingly
Rating-prediction error offset (SVD)
Observed 3         Observed 3
        Predicted 1        Predicted 5
 1                     3                    5
                               


     Observed 1               Observed 5
     Predicted 3              Predicted 3
Boundaries and the direction of error



• Taste boundary - interval between liked and
  disliked items
• Direction – error towards the boundary
• Magnitude – whether the error crosses taste
  boundary
Directional risk preference of prediction
The two dimensional weighting function



                r = 1,2   r=3   r = 4,5
    p <= 2.5      w1      w2      w3
   2.5<p<=3.5     w4      w5      w6
     P > 3.5      w7      w8      w9
Two-stage Optimization (in General)

                 Learning the
                  Directional
                    Errors


         Feedback/IR      Learning the
           Metrics       Recom. Model



                   Testing
Two-stage Optimization (An example)
                      Genetic algorithm
                      NDCG as fitness function




                      Plug in the learned Weights in SVD
                      Training
                                           T   2          2     2
                    argmin        w(rui   q pu )
                                           i       ( qi       pu )
                      q, p   ui
Genetic algorithms



• Search algorithms that work via the
  process of natural selection
• Start with a sample set of potential solutions (a set
  of weights)
• Evolve towards a set of more optimal solutions
• Poor solutions tend to die out (smaller NDCG)
• Better solutions remain in the population (higher
  NDCG)
Experiments



•   MovieLens 100k dataset
•   1862 movies, 943 users
•   Only using ratings
•   Five-fold cross validation
Evaluation metrics



• Recommendation as a ranking problem
• IR measures
  – Normalized discounted cumulative gain (NDCG)
  – Mean average precision (MAP)
  – Mean reciprocal rank (MRR)
Results – Experiment I
                          Baseline SVD
                       r = 1,2      r=3            r = 4,5
       p <= 2.5        0.0517      0.0193          0.0106
      2.5<p<=3.5       0.0904      0.1461          0.1391
       p > 3.5         0.0299      0.1012          0.4115



                 SVD with weights where w7>w8>w4
                       r = 1,2      r=3            r = 4,5
       p <= 2.5        0.0759      0.0407          0.0264
      2.5<p<=3.5       0.0837      0.1676          0.2381
       p > 3.5         0.0125      0.0583          0.2966
Results – Experiment II




                    r = 1,2   r=3   r = 4,5
        p <= 2.5      w1      w2      w3
       2.5<p<=3.5     w4      w5      w6
        P > 3.5       w7      w8      w9
Results – Experiment II



• Genetic algorithm to find optimal weigh for sector
  w7,w8 and w4 (statistically significant)
                     Weighted   Baseline
          MAP           0.450      0.447
          MRR           0.899      0.889
          NDCG@10       0.726      0.720
          NDCG@5        0.574      0.570
          NDCG@3        0.450      0.447
Probability of correct prediction within sectors




Probability of predicting non-relevant items relevant
Improved user experience



• More likely to receive relevant items on their
  recommendation list
• Less likely that lower rated items receive higher
  predictions
• But it is more likely that higher rated items receive
  lower predictions
Conclusion



•   Optimize algorithm from the user point of view
•   Identify directional errors
•   Assign risk to each direction
•   Approach can be changed depending on how
    items are presented
Future work



• Taste boundaries might be user dependent
• Directional error across items or users
• Different recommender goals
Thank you.
References

•   Deshpande, M., Karypis, G.: Item-based top-N recommendation algorithms.
    ACM Trans. Inf. Syst. 22(1) (2004)
•   Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An algorithmic
    framework for performing collaborative filtering. In: SIGIR '99. (1999)
•   Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for
    recommender systems. Computer 42(8) (2009)
•   Wang, J., de Vries, A.P., Reinders, M.J.T.: Unifying user-based and item-
    based collaborative filtering approaches by similarity fusion. In: SIGIR '06:
    Proceedings of the 29th annual international ACM SIGIR conference on
    Research and development in information retrieval, New York, NY, ACM Press

Weitere ähnliche Inhalte

Ähnlich wie Goal driven collaborative filtering (ECIR 2010)

Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationThomas Ploetz
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...Manuel Martín
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
 
Bagley_HNRS_CRM_talk_2015
Bagley_HNRS_CRM_talk_2015Bagley_HNRS_CRM_talk_2015
Bagley_HNRS_CRM_talk_2015Thomas Bagley
 
TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)Evangelos Kontopantelis
 
MLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for AnomaliesMLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for AnomaliesBigML, Inc
 
Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis
Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis
Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis KBI Biopharma
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsAravind Sesagiri Raamkumar
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisGabor Szabo, CQE
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 
Ntokas_Vasileios_Panagiotis_Final_Demonstration
Ntokas_Vasileios_Panagiotis_Final_DemonstrationNtokas_Vasileios_Panagiotis_Final_Demonstration
Ntokas_Vasileios_Panagiotis_Final_DemonstrationPv Vasilis
 
Summer 2015 Internship
Summer 2015 InternshipSummer 2015 Internship
Summer 2015 InternshipTaylor Martell
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1khairulhuda242
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Seunghyun Hwang
 
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...IRJET Journal
 

Ähnlich wie Goal driven collaborative filtering (ECIR 2010) (20)

ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
Bagley_HNRS_CRM_talk_2015
Bagley_HNRS_CRM_talk_2015Bagley_HNRS_CRM_talk_2015
Bagley_HNRS_CRM_talk_2015
 
TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)
 
MLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for AnomaliesMLSEV Virtual. Searching for Anomalies
MLSEV Virtual. Searching for Anomalies
 
Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis
Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis
Webinar: Boost Biologics Formulation Screening with Unit & Hunk Analysis
 
K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems Analysis
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
Ntokas_Vasileios_Panagiotis_Final_Demonstration
Ntokas_Vasileios_Panagiotis_Final_DemonstrationNtokas_Vasileios_Panagiotis_Final_Demonstration
Ntokas_Vasileios_Panagiotis_Final_Demonstration
 
Summer 2015 Internship
Summer 2015 InternshipSummer 2015 Internship
Summer 2015 Internship
 
Prediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source toolsPrediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source tools
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation
 
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
 

Kürzlich hochgeladen

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Goal driven collaborative filtering (ECIR 2010)

  • 1. Goal-driven Collaborative Filtering A Directional Error Based Approach Tamas Jambor and Jun Wang University College London
  • 2. Structure of the talk • Background/Problem description • Goal-driven design • Experimental results • Conclusions
  • 3. Collaborative filtering • Predicting user preference towards unknown items • Based on previously expressed preferences Love Pulp Crazy White Up in A Single Actually Fiction Heart Ribbon the Air Man Sophie  ?   ? ? Peter ?   ?  ? Jaden  ? ?  ? 
  • 4. Evaluation metrics ˆ • Root Mean Squared Error E (( r r ) 2 ) • Netflix recommendation competition adopted this metric • The objective function for some of the SVD implementations is equivalent to the performance measure [Koren et al 2009] • Criticism – Error criterion is uniform across rating scales – Is it consistent with users’ satisfactions?
  • 5. Goal-driven design • We argue that – Measure does not always reflect user needs – Different user needs require different performance measures • The algorithm should be defined based on user needs – Start from the user point of view, define measure and algorithm accordingly
  • 7. Observed 3 Observed 3 Predicted 1 Predicted 5 1 3 5    Observed 1 Observed 5 Predicted 3 Predicted 3
  • 8. Boundaries and the direction of error • Taste boundary - interval between liked and disliked items • Direction – error towards the boundary • Magnitude – whether the error crosses taste boundary
  • 10. The two dimensional weighting function r = 1,2 r=3 r = 4,5 p <= 2.5 w1 w2 w3 2.5<p<=3.5 w4 w5 w6 P > 3.5 w7 w8 w9
  • 11. Two-stage Optimization (in General) Learning the Directional Errors Feedback/IR Learning the Metrics Recom. Model Testing
  • 12. Two-stage Optimization (An example) Genetic algorithm NDCG as fitness function Plug in the learned Weights in SVD Training T 2 2 2 argmin w(rui q pu ) i ( qi pu ) q, p ui
  • 13. Genetic algorithms • Search algorithms that work via the process of natural selection • Start with a sample set of potential solutions (a set of weights) • Evolve towards a set of more optimal solutions • Poor solutions tend to die out (smaller NDCG) • Better solutions remain in the population (higher NDCG)
  • 14. Experiments • MovieLens 100k dataset • 1862 movies, 943 users • Only using ratings • Five-fold cross validation
  • 15. Evaluation metrics • Recommendation as a ranking problem • IR measures – Normalized discounted cumulative gain (NDCG) – Mean average precision (MAP) – Mean reciprocal rank (MRR)
  • 16. Results – Experiment I Baseline SVD r = 1,2 r=3 r = 4,5 p <= 2.5 0.0517 0.0193 0.0106 2.5<p<=3.5 0.0904 0.1461 0.1391 p > 3.5 0.0299 0.1012 0.4115 SVD with weights where w7>w8>w4 r = 1,2 r=3 r = 4,5 p <= 2.5 0.0759 0.0407 0.0264 2.5<p<=3.5 0.0837 0.1676 0.2381 p > 3.5 0.0125 0.0583 0.2966
  • 17. Results – Experiment II r = 1,2 r=3 r = 4,5 p <= 2.5 w1 w2 w3 2.5<p<=3.5 w4 w5 w6 P > 3.5 w7 w8 w9
  • 18. Results – Experiment II • Genetic algorithm to find optimal weigh for sector w7,w8 and w4 (statistically significant) Weighted Baseline MAP 0.450 0.447 MRR 0.899 0.889 NDCG@10 0.726 0.720 NDCG@5 0.574 0.570 NDCG@3 0.450 0.447
  • 19. Probability of correct prediction within sectors Probability of predicting non-relevant items relevant
  • 20. Improved user experience • More likely to receive relevant items on their recommendation list • Less likely that lower rated items receive higher predictions • But it is more likely that higher rated items receive lower predictions
  • 21. Conclusion • Optimize algorithm from the user point of view • Identify directional errors • Assign risk to each direction • Approach can be changed depending on how items are presented
  • 22. Future work • Taste boundaries might be user dependent • Directional error across items or users • Different recommender goals
  • 24. References • Deshpande, M., Karypis, G.: Item-based top-N recommendation algorithms. ACM Trans. Inf. Syst. 22(1) (2004) • Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An algorithmic framework for performing collaborative filtering. In: SIGIR '99. (1999) • Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8) (2009) • Wang, J., de Vries, A.P., Reinders, M.J.T.: Unifying user-based and item- based collaborative filtering approaches by similarity fusion. In: SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, ACM Press

Hinweis der Redaktion

  1. Brief introduction. Name, where from. I am going to present this work on goal driven collaborative filtering. This is a simple idea based on an assumption that not all the errors the system makes are equal.
  2. I will start with a brief introduction on collaborative filtering.
  3. This can be applied to variety of items for example movie recommendationSometimes other content is taken into account, such as gender, geographic location for the user, the type of the item a etc.In this work we use ratings for prediction.Example, Sophie, Peter and Jaden
  4. Expected value of the squared difference between the predicted rating and the observer valueUsed for the netflix competitionSince the error is squared, we emphasize large errorsObviously large error occur at the end of the rating scales.
  5. Add goal driven design imageWe argue that current algorithms do not always optimize performance based on user needs. It optimizes algorithm based on a performance measure that not always reflects user needs. In addition that different user needs require different performance.Therefore the system should be defined based on user needs and the performance should be measured accordingly.We also argue that different user needs requires different measures.The measure might indicate the qualities that the algorithm should possessBut the algorithm should be designed based on user needs not only on the measure
  6. This graph show the probability where the model over predicts or under predict certain group of items.For example items that are rated 3 are more likely to be over predicted than under predicted. Also note the pattern that the model works best with items that are rated four, you get the best accuracy there. Because we have the highest number of training points for this group of items.
  7. Models...Extract factorsIn this work we used SVD as a baseline algorithm
  8. Uninteresting itemsDepending on the way the items are presented. Top n-listExploring. Same error, but the question is whether this error should
  9. Taste boundaries, the interval between liked and disliked items. In a rating scale from 1 to five the boundary would be three.The direction would represent whether the predicted rating with respect to the observed rating is towards the taste boundary or notAnd finally the magnitude shows whether this directional error crosses the taste boundary or not
  10. Here. We obviously want to make the prediction correct at the diagonal. But if the prediction is not correct we define the risk of predicting items differently depending on the criteria I just explained. The size of the arrow represents the magnitude of the risk, as we understand it. For example it is more important to penalize lower rated items as they get higher predicted than the other way around. Therefore the aim is to minimize error in sectors that are identified more important.
  11. We define a weighting function that is a function of p the predicted value of the item and r the observed rating9 sectorsReduce the probability that an items falls in a sector – redIncrease the probability that an items falls in a sector - green
  12. The objective function is to minimize the squared error. Where w is the function of the predicted value and the observed rating, as defined in the previous slide.The second part of the equation is the regularizing term, in order to avoid over fitting by penalizing the magnitude of the parameters. We solve this by using gradient descendent optimization. So that we find a number of factors for each item and user in the dataset. To calculate to prediction for an unknown item user pair, we just take the dot product of the item and user vector.Our contribution here is the weighting function that would force the model to reduce error in sectors.
  13. We designed a two level optimization in order to come up with the best set of weights.Weights were optimized on the second set and tested on the third.
  14. Genetic algorithms are search algorithms that work via the process of natural selection. They begin with a sample set ofpotential solutions which then evolves toward a set of more optimal solutions. Within the sample set, solutions that are poor tend to die out while better solutions remain in the population, thus introducing more solutions into the set.
  15. Only use rating information
  16. We assess the system performance on the top k-list
  17. Experiment IWe set the weights manually for sectors where we wanted to reduce the error the most, these included w7, w8, w4.The table shows that we reduced the probability that items will fall into particular sectors, but we also have reduced the probability that the item will correctly predicted.
  18. Five fold validation, and tested and turned to be statistically significant.
  19. Essentially this approach aims to minimize the error for the predefined sectors which inevitably results in the increase of error in other sectors. Fig. 5(a)shows the probability that true ratings are correctly predicted within our predefined taste boundary by the optimized versus the baseline approach using theweights obtained in the second experiment (Table 4). As expected the baseline approach predicts higher ratings better than our optimized approach, since theoptimized approach does not penalize this type of error (high ratings predicted less), whereas we have some improvement in the lower range where we aimed toreduce the error. This approach takes the low risk approach therefore it hurts the performance at the higher range of the spectrum where it is less risky topredict something less, in exchange it reduces the error for item that are rated low. This means that it is less likely that users get items that are not relevantto them (Fig. 5(b)).
  20. Emphasize difficult items and users