SlideShare a Scribd company logo
1 of 13
Recommender systems
evaluation: a 3D benchmark
  Alan Said1, Domonkos Tikk2, Yue
     Shi3, Martha Larson3, Klára
     Stumpf2, Paolo Cremonesi4

1: TU Berlin
2: Gravity R&D
3: TU Delft
4: Politecnico di Milano/Moviri
Motivation
• Current recsys evaluation benchmarks are
  insufficient
  – mostly focused on IR measures (RMSE,
    MAP@X, precision/recall)
  – does not consider the need of all stakeholders
    (users, content provider, recsys vendor)
  – technological and business requirements are
    mostly overlooked
• 3D Recommender System Benchmarking
  Model
Stakeholders




users


                      content of service
                          provider
        recommender
The Proposed 3D model
Recent benchmarks (1)

• pros:
  – Large scale
  – very well organized
• cons:
  – qualitative assessment of recommendation:
    simplified to RMSE
  – rating prediction (not ranking)
  – no focus on direct business and technical
    parameters (scalability, robustness, reactivity)
Recent benchmarks (2)


• pros:
  – constraints on training and response time
  – real traffic (only planned)
  – major driver: revenue increase
• cons:
  – only business goals, but otherwise unclear
    optimization criteria
  – user needs are neglected
  – organization
Recent Benchmarks (3)


• pros:
  – availability of additional metadata (compared to
    KDD Cup 2011)
  – not rating based (implicit feedback)
  – ranking based evaluation metric (MAP@500)
• cons:
  – offline evaluation
  – size does not matter anymore (lower interest)
  – no business requirements or technical constraint
3D MODEL
User requirements
• functional (quality-related)
  – relevant, interesting, novel, diverse,
    serendipitious, context-aware, ethical, etc.
• non-functional (technology related)
  – real-time
  – usability-related
Business requirements
• Business model
  – for-profit: revenue stream
  – NP-style: award driven (reputation,
    community building)
• KPI depends on the application area
  – Revenue increase
  – CTR
  – Raise awarness to content or service
Technical constraints
• data driven
  – availability of user feedback (e.g. satellite TV)
• system driven
  – hardware/software limitations (device-
    dependent)
• scalability
  – typical response time
• robustness
Example
• VoD recommendation scenario (TV)
  – user: easy contect exploration, context-
    awareness (time, viewer identification)
  – business: increase VoD sales & awareness
    (user base)
  – technical: middleware, HW/SW of the
    provider, response time
Facit
• Recommendation tasks have many aspects
  typically overlooked
• Tasks define the important user, business,
  and technical quality measures
  – the fulfilment of all is required at a certain level
  – trade-off is usually required
• Proposal: with our 3D evaluation concept
  more comprehensive evaluation can be
  achieved

More Related Content

Viewers also liked

Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleDomonkos Tikk
 
Recommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsRecommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsDomonkos Tikk
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Domonkos Tikk
 
MovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitterMovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitterSimon Dooms
 
Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Balázs Hidasi
 
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DChallenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DDomonkos Tikk
 

Viewers also liked (6)

Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scale
 
Recommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsRecommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspects
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...
 
MovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitterMovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitter
 
Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...
 
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DChallenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
 

Similar to Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babuHem Rana
 
10 - Project Management
10 - Project Management10 - Project Management
10 - Project ManagementRaymond Gao
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation StrategySatish Nath
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notesSiva Ayyakutti
 
Module 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdfModule 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdfMASantos15
 
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...Evergreen Systems
 
01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.ppt01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.pptiqbal051663
 
Se lect11 btech
Se lect11 btechSe lect11 btech
Se lect11 btechIIITA
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life CycleSrujanaMerugu1
 
Feasibility Study - Management PPT Slides
Feasibility Study  - Management PPT SlidesFeasibility Study  - Management PPT Slides
Feasibility Study - Management PPT SlidesNusaike Mufthie
 
Software Project Management
Software Project ManagementSoftware Project Management
Software Project ManagementShauryaGupta38
 
Requirements Gathering And Management
Requirements Gathering And ManagementRequirements Gathering And Management
Requirements Gathering And ManagementAlan McSweeney
 
Software engineering jwfiles 3
Software engineering jwfiles 3Software engineering jwfiles 3
Software engineering jwfiles 3Azhar Shaik
 
City universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitschCity universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitschalanreitsch
 
ASUG Utilities Presentation
ASUG Utilities PresentationASUG Utilities Presentation
ASUG Utilities PresentationMichael Robinson
 

Similar to Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012 (20)

Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babu
 
10 - Project Management
10 - Project Management10 - Project Management
10 - Project Management
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation Strategy
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notes
 
Module 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdfModule 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdf
 
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
 
01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.ppt01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.ppt
 
Se lect11 btech
Se lect11 btechSe lect11 btech
Se lect11 btech
 
PMI Presentation2
PMI Presentation2PMI Presentation2
PMI Presentation2
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life Cycle
 
Feasibility Study - Management PPT Slides
Feasibility Study  - Management PPT SlidesFeasibility Study  - Management PPT Slides
Feasibility Study - Management PPT Slides
 
Software Project Management
Software Project ManagementSoftware Project Management
Software Project Management
 
Requirements Gathering And Management
Requirements Gathering And ManagementRequirements Gathering And Management
Requirements Gathering And Management
 
Software engineering jwfiles 3
Software engineering jwfiles 3Software engineering jwfiles 3
Software engineering jwfiles 3
 
Chap01
Chap01Chap01
Chap01
 
City universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitschCity universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitsch
 
ASUG Utilities Presentation
ASUG Utilities PresentationASUG Utilities Presentation
ASUG Utilities Presentation
 
Dpbok context i
Dpbok   context iDpbok   context i
Dpbok context i
 
Soft requirement
Soft requirementSoft requirement
Soft requirement
 

More from Domonkos Tikk

General factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsGeneral factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsDomonkos Tikk
 
Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Domonkos Tikk
 
Idomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwIdomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwDomonkos Tikk
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online ClassifiedsDomonkos Tikk
 
Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Domonkos Tikk
 
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Domonkos Tikk
 
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Domonkos Tikk
 
From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...Domonkos Tikk
 

More from Domonkos Tikk (8)

General factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsGeneral factorization framework for context-aware recommendations
General factorization framework for context-aware recommendations
 
Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment)
 
Idomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwIdomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fw
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online Classifieds
 
Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...
 
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
 
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
 
From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

  • 1. Recommender systems evaluation: a 3D benchmark Alan Said1, Domonkos Tikk2, Yue Shi3, Martha Larson3, Klára Stumpf2, Paolo Cremonesi4 1: TU Berlin 2: Gravity R&D 3: TU Delft 4: Politecnico di Milano/Moviri
  • 2. Motivation • Current recsys evaluation benchmarks are insufficient – mostly focused on IR measures (RMSE, MAP@X, precision/recall) – does not consider the need of all stakeholders (users, content provider, recsys vendor) – technological and business requirements are mostly overlooked • 3D Recommender System Benchmarking Model
  • 3. Stakeholders users content of service provider recommender
  • 5. Recent benchmarks (1) • pros: – Large scale – very well organized • cons: – qualitative assessment of recommendation: simplified to RMSE – rating prediction (not ranking) – no focus on direct business and technical parameters (scalability, robustness, reactivity)
  • 6. Recent benchmarks (2) • pros: – constraints on training and response time – real traffic (only planned) – major driver: revenue increase • cons: – only business goals, but otherwise unclear optimization criteria – user needs are neglected – organization
  • 7. Recent Benchmarks (3) • pros: – availability of additional metadata (compared to KDD Cup 2011) – not rating based (implicit feedback) – ranking based evaluation metric (MAP@500) • cons: – offline evaluation – size does not matter anymore (lower interest) – no business requirements or technical constraint
  • 9. User requirements • functional (quality-related) – relevant, interesting, novel, diverse, serendipitious, context-aware, ethical, etc. • non-functional (technology related) – real-time – usability-related
  • 10. Business requirements • Business model – for-profit: revenue stream – NP-style: award driven (reputation, community building) • KPI depends on the application area – Revenue increase – CTR – Raise awarness to content or service
  • 11. Technical constraints • data driven – availability of user feedback (e.g. satellite TV) • system driven – hardware/software limitations (device- dependent) • scalability – typical response time • robustness
  • 12. Example • VoD recommendation scenario (TV) – user: easy contect exploration, context- awareness (time, viewer identification) – business: increase VoD sales & awareness (user base) – technical: middleware, HW/SW of the provider, response time
  • 13. Facit • Recommendation tasks have many aspects typically overlooked • Tasks define the important user, business, and technical quality measures – the fulfilment of all is required at a certain level – trade-off is usually required • Proposal: with our 3D evaluation concept more comprehensive evaluation can be achieved