SlideShare ist ein Scribd-Unternehmen logo
1 von 24
ICML’11 Tutorial:
Recommender Problems for
       Web Applications
Deepak Agarwal and Bee-Chung Chen
                   Yahoo! Research
Other Significant Y! Labs Contributors

• Content Targeting
    –   Pradheep Elango
    –   Rajiv Khanna
    –   Raghu Ramakrishnan
    –   Xuanhui Wang
    –   Liang Zhang


• Ad Targeting
    – Nagaraj Kota




  Deepak Agarwal & Bee-Chung Chen @ ICML’11   2
Agenda

• Topic of Interest
    – Recommender problems for dynamic, time-sensitive applications
                                 dynamic
       • Content Optimization (main focus today), Online Advertising,
         Movie recommendation, shopping,…
• Introduction (20 min, Deepak)
• Offline components (40 min, Deepak)
    – Regression, Collaborative filtering (CF), …
• Online components + initialization (70 min, Bee-Chung)
    – Time-series, online/incremental methods, explore/exploit (bandit)
• Evaluation methods + Multi-Objective (10-15 min, Deepak)
• Challenges (5-10 min, Deepak)


  Deepak Agarwal & Bee-Chung Chen @ ICML’11                           3
Three components we will focus on today

• Defining the problem
    – Formulate objectives whose optimization achieves some long-
      term goals for the recommender system
         • E.g. How to serve content to optimize audience reach and engagement,
           optimize some combination of engagement and revenue ?

• Modeling (to estimate some critical inputs)
    – Predict rates of some positive user interaction(s) with items based
      on data obtained from historical user-item interactions
       • E.g. Click rates, average time-spent on page, etc
       • Could be explicit feedback like ratings
• Experimentation
    – Create experiments to collect data proactively to improve models,
      helps in converging to the best choice(s) cheaply and rapidly.
       • Explore and Exploit (continuous experimentation)
       • DOE (testing hypotheses by avoiding bias inherent in data)
  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                       4
Modern Recommendation Systems

• Goal
    – Serve the right item to a user in a given context to optimize long-
      term business objectives
• A scientific discipline that involves
    – Large scale Machine Learning & Statistics
         • Offline Models (capture global & stable characteristics)
         • Online Models (incorporates dynamic components)
         • Explore/Exploit (active and adaptive experimentation)
    – Multi-Objective Optimization
         • Click-rates (CTR), Engagement, advertising revenue, diversity, etc
    – Inferring user interest
         • Constructing User Profiles
    – Natural Language Processing to understand content
         • Topics, “aboutness”, entities, follow-up of something, breaking news,…


  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                     5
Some examples from content optimization

• Simple version
    – I have a content module on my page, content inventory is obtained
      from a third party source which is further refined through editorial
      oversight. Can I algorithmically recommend content on this
      module? I want to improve overall click-rate (CTR) on this module
• More advanced
    – I got X% lift in CTR. But I have additional information on other
      downstream utilities (e.g. advertising revenue). Can I increase
      downstream utility without losing too many clicks?
• Highly advanced
    – There are multiple modules running on my webpage. How do I
      perform a simultaneous optimization?




  Deepak Agarwal & Bee-Chung Chen @ ICML’11                              6
Recommend search queries

                                              Recommend packages:
                                                Image
                                                Title, summary
                                                Links to other pages

                                              Pick 4 out of a pool of K
                                                K = 20 ~ 40
                                                Dynamic

                                              Routes traffic other pages

                     Recommend applications   Recommend news article
Deepak Agarwal & Bee-Chung Chen @ ICML’11                                  7
Problems in this example

• Optimize CTR on multiple modules
    – Today Module, Trending Now, Personal Assistant, News
    – Simple solution: Treat modules as independent, optimize
      separately. May not be the best when there are strong correlations.



• For any single module
    – Optimize some combination of CTR, downstream engagement,
      and perhaps advertising revenue.




  Deepak Agarwal & Bee-Chung Chen @ ICML’11                           8
Online Advertising

                            Response rates
                            (click, conversion, ad-view)
                                                                            Bids
     conversion
                          ML /Statistical                  Auction
                          model




                                                                                        Advertisers
                                                Select argmax f(bid,response rates)
                                Click
                                              Recommend
                             Ads
                                                Best ad(s)
                                                                Ad Network
                            Page
   User                                                           •Examples:
                                                             Yahoo, Google, MSN, …
                                                            Ad exchanges (RightMedia,
                                                                 DoubleClick, …)

                             Publisher
  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                                  9
Recommender problems in general

  • Example applications
      • Search: Web, Vertical
      • Online Advertising                           Item Inventory
      • Content                                      Articles, web page,
      •…..                                                  ads, …

                   Context
                      query, page, …
                                               Use an automated algorithm
                                                 to select item(s) to show

                                             Get feedback (click, time spent,..)
        USER
                                                    Refine the models

                                              Repeat (large number of times)
                                               Optimize metric(s) of interest
                                              (Total clicks, Total revenue,…)


 Deepak Agarwal & Bee-Chung Chen @ ICML’11                                         10
Important Factors

  • Items: Articles, ads, modules, movies, users, updates, etc.

  • Context: query keywords, pages, mobile, social media, etc.

  • Metric to optimize (e.g., relevance score, CTR, revenue, engagement)
      – Currently, most applications are single-objective
      – Could be multi-objective optimization (maximize X subject to Y, Z,..)


  • Properties of the item pool
      – Size (e.g., all web pages vs. 40 stories)
      – Quality of the pool (e.g., anything vs. editorially selected)
      – Lifetime (e.g., mostly old items vs. mostly new items)




   Deepak Agarwal & Bee-Chung Chen @ ICML’11                                    11
Factors affecting Solution (continued)

• Properties of the context
    – Pull: Specified by explicit, user-driven query (e.g., keywords, a form)
    – Push: Specified by implicit context (e.g., a page, a user, a session)
       • Most applications are somewhere on continuum of pull and push

• Properties of the feedback on the matches made
    – Types and semantics of feedback (e.g., click, vote)
    – Latency (e.g., available in 5 minutes vs. 1 day)
    – Volume (e.g., 100K per day vs. 300M per day)

• Constraints specifying legitimate matches
    – e.g., business rules, diversity rules, editorial Voice
    – Multiple objectives
• Available Metadata (e.g., link graph, various user/item attributes)


  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                     12
Predicting User-Item Interactions (e.g. CTR)

• Myth: We have so much data on the web, if we can only
  process it the problem is solved
    – Number of things to learn increases with sample size
       • Rate of increase is not slow
    – Dynamic nature of systems make things worse
    – We want to learn things quickly and react fast


• Data is sparse in web recommender problems
    – We lack enough data to learn all we want to learn and as quickly
      as we would like to learn
    – Several Power laws interacting with each other
       • E.g. User visits power law, items served power law
               – Bivariate Zipf: Owen & Dyer, 2011


  Deepak Agarwal & Bee-Chung Chen @ ICML’11                              13
Can Machine Learning help?

• Fortunately, there are group behaviors that generalize to
  individuals & they are relatively stable
    – E.g. Users in San Francisco tend to read more baseball news


• Key issue: Estimating such groups
    – Coarse group : more stable but does not generalize that well.
    – Granular group: less stable with few individuals
    – Getting a good grouping structure is to hit the “sweet spot”


• Another big advantage on the web
    – Intervene and run small experiments on a small population to
      collect data that helps rapid convergence to the best choices(s)
         • We don’t need to learn all user-item interactions, only those that are good.


  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                               14
Predicting user-item interaction rates

                                         Feature construction
                               Content: IR, clustering, taxonomy, entity,..
                             User profiles: clicks, views, social, community,..




             Offline                                                      Online
( Captures stable characteristics       Initialize                   (Finer resolution
     at coarse resolutions)                                             Corrections)
    (Logistic, Boosting,….)                                          (item, user level)
                                                                     (Quick updates)



                                        Explore/Exploit
                                     (Adaptive sampling)
                                  (helps rapid convergence
                                       to best choices)

    Deepak Agarwal & Bee-Chung Chen @ ICML’11                                             15
Post-click: An example in Content Optimization

Recommender                •
                           EDITORIAL
                                                AD SERVER

               Clicks on FP links influence       DISPLAY
               downstream supply distribution
content                                         ADVERTISING Revenue




                                                  Downstream
                                                  engagement

                                                  (Time spent)




   Deepak Agarwal & Bee-Chung Chen @ ICML’11                     16
Serving Content on Front Page: Click Shaping

• What do we want to optimize?
•   Current: Maximize clicks (maximize downstream supply from FP)
•   But consider the following
      – Article 1: CTR=5%, utility per click = 5
      – Article 2: CTR=4.9%, utility per click=10
         • By promoting 2, we lose 1 click/100 visits, gain 5 utils
•   If we do this for a large number of visits --- lose some clicks but obtain
    significant gains in utility?
      – E.g. lose 5% relative CTR, gain 40% in utility (revenue, engagement, etc)




    Deepak Agarwal & Bee-Chung Chen @ ICML’11                                   17
Example Application:
      Today Module on Yahoo! Homepage

Currently in production, powered by some methods
                           discussed in this tutorial
Recommend packages:
                                                      Image
                                                      Title, summary
                      1        2            3   4     Links to other pages

                                                    Pick 4 out of a pool of K
                                                      K = 20 ~ 40
                                                      Dynamic

                                                    Routes traffic other pages



Deepak Agarwal & Bee-Chung Chen @ ICML’11                                        19
Problem definition

• Display “best” articles for each user visit
• Best - Maximize User Satisfaction, Engagement
    – BUT Hard to obtain quick feedback to measure these


• Approximation
    – Maximize utility based on immediate feedback (click rate) subject
      to constraints (relevance, freshness, diversity)
• Inventory of articles?
    – Created by human editors
    – Small pool (30-50 articles) but refreshes periodically




  Deepak Agarwal & Bee-Chung Chen @ ICML’11                           20
Where are we today?

• Before this research
    – Articles created and selected for display by editors
• After this research
    – Article placement done through statistical models
• How successful ?
 "Just look at our homepage, for example. Since we began pairing our content
   optimization technology with editorial expertise, we've seen click-through rates
   in the Today module more than double. ----- Carol Bartz, CEO Yahoo! Inc (Q4,
   2009)




  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                     21
Main Goals

• Methods to select most popular articles
    – This was done by editors before


• Provide personalized article selection
    – Based on user covariates
    – Based on per user behavior


• Scalability: Methods to generalize in small traffic scenarios
    – Today module part of most Y! portals around the world
    – Also syndicated to sources like Y! Mail, Y! IM etc




  Deepak Agarwal & Bee-Chung Chen @ ICML’11                   22
Similar applications

•   Goal: Use same methods for selecting most popular, personalization
    across different applications at Y!
•   Good news! Methods generalize, already in use




    Deepak Agarwal & Bee-Chung Chen @ ICML’11                        23
Next few hours



                                   Most Popular          Personalized
                                   Recommendation        Recommendation

  Offline Models                                         Collaborative filtering
                                                         (cold-start problem)

  Online Models                    Time-series models    Incremental CF,
                                                         online regression

  Intelligent Initialization       Prior estimation      Prior estimation,
                                                         dimension reduction

  Explore/Exploit                  Multi-armed bandits   Bandits with covariates




  Deepak Agarwal & Bee-Chung Chen @ ICML’11                                        24

Weitere ähnliche Inhalte

Ähnlich wie Recommender Systems Tutorial (Part 1) -- Introduction

Recommendation engines matching items to users
Recommendation engines matching items to usersRecommendation engines matching items to users
Recommendation engines matching items to usersFlytxt
 
Scalable advertising recommender systems
Scalable advertising recommender systemsScalable advertising recommender systems
Scalable advertising recommender systemsJoaquin Delgado PhD.
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionDeepak Agarwal
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface晓愚 孟
 
Mini-training: Personalization & Recommendation Demystified
Mini-training: Personalization & Recommendation DemystifiedMini-training: Personalization & Recommendation Demystified
Mini-training: Personalization & Recommendation DemystifiedBetclic Everest Group Tech Team
 
Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...
 Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw... Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...
Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...Christian Posse
 
Behemoth SEO: Search Strategy for Huge Websites
Behemoth SEO: Search Strategy for Huge WebsitesBehemoth SEO: Search Strategy for Huge Websites
Behemoth SEO: Search Strategy for Huge WebsitesPhilipp Klöckner
 
L'Oreal Tech Talk
L'Oreal Tech TalkL'Oreal Tech Talk
L'Oreal Tech TalkDoug Chang
 
Utilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerceUtilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerceLiangjie Hong
 
Design the Search Experience
Design the Search ExperienceDesign the Search Experience
Design the Search ExperienceMarianne Sweeny
 
Data Science for Digital Commerce
Data Science for Digital CommerceData Science for Digital Commerce
Data Science for Digital CommerceManish Gupta, Ph.D.
 
Sept 20 2012 ona show me the numbers
Sept 20 2012 ona  show me the numbersSept 20 2012 ona  show me the numbers
Sept 20 2012 ona show me the numbersHack the Hood
 
Conversion Rate Optmization
Conversion Rate OptmizationConversion Rate Optmization
Conversion Rate OptmizationEdureka!
 
Narrative Mind Week 6 H4D Stanford 2016
Narrative Mind Week 6 H4D Stanford 2016Narrative Mind Week 6 H4D Stanford 2016
Narrative Mind Week 6 H4D Stanford 2016Stanford University
 
Business model innovation in the cloud v1
Business model innovation in the cloud v1Business model innovation in the cloud v1
Business model innovation in the cloud v1Michael Netzley, Ph.D.
 
Social Media Use Cases - Webinar
Social Media Use Cases - Webinar Social Media Use Cases - Webinar
Social Media Use Cases - Webinar Mechanical Turk
 
Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Joni Salminen
 
Pubcon Presentation on How to Resolve Search Engine Reputation Problems
Pubcon Presentation on How to Resolve Search Engine Reputation ProblemsPubcon Presentation on How to Resolve Search Engine Reputation Problems
Pubcon Presentation on How to Resolve Search Engine Reputation ProblemsOnline Reputation Management
 

Ähnlich wie Recommender Systems Tutorial (Part 1) -- Introduction (20)

Recommendation engines matching items to users
Recommendation engines matching items to usersRecommendation engines matching items to users
Recommendation engines matching items to users
 
kdd2015
kdd2015kdd2015
kdd2015
 
Scalable advertising recommender systems
Scalable advertising recommender systemsScalable advertising recommender systems
Scalable advertising recommender systems
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversion
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
Mini-training: Personalization & Recommendation Demystified
Mini-training: Personalization & Recommendation DemystifiedMini-training: Personalization & Recommendation Demystified
Mini-training: Personalization & Recommendation Demystified
 
Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...
 Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw... Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...
Key Lessons Learned Building Recommender Systems for Large-Scale Social Netw...
 
Behemoth SEO: Search Strategy for Huge Websites
Behemoth SEO: Search Strategy for Huge WebsitesBehemoth SEO: Search Strategy for Huge Websites
Behemoth SEO: Search Strategy for Huge Websites
 
L'Oreal Tech Talk
L'Oreal Tech TalkL'Oreal Tech Talk
L'Oreal Tech Talk
 
Utilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerceUtilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerce
 
Design the Search Experience
Design the Search ExperienceDesign the Search Experience
Design the Search Experience
 
Data Science for Digital Commerce
Data Science for Digital CommerceData Science for Digital Commerce
Data Science for Digital Commerce
 
Sept 20 2012 ona show me the numbers
Sept 20 2012 ona  show me the numbersSept 20 2012 ona  show me the numbers
Sept 20 2012 ona show me the numbers
 
SEO + MTurk
SEO + MTurk SEO + MTurk
SEO + MTurk
 
Conversion Rate Optmization
Conversion Rate OptmizationConversion Rate Optmization
Conversion Rate Optmization
 
Narrative Mind Week 6 H4D Stanford 2016
Narrative Mind Week 6 H4D Stanford 2016Narrative Mind Week 6 H4D Stanford 2016
Narrative Mind Week 6 H4D Stanford 2016
 
Business model innovation in the cloud v1
Business model innovation in the cloud v1Business model innovation in the cloud v1
Business model innovation in the cloud v1
 
Social Media Use Cases - Webinar
Social Media Use Cases - Webinar Social Media Use Cases - Webinar
Social Media Use Cases - Webinar
 
Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)
 
Pubcon Presentation on How to Resolve Search Engine Reputation Problems
Pubcon Presentation on How to Resolve Search Engine Reputation ProblemsPubcon Presentation on How to Resolve Search Engine Reputation Problems
Pubcon Presentation on How to Resolve Search Engine Reputation Problems
 

Kürzlich hochgeladen

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Recommender Systems Tutorial (Part 1) -- Introduction

  • 1. ICML’11 Tutorial: Recommender Problems for Web Applications Deepak Agarwal and Bee-Chung Chen Yahoo! Research
  • 2. Other Significant Y! Labs Contributors • Content Targeting – Pradheep Elango – Rajiv Khanna – Raghu Ramakrishnan – Xuanhui Wang – Liang Zhang • Ad Targeting – Nagaraj Kota Deepak Agarwal & Bee-Chung Chen @ ICML’11 2
  • 3. Agenda • Topic of Interest – Recommender problems for dynamic, time-sensitive applications dynamic • Content Optimization (main focus today), Online Advertising, Movie recommendation, shopping,… • Introduction (20 min, Deepak) • Offline components (40 min, Deepak) – Regression, Collaborative filtering (CF), … • Online components + initialization (70 min, Bee-Chung) – Time-series, online/incremental methods, explore/exploit (bandit) • Evaluation methods + Multi-Objective (10-15 min, Deepak) • Challenges (5-10 min, Deepak) Deepak Agarwal & Bee-Chung Chen @ ICML’11 3
  • 4. Three components we will focus on today • Defining the problem – Formulate objectives whose optimization achieves some long- term goals for the recommender system • E.g. How to serve content to optimize audience reach and engagement, optimize some combination of engagement and revenue ? • Modeling (to estimate some critical inputs) – Predict rates of some positive user interaction(s) with items based on data obtained from historical user-item interactions • E.g. Click rates, average time-spent on page, etc • Could be explicit feedback like ratings • Experimentation – Create experiments to collect data proactively to improve models, helps in converging to the best choice(s) cheaply and rapidly. • Explore and Exploit (continuous experimentation) • DOE (testing hypotheses by avoiding bias inherent in data) Deepak Agarwal & Bee-Chung Chen @ ICML’11 4
  • 5. Modern Recommendation Systems • Goal – Serve the right item to a user in a given context to optimize long- term business objectives • A scientific discipline that involves – Large scale Machine Learning & Statistics • Offline Models (capture global & stable characteristics) • Online Models (incorporates dynamic components) • Explore/Exploit (active and adaptive experimentation) – Multi-Objective Optimization • Click-rates (CTR), Engagement, advertising revenue, diversity, etc – Inferring user interest • Constructing User Profiles – Natural Language Processing to understand content • Topics, “aboutness”, entities, follow-up of something, breaking news,… Deepak Agarwal & Bee-Chung Chen @ ICML’11 5
  • 6. Some examples from content optimization • Simple version – I have a content module on my page, content inventory is obtained from a third party source which is further refined through editorial oversight. Can I algorithmically recommend content on this module? I want to improve overall click-rate (CTR) on this module • More advanced – I got X% lift in CTR. But I have additional information on other downstream utilities (e.g. advertising revenue). Can I increase downstream utility without losing too many clicks? • Highly advanced – There are multiple modules running on my webpage. How do I perform a simultaneous optimization? Deepak Agarwal & Bee-Chung Chen @ ICML’11 6
  • 7. Recommend search queries Recommend packages: Image Title, summary Links to other pages Pick 4 out of a pool of K K = 20 ~ 40 Dynamic Routes traffic other pages Recommend applications Recommend news article Deepak Agarwal & Bee-Chung Chen @ ICML’11 7
  • 8. Problems in this example • Optimize CTR on multiple modules – Today Module, Trending Now, Personal Assistant, News – Simple solution: Treat modules as independent, optimize separately. May not be the best when there are strong correlations. • For any single module – Optimize some combination of CTR, downstream engagement, and perhaps advertising revenue. Deepak Agarwal & Bee-Chung Chen @ ICML’11 8
  • 9. Online Advertising Response rates (click, conversion, ad-view) Bids conversion ML /Statistical Auction model Advertisers Select argmax f(bid,response rates) Click Recommend Ads Best ad(s) Ad Network Page User •Examples: Yahoo, Google, MSN, … Ad exchanges (RightMedia, DoubleClick, …) Publisher Deepak Agarwal & Bee-Chung Chen @ ICML’11 9
  • 10. Recommender problems in general • Example applications • Search: Web, Vertical • Online Advertising Item Inventory • Content Articles, web page, •….. ads, … Context query, page, … Use an automated algorithm to select item(s) to show Get feedback (click, time spent,..) USER Refine the models Repeat (large number of times) Optimize metric(s) of interest (Total clicks, Total revenue,…) Deepak Agarwal & Bee-Chung Chen @ ICML’11 10
  • 11. Important Factors • Items: Articles, ads, modules, movies, users, updates, etc. • Context: query keywords, pages, mobile, social media, etc. • Metric to optimize (e.g., relevance score, CTR, revenue, engagement) – Currently, most applications are single-objective – Could be multi-objective optimization (maximize X subject to Y, Z,..) • Properties of the item pool – Size (e.g., all web pages vs. 40 stories) – Quality of the pool (e.g., anything vs. editorially selected) – Lifetime (e.g., mostly old items vs. mostly new items) Deepak Agarwal & Bee-Chung Chen @ ICML’11 11
  • 12. Factors affecting Solution (continued) • Properties of the context – Pull: Specified by explicit, user-driven query (e.g., keywords, a form) – Push: Specified by implicit context (e.g., a page, a user, a session) • Most applications are somewhere on continuum of pull and push • Properties of the feedback on the matches made – Types and semantics of feedback (e.g., click, vote) – Latency (e.g., available in 5 minutes vs. 1 day) – Volume (e.g., 100K per day vs. 300M per day) • Constraints specifying legitimate matches – e.g., business rules, diversity rules, editorial Voice – Multiple objectives • Available Metadata (e.g., link graph, various user/item attributes) Deepak Agarwal & Bee-Chung Chen @ ICML’11 12
  • 13. Predicting User-Item Interactions (e.g. CTR) • Myth: We have so much data on the web, if we can only process it the problem is solved – Number of things to learn increases with sample size • Rate of increase is not slow – Dynamic nature of systems make things worse – We want to learn things quickly and react fast • Data is sparse in web recommender problems – We lack enough data to learn all we want to learn and as quickly as we would like to learn – Several Power laws interacting with each other • E.g. User visits power law, items served power law – Bivariate Zipf: Owen & Dyer, 2011 Deepak Agarwal & Bee-Chung Chen @ ICML’11 13
  • 14. Can Machine Learning help? • Fortunately, there are group behaviors that generalize to individuals & they are relatively stable – E.g. Users in San Francisco tend to read more baseball news • Key issue: Estimating such groups – Coarse group : more stable but does not generalize that well. – Granular group: less stable with few individuals – Getting a good grouping structure is to hit the “sweet spot” • Another big advantage on the web – Intervene and run small experiments on a small population to collect data that helps rapid convergence to the best choices(s) • We don’t need to learn all user-item interactions, only those that are good. Deepak Agarwal & Bee-Chung Chen @ ICML’11 14
  • 15. Predicting user-item interaction rates Feature construction Content: IR, clustering, taxonomy, entity,.. User profiles: clicks, views, social, community,.. Offline Online ( Captures stable characteristics Initialize (Finer resolution at coarse resolutions) Corrections) (Logistic, Boosting,….) (item, user level) (Quick updates) Explore/Exploit (Adaptive sampling) (helps rapid convergence to best choices) Deepak Agarwal & Bee-Chung Chen @ ICML’11 15
  • 16. Post-click: An example in Content Optimization Recommender • EDITORIAL AD SERVER Clicks on FP links influence DISPLAY downstream supply distribution content ADVERTISING Revenue Downstream engagement (Time spent) Deepak Agarwal & Bee-Chung Chen @ ICML’11 16
  • 17. Serving Content on Front Page: Click Shaping • What do we want to optimize? • Current: Maximize clicks (maximize downstream supply from FP) • But consider the following – Article 1: CTR=5%, utility per click = 5 – Article 2: CTR=4.9%, utility per click=10 • By promoting 2, we lose 1 click/100 visits, gain 5 utils • If we do this for a large number of visits --- lose some clicks but obtain significant gains in utility? – E.g. lose 5% relative CTR, gain 40% in utility (revenue, engagement, etc) Deepak Agarwal & Bee-Chung Chen @ ICML’11 17
  • 18. Example Application: Today Module on Yahoo! Homepage Currently in production, powered by some methods discussed in this tutorial
  • 19. Recommend packages: Image Title, summary 1 2 3 4 Links to other pages Pick 4 out of a pool of K K = 20 ~ 40 Dynamic Routes traffic other pages Deepak Agarwal & Bee-Chung Chen @ ICML’11 19
  • 20. Problem definition • Display “best” articles for each user visit • Best - Maximize User Satisfaction, Engagement – BUT Hard to obtain quick feedback to measure these • Approximation – Maximize utility based on immediate feedback (click rate) subject to constraints (relevance, freshness, diversity) • Inventory of articles? – Created by human editors – Small pool (30-50 articles) but refreshes periodically Deepak Agarwal & Bee-Chung Chen @ ICML’11 20
  • 21. Where are we today? • Before this research – Articles created and selected for display by editors • After this research – Article placement done through statistical models • How successful ? "Just look at our homepage, for example. Since we began pairing our content optimization technology with editorial expertise, we've seen click-through rates in the Today module more than double. ----- Carol Bartz, CEO Yahoo! Inc (Q4, 2009) Deepak Agarwal & Bee-Chung Chen @ ICML’11 21
  • 22. Main Goals • Methods to select most popular articles – This was done by editors before • Provide personalized article selection – Based on user covariates – Based on per user behavior • Scalability: Methods to generalize in small traffic scenarios – Today module part of most Y! portals around the world – Also syndicated to sources like Y! Mail, Y! IM etc Deepak Agarwal & Bee-Chung Chen @ ICML’11 22
  • 23. Similar applications • Goal: Use same methods for selecting most popular, personalization across different applications at Y! • Good news! Methods generalize, already in use Deepak Agarwal & Bee-Chung Chen @ ICML’11 23
  • 24. Next few hours Most Popular Personalized Recommendation Recommendation Offline Models Collaborative filtering (cold-start problem) Online Models Time-series models Incremental CF, online regression Intelligent Initialization Prior estimation Prior estimation, dimension reduction Explore/Exploit Multi-armed bandits Bandits with covariates Deepak Agarwal & Bee-Chung Chen @ ICML’11 24

Hinweis der Redaktion

  1. This only shows one scenario; that of content match. Let’s add Sponsored Search (Replace Content with Query) and Have a new slide for display advertising. This also does not provide info for the revenue model (shall we add it here or later).