SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Online Parameter Selection for Web-based
Ranking via Bayesian Optimization
Sep 12 2019
Kinjal Basu
Staff Software Engineer,
Flagship AI
Agenda
1 Problem Setup
LinkedIn Feed
2
Reformulation as a
Black-Box
Optimization
3
Explore-Exploit
Algorithm
Thompson Sampling
4 Infrastructure
5 Results
LinkedIn Feed
Mission: Enable Members
to build an active
professional community
that advances their career.
LinkedIn Feed
Mission: Enable Members
to build an active
professional community
that advances their career.
Heterogenous List:
• Shares from a member’s
connections
• Recommendations such
as jobs, articles, courses,
etc.
• Sponsored content or
ads
Member
Shares
Jobs
Articles
Ads
Important Metrics
Viral Actions (VA)
Members liked, shared or
commented on an item
Job Applies (JA)
Members applied for a job
Engaged Feed Sessions (EFS)
Sessions where a member
engaged with anything on
feed.
Ranking Function
𝑆 𝑚, 𝑢 ≔ 𝑃𝑉𝐴 𝑚, 𝑢 + 𝑥 𝐸𝐹𝑆 𝑃𝐸𝐹𝑆 𝑚, 𝑢 + 𝑥𝐽𝐴 𝑃𝐽𝐴 𝑚, 𝑢
• The weight vector 𝑥 = 𝑥 𝐸𝐹𝑆, 𝑥𝐽𝐴 controls the balance between the three business
metrics: EFS, VA and JA.
• A Sample Business Strategy is
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒. 𝑉𝐴(𝑥)
𝑠. 𝑡. 𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆
𝐽𝐴 𝑥 > 𝑐𝐽𝐴
• m – Member, u - Item
Major Challenges
● The optimal value of x (tuning parameters) changes over time
● Example of changes
● New content types are added
● Score distribution changes ( Feature drift, updated models, etc.)
● With every change engineers would manually find the optimal x
● Run multiple A/B tests
● Not the best use of engineering time
Reformulation into a
Black-Box
Optimization
Problem
● 𝑌𝑖,𝑗
𝑘
𝑥 ∈ 0,1 denotes if the 𝑖-th member during the 𝑗-th session which was
served by parameter 𝑥, did action k or not. Here k = VA, EFS or JA.
● We model this data as follows
𝑌𝑖
𝑘
~ Binomial 𝑛𝑖 𝑥 , 𝜎 𝑓𝑘 𝑥
● where 𝑛𝑖 𝑥 is the total number of sessions of member 𝑖 which was served by 𝑥
and 𝑓𝑘 is a latent function for the particular metric.
● Assume a Gaussian process prior on each of the latent function 𝑓𝑘 .
Modeling The Metrics
Reformulation
We approximate each of the metrics as:
𝑉𝐴 𝑥 = 𝜎 𝑓𝑉𝐴 𝑥
𝐸𝐹𝑆 𝑥 = 𝜎 𝑓𝐸𝐹𝑆 𝑥
𝐽𝐴 𝑥 = 𝜎 𝑓𝐽𝐴 𝑥
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒. 𝑉𝐴(𝑥)
𝑠. 𝑡. 𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆
𝐽𝐴 𝑥 > 𝑐𝐽𝐴
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝜎 𝑓𝑉𝐴 𝑥
𝑠. 𝑡. 𝜎 𝑓𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆
𝜎 𝑓𝐽𝐴 𝑥 > 𝑐𝐽𝐴
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
Benefit: The last problem can now be solved using techniques from the literature of
Bayesian Optimization.
The original optimization problem can be written through this parametrization.
Explore-Exploit
Algorithms
Bayesian Optimization
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
A Quick Crash Course
• Explore-Exploit scheme to solve
Bayesian Optimization
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
A Quick Crash Course
• Explore-Exploit scheme to solve
• Assume a Gaussian Process prior
on 𝑓 𝑥 .
• Start with uniform sample
get (𝑥, 𝑓 𝑥 )
• Estimate the mean function and
covariance kernel
Bayesian Optimization
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
A Quick Crash Course
• Explore-Exploit scheme to solve
• Assume a Gaussian Process prior
on 𝑓 𝑥 .
• Start with uniform sample
get (𝑥, 𝑓 𝑥 )
• Estimate the mean function and
covariance kernel
• Draw the next sample 𝑥 which
maximizes an “acquisition
function” or predictive posterior.
• Continue the process.
Bayesian Optimization
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
A Quick Crash Course
• Explore-Exploit scheme to solve
• Assume a Gaussian Process prior
on 𝑓 𝑥 .
• Start with uniform sample
get (𝑥, 𝑓 𝑥 )
• Estimate the mean function and
covariance kernel
• Draw the next sample 𝑥 which
maximizes an “acquisition
function” or predictive posterior.
• Continue the process.
Bayesian Optimization
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
A Quick Crash Course
• Explore-Exploit scheme to solve
• Assume a Gaussian Process prior
on 𝑓 𝑥 .
• Start with uniform sample
get (𝑥, 𝑓 𝑥 )
• Estimate the mean function and
covariance kernel
• Draw the next sample 𝑥 which
maximizes an “acquisition
function” or predictive posterior.
• Continue the process.
Bayesian Optimization
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥
𝑥 ∈ 𝑋
A Quick Crash Course
• Explore-Exploit scheme to solve
• Assume a Gaussian Process prior
on 𝑓 𝑥 .
• Start with uniform sample
get (𝑥, 𝑓 𝑥 )
• Estimate the mean function and
covariance kernel
• Draw the next sample 𝑥 which
maximizes an “acquisition
function” or predictive posterior.
• Continue the process.
Thompson Sampling
● Consider a Gaussian Process Prior on each 𝑓𝑘,
where k is VA, EFS or JA
● Observe the data (𝑥, 𝑓𝑘(𝑥))
● Obtain the posterior of each 𝑓𝑘 which is another
Gaussian Process
● Sample from the posterior distribution and
generate samples for the overall objective
function.
● We get the next distribution of hyperparameters
by maximizing the sampled objectives (over a
grid of QMC points).
● Continue this process till convergence.
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝜎 𝑓𝑉𝐴 𝑥
𝑠. 𝑡. 𝜎 𝑓𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆
𝜎 𝑓𝐽𝐴 𝑥 > 𝑐𝐽𝐴
Infrastructur
e
Overall System Architecture
Offline System
The heart of the product
Tracking
• All member activities are
tracked with the
parameter of interest.
• ETL into HDFS for easy
consumption
Utility Evaluation
• Using the tracking data
we generate (𝑥, 𝑓𝑘 𝑥 ) for
each function k.
• The data is kept in
appropriate schema that
is problem agnostic.
Bayesian Optimization
• The data and the problem
specifications are input to
this.
• Using the data, we first
estimate each of the
posterior distributions of
the latent functions.
• Sample from those
distributions to get
distribution of the
parameter 𝑥 which
maximizes the objective.
• The Bayesian Optimization library generates
• A set of potential candidates for trying in the next round (𝑥1, 𝑥2, … , 𝑥 𝑛)
• A probability of how likely each point is the true maximizer (𝑝1, 𝑝2, … , 𝑝 𝑛) such
that 𝑖=1
𝑛
𝑝𝑖 = 1
• To serve members with the above distribution, each memberId is mapped to [0,1]
using a hashing function ℎ. For example, if
𝑖=1
𝑘
𝑝𝑖 < ℎ 𝐾𝑖𝑛𝑗𝑎𝑙 ≤ 𝑖=1
𝑘+1
𝑝𝑖
Then my feed is served with parameter 𝑥 𝑘+1
• The parameter store (depending on use-case) can contain
• <parameterValue, probability> i.e. 𝑥𝑖, 𝑝𝑖 or
• <memberId, parameterValue>
The Parameter Store and Online Serving
Online System
Serving hundreds of millions of users
Parameter Sampling
• For each member 𝑚 visiting LinkedIn,
• Depending on the parameter store, we either
evaluate <m, parameterValue>
• Or we directly call the store to retrieve
<m, parameterValue>
Online Serving
• Depending on the parameter value that is
retrieved (say x), the member’s full feed is
scored according to the ranking function
and served
𝑆 𝑚, 𝑢 ≔ 𝑃𝑉𝐴 𝑚, 𝑢 + 𝑥 𝐸𝐹𝑆 𝑃𝐸𝐹𝑆 𝑚, 𝑢
+ 𝑥𝐽𝐴 𝑃𝐽𝐴 𝑚, 𝑢
Practical Design Considerations
● Consistency in user experience.
● Randomize at member level instead of session level.
● Offline Flow Frequency
● Batch computation where we collect data for an hour and run the offline flow each hour to
update the sampling distribution.
● Assume (𝑓𝑉𝐴, 𝑓𝐸𝐹𝑆, 𝑓𝐽𝐴) to be Independent
● Works well in our setup. Joint modeling might reduce variance.
● Choice of Business Constraint Thresholds.
● Chosen to allow for a small drop.
Results
Simulation Results
Online A/B Testing Results
Online Convergence Plots
Key Takeaways
● Removes the human in the loop: Fully automatic process to find the optimal
parameters.
● Drastically improves developer productivity.
● Can scale to multiple competing metrics.
● Very easy onboarding infra for multiple vertical teams. Currently used by Ads, Feed,
Notifications, PYMK, etc.
● Future Direction
• Add on other Explore-Exploit algorithms.
• Move from Black-Box to Grey-Box optimizations
• Create a dependent structure on different utilities to better model the
variance.
• Automatically identify the primary metric by understanding the models better.
Thank you
Appendix – Library API
● Problem Specifications
Appendix – Library API
● Objective and Constraints

Weitere ähnliche Inhalte

Was ist angesagt?

Practical AI for Business: Bandit Algorithms
Practical AI for Business: Bandit AlgorithmsPractical AI for Business: Bandit Algorithms
Practical AI for Business: Bandit AlgorithmsSC5.io
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsJaya Kawale
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at NetflixJustin Basilico
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at NetflixLinas Baltrunas
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsJustin Basilico
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsYves Raimond
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
 
Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsYves Raimond
 
Supporting decisions with ML
Supporting decisions with MLSupporting decisions with ML
Supporting decisions with MLMegan Neider
 
Counterfactual Learning for Recommendation
Counterfactual Learning for RecommendationCounterfactual Learning for Recommendation
Counterfactual Learning for RecommendationOlivier Jeunen
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixGrace T. Huang
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at NetflixLinas Baltrunas
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender modelsParmeshwar Khurd
 

Was ist angesagt? (20)

Practical AI for Business: Bandit Algorithms
Practical AI for Business: Bandit AlgorithmsPractical AI for Business: Bandit Algorithms
Practical AI for Business: Bandit Algorithms
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Supporting decisions with ML
Supporting decisions with MLSupporting decisions with ML
Supporting decisions with ML
 
Counterfactual Learning for Recommendation
Counterfactual Learning for RecommendationCounterfactual Learning for Recommendation
Counterfactual Learning for Recommendation
 
Data council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at NetflixData council SF 2020 Building a Personalized Messaging System at Netflix
Data council SF 2020 Building a Personalized Messaging System at Netflix
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 
Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 

Ähnlich wie LinkedIn talk at Netflix ML Platform meetup Sep 2019

Bayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender SystemsBayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender SystemsViral Gupta
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsViral Gupta
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models ananth
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
the application of machine lerning algorithm for SEE
the application of machine lerning algorithm for SEEthe application of machine lerning algorithm for SEE
the application of machine lerning algorithm for SEEKiranKumar671235
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018Amazon Web Services
 
Feature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performanceFeature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performanceVenkat Projects
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllBernard Ong
 
A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...MuskanRath1
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OSri Ambati
 
Sagemaker Automatic model tuning
Sagemaker Automatic model tuningSagemaker Automatic model tuning
Sagemaker Automatic model tuningSoji Adeshina
 
Heuristic approch monika sanghani
Heuristic approch  monika sanghaniHeuristic approch  monika sanghani
Heuristic approch monika sanghaniMonika Sanghani
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part IIQuantUniversity
 

Ähnlich wie LinkedIn talk at Netflix ML Platform meetup Sep 2019 (20)

Bayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender SystemsBayesian Optimization for Balancing Metrics in Recommender Systems
Bayesian Optimization for Balancing Metrics in Recommender Systems
 
Ijcai 2020
Ijcai 2020Ijcai 2020
Ijcai 2020
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation Systems
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
the application of machine lerning algorithm for SEE
the application of machine lerning algorithm for SEEthe application of machine lerning algorithm for SEE
the application of machine lerning algorithm for SEE
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
Automatic Model Tuning Using Amazon SageMaker (AIM412) - AWS re:Invent 2018
 
Feature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performanceFeature extraction for classifying students based on theirac ademic performance
Feature extraction for classifying students based on theirac ademic performance
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_All
 
A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
 
Sagemaker Automatic model tuning
Sagemaker Automatic model tuningSagemaker Automatic model tuning
Sagemaker Automatic model tuning
 
Heuristic approch monika sanghani
Heuristic approch  monika sanghaniHeuristic approch  monika sanghani
Heuristic approch monika sanghani
 
Final
FinalFinal
Final
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
 

Kürzlich hochgeladen

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Kürzlich hochgeladen (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

LinkedIn talk at Netflix ML Platform meetup Sep 2019

  • 1. Online Parameter Selection for Web-based Ranking via Bayesian Optimization Sep 12 2019 Kinjal Basu Staff Software Engineer, Flagship AI
  • 2. Agenda 1 Problem Setup LinkedIn Feed 2 Reformulation as a Black-Box Optimization 3 Explore-Exploit Algorithm Thompson Sampling 4 Infrastructure 5 Results
  • 3. LinkedIn Feed Mission: Enable Members to build an active professional community that advances their career.
  • 4. LinkedIn Feed Mission: Enable Members to build an active professional community that advances their career. Heterogenous List: • Shares from a member’s connections • Recommendations such as jobs, articles, courses, etc. • Sponsored content or ads Member Shares Jobs Articles Ads
  • 5. Important Metrics Viral Actions (VA) Members liked, shared or commented on an item Job Applies (JA) Members applied for a job Engaged Feed Sessions (EFS) Sessions where a member engaged with anything on feed.
  • 6. Ranking Function 𝑆 𝑚, 𝑢 ≔ 𝑃𝑉𝐴 𝑚, 𝑢 + 𝑥 𝐸𝐹𝑆 𝑃𝐸𝐹𝑆 𝑚, 𝑢 + 𝑥𝐽𝐴 𝑃𝐽𝐴 𝑚, 𝑢 • The weight vector 𝑥 = 𝑥 𝐸𝐹𝑆, 𝑥𝐽𝐴 controls the balance between the three business metrics: EFS, VA and JA. • A Sample Business Strategy is 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒. 𝑉𝐴(𝑥) 𝑠. 𝑡. 𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆 𝐽𝐴 𝑥 > 𝑐𝐽𝐴 • m – Member, u - Item
  • 7. Major Challenges ● The optimal value of x (tuning parameters) changes over time ● Example of changes ● New content types are added ● Score distribution changes ( Feature drift, updated models, etc.) ● With every change engineers would manually find the optimal x ● Run multiple A/B tests ● Not the best use of engineering time
  • 9. ● 𝑌𝑖,𝑗 𝑘 𝑥 ∈ 0,1 denotes if the 𝑖-th member during the 𝑗-th session which was served by parameter 𝑥, did action k or not. Here k = VA, EFS or JA. ● We model this data as follows 𝑌𝑖 𝑘 ~ Binomial 𝑛𝑖 𝑥 , 𝜎 𝑓𝑘 𝑥 ● where 𝑛𝑖 𝑥 is the total number of sessions of member 𝑖 which was served by 𝑥 and 𝑓𝑘 is a latent function for the particular metric. ● Assume a Gaussian process prior on each of the latent function 𝑓𝑘 . Modeling The Metrics
  • 10. Reformulation We approximate each of the metrics as: 𝑉𝐴 𝑥 = 𝜎 𝑓𝑉𝐴 𝑥 𝐸𝐹𝑆 𝑥 = 𝜎 𝑓𝐸𝐹𝑆 𝑥 𝐽𝐴 𝑥 = 𝜎 𝑓𝐽𝐴 𝑥 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒. 𝑉𝐴(𝑥) 𝑠. 𝑡. 𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆 𝐽𝐴 𝑥 > 𝑐𝐽𝐴 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝜎 𝑓𝑉𝐴 𝑥 𝑠. 𝑡. 𝜎 𝑓𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆 𝜎 𝑓𝐽𝐴 𝑥 > 𝑐𝐽𝐴 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 Benefit: The last problem can now be solved using techniques from the literature of Bayesian Optimization. The original optimization problem can be written through this parametrization.
  • 12. Bayesian Optimization 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 A Quick Crash Course • Explore-Exploit scheme to solve
  • 13. Bayesian Optimization 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 A Quick Crash Course • Explore-Exploit scheme to solve • Assume a Gaussian Process prior on 𝑓 𝑥 . • Start with uniform sample get (𝑥, 𝑓 𝑥 ) • Estimate the mean function and covariance kernel
  • 14. Bayesian Optimization 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 A Quick Crash Course • Explore-Exploit scheme to solve • Assume a Gaussian Process prior on 𝑓 𝑥 . • Start with uniform sample get (𝑥, 𝑓 𝑥 ) • Estimate the mean function and covariance kernel • Draw the next sample 𝑥 which maximizes an “acquisition function” or predictive posterior. • Continue the process.
  • 15. Bayesian Optimization 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 A Quick Crash Course • Explore-Exploit scheme to solve • Assume a Gaussian Process prior on 𝑓 𝑥 . • Start with uniform sample get (𝑥, 𝑓 𝑥 ) • Estimate the mean function and covariance kernel • Draw the next sample 𝑥 which maximizes an “acquisition function” or predictive posterior. • Continue the process.
  • 16. Bayesian Optimization 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 A Quick Crash Course • Explore-Exploit scheme to solve • Assume a Gaussian Process prior on 𝑓 𝑥 . • Start with uniform sample get (𝑥, 𝑓 𝑥 ) • Estimate the mean function and covariance kernel • Draw the next sample 𝑥 which maximizes an “acquisition function” or predictive posterior. • Continue the process.
  • 17. Bayesian Optimization 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥 𝑥 ∈ 𝑋 A Quick Crash Course • Explore-Exploit scheme to solve • Assume a Gaussian Process prior on 𝑓 𝑥 . • Start with uniform sample get (𝑥, 𝑓 𝑥 ) • Estimate the mean function and covariance kernel • Draw the next sample 𝑥 which maximizes an “acquisition function” or predictive posterior. • Continue the process.
  • 18. Thompson Sampling ● Consider a Gaussian Process Prior on each 𝑓𝑘, where k is VA, EFS or JA ● Observe the data (𝑥, 𝑓𝑘(𝑥)) ● Obtain the posterior of each 𝑓𝑘 which is another Gaussian Process ● Sample from the posterior distribution and generate samples for the overall objective function. ● We get the next distribution of hyperparameters by maximizing the sampled objectives (over a grid of QMC points). ● Continue this process till convergence. 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝜎 𝑓𝑉𝐴 𝑥 𝑠. 𝑡. 𝜎 𝑓𝐸𝐹𝑆 𝑥 > 𝑐 𝐸𝐹𝑆 𝜎 𝑓𝐽𝐴 𝑥 > 𝑐𝐽𝐴
  • 21. Offline System The heart of the product Tracking • All member activities are tracked with the parameter of interest. • ETL into HDFS for easy consumption Utility Evaluation • Using the tracking data we generate (𝑥, 𝑓𝑘 𝑥 ) for each function k. • The data is kept in appropriate schema that is problem agnostic. Bayesian Optimization • The data and the problem specifications are input to this. • Using the data, we first estimate each of the posterior distributions of the latent functions. • Sample from those distributions to get distribution of the parameter 𝑥 which maximizes the objective.
  • 22. • The Bayesian Optimization library generates • A set of potential candidates for trying in the next round (𝑥1, 𝑥2, … , 𝑥 𝑛) • A probability of how likely each point is the true maximizer (𝑝1, 𝑝2, … , 𝑝 𝑛) such that 𝑖=1 𝑛 𝑝𝑖 = 1 • To serve members with the above distribution, each memberId is mapped to [0,1] using a hashing function ℎ. For example, if 𝑖=1 𝑘 𝑝𝑖 < ℎ 𝐾𝑖𝑛𝑗𝑎𝑙 ≤ 𝑖=1 𝑘+1 𝑝𝑖 Then my feed is served with parameter 𝑥 𝑘+1 • The parameter store (depending on use-case) can contain • <parameterValue, probability> i.e. 𝑥𝑖, 𝑝𝑖 or • <memberId, parameterValue> The Parameter Store and Online Serving
  • 23. Online System Serving hundreds of millions of users Parameter Sampling • For each member 𝑚 visiting LinkedIn, • Depending on the parameter store, we either evaluate <m, parameterValue> • Or we directly call the store to retrieve <m, parameterValue> Online Serving • Depending on the parameter value that is retrieved (say x), the member’s full feed is scored according to the ranking function and served 𝑆 𝑚, 𝑢 ≔ 𝑃𝑉𝐴 𝑚, 𝑢 + 𝑥 𝐸𝐹𝑆 𝑃𝐸𝐹𝑆 𝑚, 𝑢 + 𝑥𝐽𝐴 𝑃𝐽𝐴 𝑚, 𝑢
  • 24. Practical Design Considerations ● Consistency in user experience. ● Randomize at member level instead of session level. ● Offline Flow Frequency ● Batch computation where we collect data for an hour and run the offline flow each hour to update the sampling distribution. ● Assume (𝑓𝑉𝐴, 𝑓𝐸𝐹𝑆, 𝑓𝐽𝐴) to be Independent ● Works well in our setup. Joint modeling might reduce variance. ● Choice of Business Constraint Thresholds. ● Chosen to allow for a small drop.
  • 29. Key Takeaways ● Removes the human in the loop: Fully automatic process to find the optimal parameters. ● Drastically improves developer productivity. ● Can scale to multiple competing metrics. ● Very easy onboarding infra for multiple vertical teams. Currently used by Ads, Feed, Notifications, PYMK, etc. ● Future Direction • Add on other Explore-Exploit algorithms. • Move from Black-Box to Grey-Box optimizations • Create a dependent structure on different utilities to better model the variance. • Automatically identify the primary metric by understanding the models better.
  • 31. Appendix – Library API ● Problem Specifications
  • 32. Appendix – Library API ● Objective and Constraints