SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Downloaden Sie, um offline zu lesen
Bandit Algorithms
for Website
Optimization
by John Myles White
summary by Kyle (Kwanghee Choi)
Reference
1. Two Characters:
Exploration and Exploitation
- Need to balance exploration and exploitation
- or, experimentation and profit-maximization
- or, learning new ideas and taking advantage of the best of old ideas,
- or, gathering data and acting on that data
2. Why Use
Multiarmed Bandit Algorithms?
- Measurable achievements examples
- Traffic, Conversions, Sales, CTRs
- Definitions
- Reward: Measure of success
- Arms: List of potential changes
- Explaining standard A/B testing as an exploration - exploitation tradeoff
- Short period of pure exploration (Assigning equal numbers of users to A/B)
- Long period of pure exploitation (Send all of the users to successful option
- Why A/B testing might be a bad strategy?
- Abrupt transition
- Wastes resources exploring inferior options
3. The ϵ-Greedy Algorithm
- Tries to be fair to the two opposite goals of exploration & exploitation
- ϵ=0: Pure exploitation
- ϵ=1: Pure exploration
- Problem of fixed ϵ
- May need more exploration at the start, may need more exploitation after some time.
- Explores arms completely at random without any concern about their merits.
4. Debugging
Bandit Algorithms
- Bandit algorithms are not black-box functions.
- Bandit algorithms have to actively select which data it should acquire (Active Learning)
and analyze that data at real time (Online Learning).
- Bandit data and bandit analysis are inseparable. “Feedback cycle.”
4. Debugging
Bandit Algorithms
- Use Monte Carlo simulation to provide simulated data in real-time.
- Analyzing results
- Tracking the probability of choosing the best arm,
as both bandit algorithms and rewards are probabilistic.
- Tracking the average reward at each point in time.
- Tracking the cumulative reward at each point in time,
to look at the bigger picture of the lifetime performance.
5. The Softmax Algorithm
- Problem of fixed ϵ revisited
- If the difference in rewards between two arms is small,
more exploration is needed, and vice versa.
- Never get past the intrinsic errors caused by the purely random exploration strategy.
- Set the probability of choosing arm A with accumulative reward rA
as …
-
- Temperature parameter τ shifts the behavior along a continuum
between pure exploration ( τ = ∞ ) and exploitation ( τ = 0 ) .
- Negative rewards are okay thanks to exponential rescaling.
- Annealing: Encouraging to explore less over time by slowly decreasing τ .
6. UCB
The Upper Confidence Bound Algorithm
- Problems of softmax algorithms
- Only pay attention on how much reward they’ve gotten from the arms.
- Gullible: easily misled by a few negative experiences, as the algorithm do not keep track of how
much they know about the arms (how much confident).
- UCBs avoid being gullible by keeping track of confidence in assessments of the
estimated values of all the arms.
- UCBs doesn’t use randomness, and doesn’t have any free parameters.
6. UCB
The Upper Confidence Bound Algorithm
- UCB1 (one of the variants of UCBs) chooses arm i
with accumulative rewards ri
, bonus bi
, and number of times ni
as …
-
- Cold start is prevented by bi
= ∞
- UCBs are explicitly curious algorithms.
- Curiousness are implemented with bonus bi
, where bi
gets bigger when ni
is too small.
- So, we will occasionally visit the worst of the arms.
6. UCB
The Upper Confidence Bound Algorithm
- Comparing bandit algorithms side-by-side
- UCB1 is much noisier than ϵ -Greedy or Softmax.
- ϵ -Greedy doesn’t converge as quickly as Softmax.
- UCB1 takes a while to catch up with Softmax.
- UCB1 finds the best arm quickly,
but the backpedaling it does causes it to underperform the Softmax.
7. Bandits in the Real World:
Complexity and Complications
- A/A Testing
- Testing of bandit algorithms itself
- Estimation of the actual variability in real-time data.
- Running concurrent experiments
- May have strange interactions between experiments (ex. different logos and fonts)
- Continuous experimentation vs. Periodic testing
- Bandit algorithms look much better than A/B testing when you are willing to let them run for a
very long time.
- Metrics of Success
- Optimizing short-term CTR may destroy long-term retainability.
- Rescaling metrics into 0-1 space helps algorithms to work well.
- Moving worlds
- Arms with changing rewards raise serious problems.
- Average (No parameter to tune) vs. Weighted Average (Flexibility towards moving worlds)
8. Conclusion
- There is no universal bandit algorithm that will always do the best job.
- Domain expertise and good judgement will always be necessary.
- There is always a trade-off between exploration & exploitation.
Initialization of an algorithm matters a lot. Biases may both help or hurt.
- Make sure you explore less over time.

Weitere ähnliche Inhalte

Ähnlich wie Bandit algorithms for website optimization - A summary

algorithm_2algorithm_analysis.pdf
algorithm_2algorithm_analysis.pdfalgorithm_2algorithm_analysis.pdf
algorithm_2algorithm_analysis.pdfHsuChi Chen
 
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHMNON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHMIRJET Journal
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentationHJ van Veen
 
The monte carlo method
The monte carlo methodThe monte carlo method
The monte carlo methodSaurabh Sood
 
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 MLconf
 
GA.-.Presentation
GA.-.PresentationGA.-.Presentation
GA.-.Presentationoldmanpat
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money LaunderingJim Dowling
 
Apriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptx
Apriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptxApriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptx
Apriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptxNingthoujamMahesh1
 
Bandit Algorithms
Bandit AlgorithmsBandit Algorithms
Bandit AlgorithmsSC5.io
 
Smartphone Activity Prediction
Smartphone Activity PredictionSmartphone Activity Prediction
Smartphone Activity PredictionTriskelion_Kaggle
 
Ensemble Contextual Bandits for Personalized Recommendation
Ensemble Contextual Bandits for Personalized RecommendationEnsemble Contextual Bandits for Personalized Recommendation
Ensemble Contextual Bandits for Personalized RecommendationLiang Tang
 
Bias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniquesBias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniquesOlivier Teytaud
 
Uncertainties in large scale power systems
Uncertainties in large scale power systemsUncertainties in large scale power systems
Uncertainties in large scale power systemsOlivier Teytaud
 
Using Java & Genetic Algorithms to Beat the Market
Using Java & Genetic Algorithms to Beat the MarketUsing Java & Genetic Algorithms to Beat the Market
Using Java & Genetic Algorithms to Beat the MarketMatthew Ring
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validationAbraham Kong
 
Is Production RL at a tipping point?
Is Production RL at a tipping point?Is Production RL at a tipping point?
Is Production RL at a tipping point?M Waleed Kadous
 
AWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep diveAWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep divePeter Shi
 

Ähnlich wie Bandit algorithms for website optimization - A summary (20)

algorithm_2algorithm_analysis.pdf
algorithm_2algorithm_analysis.pdfalgorithm_2algorithm_analysis.pdf
algorithm_2algorithm_analysis.pdf
 
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHMNON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
The monte carlo method
The monte carlo methodThe monte carlo method
The monte carlo method
 
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
 
GA.-.Presentation
GA.-.PresentationGA.-.Presentation
GA.-.Presentation
 
cs1538.ppt
cs1538.pptcs1538.ppt
cs1538.ppt
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
 
Apriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptx
Apriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptxApriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptx
Apriori-Eclat-Upper-Confidence-Bound-in-Machine-Learning.pptx
 
Bandit Algorithms
Bandit AlgorithmsBandit Algorithms
Bandit Algorithms
 
Smartphone Activity Prediction
Smartphone Activity PredictionSmartphone Activity Prediction
Smartphone Activity Prediction
 
Multi Armed Bandits
Multi Armed BanditsMulti Armed Bandits
Multi Armed Bandits
 
Ensemble Contextual Bandits for Personalized Recommendation
Ensemble Contextual Bandits for Personalized RecommendationEnsemble Contextual Bandits for Personalized Recommendation
Ensemble Contextual Bandits for Personalized Recommendation
 
Bias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniquesBias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniques
 
Uncertainties in large scale power systems
Uncertainties in large scale power systemsUncertainties in large scale power systems
Uncertainties in large scale power systems
 
Using Java & Genetic Algorithms to Beat the Market
Using Java & Genetic Algorithms to Beat the MarketUsing Java & Genetic Algorithms to Beat the Market
Using Java & Genetic Algorithms to Beat the Market
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validation
 
Is Production RL at a tipping point?
Is Production RL at a tipping point?Is Production RL at a tipping point?
Is Production RL at a tipping point?
 
September 11, Deliberative Algorithms II
September 11, Deliberative Algorithms IISeptember 11, Deliberative Algorithms II
September 11, Deliberative Algorithms II
 
AWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep diveAWS Cost Opt Meetup 2 - News corp - Spot On deep dive
AWS Cost Opt Meetup 2 - News corp - Spot On deep dive
 

Mehr von Kwanghee Choi

Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022Kwanghee Choi
 
추천 시스템 한 발짝 떨어져 살펴보기 (3)
추천 시스템 한 발짝 떨어져 살펴보기 (3)추천 시스템 한 발짝 떨어져 살펴보기 (3)
추천 시스템 한 발짝 떨어져 살펴보기 (3)Kwanghee Choi
 
Recommendation systems: Vertical and Horizontal Scrolls
Recommendation systems: Vertical and Horizontal ScrollsRecommendation systems: Vertical and Horizontal Scrolls
Recommendation systems: Vertical and Horizontal ScrollsKwanghee Choi
 
추천 시스템 한 발짝 떨어져 살펴보기 (1)
추천 시스템 한 발짝 떨어져 살펴보기 (1)추천 시스템 한 발짝 떨어져 살펴보기 (1)
추천 시스템 한 발짝 떨어져 살펴보기 (1)Kwanghee Choi
 
추천 시스템 한 발짝 떨어져 살펴보기 (2)
추천 시스템 한 발짝 떨어져 살펴보기 (2)추천 시스템 한 발짝 떨어져 살펴보기 (2)
추천 시스템 한 발짝 떨어져 살펴보기 (2)Kwanghee Choi
 
Before and After the AI Winter - Recap
Before and After the AI Winter - RecapBefore and After the AI Winter - Recap
Before and After the AI Winter - RecapKwanghee Choi
 
Mastering Gomoku - Recap
Mastering Gomoku - RecapMastering Gomoku - Recap
Mastering Gomoku - RecapKwanghee Choi
 
Teachings of Ada Lovelace
Teachings of Ada LovelaceTeachings of Ada Lovelace
Teachings of Ada LovelaceKwanghee Choi
 
div, grad, curl, and all that - a review
div, grad, curl, and all that - a reviewdiv, grad, curl, and all that - a review
div, grad, curl, and all that - a reviewKwanghee Choi
 
Neural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to LearnNeural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to LearnKwanghee Choi
 
Duality between OOP and RL
Duality between OOP and RLDuality between OOP and RL
Duality between OOP and RLKwanghee Choi
 
Dummy log generation using poisson sampling
 Dummy log generation using poisson sampling Dummy log generation using poisson sampling
Dummy log generation using poisson samplingKwanghee Choi
 
Azure functions: Quickstart
Azure functions: QuickstartAzure functions: Quickstart
Azure functions: QuickstartKwanghee Choi
 
Modern convolutional object detectors
Modern convolutional object detectorsModern convolutional object detectors
Modern convolutional object detectorsKwanghee Choi
 
Usage of Moving Average
Usage of Moving AverageUsage of Moving Average
Usage of Moving AverageKwanghee Choi
 
Jpl coding standard for the c programming language
Jpl coding standard for the c programming languageJpl coding standard for the c programming language
Jpl coding standard for the c programming languageKwanghee Choi
 

Mehr von Kwanghee Choi (19)

Visual Transformers
Visual TransformersVisual Transformers
Visual Transformers
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022
 
추천 시스템 한 발짝 떨어져 살펴보기 (3)
추천 시스템 한 발짝 떨어져 살펴보기 (3)추천 시스템 한 발짝 떨어져 살펴보기 (3)
추천 시스템 한 발짝 떨어져 살펴보기 (3)
 
Recommendation systems: Vertical and Horizontal Scrolls
Recommendation systems: Vertical and Horizontal ScrollsRecommendation systems: Vertical and Horizontal Scrolls
Recommendation systems: Vertical and Horizontal Scrolls
 
추천 시스템 한 발짝 떨어져 살펴보기 (1)
추천 시스템 한 발짝 떨어져 살펴보기 (1)추천 시스템 한 발짝 떨어져 살펴보기 (1)
추천 시스템 한 발짝 떨어져 살펴보기 (1)
 
추천 시스템 한 발짝 떨어져 살펴보기 (2)
추천 시스템 한 발짝 떨어져 살펴보기 (2)추천 시스템 한 발짝 떨어져 살펴보기 (2)
추천 시스템 한 발짝 떨어져 살펴보기 (2)
 
Before and After the AI Winter - Recap
Before and After the AI Winter - RecapBefore and After the AI Winter - Recap
Before and After the AI Winter - Recap
 
Mastering Gomoku - Recap
Mastering Gomoku - RecapMastering Gomoku - Recap
Mastering Gomoku - Recap
 
Teachings of Ada Lovelace
Teachings of Ada LovelaceTeachings of Ada Lovelace
Teachings of Ada Lovelace
 
div, grad, curl, and all that - a review
div, grad, curl, and all that - a reviewdiv, grad, curl, and all that - a review
div, grad, curl, and all that - a review
 
Gaussian processes
Gaussian processesGaussian processes
Gaussian processes
 
Neural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to LearnNeural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to Learn
 
Duality between OOP and RL
Duality between OOP and RLDuality between OOP and RL
Duality between OOP and RL
 
JFEF encoding
JFEF encodingJFEF encoding
JFEF encoding
 
Dummy log generation using poisson sampling
 Dummy log generation using poisson sampling Dummy log generation using poisson sampling
Dummy log generation using poisson sampling
 
Azure functions: Quickstart
Azure functions: QuickstartAzure functions: Quickstart
Azure functions: Quickstart
 
Modern convolutional object detectors
Modern convolutional object detectorsModern convolutional object detectors
Modern convolutional object detectors
 
Usage of Moving Average
Usage of Moving AverageUsage of Moving Average
Usage of Moving Average
 
Jpl coding standard for the c programming language
Jpl coding standard for the c programming languageJpl coding standard for the c programming language
Jpl coding standard for the c programming language
 

Kürzlich hochgeladen

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 

Kürzlich hochgeladen (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 

Bandit algorithms for website optimization - A summary

  • 1. Bandit Algorithms for Website Optimization by John Myles White summary by Kyle (Kwanghee Choi)
  • 3. 1. Two Characters: Exploration and Exploitation - Need to balance exploration and exploitation - or, experimentation and profit-maximization - or, learning new ideas and taking advantage of the best of old ideas, - or, gathering data and acting on that data
  • 4. 2. Why Use Multiarmed Bandit Algorithms? - Measurable achievements examples - Traffic, Conversions, Sales, CTRs - Definitions - Reward: Measure of success - Arms: List of potential changes - Explaining standard A/B testing as an exploration - exploitation tradeoff - Short period of pure exploration (Assigning equal numbers of users to A/B) - Long period of pure exploitation (Send all of the users to successful option - Why A/B testing might be a bad strategy? - Abrupt transition - Wastes resources exploring inferior options
  • 5. 3. The ϵ-Greedy Algorithm - Tries to be fair to the two opposite goals of exploration & exploitation - ϵ=0: Pure exploitation - ϵ=1: Pure exploration - Problem of fixed ϵ - May need more exploration at the start, may need more exploitation after some time. - Explores arms completely at random without any concern about their merits.
  • 6. 4. Debugging Bandit Algorithms - Bandit algorithms are not black-box functions. - Bandit algorithms have to actively select which data it should acquire (Active Learning) and analyze that data at real time (Online Learning). - Bandit data and bandit analysis are inseparable. “Feedback cycle.”
  • 7. 4. Debugging Bandit Algorithms - Use Monte Carlo simulation to provide simulated data in real-time. - Analyzing results - Tracking the probability of choosing the best arm, as both bandit algorithms and rewards are probabilistic. - Tracking the average reward at each point in time. - Tracking the cumulative reward at each point in time, to look at the bigger picture of the lifetime performance.
  • 8. 5. The Softmax Algorithm - Problem of fixed ϵ revisited - If the difference in rewards between two arms is small, more exploration is needed, and vice versa. - Never get past the intrinsic errors caused by the purely random exploration strategy. - Set the probability of choosing arm A with accumulative reward rA as … - - Temperature parameter τ shifts the behavior along a continuum between pure exploration ( τ = ∞ ) and exploitation ( τ = 0 ) . - Negative rewards are okay thanks to exponential rescaling. - Annealing: Encouraging to explore less over time by slowly decreasing τ .
  • 9. 6. UCB The Upper Confidence Bound Algorithm - Problems of softmax algorithms - Only pay attention on how much reward they’ve gotten from the arms. - Gullible: easily misled by a few negative experiences, as the algorithm do not keep track of how much they know about the arms (how much confident). - UCBs avoid being gullible by keeping track of confidence in assessments of the estimated values of all the arms. - UCBs doesn’t use randomness, and doesn’t have any free parameters.
  • 10. 6. UCB The Upper Confidence Bound Algorithm - UCB1 (one of the variants of UCBs) chooses arm i with accumulative rewards ri , bonus bi , and number of times ni as … - - Cold start is prevented by bi = ∞ - UCBs are explicitly curious algorithms. - Curiousness are implemented with bonus bi , where bi gets bigger when ni is too small. - So, we will occasionally visit the worst of the arms.
  • 11. 6. UCB The Upper Confidence Bound Algorithm - Comparing bandit algorithms side-by-side - UCB1 is much noisier than ϵ -Greedy or Softmax. - ϵ -Greedy doesn’t converge as quickly as Softmax. - UCB1 takes a while to catch up with Softmax. - UCB1 finds the best arm quickly, but the backpedaling it does causes it to underperform the Softmax.
  • 12. 7. Bandits in the Real World: Complexity and Complications - A/A Testing - Testing of bandit algorithms itself - Estimation of the actual variability in real-time data. - Running concurrent experiments - May have strange interactions between experiments (ex. different logos and fonts) - Continuous experimentation vs. Periodic testing - Bandit algorithms look much better than A/B testing when you are willing to let them run for a very long time. - Metrics of Success - Optimizing short-term CTR may destroy long-term retainability. - Rescaling metrics into 0-1 space helps algorithms to work well. - Moving worlds - Arms with changing rewards raise serious problems. - Average (No parameter to tune) vs. Weighted Average (Flexibility towards moving worlds)
  • 13. 8. Conclusion - There is no universal bandit algorithm that will always do the best job. - Domain expertise and good judgement will always be necessary. - There is always a trade-off between exploration & exploitation. Initialization of an algorithm matters a lot. Biases may both help or hurt. - Make sure you explore less over time.