SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Learn Like a Human – Taking Machine
Learning from Batch to Real-Time
Elad Rosenheim
Who am I
 Architect at Dynamic Yield,
“Predictors” Team Lead
 Previously:
 AlphaCSP
 SAP
 Performance & Scale, DevOps
 Measure All the Things!
 East-Asia & Japan
Who’s Dynamic Yield?
We’re optimizing & personalizing websites since 2011
 Start-up in Tel-Aviv
headed by Liad Agmon
 I Joined as 5th employee, we’re 50 now and growing fast
On the Agenda
Our clients’ problem
Old School Solutions
Meet the ML Bandits
Our clients’ problem
Publishers, retailers, SaaS
all share a common problem
They know their domain
but not how to optimize for each user
Screen real-estate is limited
yet everyone sees the same thing
What top videos to show on
NBC News’ site?
What user segments should see
this element at this location?
What’s the best layout for this
element?
Both the layout of this page and
each element in it deserve testing
What’s the best layout?
What types of products to show
whom?
What articles to show on
ynet’s homepage?
What titles and images?
In what order?
What is the best default sort order for products on Adika?
Does is significantly differ between user segments?
The Beginning
 First, there was the educated guess
 Then, there was the A/B test
 "Data Beats Opinion“
 Freedom to experiment (with nice tools)
 Hopefully: less fear of change, less politics
 How does it work?
 Split traffic between baseline and alternative variations
 In theory: sit & wait for significant results
 In practice: peek at the numbers till the nice “95% confidence”
A/B Tests: Already Old School?
While you wait, you're bleeding clicks
clicks == money
What about the really dynamic stuff?
Campaigns, Current Headlines, Products on Sale
Enter the Multi-Arm Bandits
 A Single-Arm Bandit
 Suppose I have multiple arms in front of me,
each with its unknown mean reward…
 How do I optimize income from multiple machines?
 Caution or Haste?
 Explore vs. Exploit
 In our context:
How do I optimize multiple variations?
Bandits - A Classic Problem
 (Very) Simple Solutions
 ε-greedy, ε–decreasing
 First 100% random explore, then ~90% exploit?
 Magic numbers, built-in revenue loss
 Bayesian-based approaches
 Smoother curve from explore to exploit
 “Winner” is now a less relevant term
Bandits work well when…
 We want to find the variation “best on average“
…but we’re not improving the conversion rate of any single variation
2.4% 1.7% 0.4%
Enter Personalization
 Each of us is a beautiful and unique feature vector!
 By showing the right variation to the right people,
we can improve conversions per variation
and beat the best variation
 ML Challenge Accepted
The Usual Suspects
Collaborative Filtering?
 Very big, very sparse matrix
 Cold Start
 Batch
 Not suitable in this case
Classifiers?
Logistic Regression, Random Forest et al.
 Periodically learn over all converters so far
 More data == more time, bigger model
 Not the classic question
What We Need
 Like a bandit, we need to learn as we go (not in batch),
but this time with “context” - the user’s data
 Incremental Learning over the stream of impressions & rewards
(“Partial Fit”)
 We’re looking to…
 Start learning from the first impression
 Handle the explore-exploit curve
 Run fast (enough)
 In the worst case: converge on the best variation, like a bandit
Meet the Contextual Bandits
 They “eat” the data stream
 They demand fast access to user data
 Historical or immediate
 Their model is always ready for action
 In the Papers
 Linear Bayes, LinUCB
 What we do: Per-Variation Logistic Regression
 A variant supporting updates in “mini-batches”
 Exploration-on-top
 Worst case: “Garbage In  Multi Arm Bandit Out”
 Light on memory, compact output
 Online should be fast & scale
 Offline: a testbed for iteratively testing new ideas
 New algorithms
 Tweaked parameters
 Feature transformations
How We Do It: Online & Offline
The Online Flow
DY Web Servers
a. get our script
b. log impressions,
conversions
Queue
Per Test
Learn
Workers
User
DB
Persist
ModelLoad to
Predict Server
Queue
Per Test
A B C
A B C
A B C
Predictions
The Offline Evaluator
 Test, Improve, Iterate
 Using real-world data
 Using generated data
 From easy to hard
Going Global
 Learn in the center site, fast predict in each geo. How?
 Push models via local Redis slaves
 Compressed SSH tunnel
 User data - daily aggregation
 Storage into LMDB (simple, fast memory-mapped K/V DB)
 Sync via S3 (LZ4 compressed), read from SSD
 Learn & Predict services
 Python as ML lingua franca: NumPy, SciPy, scikit-learn
Elad & Idan Say Goodbye
 Better data beats better algorithms
 Reduce aggressively
 Keep It Simple, Smart!
 Elad Rosenheim
 Idan Michaeli
 Read our blog
 Hiring? but of course! What’s with the Groundhog?

Weitere ähnliche Inhalte

Ähnlich wie Taking Machine Learning from Batch to Real-Time (big data eXposed 2015)

Learn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeLearn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeDynamic Yield
 
Growth Hacking Conference '17 - Antwerp
Growth Hacking Conference '17 - AntwerpGrowth Hacking Conference '17 - Antwerp
Growth Hacking Conference '17 - AntwerpThibault Imbert
 
2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute
2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute
2010 04 28 The Lean Startup webinar for the Lean Enterprise InstituteEric Ries
 
Master the essentials of conversion optimization
Master the essentials of conversion optimizationMaster the essentials of conversion optimization
Master the essentials of conversion optimizationArnas Rackauskas
 
Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...
Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...
Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...Intuit Inc.
 
WordCamp Nashville 2016: The promise and peril of Agile and Lean practices
WordCamp Nashville 2016: The promise and peril of Agile and Lean practicesWordCamp Nashville 2016: The promise and peril of Agile and Lean practices
WordCamp Nashville 2016: The promise and peril of Agile and Lean practicesmtoppa
 
HadoopSummit2015_SelfEvolvingModels
HadoopSummit2015_SelfEvolvingModelsHadoopSummit2015_SelfEvolvingModels
HadoopSummit2015_SelfEvolvingModelspeas2bees
 
2010 08 19 The Lean Startup TechAviv
2010 08 19 The Lean Startup TechAviv2010 08 19 The Lean Startup TechAviv
2010 08 19 The Lean Startup TechAvivEric Ries
 
Einstein Analytics Prediction Builder
Einstein Analytics Prediction BuilderEinstein Analytics Prediction Builder
Einstein Analytics Prediction Builderrikkehovgaard
 
LTK - FC - Supply Chain - Startup Challenge v3.pdf
LTK - FC - Supply Chain - Startup Challenge v3.pdfLTK - FC - Supply Chain - Startup Challenge v3.pdf
LTK - FC - Supply Chain - Startup Challenge v3.pdfjeroen_tjepkema
 
Big data workshop october 18
Big data workshop october 18Big data workshop october 18
Big data workshop october 18Mohammad Zaman
 
BDX 2016 - Kevin lyons & yakir buskilla @ eXelate
BDX 2016 - Kevin lyons & yakir buskilla  @ eXelate BDX 2016 - Kevin lyons & yakir buskilla  @ eXelate
BDX 2016 - Kevin lyons & yakir buskilla @ eXelate Ido Shilon
 
Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking
Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking
Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking jeroentjepkema
 
Self Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System AccuracySelf Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System AccuracyDataWorks Summit
 
HadoopSummit'2015:Self Evolving Models for Dynamic System Accuracy
HadoopSummit'2015:Self Evolving Models for Dynamic System AccuracyHadoopSummit'2015:Self Evolving Models for Dynamic System Accuracy
HadoopSummit'2015:Self Evolving Models for Dynamic System AccuracyRekha Joshi
 
2009 10 28 The Lean Startup In Paris
2009 10 28 The Lean Startup In Paris2009 10 28 The Lean Startup In Paris
2009 10 28 The Lean Startup In ParisEric Ries
 
2010 10 19 the lean startup workshop for i_gap ireland
2010 10 19 the lean startup workshop for i_gap ireland2010 10 19 the lean startup workshop for i_gap ireland
2010 10 19 the lean startup workshop for i_gap irelandEric Ries
 
Croll lean analytics workshop (3h) - lean ux nyc april 2014
Croll   lean analytics workshop (3h) - lean ux nyc april 2014Croll   lean analytics workshop (3h) - lean ux nyc april 2014
Croll lean analytics workshop (3h) - lean ux nyc april 2014Lean Analytics
 
Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018
Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018 Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018
Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018 Codemotion
 
2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsd2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsdEric Ries
 

Ähnlich wie Taking Machine Learning from Batch to Real-Time (big data eXposed 2015) (20)

Learn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-TimeLearn Like a Human: Taking Machine Learning from Batch to Real-Time
Learn Like a Human: Taking Machine Learning from Batch to Real-Time
 
Growth Hacking Conference '17 - Antwerp
Growth Hacking Conference '17 - AntwerpGrowth Hacking Conference '17 - Antwerp
Growth Hacking Conference '17 - Antwerp
 
2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute
2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute
2010 04 28 The Lean Startup webinar for the Lean Enterprise Institute
 
Master the essentials of conversion optimization
Master the essentials of conversion optimizationMaster the essentials of conversion optimization
Master the essentials of conversion optimization
 
Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...
Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...
Velocity Conference: Building a Scalable, Global SaaS Offering: Lessons from ...
 
WordCamp Nashville 2016: The promise and peril of Agile and Lean practices
WordCamp Nashville 2016: The promise and peril of Agile and Lean practicesWordCamp Nashville 2016: The promise and peril of Agile and Lean practices
WordCamp Nashville 2016: The promise and peril of Agile and Lean practices
 
HadoopSummit2015_SelfEvolvingModels
HadoopSummit2015_SelfEvolvingModelsHadoopSummit2015_SelfEvolvingModels
HadoopSummit2015_SelfEvolvingModels
 
2010 08 19 The Lean Startup TechAviv
2010 08 19 The Lean Startup TechAviv2010 08 19 The Lean Startup TechAviv
2010 08 19 The Lean Startup TechAviv
 
Einstein Analytics Prediction Builder
Einstein Analytics Prediction BuilderEinstein Analytics Prediction Builder
Einstein Analytics Prediction Builder
 
LTK - FC - Supply Chain - Startup Challenge v3.pdf
LTK - FC - Supply Chain - Startup Challenge v3.pdfLTK - FC - Supply Chain - Startup Challenge v3.pdf
LTK - FC - Supply Chain - Startup Challenge v3.pdf
 
Big data workshop october 18
Big data workshop october 18Big data workshop october 18
Big data workshop october 18
 
BDX 2016 - Kevin lyons & yakir buskilla @ eXelate
BDX 2016 - Kevin lyons & yakir buskilla  @ eXelate BDX 2016 - Kevin lyons & yakir buskilla  @ eXelate
BDX 2016 - Kevin lyons & yakir buskilla @ eXelate
 
Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking
Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking
Adobe User Group Amsterdam - Correlation between Innovation & Growth Hacking
 
Self Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System AccuracySelf Evolving Model to Attain to State of Dynamic System Accuracy
Self Evolving Model to Attain to State of Dynamic System Accuracy
 
HadoopSummit'2015:Self Evolving Models for Dynamic System Accuracy
HadoopSummit'2015:Self Evolving Models for Dynamic System AccuracyHadoopSummit'2015:Self Evolving Models for Dynamic System Accuracy
HadoopSummit'2015:Self Evolving Models for Dynamic System Accuracy
 
2009 10 28 The Lean Startup In Paris
2009 10 28 The Lean Startup In Paris2009 10 28 The Lean Startup In Paris
2009 10 28 The Lean Startup In Paris
 
2010 10 19 the lean startup workshop for i_gap ireland
2010 10 19 the lean startup workshop for i_gap ireland2010 10 19 the lean startup workshop for i_gap ireland
2010 10 19 the lean startup workshop for i_gap ireland
 
Croll lean analytics workshop (3h) - lean ux nyc april 2014
Croll   lean analytics workshop (3h) - lean ux nyc april 2014Croll   lean analytics workshop (3h) - lean ux nyc april 2014
Croll lean analytics workshop (3h) - lean ux nyc april 2014
 
Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018
Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018 Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018
Catalina Oyaneder | Ultimate Stack Compilation | Codemotion Madrid 2018
 
2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsd2010 10 28 the lean startup at ucsd
2010 10 28 the lean startup at ucsd
 

Kürzlich hochgeladen

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...masabamasaba
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationShrmpro
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 

Kürzlich hochgeladen (20)

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

Taking Machine Learning from Batch to Real-Time (big data eXposed 2015)

  • 1.
  • 2. Learn Like a Human – Taking Machine Learning from Batch to Real-Time Elad Rosenheim
  • 3. Who am I  Architect at Dynamic Yield, “Predictors” Team Lead  Previously:  AlphaCSP  SAP  Performance & Scale, DevOps  Measure All the Things!  East-Asia & Japan
  • 4. Who’s Dynamic Yield? We’re optimizing & personalizing websites since 2011  Start-up in Tel-Aviv headed by Liad Agmon  I Joined as 5th employee, we’re 50 now and growing fast
  • 5. On the Agenda Our clients’ problem Old School Solutions Meet the ML Bandits
  • 6. Our clients’ problem Publishers, retailers, SaaS all share a common problem They know their domain but not how to optimize for each user Screen real-estate is limited yet everyone sees the same thing
  • 7. What top videos to show on NBC News’ site? What user segments should see this element at this location? What’s the best layout for this element?
  • 8. Both the layout of this page and each element in it deserve testing What’s the best layout? What types of products to show whom?
  • 9. What articles to show on ynet’s homepage? What titles and images? In what order?
  • 10. What is the best default sort order for products on Adika? Does is significantly differ between user segments?
  • 11. The Beginning  First, there was the educated guess  Then, there was the A/B test  "Data Beats Opinion“  Freedom to experiment (with nice tools)  Hopefully: less fear of change, less politics  How does it work?  Split traffic between baseline and alternative variations  In theory: sit & wait for significant results  In practice: peek at the numbers till the nice “95% confidence”
  • 12. A/B Tests: Already Old School? While you wait, you're bleeding clicks clicks == money What about the really dynamic stuff? Campaigns, Current Headlines, Products on Sale
  • 13. Enter the Multi-Arm Bandits  A Single-Arm Bandit  Suppose I have multiple arms in front of me, each with its unknown mean reward…  How do I optimize income from multiple machines?  Caution or Haste?  Explore vs. Exploit  In our context: How do I optimize multiple variations?
  • 14. Bandits - A Classic Problem  (Very) Simple Solutions  ε-greedy, ε–decreasing  First 100% random explore, then ~90% exploit?  Magic numbers, built-in revenue loss  Bayesian-based approaches  Smoother curve from explore to exploit  “Winner” is now a less relevant term
  • 15. Bandits work well when…  We want to find the variation “best on average“ …but we’re not improving the conversion rate of any single variation 2.4% 1.7% 0.4%
  • 16. Enter Personalization  Each of us is a beautiful and unique feature vector!  By showing the right variation to the right people, we can improve conversions per variation and beat the best variation  ML Challenge Accepted
  • 17. The Usual Suspects Collaborative Filtering?  Very big, very sparse matrix  Cold Start  Batch  Not suitable in this case Classifiers? Logistic Regression, Random Forest et al.  Periodically learn over all converters so far  More data == more time, bigger model  Not the classic question
  • 18. What We Need  Like a bandit, we need to learn as we go (not in batch), but this time with “context” - the user’s data  Incremental Learning over the stream of impressions & rewards (“Partial Fit”)  We’re looking to…  Start learning from the first impression  Handle the explore-exploit curve  Run fast (enough)  In the worst case: converge on the best variation, like a bandit
  • 19. Meet the Contextual Bandits  They “eat” the data stream  They demand fast access to user data  Historical or immediate  Their model is always ready for action  In the Papers  Linear Bayes, LinUCB  What we do: Per-Variation Logistic Regression  A variant supporting updates in “mini-batches”  Exploration-on-top  Worst case: “Garbage In  Multi Arm Bandit Out”  Light on memory, compact output
  • 20.  Online should be fast & scale  Offline: a testbed for iteratively testing new ideas  New algorithms  Tweaked parameters  Feature transformations How We Do It: Online & Offline
  • 21. The Online Flow DY Web Servers a. get our script b. log impressions, conversions Queue Per Test Learn Workers User DB Persist ModelLoad to Predict Server Queue Per Test A B C A B C A B C Predictions
  • 22. The Offline Evaluator  Test, Improve, Iterate  Using real-world data  Using generated data  From easy to hard
  • 23. Going Global  Learn in the center site, fast predict in each geo. How?  Push models via local Redis slaves  Compressed SSH tunnel  User data - daily aggregation  Storage into LMDB (simple, fast memory-mapped K/V DB)  Sync via S3 (LZ4 compressed), read from SSD  Learn & Predict services  Python as ML lingua franca: NumPy, SciPy, scikit-learn
  • 24. Elad & Idan Say Goodbye  Better data beats better algorithms  Reduce aggressively  Keep It Simple, Smart!  Elad Rosenheim  Idan Michaeli  Read our blog  Hiring? but of course! What’s with the Groundhog?

Hinweis der Redaktion

  1. איך ניסו עד היום לתקוף את הבעיות האלה?
  2. בואו ניקח את הפיתרון שלנו שלב אחד קדימה
  3. בואו ניקח את הפיתרון שלנו שלב אחד קדימה
  4. בואו ניקח את הפיתרון שלנו שלב אחד קדימה
  5. בואו ניקח את הפיתרון שלנו שלב אחד קדימה
  6. בקיצור, העולם לא כ"כ מופלא כפי שמוכרים לנו
  7. בואו ניקח את הפיתרון שלנו שלב אחד קדימה
  8. לבעיה הזו אין פיתרון אופטימלי, אבל יש בהחלט גישות שונות ברמות שונות של מורכבות
  9. עכשיו, בנדיטים הם לא כ"כ רעים למעשה...
  10. והמגבלה הזו מביאה אותנו לשלב הבא בחיפוש, והוא: פרסונליזציה
  11. יפה, אז קיבלנו את האתגר. על איזה אלגוריתמים אנחנו חושבים?
  12. אז בעצם, אנחנו מחפשים משהו אחר – משפחה חדשה של אלגוריתמים
  13. אז בואו ונכיר את הגיבורים החדשים שלנו...
  14. אלגוריתם כשלעצמו זה דבר נחמד, אבל איך בונים את כל המעטפת לפרודקשן?
  15. בואו ונבין טוב יותר את הזרימה