SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
1
From practice to theory
in learning from massive data
Charles Elkan
Amazon Fellow
August 14, 2016
Important
Information here is already public.
Opinions are mine, not Amazon’s.
3
Outline
Only 30 minutes!
1. Detecting anomalies in streaming data
2. Making Spark usable for real-time predictions
3. Amazon’s most important algorithm for recommendations
4. Uplift: We want causation, not merely correlation
Outline
1. Detecting anomalies in streaming data
2. Making Spark usable for real-time predictions
3. Amazon’s most important algorithm for recommendations
4. Uplift: We want causation, not merely correlation
From practice to theory
From theory to practice
Now for everyone!
Outline
1. Detecting anomalies in streaming data
2. Making Spark usable for real-time predictions
3. Amazon’s most important algorithm for recommendations
4. Uplift: We want causation, not merely correlation
From practice to practice
Outline
1. Detecting anomalies in streaming data
2. Making Spark usable for real-time predictions
3. Amazon’s most important algorithm for recommendations
4. Uplift: We want causation, not merely correlation
13
Academic versus applied
In theory, researchers favor simplicity. In practice, they don’t.
In industry, simplicity genuinely wins.
Example: Desiderata for recommender systems:
1. Respect the privacy of users; don’t be creepy.
2. Make recommendations understandable.
3. Make them responsive to the user’s most recent interests.
4. Generate them with millisecond latency.
14
Amazon’s most important recommender system
1. Respect the privacy of users; don’t be creepy.
2. Make recommendations understandable.
3. And responsive to the user’s most recent interests.
4. Generate them with millisecond latency.
Outline
1. Detecting anomalies in streaming data
2. Making Spark usable for real-time predictions
3. Amazon’s most important algorithm for recommendations
4. Uplift: We want causation, not merely correlation
What data scientists do every day
Let x be a user and let R = 0 or 1 be a response. For example, R=1
means the user buys shoes in the next month.
Routinely, we train models to predict the probability p(R=1|x).
We send messages and coupons to users with high p(R=1|x).
16
Is p(R=1|x) actually useful?
In principle, no. "Our goal is not to predict the future; it is to
change the future."
• Merely predicting user behavior is of limited interest.
We want to select treatments that influence users.
• T = t means we choose treatment t.
• For each available t, compute p(R=1|x,T=t).
• Choose the t that gives highest probability.
17
The risk of ignoring uplift
18
Users are ranked by p(R=1|x), shown by the brown line.
The blue dashed line shows p(R=1|x,T=t) .
The treatment t has a negative effect for users in the top 5%:
p(R=1|x,T=t) < p(R=1|x).
Politicians know this …
If you are a Republican, don’t target confirmed Democrat voters!
Instead:
• Send persuasive messages to undecided voters.
• Send “get out the vote” messages to confirmed supporters.
• Send “please donate” messages to these people also.
A common scenario for uplift
Many treatments are almost free to apply, such as sending email.
The uplift question is then which treatment is most effective.
For each user x, we want to know which t has highest value
p(R=1|x,T=t).
Keep in mind: The same treatment may be the best for all x.
20
A public dataset
Published by Kevin Hillstrom, former VP of database marketing
at Nordstrom.
Studied in several published papers on uplift, notably by Nicholas
Radcliffe, professor at the University of Edinburgh.
• 64,000 past customers of an e-commerce site selling clothing.
• Randomized to no email, men’s email, or women’s email.
• Three outcomes: Binary visit? purchase? and numerical spend.
21
Looking at the data
22
Treatments have a larger effect on “visit” than on “purchase
given visit” or on “spend given purchase.”
We'll analyze uplift (i.e., the causal influence of treatments)
for visits.
Table from Hillstrom’s MineThatData email analytics challenge by Radcliffe.
The linear probability model
Assume the linear function p(R=1|x) = b0 + ∑i bi * xi.
• Find coefficients bi to minimize square loss.
Square loss is proper, so predicted probabilities are calibrated.
Avoid overfitting and predictions <0 or >1 by not having too
many predictors.
Commonly used in econometrics, not in ML. In practice, often
quite similar to logistic regression.
23
probability of visit =
7.5% + … +
6.5% IF (men’s past
AND men’s email) +
6.6% IF (women’s
past AND men’s
email) +
6.1% IF (women’s
past AND women’s
email)
24
Including treatment indicators M and W
25
The men’s email is effective for customers who have
previously purchased men’s or women’s clothing.
The women’s email is not effective for customers who have
previously purchased only men’s clothing.
26
Optimal treatment policy:
• If only men’s previous purchases: send men’s email.
• If only women’s purchases: send either email.
• If both: send men’s email.
Hypothesis: Women tend to buy clothing for their families,
but men tend to buy clothing only for themselves.
Validation
How can we confirm that we have found an optimal policy?
Approach:
1. Train models of response for each treatment.
2. For each user x in a test set, plot both predicted probabilities.
3. Three separate test sets: users who previously purchased only
women’s clothing, only men’s, or both.
4. The latter two sets should show p(R=1|x, T=M) > p(R=1|x, T=W)
for most x.
Results using random forests:
Lower two panels: As expected,
p(R=1|x, T=M) > p(R=1|x, T=W).
Top panel: The two treatments
M and W are equally effective.
What comes next?
Conclusion: Indeed, one treatment (the men’s email) can be
optimal for all customers.
The step beyond uplift modeling is reinforcement learning:
Learning a sequence of actions that is best for each user.
• The goal is to maximize total lifetime reward from each
customer.
• Learn simultaneously how customers evolve and how
they respond to actions that we take.
29
Questions?
1. Detecting anomalies in streaming data
2. Making Spark usable for real-time predictions
3. Amazon’s most important algorithm for recommendations
4. Uplift: We want causation, not merely correlation

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to simulating data to improve your research
Introduction to simulating data to improve your researchIntroduction to simulating data to improve your research
Introduction to simulating data to improve your researchDorothy Bishop
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced dataSaurabhWani6
 
Statistical Test
Statistical TestStatistical Test
Statistical Testguestdbf093
 
Qnt 275 final exam july 2017 version
Qnt 275 final exam july 2017 versionQnt 275 final exam july 2017 version
Qnt 275 final exam july 2017 versionAdams-ASs
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 

Was ist angesagt? (6)

Introduction to simulating data to improve your research
Introduction to simulating data to improve your researchIntroduction to simulating data to improve your research
Introduction to simulating data to improve your research
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced data
 
Statistical Test
Statistical TestStatistical Test
Statistical Test
 
Qnt 275 final exam july 2017 version
Qnt 275 final exam july 2017 versionQnt 275 final exam july 2017 version
Qnt 275 final exam july 2017 version
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 

Ähnlich wie From Practice to Theory in Learning from Massive Data by Charles Elkan at BigMine16

Personalized News Recommendation (Stream Data Based)
Personalized News Recommendation (Stream Data Based)Personalized News Recommendation (Stream Data Based)
Personalized News Recommendation (Stream Data Based)Umesh Singla
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headachesBenoît Rostykus
 
Marketing Experiment - Part II: Analysis
Marketing Experiment - Part II: Analysis Marketing Experiment - Part II: Analysis
Marketing Experiment - Part II: Analysis Minha Hwang
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshopodsc
 
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan HamedUplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan HamedRising Media Ltd.
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithmsArunangsu Sahu
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdfgadissaassefa
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSScsula its training
 
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017MLconf
 
Counterfactual Learning for Recommendation
Counterfactual Learning for RecommendationCounterfactual Learning for Recommendation
Counterfactual Learning for RecommendationOlivier Jeunen
 
Data Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That WayData Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That WayMelinda Thielbar
 
Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data sciencepujashri1975
 
Using Excel to Build Understanding AMATYC 2015
Using Excel to Build Understanding AMATYC 2015Using Excel to Build Understanding AMATYC 2015
Using Excel to Build Understanding AMATYC 2015kathleenalmy
 
slides-correlations.pdf
slides-correlations.pdfslides-correlations.pdf
slides-correlations.pdfFlorentBersani
 
statistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfstatistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfkobra22
 
Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methodssonangrai
 
Disease Prediction And Doctor Appointment system
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment systemKOYELMAJUMDAR1
 

Ähnlich wie From Practice to Theory in Learning from Massive Data by Charles Elkan at BigMine16 (20)

Personalized News Recommendation (Stream Data Based)
Personalized News Recommendation (Stream Data Based)Personalized News Recommendation (Stream Data Based)
Personalized News Recommendation (Stream Data Based)
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
 
Marketing Experiment - Part II: Analysis
Marketing Experiment - Part II: Analysis Marketing Experiment - Part II: Analysis
Marketing Experiment - Part II: Analysis
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshop
 
DATA COLLECTION IN RESEARCH
DATA COLLECTION IN RESEARCHDATA COLLECTION IN RESEARCH
DATA COLLECTION IN RESEARCH
 
151028_abajpai1
151028_abajpai1151028_abajpai1
151028_abajpai1
 
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan HamedUplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithms
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSS
 
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
 
Counterfactual Learning for Recommendation
Counterfactual Learning for RecommendationCounterfactual Learning for Recommendation
Counterfactual Learning for Recommendation
 
Data Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That WayData Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That Way
 
Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data science
 
Using Excel to Build Understanding AMATYC 2015
Using Excel to Build Understanding AMATYC 2015Using Excel to Build Understanding AMATYC 2015
Using Excel to Build Understanding AMATYC 2015
 
slides-correlations.pdf
slides-correlations.pdfslides-correlations.pdf
slides-correlations.pdf
 
statistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfstatistics - Populations and Samples.pdf
statistics - Populations and Samples.pdf
 
Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methods
 
Disease Prediction And Doctor Appointment system
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment system
 
Stat342 ch1
Stat342 ch1Stat342 ch1
Stat342 ch1
 

Mehr von BigMine

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...BigMine
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16BigMine
 
Big Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBig Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBigMine
 
Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeBigMine
 
Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...BigMine
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...BigMine
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...BigMine
 
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosBigMine
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles ParkerBigMine
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 

Mehr von BigMine (10)

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
 
Big Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina MorikBig Data and Small Devices by Katharina Morik
Big Data and Small Devices by Katharina Morik
 
Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping Ye
 
Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 

Kürzlich hochgeladen

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 

Kürzlich hochgeladen (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 

From Practice to Theory in Learning from Massive Data by Charles Elkan at BigMine16

  • 1. 1 From practice to theory in learning from massive data Charles Elkan Amazon Fellow August 14, 2016
  • 2. Important Information here is already public. Opinions are mine, not Amazon’s.
  • 3. 3
  • 4. Outline Only 30 minutes! 1. Detecting anomalies in streaming data 2. Making Spark usable for real-time predictions 3. Amazon’s most important algorithm for recommendations 4. Uplift: We want causation, not merely correlation
  • 5. Outline 1. Detecting anomalies in streaming data 2. Making Spark usable for real-time predictions 3. Amazon’s most important algorithm for recommendations 4. Uplift: We want causation, not merely correlation
  • 7. From theory to practice
  • 9. Outline 1. Detecting anomalies in streaming data 2. Making Spark usable for real-time predictions 3. Amazon’s most important algorithm for recommendations 4. Uplift: We want causation, not merely correlation
  • 10. From practice to practice
  • 11.
  • 12. Outline 1. Detecting anomalies in streaming data 2. Making Spark usable for real-time predictions 3. Amazon’s most important algorithm for recommendations 4. Uplift: We want causation, not merely correlation
  • 13. 13 Academic versus applied In theory, researchers favor simplicity. In practice, they don’t. In industry, simplicity genuinely wins. Example: Desiderata for recommender systems: 1. Respect the privacy of users; don’t be creepy. 2. Make recommendations understandable. 3. Make them responsive to the user’s most recent interests. 4. Generate them with millisecond latency.
  • 14. 14 Amazon’s most important recommender system 1. Respect the privacy of users; don’t be creepy. 2. Make recommendations understandable. 3. And responsive to the user’s most recent interests. 4. Generate them with millisecond latency.
  • 15. Outline 1. Detecting anomalies in streaming data 2. Making Spark usable for real-time predictions 3. Amazon’s most important algorithm for recommendations 4. Uplift: We want causation, not merely correlation
  • 16. What data scientists do every day Let x be a user and let R = 0 or 1 be a response. For example, R=1 means the user buys shoes in the next month. Routinely, we train models to predict the probability p(R=1|x). We send messages and coupons to users with high p(R=1|x). 16
  • 17. Is p(R=1|x) actually useful? In principle, no. "Our goal is not to predict the future; it is to change the future." • Merely predicting user behavior is of limited interest. We want to select treatments that influence users. • T = t means we choose treatment t. • For each available t, compute p(R=1|x,T=t). • Choose the t that gives highest probability. 17
  • 18. The risk of ignoring uplift 18 Users are ranked by p(R=1|x), shown by the brown line. The blue dashed line shows p(R=1|x,T=t) . The treatment t has a negative effect for users in the top 5%: p(R=1|x,T=t) < p(R=1|x).
  • 19. Politicians know this … If you are a Republican, don’t target confirmed Democrat voters! Instead: • Send persuasive messages to undecided voters. • Send “get out the vote” messages to confirmed supporters. • Send “please donate” messages to these people also.
  • 20. A common scenario for uplift Many treatments are almost free to apply, such as sending email. The uplift question is then which treatment is most effective. For each user x, we want to know which t has highest value p(R=1|x,T=t). Keep in mind: The same treatment may be the best for all x. 20
  • 21. A public dataset Published by Kevin Hillstrom, former VP of database marketing at Nordstrom. Studied in several published papers on uplift, notably by Nicholas Radcliffe, professor at the University of Edinburgh. • 64,000 past customers of an e-commerce site selling clothing. • Randomized to no email, men’s email, or women’s email. • Three outcomes: Binary visit? purchase? and numerical spend. 21
  • 22. Looking at the data 22 Treatments have a larger effect on “visit” than on “purchase given visit” or on “spend given purchase.” We'll analyze uplift (i.e., the causal influence of treatments) for visits. Table from Hillstrom’s MineThatData email analytics challenge by Radcliffe.
  • 23. The linear probability model Assume the linear function p(R=1|x) = b0 + ∑i bi * xi. • Find coefficients bi to minimize square loss. Square loss is proper, so predicted probabilities are calibrated. Avoid overfitting and predictions <0 or >1 by not having too many predictors. Commonly used in econometrics, not in ML. In practice, often quite similar to logistic regression. 23
  • 24. probability of visit = 7.5% + … + 6.5% IF (men’s past AND men’s email) + 6.6% IF (women’s past AND men’s email) + 6.1% IF (women’s past AND women’s email) 24 Including treatment indicators M and W
  • 25. 25 The men’s email is effective for customers who have previously purchased men’s or women’s clothing. The women’s email is not effective for customers who have previously purchased only men’s clothing.
  • 26. 26 Optimal treatment policy: • If only men’s previous purchases: send men’s email. • If only women’s purchases: send either email. • If both: send men’s email. Hypothesis: Women tend to buy clothing for their families, but men tend to buy clothing only for themselves.
  • 27. Validation How can we confirm that we have found an optimal policy? Approach: 1. Train models of response for each treatment. 2. For each user x in a test set, plot both predicted probabilities. 3. Three separate test sets: users who previously purchased only women’s clothing, only men’s, or both. 4. The latter two sets should show p(R=1|x, T=M) > p(R=1|x, T=W) for most x.
  • 28. Results using random forests: Lower two panels: As expected, p(R=1|x, T=M) > p(R=1|x, T=W). Top panel: The two treatments M and W are equally effective.
  • 29. What comes next? Conclusion: Indeed, one treatment (the men’s email) can be optimal for all customers. The step beyond uplift modeling is reinforcement learning: Learning a sequence of actions that is best for each user. • The goal is to maximize total lifetime reward from each customer. • Learn simultaneously how customers evolve and how they respond to actions that we take. 29
  • 30. Questions? 1. Detecting anomalies in streaming data 2. Making Spark usable for real-time predictions 3. Amazon’s most important algorithm for recommendations 4. Uplift: We want causation, not merely correlation