How to correctly estimate the effect of online advertisement(About Double Machine Learning)

•

1 gefällt mir•1,658 views

Yusuke Kaneko

TokyoR 第71回でのLTスライドです．発表中も少し触れましたが、今回使用したCriteoのデータセットは高次元データとは言い難いのでややタイトル詐欺感はあります

Daten & Analysen

How to correctly estimate
the effect of online
advertisement
Introduction of Double Machine Lerning

About me
● Yusuke Kaneko(@coldstart_p)
● CyberAgent, Inc.
● Studying: Econometrics, Causal Inference

Main Theme
How to estimate correctly treatment effects in
high-dimensional data(such as advertising data)

Original
1. Chernozhukov, Victor, et al. "Double/debiased/neyman machine learning of
treatment effects." American Economic Review 107.5 (2017): 261-65.
2. Belloni, A., Chernozhukov, V., Fernández‐Val, I., & Hansen, C. (2017).
Program evaluation and causal inference with high‐dimensional data.
Econometrica, 85(1), 233-298.
3. Chernozhukov, Victor, et al. "Double/debiased machine learning for treatment
and structural parameters." The Econometrics Journal 21.1 (2018): C1-C68.

Problem
● Advertisement data has a lot of variables. (= High-Dimensional Data)
example: Type of Advertisement , Time , History-data
→ Under such circumstances, it is known that low-dimensional parameters such
as treatment effects can not be estimated well by Machine Learning Methods
(The model and figure are on the slide later)
● Motivation: How to estimate treatment effects correctly ?
→ Use Double Machine Learning(Double ML) !
=( FLEXIBLE and SIMPLE Method)
● Double ML Code are on github(https://github.com/VC2015/DMLonGitHub/)
…(Double ML package does not exist yet. )

Data
● Criteo Uplift Modeling Dataset(EC Site AD Large Dataset for causal
inference)
Recently
Published
Dataset!

Data
● Features are as follows
● Averege Visit Rate and Treatment Ratio
is on the right

Model
● Partially Linear Model
Y: Outcome Variable(Whether user visited = visit)
D: Treatment Variable (Whether or not an ad has been delivered = treatment)
Z: Vector of covariates(such as history data = f0 f1 ...)
θ0: “lift” parameter(Treatment Effect parameter)
● D is expressed by the following equation
D is conditionally exogenous

Bad ML Estimation
● Predict Y using D and Z, then obtain
(Example)
1. Run Random Forest and get
2. and get
3. Repeat 1. and 2. until convergence

Bad ML Estimation
● Prediction performance is Good, but the distribution of looks like
below.
(from paper)

Double ML
1. Predict Y and D using Z by
obtained using ML methods(Lasso , Random Forest etc... )
2. Residualize
3. Regress and get

Double ML
● Prediction performance is Good, and the distribution of looks like
below.
(from paper)

Sample Splitting
● In Double ML, Sample Splitting is one key Ingredient.
→ Convergence rate improves with sample splitting(Please refer to the papers
for the theoretical content)
(from paper)

Double ML in R
● All the code for the simulation is on the github

Simulation
● Since Criteo Dataset is too large , I made overall sampling first (about 2%)
● Introduce Sampling Bias( Since we are using data coming from a randomized
experiment)

Naive ATE Method
● Unbiased ATE(Average Treatment Effect) estimation is possible with this
technique if there is no sampling bias

Simulation
● Average treatment effect without sampling bias on the left
● Average treatment effect with sampling bias on the right
Highly Biased

Simulation
● Use approach: Double ML with Lasso(You can use RandomForest instead of
Lasso)
→ Counter approach: Naive ATE, Single equation Lasso , Usual Lasso , Lasso
with Propensity Score Weighting

Simulation
● Result(with Lasso with Propensity Score Weighting) Lasso with PSW‘s
performance is
very bad , so
remove this

Simulation
● Result(without Lasso with Propensity Score Weighting)
Double ML’s
performance is
very good!

Conclusion
● Double ML is Flexible and Simple Method for causal inference
● Double ML’s performance seems very good with real data

Empfohlen

Actor critic algorithmJie-Han Chen

Fake News Detection Using Machine learning algorithm MudasirBashir23

An introduction to reinforcement learningJie-Han Chen

Proximal Policy Optimization (Reinforcement Learning)Thom Lane

バンディットアルゴリズム入門と実践智之村上

Practical tips for handling noisy data and annotaitonRyuichiKanoh

Siamese networksNicholas McClure

Osaka.Stan #3 Chapter 5-2Takayuki Goto

Empfohlen

Actor critic algorithmJie-Han Chen

Fake News Detection Using Machine learning algorithm MudasirBashir23

An introduction to reinforcement learningJie-Han Chen

Proximal Policy Optimization (Reinforcement Learning)Thom Lane

バンディットアルゴリズム入門と実践智之村上

Practical tips for handling noisy data and annotaitonRyuichiKanoh

Siamese networksNicholas McClure

Osaka.Stan #3 Chapter 5-2Takayuki Goto

多腕バンディット問題: 定式化と応用 (第13回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology

Randomforestで高次元の変数重要度を見る #japanr LTAkifumi Eguchi

ゼロから始めるレコメンダシステムKazuaki Tanida

1 8.交互作用logics-of-blue

Proximal Policy OptimizationShubhaManikarnike

Rで学ぶロバスト推定Shintaro Fukushima

最近のKaggleに学ぶテーブルデータの特徴量エンジニアリングmlm_kansai

ブートストラップ法とその周辺とRDaisuke Yoneoka

機械学習モデルの判断根拠の説明Satoshi Hara

臨床家が知っておくべき臨床疫学・統計Yasuaki Sagara

機械学習によるデータ分析まわりのお話Ryota Kamoshida

第10章後半「ブースティングと加法的木」T T

Machine Learning for Survival AnalysisChandan Reddy

東京都市大学データ解析入門 8 クラスタリングと分類分析 1hirokazutanaka

第２回DARM勉強会Yoshitake Takebayashi

回帰モデルとして見る信号検出理論Shushi Namba

【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論Naoki Hayashi

言語モデル入門（第二版）Yoshinari Fujinuma

学位論文「ウェブ情報の信憑性分析に関する研究」Yusuke Yamamoto

SIGNATE オフロードコンペ精度認識部門 3rd Place SolutionYusuke Uchida

Faster and cheaper, smart ab experiments - public ver.Marsan Ma

IntroductioneditedMefratechnologies

Weitere ähnliche Inhalte

Was ist angesagt?

多腕バンディット問題: 定式化と応用 (第13回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology

Randomforestで高次元の変数重要度を見る #japanr LTAkifumi Eguchi

ゼロから始めるレコメンダシステムKazuaki Tanida

1 8.交互作用logics-of-blue

Proximal Policy OptimizationShubhaManikarnike

Rで学ぶロバスト推定Shintaro Fukushima

最近のKaggleに学ぶテーブルデータの特徴量エンジニアリングmlm_kansai

ブートストラップ法とその周辺とRDaisuke Yoneoka

機械学習モデルの判断根拠の説明Satoshi Hara

臨床家が知っておくべき臨床疫学・統計Yasuaki Sagara

機械学習によるデータ分析まわりのお話Ryota Kamoshida

第10章後半「ブースティングと加法的木」T T

Machine Learning for Survival AnalysisChandan Reddy

東京都市大学データ解析入門 8 クラスタリングと分類分析 1hirokazutanaka

第２回DARM勉強会Yoshitake Takebayashi

回帰モデルとして見る信号検出理論Shushi Namba

【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論Naoki Hayashi

言語モデル入門（第二版）Yoshinari Fujinuma

学位論文「ウェブ情報の信憑性分析に関する研究」Yusuke Yamamoto

SIGNATE オフロードコンペ精度認識部門 3rd Place SolutionYusuke Uchida

Was ist angesagt? (20)

多腕バンディット問題: 定式化と応用 (第13回ステアラボ人工知能セミナー)

Randomforestで高次元の変数重要度を見る #japanr LT

ゼロから始めるレコメンダシステム

1 8.交互作用

Proximal Policy Optimization

Rで学ぶロバスト推定

最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング

ブートストラップ法とその周辺とR

機械学習モデルの判断根拠の説明

臨床家が知っておくべき臨床疫学・統計

機械学習によるデータ分析まわりのお話

第10章後半「ブースティングと加法的木」

Machine Learning for Survival Analysis

東京都市大学データ解析入門 8 クラスタリングと分類分析 1

第２回DARM勉強会

回帰モデルとして見る信号検出理論

【博士論文発表会】パラメータ制約付き特異モデルの統計的学習理論

言語モデル入門（第二版）

学位論文「ウェブ情報の信憑性分析に関する研究」

SIGNATE オフロードコンペ精度認識部門 3rd Place Solution

Ähnlich wie How to correctly estimate the effect of online advertisement(About Double Machine Learning)

Faster and cheaper, smart ab experiments - public ver.Marsan Ma

IntroductioneditedMefratechnologies

Nbe rcausalpredictionv111 lecture2NBER

PREDICT 422 - Module 1.pptxVikramKumar790542

DissertationMefratechnologies

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar

Adapting neural networks for the estimation of treatment effectsViswanath Gangavaram

AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfMargiShah29

Housing price predictionAbhimanyu Dwivedi

Experimental designs and data analysis in the field of Agronomy science by ma...Manoj Sharma

Setting up an A/B-testing frameworkAgnes van Belle

Internship project report,Predictive ModellingAmit Kumar

housepriceprediction-180915174356.pdfVinayShekarReddy

Machine Learning.pptxNitinSharma134320

Data Science - Part V - Decision Trees & Random Forests Derek Kane

Probability distribution Function & Decision Trees in machine learningSadia Zafar

Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...INFOGAIN PUBLICATION

Classification of Headache Disorder Using Random Forest Algorithm.pptxAlemu Gudeta

debatrim_report (1)Debatri Mitra

How ml can improve purchase conversionsSudeep Shukla

Ähnlich wie How to correctly estimate the effect of online advertisement(About Double Machine Learning) (20)

Faster and cheaper, smart ab experiments - public ver.

Introductionedited

Nbe rcausalpredictionv111 lecture2

PREDICT 422 - Module 1.pptx

Dissertation

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES

Adapting neural networks for the estimation of treatment effects

AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf

Housing price prediction

Experimental designs and data analysis in the field of Agronomy science by ma...

Setting up an A/B-testing framework

Internship project report,Predictive Modelling

housepriceprediction-180915174356.pdf

Machine Learning.pptx

Data Science - Part V - Decision Trees & Random Forests

Probability distribution Function & Decision Trees in machine learning

Ijaems apr-2016-23 Study of Pruning Techniques to Predict Efficient Business ...

Classification of Headache Disorder Using Random Forest Algorithm.pptx

debatrim_report (1)

How ml can improve purchase conversions

Mehr von Yusuke Kaneko

Kdd 2021 読み会(clustering for private interest-based advertising & learning a l...Yusuke Kaneko

DID, Synthetic Control, CausalImpactYusuke Kaneko

企業の中の経済学Yusuke Kaneko

TokyoR_74_RDDYusuke Kaneko

LightGBM: a highly efficient gradient boosting decision treeYusuke Kaneko

Hastie_chapter5Yusuke Kaneko

効果のあるクリエイティブ広告の見つけ方(Contextual Bandit + TS or UCB)Yusuke Kaneko

Mehr von Yusuke Kaneko (7)

Kdd 2021 読み会(clustering for private interest-based advertising & learning a l...

DID, Synthetic Control, CausalImpact

企業の中の経済学

TokyoR_74_RDD

LightGBM: a highly efficient gradient boosting decision tree

Hastie_chapter5

効果のあるクリエイティブ広告の見つけ方(Contextual Bandit + TS or UCB)

Kürzlich hochgeladen

➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal

Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila

Predicting Loan Approval: A Data Science ProjectBoston Institute of Analytics

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823

➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823

Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann

Anomaly detection and data imputation within time seriesParis Women in Machine Learning and Data Science

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823

Kürzlich hochgeladen (20)

➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand

Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -

Aspirational Block Program Block Syaldey District - Almora

Predicting Loan Approval: A Data Science Project

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...

➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK

Anomaly detection and data imputation within time series

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand

How to correctly estimate the effect of online advertisement(About Double Machine Learning)

1. How to correctly estimate the effect of online advertisement Introduction of Double Machine Lerning

2. About me ● Yusuke Kaneko(@coldstart_p) ● CyberAgent, Inc. ● Studying: Econometrics, Causal Inference

3. Main Theme How to estimate correctly treatment effects in high-dimensional data(such as advertising data)

4. Original 1. Chernozhukov, Victor, et al. "Double/debiased/neyman machine learning of treatment effects." American Economic Review 107.5 (2017): 261-65. 2. Belloni, A., Chernozhukov, V., Fernández‐Val, I., & Hansen, C. (2017). Program evaluation and causal inference with high‐dimensional data. Econometrica, 85(1), 233-298. 3. Chernozhukov, Victor, et al. "Double/debiased machine learning for treatment and structural parameters." The Econometrics Journal 21.1 (2018): C1-C68.

5. Problem ● Advertisement data has a lot of variables. (= High-Dimensional Data) example: Type of Advertisement , Time , History-data → Under such circumstances, it is known that low-dimensional parameters such as treatment effects can not be estimated well by Machine Learning Methods (The model and figure are on the slide later) ● Motivation: How to estimate treatment effects correctly ? → Use Double Machine Learning(Double ML) ! =( FLEXIBLE and SIMPLE Method) ● Double ML Code are on github(https://github.com/VC2015/DMLonGitHub/) …(Double ML package does not exist yet. )

6. Data ● Criteo Uplift Modeling Dataset(EC Site AD Large Dataset for causal inference) Recently Published Dataset!

7. Data ● Features are as follows ● Averege Visit Rate and Treatment Ratio is on the right

8. Model ● Partially Linear Model Y: Outcome Variable(Whether user visited = visit) D: Treatment Variable (Whether or not an ad has been delivered = treatment) Z: Vector of covariates(such as history data = f0 f1 ...) θ0: “lift” parameter(Treatment Effect parameter) ● D is expressed by the following equation D is conditionally exogenous

9. Bad ML Estimation ● Predict Y using D and Z, then obtain (Example) 1. Run Random Forest and get 2. and get 3. Repeat 1. and 2. until convergence

10. Bad ML Estimation ● Prediction performance is Good, but the distribution of looks like below. (from paper)

11. Double ML 1. Predict Y and D using Z by obtained using ML methods(Lasso , Random Forest etc... ) 2. Residualize 3. Regress and get

12. Double ML ● Prediction performance is Good, and the distribution of looks like below. (from paper)

13. Sample Splitting ● In Double ML, Sample Splitting is one key Ingredient. → Convergence rate improves with sample splitting(Please refer to the papers for the theoretical content) (from paper)

14. Double ML in R ● All the code for the simulation is on the github

15. Simulation ● Since Criteo Dataset is too large , I made overall sampling first (about 2%) ● Introduce Sampling Bias( Since we are using data coming from a randomized experiment)

16. Naive ATE Method ● Unbiased ATE(Average Treatment Effect) estimation is possible with this technique if there is no sampling bias

17. Simulation ● Average treatment effect without sampling bias on the left ● Average treatment effect with sampling bias on the right Highly Biased

18. Simulation ● Use approach: Double ML with Lasso(You can use RandomForest instead of Lasso) → Counter approach: Naive ATE, Single equation Lasso , Usual Lasso , Lasso with Propensity Score Weighting

19. Simulation ● Result(with Lasso with Propensity Score Weighting) Lasso with PSW‘s performance is very bad , so remove this

20. Simulation ● Result(without Lasso with Propensity Score Weighting) Double ML’s performance is very good!

21. Conclusion ● Double ML is Flexible and Simple Method for causal inference ● Double ML’s performance seems very good with real data