SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Maximizing a churn
campaign’s profitability
with cost sensitive
machine learning
Alejandro Correa Bahnsen, PhD
Chief Data Scientist | Easy Solutions
Agosto 25 y 26 | Lima – Perú 2017
#BIGDATASUMMIT2017
Agenda
 Churn modeling
 Evaluation Measures
 Offers
 Predictive modeling
 Cost-Sensitive Predictive Modeling
 Cost Proportionate Sampling
 Bayes Minimum Risk
 CS – Decision Trees
 Conclusions
Churn Modeling
• Detect which customers are likely to abandon
Voluntary churn
Involuntary churn
Customer Churn Management Campaign
Inflow
New
Customers
Customer
Base
Active
Customers
*Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn
prediction models.
Predicted Churners
Predicted Non-Churners
TP: Actual Churners
FP: Actual Non-Churners
FN: Actual Churners
TN: Actual Non-Churners
Outflow
Effective
Churners
Churn Model Prediction
1
1
1 − 𝛾𝛾
1
Evaluation of a Campaign
 Confusion Matrix
• Accuracy =
𝑇𝑃+𝑇𝑁
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
• Recall =
𝑇𝑃
𝑇𝑃+𝐹𝑁
• Precision =
𝑇𝑃
𝑇𝑃+𝐹𝑃
• F1-Score = 2
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
True Class (𝑦𝑖)
Churner
(𝑦𝑖=1)
Non-
Churner(𝑦𝑖=0)
Predicted
class (𝑐𝑖)
Churner (𝑐𝑖=1) TP FP
Non-Churner
(𝑐𝑖=0)
FN TN
Evaluation of a Campaign
 However these measures assign the same weight to different errors
 Not the case in a Churn model since
 Failing to predict a churner carries a different cost than wrongly
predicting a non-churner
 Churners have different financial impact
Financial Evaluation of a Campaign
Inflow
New
Customers
Customer
Base
Active
Customers
*Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn
prediction models.
Predicted Churners
Predicted Non-Churners
TP: Actual Churners
FP: Actual Non-Churners
FN: Actual Churners
TN: Actual Non-Churners
Outflow
Effective
Churners
Churn Model Prediction
0
𝐶𝐿𝑉
𝐶𝐿𝑉 + 𝐶 𝑎𝐶 𝑜 + 𝐶 𝑎
𝐶 𝑜 + 𝐶 𝑎
Financial Evaluation of a Campaign
 Cost Matrix
where:
True Class (𝑦𝑖)
Churner (𝑦𝑖=1)
Non-
Churner(𝑦𝑖=0)
Predicte
d class
(𝑐𝑖)
Churner (𝑐𝑖=1)
Non-Churner
(𝑐𝑖=0)
𝐶 𝑎 = Administrative cost 𝐶𝐿𝑉𝑖 = Client Lifetime Value of
customer 𝑖
𝐶𝑜 𝑖
= Cost of the offer made to
customer 𝑖
𝛾𝑖 = Probability that customer 𝑖 accepts
the offer
𝐶 𝑇𝑃 𝑖
= 𝛾𝑖 𝐶 𝑜 𝑖
+ 1 − 𝛾𝑖 𝐶𝐿𝑉𝑖 + 𝐶 𝑎
𝐶 𝐹𝑁 𝑖
= 𝐶𝐿𝑉𝑖 𝐶 𝑇𝑁 𝑖
= 0
𝐶 𝐹𝑃 𝑖
= 𝐶 𝑜 𝑖
+ 𝐶 𝑎
Financial Evaluation of a Campaign
 Using the cost matrix the total cost is calculated as:
𝐶 = 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝑇𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝐹𝑁 𝑖 + 1 − 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝐹𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝑇𝑁 𝑖
 Additionally the savings are defined as:
𝐶𝑠 =
𝐶0 − 𝐶
𝐶0
where 𝐶0 is the cost when all the customers are predicted as non-churners
Financial Evaluation of a Campaign
 Customer Lifetime Value
*Glady et al. (2009). Modeling churn using customer lifetime value.
Offers
 Same offer may not apply to all customers (eg. Already
have premium channels)
 An offer should be made such that it maximizes the
probability of acceptance (𝛾)
Offers Analysis
Improve
to HD
DVR
Monthly
Discount
Premium
Channels
Evaluate
Offers
Performance
Offers Analysis
88%
90%
92%
94%
96%
98%
100%
0.0%
1.0%
2.0%
3.0%
4.0%
5.0%
6.0%
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Churn Rate Gamma (right axis)
𝛾 = Probability that a customer accepts the offer
Predictive Modeling
Dataset N Churn 𝑪 𝟎 (Euros)
Total 9410 4.83% 580,884
Training 3758 5.05% 244,542
Validation 2824 4.77% 174,171
Testing 2825 4.42% 162,171
SMOTE 6988 48.94% 4,273,083
Under-Sampling 374 50.80% 244,542
Predictive Modeling
 Algorithms
 Decision Trees
 Logistic Regression
 Random Forest
Predictive Modeling - Results
0%
2%
4%
6%
8%
10%
12%
14%
Decision
Trees
Logistic
Regression
Random
Forest
F1-Score
Training Under-Sampling SMOTE
0%
1%
2%
3%
4%
5%
6%
7%
8%
Decision
Trees
Logistic
Regression
Random
Forest
Savings
Training Under-Sampling SMOTE
Predictive Modeling
 Sampling techniques helps to improve models’ predictive
power however not necessarily the savings
 There is a need for methods that aim to increase savings
Agenda
 Churn modeling
 Evaluation Measures
 Offers
 Predictive modeling
 Cost-Sensitive Predictive Modeling
 Cost Proportionate Sampling
 Bayes Minimum Risk
 CS – Decision Trees
 Conclusions
Cost-Sensitive Predictive Modeling
 Traditional methods assume the same cost for different
errors
 Not the case in Churn modeling
 Some cost-sensitive methods assume a constant cost
difference between errors
 Example-Dependent Cost-Sensitive Predictive Modeling
Cost-Sensitive Predictive Modeling
 Changing class distribution
 Cost Proportionate Rejection Sampling
 Cost Proportionate Over Sampling
 Direct Cost
 Bayes Minimum Risk
 Modifying a learning algorithm
 CS – Decision Tree
Cost Proportionate Sampling
 Cost Proportionate Over Sampling
Example 𝑦𝑖 𝑤𝑖
1 0 1
2 1 10
3 0 2
4 1 20
5 0 1
Initial
Dataset
(1,0,1)
(2,1,10)
(3,0,2)
(4,1,20)
(5,0,1)
Cost Proportionate Dataset
(1,0,1)
(2,1,1), (2,1,1), …, (2,1,1)
(3,0,2), (3,0,2)
(4,1,1), (4,1,1), (4,1,1), …, (4,1,1),
(4,1,1)
(5,0,1)
*Elkan, C. (2001). The Foundations of Cost-Sensitive Learning.
𝑤𝑖/max( 𝑤𝑖)
0.05
0.5
0.1
1
0.05
Cost Proportionate Sampling
 Cost Proportionate Rejection Sampling
Example 𝑦𝑖 𝑤𝑖
1 0 1
2 1 10
3 0 2
4 1 20
5 0 1
Initial
Dataset
(1,0,1)
(2,1,10)
(3,0,2)
(4,1,20)
(5,0,1)
Cost
Proportionat
e Dataset
(2,1,1)
(4,1,1)
(4,1,1)
(5,0,1)
*Zadrozny et al. (2003). Cost-sensitive learning by cost-proportionate example weighting.
Cost Proportionate Sampling
Dataset N Churn 𝑪 𝟎 (Euros)
Total 9410 4.83% 580,884
Training 3758 5.05% 244,542
Validation 2824 4.77% 174,171
Testing 2825 4.42% 162,171
Under-Sampling 374 50.80% 244,542
SMOTE 6988 48.94% 4,273,083
CS – Rejection-Sampling 428 41.35% 231,428
CS – Over-Sampling 5767 31.24% 2,350,285
Cost Proportionate Sampling
0%
5%
10%
15%
20%
25%
Decision
Trees
Logistic
Regression
Random
Forest
Savings
Training Under
SMOTE CS-Rejection
CS-Over
0%
5%
10%
15%
Decision
Trees
Logistic
Regression
Random
Forest
F1-Score
Training Under
SMOTE CS-Rejection
CS-Over
Bayes Minimum Risk
 Decision model based on quantifying tradeoffs between various decisions
using probabilities and the costs that accompany such decisions
 Risk of classification
𝑅 𝑐𝑖 = 0|𝑥𝑖 = 𝐶 𝑇𝑁 𝑖
1 − 𝑝𝑖 + 𝐶 𝐹𝑁 𝑖
∙ 𝑝𝑖
𝑅 𝑐𝑖 = 1|𝑥𝑖 = 𝐶 𝐹𝑃 𝑖
1 − 𝑝𝑖 + 𝐶 𝑇𝑃 𝑖
∙ 𝑝𝑖
 Using the different risks the prediction is made based on the following
condition:
𝑐𝑖 =
0 𝑅 𝑐𝑖 = 0|𝑥𝑖 ≤ 𝑅 𝑐𝑖 = 1|𝑥𝑖
1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Bayes Minimum Risk
0%
5%
10%
15%
20%
25%
30%
35%
- BMR - BMR - BMR
Decision Trees Logistic Regression Random Forest
Savings
Training Under-Sampling SMOTE CS-Rejection CS-Over
Bayes Minimum Risk
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
- BMR - BMR - BMR
Decision Trees Logistic Regression Random Forest
F1-Score
Training Under-Sampling SMOTE CS-Rejection CS-Over
Bayes Minimum Risk
 Bayes Minimum Risk increases the savings by using a cost-
insensitive method and then introducing the costs
 Why not introduce the costs during the estimation of the
methods?
CS – Decision Trees
 Decision trees
 Classification model that iteratively creates binary
decision rules 𝑥 𝑗, 𝑙 𝑗
𝑚 that maximize certain criteria
 Where 𝑥 𝑗, 𝑙 𝑗
𝑚 refers to making a rule using feature 𝑗 on
value 𝑚
Comparison of Models
0%
10%
20%
30%
40%
50%
Random Forest
Train
Logistic
Regression
CSRejection
Logistic
Regression BMR
Train
Decision Tree
CostPruning
CSRejection
CS-Decision Tree
Train
Savings F1-Score
Conclusions
 Selecting models based on traditional statistics does not
gives the best results measured by savings
 Incorporating the costs into the modeling helps to achieve
higher savings
Thank you!
Alejandro Correa Bahnsen
acorrea@easysol.net

Weitere ähnliche Inhalte

Was ist angesagt?

An introduction to decision trees
An introduction to decision treesAn introduction to decision trees
An introduction to decision treesFahim Muntaha
 
Decision tree for Predictive Modeling
Decision tree for Predictive ModelingDecision tree for Predictive Modeling
Decision tree for Predictive ModelingEdureka!
 
Decision Tree- M.B.A -DecSci
Decision Tree- M.B.A -DecSciDecision Tree- M.B.A -DecSci
Decision Tree- M.B.A -DecSciLesly Lising
 
How Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataHow Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataTa-Wei (David) Huang
 
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib KeeminkPython and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib KeeminkPyData
 
Anomaly detection- Credit Card Fraud Detection
Anomaly detection- Credit Card Fraud DetectionAnomaly detection- Credit Card Fraud Detection
Anomaly detection- Credit Card Fraud DetectionLipsa Panda
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term depositPranov Mishra
 

Was ist angesagt? (18)

An introduction to decision trees
An introduction to decision treesAn introduction to decision trees
An introduction to decision trees
 
Decision tree for Predictive Modeling
Decision tree for Predictive ModelingDecision tree for Predictive Modeling
Decision tree for Predictive Modeling
 
Decision Tree- M.B.A -DecSci
Decision Tree- M.B.A -DecSciDecision Tree- M.B.A -DecSci
Decision Tree- M.B.A -DecSci
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
How Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataHow Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with Data
 
Survival_Analysis
Survival_AnalysisSurvival_Analysis
Survival_Analysis
 
16
1616
16
 
Les5e ppt 08
Les5e ppt 08Les5e ppt 08
Les5e ppt 08
 
Les5e ppt 10
Les5e ppt 10Les5e ppt 10
Les5e ppt 10
 
Les5e ppt 03
Les5e ppt 03Les5e ppt 03
Les5e ppt 03
 
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib KeeminkPython and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
 
Anomaly detection- Credit Card Fraud Detection
Anomaly detection- Credit Card Fraud DetectionAnomaly detection- Credit Card Fraud Detection
Anomaly detection- Credit Card Fraud Detection
 
Les5e ppt 09
Les5e ppt 09Les5e ppt 09
Les5e ppt 09
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Les5e ppt 06
Les5e ppt 06Les5e ppt 06
Les5e ppt 06
 
Classification Using Decision tree
Classification Using Decision treeClassification Using Decision tree
Classification Using Decision tree
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 

Andere mochten auch

Andere mochten auch (9)

Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
 
2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice
 
Analytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacionAnalytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacion
 
1609 Fraud Data Science
1609 Fraud Data Science1609 Fraud Data Science
1609 Fraud Data Science
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...
 
2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle
 
Modern Data Science
Modern Data ScienceModern Data Science
Modern Data Science
 
Classifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksClassifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural Networks
 
Demystifying machine learning using lime
Demystifying machine learning using limeDemystifying machine learning using lime
Demystifying machine learning using lime
 

Ähnlich wie Maximizing a churn campaigns profitability with cost sensitive machine learning

Hero conference 2016 - advanced bidding
Hero conference   2016 - advanced biddingHero conference   2016 - advanced bidding
Hero conference 2016 - advanced biddingChris Haleua
 
Media Optimization Model
Media Optimization ModelMedia Optimization Model
Media Optimization ModelDaniel McKean
 
Optimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designsOptimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designsPankaj Sharma
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsKun Liu
 
Deepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn WayDeepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn Wayyingfeng
 
Predictive Model Example
Predictive Model ExamplePredictive Model Example
Predictive Model ExampleDaniel McKean
 
AHP_Report_EM-206.ppt
AHP_Report_EM-206.pptAHP_Report_EM-206.ppt
AHP_Report_EM-206.pptGeorgeGomez31
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionMatt Stubbs
 
Six Sigma- Define & Measure_Sudhanshu.pdf
Six Sigma- Define & Measure_Sudhanshu.pdfSix Sigma- Define & Measure_Sudhanshu.pdf
Six Sigma- Define & Measure_Sudhanshu.pdfSudhanshuMittal20
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
 
Mastering customer journey with reactivation
Mastering customer journey with reactivation Mastering customer journey with reactivation
Mastering customer journey with reactivation Exponea
 
Machine Learning Project - 1994 U.S. Census
Machine Learning Project - 1994 U.S. CensusMachine Learning Project - 1994 U.S. Census
Machine Learning Project - 1994 U.S. CensusTim Enalls
 
Risk_Management_Poster
Risk_Management_PosterRisk_Management_Poster
Risk_Management_PosterRohan Sanas
 
Basic Statistics for Paid Search Advertising
Basic Statistics for Paid Search AdvertisingBasic Statistics for Paid Search Advertising
Basic Statistics for Paid Search AdvertisingNina Estenzo
 
Statistical modelling to optimise paid media campaigns
Statistical modelling to optimise paid media campaignsStatistical modelling to optimise paid media campaigns
Statistical modelling to optimise paid media campaignsAyima
 

Ähnlich wie Maximizing a churn campaigns profitability with cost sensitive machine learning (20)

Hero conference 2016 - advanced bidding
Hero conference   2016 - advanced biddingHero conference   2016 - advanced bidding
Hero conference 2016 - advanced bidding
 
Media Optimization Model
Media Optimization ModelMedia Optimization Model
Media Optimization Model
 
Optimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designsOptimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designs
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
 
Strategic approachppg v02
Strategic approachppg v02Strategic approachppg v02
Strategic approachppg v02
 
Deepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn WayDeepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn Way
 
Predictive Model Example
Predictive Model ExamplePredictive Model Example
Predictive Model Example
 
AHP_Report_EM-206.ppt
AHP_Report_EM-206.pptAHP_Report_EM-206.ppt
AHP_Report_EM-206.ppt
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
 
Six Sigma- Define & Measure_Sudhanshu.pdf
Six Sigma- Define & Measure_Sudhanshu.pdfSix Sigma- Define & Measure_Sudhanshu.pdf
Six Sigma- Define & Measure_Sudhanshu.pdf
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Mastering customer journey with reactivation
Mastering customer journey with reactivation Mastering customer journey with reactivation
Mastering customer journey with reactivation
 
Case studies to engage
Case studies to engageCase studies to engage
Case studies to engage
 
Aqbd seminar tmu
Aqbd seminar tmuAqbd seminar tmu
Aqbd seminar tmu
 
Machine Learning Project - 1994 U.S. Census
Machine Learning Project - 1994 U.S. CensusMachine Learning Project - 1994 U.S. Census
Machine Learning Project - 1994 U.S. Census
 
Risk_Management_Poster
Risk_Management_PosterRisk_Management_Poster
Risk_Management_Poster
 
Basic Statistics for Paid Search Advertising
Basic Statistics for Paid Search AdvertisingBasic Statistics for Paid Search Advertising
Basic Statistics for Paid Search Advertising
 
Corporate presentation
Corporate presentationCorporate presentation
Corporate presentation
 
Statistical modelling to optimise paid media campaigns
Statistical modelling to optimise paid media campaignsStatistical modelling to optimise paid media campaigns
Statistical modelling to optimise paid media campaigns
 
Application Vetting
Application VettingApplication Vetting
Application Vetting
 

Kürzlich hochgeladen

Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 

Kürzlich hochgeladen (20)

Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 

Maximizing a churn campaigns profitability with cost sensitive machine learning

  • 1. Maximizing a churn campaign’s profitability with cost sensitive machine learning Alejandro Correa Bahnsen, PhD Chief Data Scientist | Easy Solutions Agosto 25 y 26 | Lima – Perú 2017 #BIGDATASUMMIT2017
  • 2. Agenda  Churn modeling  Evaluation Measures  Offers  Predictive modeling  Cost-Sensitive Predictive Modeling  Cost Proportionate Sampling  Bayes Minimum Risk  CS – Decision Trees  Conclusions
  • 3. Churn Modeling • Detect which customers are likely to abandon Voluntary churn Involuntary churn
  • 4. Customer Churn Management Campaign Inflow New Customers Customer Base Active Customers *Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models. Predicted Churners Predicted Non-Churners TP: Actual Churners FP: Actual Non-Churners FN: Actual Churners TN: Actual Non-Churners Outflow Effective Churners Churn Model Prediction 1 1 1 − 𝛾𝛾 1
  • 5. Evaluation of a Campaign  Confusion Matrix • Accuracy = 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 • Recall = 𝑇𝑃 𝑇𝑃+𝐹𝑁 • Precision = 𝑇𝑃 𝑇𝑃+𝐹𝑃 • F1-Score = 2 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 True Class (𝑦𝑖) Churner (𝑦𝑖=1) Non- Churner(𝑦𝑖=0) Predicted class (𝑐𝑖) Churner (𝑐𝑖=1) TP FP Non-Churner (𝑐𝑖=0) FN TN
  • 6. Evaluation of a Campaign  However these measures assign the same weight to different errors  Not the case in a Churn model since  Failing to predict a churner carries a different cost than wrongly predicting a non-churner  Churners have different financial impact
  • 7. Financial Evaluation of a Campaign Inflow New Customers Customer Base Active Customers *Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models. Predicted Churners Predicted Non-Churners TP: Actual Churners FP: Actual Non-Churners FN: Actual Churners TN: Actual Non-Churners Outflow Effective Churners Churn Model Prediction 0 𝐶𝐿𝑉 𝐶𝐿𝑉 + 𝐶 𝑎𝐶 𝑜 + 𝐶 𝑎 𝐶 𝑜 + 𝐶 𝑎
  • 8. Financial Evaluation of a Campaign  Cost Matrix where: True Class (𝑦𝑖) Churner (𝑦𝑖=1) Non- Churner(𝑦𝑖=0) Predicte d class (𝑐𝑖) Churner (𝑐𝑖=1) Non-Churner (𝑐𝑖=0) 𝐶 𝑎 = Administrative cost 𝐶𝐿𝑉𝑖 = Client Lifetime Value of customer 𝑖 𝐶𝑜 𝑖 = Cost of the offer made to customer 𝑖 𝛾𝑖 = Probability that customer 𝑖 accepts the offer 𝐶 𝑇𝑃 𝑖 = 𝛾𝑖 𝐶 𝑜 𝑖 + 1 − 𝛾𝑖 𝐶𝐿𝑉𝑖 + 𝐶 𝑎 𝐶 𝐹𝑁 𝑖 = 𝐶𝐿𝑉𝑖 𝐶 𝑇𝑁 𝑖 = 0 𝐶 𝐹𝑃 𝑖 = 𝐶 𝑜 𝑖 + 𝐶 𝑎
  • 9. Financial Evaluation of a Campaign  Using the cost matrix the total cost is calculated as: 𝐶 = 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝑇𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝐹𝑁 𝑖 + 1 − 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝐹𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝑇𝑁 𝑖  Additionally the savings are defined as: 𝐶𝑠 = 𝐶0 − 𝐶 𝐶0 where 𝐶0 is the cost when all the customers are predicted as non-churners
  • 10. Financial Evaluation of a Campaign  Customer Lifetime Value *Glady et al. (2009). Modeling churn using customer lifetime value.
  • 11. Offers  Same offer may not apply to all customers (eg. Already have premium channels)  An offer should be made such that it maximizes the probability of acceptance (𝛾)
  • 13. Offers Analysis 88% 90% 92% 94% 96% 98% 100% 0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% Cluster 1 Cluster 2 Cluster 3 Cluster 4 Churn Rate Gamma (right axis) 𝛾 = Probability that a customer accepts the offer
  • 14. Predictive Modeling Dataset N Churn 𝑪 𝟎 (Euros) Total 9410 4.83% 580,884 Training 3758 5.05% 244,542 Validation 2824 4.77% 174,171 Testing 2825 4.42% 162,171 SMOTE 6988 48.94% 4,273,083 Under-Sampling 374 50.80% 244,542
  • 15. Predictive Modeling  Algorithms  Decision Trees  Logistic Regression  Random Forest
  • 16. Predictive Modeling - Results 0% 2% 4% 6% 8% 10% 12% 14% Decision Trees Logistic Regression Random Forest F1-Score Training Under-Sampling SMOTE 0% 1% 2% 3% 4% 5% 6% 7% 8% Decision Trees Logistic Regression Random Forest Savings Training Under-Sampling SMOTE
  • 17. Predictive Modeling  Sampling techniques helps to improve models’ predictive power however not necessarily the savings  There is a need for methods that aim to increase savings
  • 18. Agenda  Churn modeling  Evaluation Measures  Offers  Predictive modeling  Cost-Sensitive Predictive Modeling  Cost Proportionate Sampling  Bayes Minimum Risk  CS – Decision Trees  Conclusions
  • 19. Cost-Sensitive Predictive Modeling  Traditional methods assume the same cost for different errors  Not the case in Churn modeling  Some cost-sensitive methods assume a constant cost difference between errors  Example-Dependent Cost-Sensitive Predictive Modeling
  • 20. Cost-Sensitive Predictive Modeling  Changing class distribution  Cost Proportionate Rejection Sampling  Cost Proportionate Over Sampling  Direct Cost  Bayes Minimum Risk  Modifying a learning algorithm  CS – Decision Tree
  • 21. Cost Proportionate Sampling  Cost Proportionate Over Sampling Example 𝑦𝑖 𝑤𝑖 1 0 1 2 1 10 3 0 2 4 1 20 5 0 1 Initial Dataset (1,0,1) (2,1,10) (3,0,2) (4,1,20) (5,0,1) Cost Proportionate Dataset (1,0,1) (2,1,1), (2,1,1), …, (2,1,1) (3,0,2), (3,0,2) (4,1,1), (4,1,1), (4,1,1), …, (4,1,1), (4,1,1) (5,0,1) *Elkan, C. (2001). The Foundations of Cost-Sensitive Learning.
  • 22. 𝑤𝑖/max( 𝑤𝑖) 0.05 0.5 0.1 1 0.05 Cost Proportionate Sampling  Cost Proportionate Rejection Sampling Example 𝑦𝑖 𝑤𝑖 1 0 1 2 1 10 3 0 2 4 1 20 5 0 1 Initial Dataset (1,0,1) (2,1,10) (3,0,2) (4,1,20) (5,0,1) Cost Proportionat e Dataset (2,1,1) (4,1,1) (4,1,1) (5,0,1) *Zadrozny et al. (2003). Cost-sensitive learning by cost-proportionate example weighting.
  • 23. Cost Proportionate Sampling Dataset N Churn 𝑪 𝟎 (Euros) Total 9410 4.83% 580,884 Training 3758 5.05% 244,542 Validation 2824 4.77% 174,171 Testing 2825 4.42% 162,171 Under-Sampling 374 50.80% 244,542 SMOTE 6988 48.94% 4,273,083 CS – Rejection-Sampling 428 41.35% 231,428 CS – Over-Sampling 5767 31.24% 2,350,285
  • 24. Cost Proportionate Sampling 0% 5% 10% 15% 20% 25% Decision Trees Logistic Regression Random Forest Savings Training Under SMOTE CS-Rejection CS-Over 0% 5% 10% 15% Decision Trees Logistic Regression Random Forest F1-Score Training Under SMOTE CS-Rejection CS-Over
  • 25. Bayes Minimum Risk  Decision model based on quantifying tradeoffs between various decisions using probabilities and the costs that accompany such decisions  Risk of classification 𝑅 𝑐𝑖 = 0|𝑥𝑖 = 𝐶 𝑇𝑁 𝑖 1 − 𝑝𝑖 + 𝐶 𝐹𝑁 𝑖 ∙ 𝑝𝑖 𝑅 𝑐𝑖 = 1|𝑥𝑖 = 𝐶 𝐹𝑃 𝑖 1 − 𝑝𝑖 + 𝐶 𝑇𝑃 𝑖 ∙ 𝑝𝑖  Using the different risks the prediction is made based on the following condition: 𝑐𝑖 = 0 𝑅 𝑐𝑖 = 0|𝑥𝑖 ≤ 𝑅 𝑐𝑖 = 1|𝑥𝑖 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
  • 26. Bayes Minimum Risk 0% 5% 10% 15% 20% 25% 30% 35% - BMR - BMR - BMR Decision Trees Logistic Regression Random Forest Savings Training Under-Sampling SMOTE CS-Rejection CS-Over
  • 27. Bayes Minimum Risk 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 - BMR - BMR - BMR Decision Trees Logistic Regression Random Forest F1-Score Training Under-Sampling SMOTE CS-Rejection CS-Over
  • 28. Bayes Minimum Risk  Bayes Minimum Risk increases the savings by using a cost- insensitive method and then introducing the costs  Why not introduce the costs during the estimation of the methods?
  • 29. CS – Decision Trees  Decision trees  Classification model that iteratively creates binary decision rules 𝑥 𝑗, 𝑙 𝑗 𝑚 that maximize certain criteria  Where 𝑥 𝑗, 𝑙 𝑗 𝑚 refers to making a rule using feature 𝑗 on value 𝑚
  • 30. Comparison of Models 0% 10% 20% 30% 40% 50% Random Forest Train Logistic Regression CSRejection Logistic Regression BMR Train Decision Tree CostPruning CSRejection CS-Decision Tree Train Savings F1-Score
  • 31. Conclusions  Selecting models based on traditional statistics does not gives the best results measured by savings  Incorporating the costs into the modeling helps to achieve higher savings
  • 32.
  • 33. Thank you! Alejandro Correa Bahnsen acorrea@easysol.net