SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014
Maximizing a Churn Campaign’s
Profitability With Cost-Sensitive
Predictive Analytics
Alejandro Correa Bahnsen, Luxembourg University
Andres Felipe Gonzalez Montoya, DIRECTV
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Agenda
• Churn modeling
• Evaluation Measures
• Offers
• Predictive modeling
• Cost-Sensitive Predictive Modeling
 Cost Proportionate Sampling
 Bayes Minimum Risk
 CS – Decision Trees
• Conclusions
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Churn Modeling
• Detect which customers are likely to abandon
Voluntary churn
Involuntary churn
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Customer Churn Management Campaign
Inflow
New
Customers
Customer
Base
Active
Customers
*Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models.
Predicted Churners
Predicted Non-Churners
TP: Actual Churners
FP: Actual Non-Churners
FN: Actual Churners
TN: Actual Non-Churners
Outflow
Effective
Churners
Churn Model Prediction
1
1
1 − 𝛾𝛾
1
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Evaluation of a Campaign
• Confusion Matrix
• Accuracy =
𝑇𝑃+𝑇𝑁
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
• Recall =
𝑇𝑃
𝑇𝑃+𝐹𝑁
• Precision =
𝑇𝑃
𝑇𝑃+𝐹𝑃
• F1-Score = 2
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
True Class (𝑦𝑖)
Churner (𝑦𝑖=1) Non-Churner(𝑦𝑖=0)
Predicted
class (𝑐𝑖)
Churner (𝑐𝑖=1) TP FP
Non-Churner (𝑐𝑖=0) FN TN
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Evaluation of a Campaign
• However these measures assign the same weight to different
errors
• Not the case in a Churn model since
 Failing to predict a churner carries a different cost than wrongly
predicting a non-churner
 Churners have different financial impact
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Financial Evaluation of a Campaign
Inflow
New
Customers
Customer
Base
Active
Customers
*Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models.
Predicted Churners
Predicted Non-Churners
TP: Actual Churners
FP: Actual Non-Churners
FN: Actual Churners
TN: Actual Non-Churners
Outflow
Effective
Churners
Churn Model Prediction
0
𝐶𝐿𝑉
𝐶𝐿𝑉 + 𝐶 𝑎𝐶 𝑜 + 𝐶 𝑎
𝐶 𝑜 + 𝐶 𝑎
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Financial Evaluation of a Campaign
• Cost Matrix
where:
True Class (𝑦𝑖)
Churner (𝑦𝑖=1) Non-Churner(𝑦𝑖=0)
Predicted
class (𝑐𝑖)
Churner (𝑐𝑖=1)
Non-Churner (𝑐𝑖=0)
𝐶 𝑎 = Administrative cost 𝐶𝐿𝑉𝑖 = Client Lifetime Value of
customer 𝑖
𝐶 𝑜 𝑖
= Cost of the offer made to
customer 𝑖
𝛾𝑖 = Probability that customer 𝑖
accepts the offer
𝐶 𝑇𝑃 𝑖
= 𝛾𝑖 𝐶 𝑜 𝑖
+ 1 − 𝛾𝑖 𝐶𝐿𝑉𝑖 + 𝐶 𝑎
𝐶 𝐹𝑁 𝑖
= 𝐶𝐿𝑉𝑖 𝐶 𝑇𝑁 𝑖
= 0
𝐶 𝐹𝑃 𝑖
= 𝐶 𝑜 𝑖
+ 𝐶 𝑎
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Financial Evaluation of a Campaign
• Using the cost matrix the total cost is calculated as:
𝐶 = 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝑇𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝐹𝑁 𝑖 + 1 − 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝐹𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝑇𝑁 𝑖
• Additionally the savings are defined as:
𝐶𝑠 =
𝐶0 − 𝐶
𝐶0
where 𝐶0 is the cost when all the customers are predicted as non-churners
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
• Customer Lifetime Value
Financial Evaluation of a Campaign
*Glady et al. (2009). Modeling churn using customer lifetime value.
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Agenda
• Churn modeling
• Evaluation Measures
• Offers
• Predictive modeling
• Cost-Sensitive Predictive Modeling
 Cost Proportionate Sampling
 Bayes Minimum Risk
 CS – Decision Trees
• Conclusions
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Offers
• Same offer may not apply to all customers (eg. Already have
premium channels)
• An offer should be made such that it maximizes the
probability of acceptance (𝛾) and CLV
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Offers clusters
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Offers Analysis
Improve
to HD DVR
Monthly
Discount
Premium
Channels
Evaluate
Offers
Performance
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Offers Analysis
88%
90%
92%
94%
96%
98%
100%
0.0%
1.0%
2.0%
3.0%
4.0%
5.0%
6.0%
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Churn Rate Gamma (right axis)
𝛾 = Probability that a customer accepts the offer
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling
• Using predictive analytics for detecting the behavioral
patterns of those customer's who had defect in the past
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling
• Then check which of the current customers share the same
patterns
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling
• Dataset
Dataset N Churn 𝑪 𝟎 (Euros)
Total 9410 4.83% 580,884
Training 3758 5.05% 244,542
Validation 2824 4.77% 174,171
Testing 2825 4.42% 162,171
Under-Sampling 374 50.80% 244,542
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling
• Algorithms
 Decision Trees
 Logistic Regression
 Random Forest
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling - Results
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Decision
Trees
Logistic
Regression
Random
Forest
F1-Score
Training Under-Sampling
0%
1%
2%
3%
4%
5%
6%
7%
8%
Decision Trees Logistic
Regression
Random
Forest
Savings
Training Under-Sampling
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling - SMOTE
• Synthetic Minority Over-sampling Technique
Dim2
Dim 1 Synthetic samples
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling - SMOTE
• Dataset
Dataset N Churn 𝑪 𝟎 (Euros)
Total 9410 4.83% 580,884
Training 3758 5.05% 244,542
Validation 2824 4.77% 174,171
Testing 2825 4.42% 162,171
Under-Sampling 374 50.80% 244,542
SMOTE 6988 48.94% 4,273,083
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling - SMOTE
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Decision
Trees
Logistic
Regression
Random
Forest
F1-Score
Training Under-Sampling SMOTE
0%
1%
2%
3%
4%
5%
6%
7%
8%
Decision Trees Logistic
Regression
Random
Forest
Savings
Training Under-Sampling SMOTE
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Predictive Modeling - SMOTE
• Sampling techniques helps to improve models’ predictive
power however not necessarily the savings
• There is a need for methods that aim to increase savings
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Agenda
• Churn modeling
• Evaluation Measures
• Offers
• Predictive modeling
• Cost-Sensitive Predictive Modeling
 Cost Proportionate Sampling
 Bayes Minimum Risk
 CS – Decision Trees
• Conclusions
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost-Sensitive Predictive Modeling
• Traditional methods assume the same cost for different errors
• Not the case in Churn modeling
• Some cost-sensitive methods assume a constant cost difference between
errors
• Example-Dependent Cost-Sensitive Predictive Modeling
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost-Sensitive Predictive Modeling
• Changing class distribution
 Cost Proportionate Rejection Sampling
 Cost Proportionate Over Sampling
• Direct Cost
 Bayes Minimum Risk
• Modifying a learning algorithm
 CS – Decision Tree
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost Proportionate Sampling
• Normalized Cost weight
𝑤𝑖 =
𝐶 𝐹𝑃 𝑖 𝑖𝑓 𝑦𝑖 = 0
𝐶 𝐹𝑁 𝑖 𝑖𝑓 𝑦𝑖 = 1
𝑤𝑖 =
𝑤𝑖
max
𝑗
𝑤𝑗
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost Proportionate Sampling
• Cost Proportionate Over Sampling
Example 𝑦𝑖 𝑤𝑖
1 0 1
2 1 10
3 0 2
4 1 20
5 0 1
Initial Dataset
(1,0,1)
(2,1,10)
(3,0,2)
(4,1,20)
(5,0,1)
Cost Proportionate Dataset
(1,0,1)
(2,1,1), (2,1,1), …, (2,1,1)
(3,0,2), (3,0,2)
(4,1,1), (4,1,1), (4,1,1), …, (4,1,1), (4,1,1)
(5,0,1)
*Elkan, C. (2001). The Foundations of Cost-Sensitive Learning.
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost Proportionate Sampling
• Cost Proportionate Rejection Sampling
Example 𝑦𝑖 𝑤𝑖
1 0 1
2 1 10
3 0 2
4 1 20
5 0 1
Initial Dataset
(1,0,1)
(2,1,10)
(3,0,2)
(4,1,20)
(5,0,1)
Cost
Proportionate
Dataset
(2,1,1)
(4,1,1)
(4,1,1)
(5,0,1)
*Zadrozny et al. (2003). Cost-sensitive learning by cost-proportionate example weighting.
𝑤𝑖
0.05
0.5
0.1
1
0.05
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost Proportionate Sampling
• Dataset
Dataset N Churn 𝑪 𝟎 (Euros)
Total 9410 4.83% 580,884
Training 3758 5.05% 244,542
Validation 2824 4.77% 174,171
Testing 2825 4.42% 162,171
Under-Sampling 374 50.80% 244,542
SMOTE 6988 48.94% 4,273,083
CS – Rejection-Sampling 428 41.35% 231,428
CS – Over-Sampling 5767 31.24% 2,350,285
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Cost Proportionate Sampling
0%
5%
10%
15%
20%
25%
Decision Trees Logistic
Regression
Random
Forest
Savings
Training Under SMOTE
CS-Rejection CS-Over
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Decision
Trees
Logistic
Regression
Random
Forest
F1-Score
Training Under SMOTE
CS-Rejection CS-Over
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
• Decision model based on quantifying tradeoffs between
various decisions using probabilities and the costs that
accompany such decisions
• Risk of classification
𝑅 𝑐𝑖 = 0|𝑥𝑖 = 𝐶 𝑇𝑁 𝑖 1 − 𝑝𝑖 + 𝐶 𝐹𝑁 𝑖 ∙ 𝑝𝑖
𝑅 𝑐𝑖 = 1|𝑥𝑖 = 𝐶 𝐹𝑃 𝑖 1 − 𝑝𝑖 + 𝐶 𝑇𝑃 𝑖 ∙ 𝑝𝑖
Bayes Minimum Risk
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
• Using the different risks the prediction is made based on the
following condition:
𝑐𝑖 =
0 𝑅 𝑐𝑖 = 0|𝑥𝑖 ≤ 𝑅 𝑐𝑖 = 1|𝑥𝑖
1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
• Example-dependent threshold
𝑡 𝐵𝑀𝑅 𝑖 =
𝐶 𝐹𝑃 𝑖 − 𝐶 𝑇𝑁 𝑖
𝐶 𝐹𝑁 𝑖 − 𝐶 𝑇𝑁 𝑖 − 𝐶 𝑇𝑃 𝑖 + 𝐶 𝐹𝑃 𝑖
Bayes Minimum Risk
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Bayes Minimum Risk
0%
5%
10%
15%
20%
25%
30%
35%
- BMR - BMR - BMR
Decision Trees Logistic Regression Random Forest
Savings
Training Under-Sampling SMOTE CS-Rejection CS-Over
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Bayes Minimum Risk
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
- BMR - BMR - BMR
Decision Trees Logistic Regression Random Forest
F1-Score
Training Under-Sampling SMOTE CS-Rejection CS-Over
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Bayes Minimum Risk
• Bayes Minimum Risk increases the savings by using a cost-
insensitive method and then introducing the costs
• Why not introduce the costs during the estimation of the
methods?
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
• Decision trees
 Classification model that iteratively creates binary decision rules
𝑥 𝑗
, 𝑙 𝑗
𝑚 that maximize certain criteria
 Where 𝑥 𝑗
, 𝑙 𝑗
𝑚 refers to making a rule using feature 𝑗 on value 𝑚
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
• Decision trees – Construction
• Then the impurity of each leaf is calculated using:
 Misclassification: 𝐼 𝑚 𝜋1 = 1 − 𝑚𝑎𝑥 𝜋1, (1 − 𝜋1)
 Entropy : 𝐼𝑒 𝜋1 = −𝜋1 log 𝜋1 − 1 − 𝜋1 log(1 − 𝜋1)
 Gini : 𝐼𝑔 𝜋1 = 2𝜋1 1 − 𝜋1
𝜋1is the percentage of positives.
CS – Decision Trees
𝑆
𝑆 𝑙 𝑆 𝑟
𝑆 𝑙
= 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗
𝑖 ≤ 𝑙 𝑗
𝑚 𝑆 𝑟
= 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗
𝑖 > 𝑙 𝑗
𝑚
𝑥 𝑗
, 𝑙 𝑗
𝑚
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
• Decision trees – Construction
• Afterwards the gain of applying a given rule to the set 𝑆 is:
𝐺𝑎𝑖𝑛 𝑥 𝑗, 𝑙 𝑗
𝑚 = 𝐼 𝜋1 −
𝑆 𝑙
𝑆
𝐼(𝜋 𝑙
1) −
𝑆 𝑟
𝑆
𝐼(𝜋 𝑟
1)
CS – Decision Trees
𝑆
𝑆 𝑙 𝑆 𝑟
𝑆 𝑙
= 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗
𝑖 ≤ 𝑙 𝑗
𝑚 𝑆 𝑟
= 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗
𝑖 > 𝑙 𝑗
𝑚
𝑥 𝑗
, 𝑙 𝑗
𝑚
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
• Decision trees – Construction
• The rule that maximizes the gain is selected
𝑏𝑒𝑠𝑡 𝑥, 𝑏𝑒𝑠𝑡𝑙 = argmax
(𝑗,𝑚)
𝐺𝑎𝑖𝑛 𝑥 𝑗, 𝑙 𝑗
𝑚
• The process is repeated until a stopping criteria is met:
CS – Decision Trees
S
S S
S S S S
S S S S
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
• Decision trees - Pruning
• Calculation of the Tree error and pruned Tree error
• After calculating the pruning criteria for all possible trees. The maximum
improvement is selected and the Tree is pruned.
• Later the process is repeated until there is no further improvement.
S
S S
S S S S
S S S S
S
S S
S S S S
S S
S
S S
S S
𝜖 𝑇𝑟𝑒𝑒
𝜖 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝜖 𝑇𝑟𝑒𝑒
𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|
𝜖 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝜖 𝑇𝑟𝑒𝑒
𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
• Maximize the accuracy is different than maximizing the cost
• To solve this, some studies had been proposed method that
aim to introduce the cost-sensitivity into the algorithms
• However, research have been focused on class-dependent
methods Instead we used a:
 Example-dependent cost based impurity measure
 Example-dependent cost based pruning criteria
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
• Cost based impurity measure
• The impurity of each leaf is calculated using:
𝐼𝑐 𝑆 = 𝑚𝑖𝑛 𝐶0, 𝐶1
𝑓(𝑆) =
0 𝐶0 ≤ 𝐶1
1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑆
𝑆 𝑙 𝑆 𝑟
𝑆 𝑙
= 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗
𝑖 ≤ 𝑙 𝑗
𝑚 𝑆 𝑟
= 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗
𝑖 > 𝑙 𝑗
𝑚
𝑥 𝑗
, 𝑙 𝑗
𝑚
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
• Cost sensitive pruning
𝑃𝐶𝑐 =
𝐶 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝐶 𝑇𝑟𝑒𝑒
𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|
• New pruning criteria that evaluates the improvement in cost
of eliminating a particular branch
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
0%
10%
20%
30%
40%
50%
Error Pruning Cost Pruning
Decision Trees Cost-Sensitive Decision Trees
Savings
Training Under-Sampling SMOTE CS-Rejection CS-Over
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
CS – Decision Trees
0
0.05
0.1
0.15
0.2
0.25
0.3
F1-Score
Training Under-Sampling SMOTE CS-Rejection CS-Over
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Comparison of Models
0%
10%
20%
30%
40%
50%
Random Forest
Train
Logistic Regression
CSRejection
Logistic Regression
BMR Train
Decision Tree
CostPruning
CSRejection
CS-Decision Tree
Train
Savings F1-Score
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Conclusions
• Selecting models based on traditional statistics does not gives
the best results measured by savings
• Incorporating the costs into the modeling helps to achieve
higher savings
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Other Applications
• Fraud Detection
 Correa Bahnsen et al. (2013). Cost Sensitive Credit Card Fraud Detection
using Bayes Minimum Risk.
 Correa Bahnsen, et al. (2014). Improving Credit Card Fraud Detection with
Calibrated Probabilities.
• Credit Scoring
 Correa Bahnsen, et al. (2014). Example-Dependent Cost-Sensitive Credit
Scoring using Bayes Minimum Risk.
• Direct Marketing
 Correa Bahnsen, et al. (2014). Example-Dependent Cost-Sensitive Decision
Trees.
Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014
Contact Information
Alejandro Correa Bahnsen
University of Luxembourg
Luxembourg
al.bahnsen@gmail.com
http://www.linkedin.com/in/albahnsen
http://www.slideshare.net/albahnsen
Andres Gonzalez Montoya
DIRECTV
Colombia
andrezfg@gmail.com
Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014
Thank you!
Alejandro Correa Bahnsen, Luxembourg University
Andres Felipe Gonzalez Montoya, DIRECTV

Weitere ähnliche Inhalte

Was ist angesagt?

Churn Prediction in Practice
Churn Prediction in PracticeChurn Prediction in Practice
Churn Prediction in Practice
BigData Republic
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
Mike Nguyen
 

Was ist angesagt? (20)

Wayfair's Data Science Team and Case Study: Uplift Modeling
Wayfair's Data Science Team and Case Study: Uplift ModelingWayfair's Data Science Team and Case Study: Uplift Modeling
Wayfair's Data Science Team and Case Study: Uplift Modeling
 
FSRM 582 Project
FSRM 582 ProjectFSRM 582 Project
FSRM 582 Project
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
How Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataHow Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with Data
 
An introduction to decision trees
An introduction to decision treesAn introduction to decision trees
An introduction to decision trees
 
Churn Prediction in Practice
Churn Prediction in PracticeChurn Prediction in Practice
Churn Prediction in Practice
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib KeeminkPython and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
Python and the Holy Grail of Causal Inference - Dennis Ramondt, Huib Keemink
 
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
What is Hierarchical Clustering and How Can an Organization Use it to Analyze...
 
Decision tree example problem
Decision tree example problemDecision tree example problem
Decision tree example problem
 
Telecom Churn Analysis
Telecom Churn AnalysisTelecom Churn Analysis
Telecom Churn Analysis
 
What Is a Model, Anyhow?
What Is a Model, Anyhow?What Is a Model, Anyhow?
What Is a Model, Anyhow?
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
 
Telecom Churn Prediction Presentation
Telecom Churn Prediction PresentationTelecom Churn Prediction Presentation
Telecom Churn Prediction Presentation
 
Decision Tree Analysis
Decision Tree AnalysisDecision Tree Analysis
Decision Tree Analysis
 
Predictive modelling
Predictive modellingPredictive modelling
Predictive modelling
 
A high level overview of all that is Analytics
A high level overview of all that is AnalyticsA high level overview of all that is Analytics
A high level overview of all that is Analytics
 
Predictive Model for Loan Approval Process using SAS 9.3_M1
Predictive Model for Loan Approval Process using SAS 9.3_M1Predictive Model for Loan Approval Process using SAS 9.3_M1
Predictive Model for Loan Approval Process using SAS 9.3_M1
 
Automation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep LearningAutomation of IT Ticket Automation using NLP and Deep Learning
Automation of IT Ticket Automation using NLP and Deep Learning
 
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
 

Andere mochten auch

Analytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacionAnalytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacion
Alejandro Correa Bahnsen, PhD
 
Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
Alejandro Correa Bahnsen, PhD
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Alejandro Correa Bahnsen, PhD
 
1609 Fraud Data Science
1609 Fraud Data Science1609 Fraud Data Science
1609 Fraud Data Science
Alejandro Correa Bahnsen, PhD
 

Andere mochten auch (10)

2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle
 
Modern Data Science
Modern Data ScienceModern Data Science
Modern Data Science
 
Analytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacionAnalytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacion
 
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud DetectionExample-Dependent Cost-Sensitive Credit Card Fraud Detection
Example-Dependent Cost-Sensitive Credit Card Fraud Detection
 
Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...
 
Classifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksClassifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural Networks
 
1609 Fraud Data Science
1609 Fraud Data Science1609 Fraud Data Science
1609 Fraud Data Science
 
2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice
 
Demystifying machine learning using lime
Demystifying machine learning using limeDemystifying machine learning using lime
Demystifying machine learning using lime
 

Ähnlich wie Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Accenture-Value-Realization-for-SAP
Accenture-Value-Realization-for-SAPAccenture-Value-Realization-for-SAP
Accenture-Value-Realization-for-SAP
Lionel Vuillemin
 

Ähnlich wie Maximizing a churn campaign’s profitability with cost sensitive predictive analytics (20)

About Tangerine Lab?
About Tangerine Lab?About Tangerine Lab?
About Tangerine Lab?
 
A Case Study in Predictive Modeling: How One Firm Achieved Dramatic Results w...
A Case Study in Predictive Modeling: How One Firm Achieved Dramatic Results w...A Case Study in Predictive Modeling: How One Firm Achieved Dramatic Results w...
A Case Study in Predictive Modeling: How One Firm Achieved Dramatic Results w...
 
The successful analytics organization - Epsilon and Transamerica, LIMRA Data ...
The successful analytics organization - Epsilon and Transamerica, LIMRA Data ...The successful analytics organization - Epsilon and Transamerica, LIMRA Data ...
The successful analytics organization - Epsilon and Transamerica, LIMRA Data ...
 
Mastering Paid Search Automation
Mastering Paid Search AutomationMastering Paid Search Automation
Mastering Paid Search Automation
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
 
Predictive Analytics for Non-programmers
Predictive Analytics for Non-programmersPredictive Analytics for Non-programmers
Predictive Analytics for Non-programmers
 
How to Use Data for Product Decisions by YouTube Product Manager
How to Use Data for Product Decisions by YouTube Product ManagerHow to Use Data for Product Decisions by YouTube Product Manager
How to Use Data for Product Decisions by YouTube Product Manager
 
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
BDAS-2017 | Maximizing a churn campaign’s profitability with cost sensitive m...
 
Gap assessment Continuous Testing
Gap assessment   Continuous TestingGap assessment   Continuous Testing
Gap assessment Continuous Testing
 
The metrics that matter using scalability metrics for project planning of a d...
The metrics that matter using scalability metrics for project planning of a d...The metrics that matter using scalability metrics for project planning of a d...
The metrics that matter using scalability metrics for project planning of a d...
 
Analytics Roadmap Developing Management Platform Automation Framework Technol...
Analytics Roadmap Developing Management Platform Automation Framework Technol...Analytics Roadmap Developing Management Platform Automation Framework Technol...
Analytics Roadmap Developing Management Platform Automation Framework Technol...
 
Big Data Analytics: From Insights to Production
Big Data Analytics: From Insights to ProductionBig Data Analytics: From Insights to Production
Big Data Analytics: From Insights to Production
 
"From Insights to Production with Big Data Analytics", Eliano Marques, Senior...
"From Insights to Production with Big Data Analytics", Eliano Marques, Senior..."From Insights to Production with Big Data Analytics", Eliano Marques, Senior...
"From Insights to Production with Big Data Analytics", Eliano Marques, Senior...
 
Marketing analytics
Marketing analyticsMarketing analytics
Marketing analytics
 
Value Chain Analysis Framework PowerPoint Presentation Slides
Value Chain Analysis Framework PowerPoint Presentation Slides Value Chain Analysis Framework PowerPoint Presentation Slides
Value Chain Analysis Framework PowerPoint Presentation Slides
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
Accenture-Value-Realization-for-SAP
Accenture-Value-Realization-for-SAPAccenture-Value-Realization-for-SAP
Accenture-Value-Realization-for-SAP
 
Paradigms of trading strategies formulation
Paradigms of trading strategies formulationParadigms of trading strategies formulation
Paradigms of trading strategies formulation
 
AI Planning Workshop overview
AI Planning Workshop overviewAI Planning Workshop overview
AI Planning Workshop overview
 
Lean LaunchPad: Analytics Workshop
Lean LaunchPad: Analytics WorkshopLean LaunchPad: Analytics Workshop
Lean LaunchPad: Analytics Workshop
 

Mehr von Alejandro Correa Bahnsen, PhD

Mehr von Alejandro Correa Bahnsen, PhD (6)

black hat deephish
black hat deephishblack hat deephish
black hat deephish
 
DeepPhish: Simulating malicious AI
DeepPhish: Simulating malicious AIDeepPhish: Simulating malicious AI
DeepPhish: Simulating malicious AI
 
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
 
How I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data ProductsHow I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data Products
 
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision TreesFraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision Trees
 
2012 predictive clusters
2012 predictive clusters2012 predictive clusters
2012 predictive clusters
 

Kürzlich hochgeladen

Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...
Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...
Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...
Sheetaleventcompany
 
💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...
💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...
💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...
Sheetaleventcompany
 
Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...
Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...
Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...
minkseocompany
 
Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...
Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...
Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...
rajveermohali2022
 

Kürzlich hochgeladen (20)

gatiin-namaa-meeqa .pdf
gatiin-namaa-meeqa                        .pdfgatiin-namaa-meeqa                        .pdf
gatiin-namaa-meeqa .pdf
 
{ Pooja 9892124323 } girls birds call girls netflix funny names to call girls...
{ Pooja 9892124323 } girls birds call girls netflix funny names to call girls...{ Pooja 9892124323 } girls birds call girls netflix funny names to call girls...
{ Pooja 9892124323 } girls birds call girls netflix funny names to call girls...
 
Tinted Sunscreen For Soft and Smooth Skin
Tinted Sunscreen For Soft and Smooth SkinTinted Sunscreen For Soft and Smooth Skin
Tinted Sunscreen For Soft and Smooth Skin
 
Tirunelveli Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tirunelveli
Tirunelveli Escorts Service Girl ^ 9332606886, WhatsApp Anytime TirunelveliTirunelveli Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tirunelveli
Tirunelveli Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tirunelveli
 
Style Victorious Cute Outfits for Winners
Style Victorious Cute Outfits for WinnersStyle Victorious Cute Outfits for Winners
Style Victorious Cute Outfits for Winners
 
Ladies kitty party invitation messages and greetings.pdf
Ladies kitty party invitation messages and greetings.pdfLadies kitty party invitation messages and greetings.pdf
Ladies kitty party invitation messages and greetings.pdf
 
Escorts Service Model Hathras 👉 Just CALL ME: 8617697112 💋 Call Out Call Both...
Escorts Service Model Hathras 👉 Just CALL ME: 8617697112 💋 Call Out Call Both...Escorts Service Model Hathras 👉 Just CALL ME: 8617697112 💋 Call Out Call Both...
Escorts Service Model Hathras 👉 Just CALL ME: 8617697112 💋 Call Out Call Both...
 
❤️Amritsar Call Girls☎️9815674956☎️ Call Girl service in Amritsar☎️ Amritsar ...
❤️Amritsar Call Girls☎️9815674956☎️ Call Girl service in Amritsar☎️ Amritsar ...❤️Amritsar Call Girls☎️9815674956☎️ Call Girl service in Amritsar☎️ Amritsar ...
❤️Amritsar Call Girls☎️9815674956☎️ Call Girl service in Amritsar☎️ Amritsar ...
 
Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...
Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...
Call Girls Service In Zirakpur ❤️🍑 7837612180 👄🫦Independent Escort Service Zi...
 
9867746289 - Payal Mehta Book Call Girls in Versova and escort services 24x7
9867746289 - Payal Mehta Book Call Girls in Versova and escort services 24x79867746289 - Payal Mehta Book Call Girls in Versova and escort services 24x7
9867746289 - Payal Mehta Book Call Girls in Versova and escort services 24x7
 
The Clean Living Project Episode 17 - Blue Zones
The Clean Living Project Episode 17 - Blue ZonesThe Clean Living Project Episode 17 - Blue Zones
The Clean Living Project Episode 17 - Blue Zones
 
Call Girls In Jamnagar Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service Enj...
Call Girls In Jamnagar Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service Enj...Call Girls In Jamnagar Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service Enj...
Call Girls In Jamnagar Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service Enj...
 
UNIVERSAL HUMAN VALUES - INTRODUCTION TO VALUE EDUCATION
 UNIVERSAL HUMAN VALUES - INTRODUCTION TO VALUE EDUCATION UNIVERSAL HUMAN VALUES - INTRODUCTION TO VALUE EDUCATION
UNIVERSAL HUMAN VALUES - INTRODUCTION TO VALUE EDUCATION
 
Top 10 Moisturising Cream Brands In India - Stelon Biotech
Top 10 Moisturising Cream Brands In India - Stelon BiotechTop 10 Moisturising Cream Brands In India - Stelon Biotech
Top 10 Moisturising Cream Brands In India - Stelon Biotech
 
Call girls in Vashi Service 7738596112 Free Delivery 24x7 at Your Doorstep
Call girls in Vashi Service 7738596112 Free Delivery 24x7 at Your DoorstepCall girls in Vashi Service 7738596112 Free Delivery 24x7 at Your Doorstep
Call girls in Vashi Service 7738596112 Free Delivery 24x7 at Your Doorstep
 
"Paltr Packaging: Streamlined Order Process for Seamless Deliveries"
"Paltr Packaging: Streamlined Order Process for Seamless Deliveries""Paltr Packaging: Streamlined Order Process for Seamless Deliveries"
"Paltr Packaging: Streamlined Order Process for Seamless Deliveries"
 
Tumkur Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tumkur
Tumkur Escorts Service Girl ^ 9332606886, WhatsApp Anytime TumkurTumkur Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tumkur
Tumkur Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tumkur
 
💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...
💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...
💚Call Girl In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girls No💰Advance Cash...
 
Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...
Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...
Top 20: Best & Hottest Russian Pornstars Right Now (2024) Russian Porn Stars ...
 
Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...
Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...
Kharar Call Girls Service✔️ 9915851334 ✔️Call Now Ranveer📲 Zirakpur Escort Se...
 

Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

  • 1. Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014 Maximizing a Churn Campaign’s Profitability With Cost-Sensitive Predictive Analytics Alejandro Correa Bahnsen, Luxembourg University Andres Felipe Gonzalez Montoya, DIRECTV
  • 2. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Agenda • Churn modeling • Evaluation Measures • Offers • Predictive modeling • Cost-Sensitive Predictive Modeling  Cost Proportionate Sampling  Bayes Minimum Risk  CS – Decision Trees • Conclusions
  • 3. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Churn Modeling • Detect which customers are likely to abandon Voluntary churn Involuntary churn
  • 4. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Customer Churn Management Campaign Inflow New Customers Customer Base Active Customers *Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models. Predicted Churners Predicted Non-Churners TP: Actual Churners FP: Actual Non-Churners FN: Actual Churners TN: Actual Non-Churners Outflow Effective Churners Churn Model Prediction 1 1 1 − 𝛾𝛾 1
  • 5. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Evaluation of a Campaign • Confusion Matrix • Accuracy = 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 • Recall = 𝑇𝑃 𝑇𝑃+𝐹𝑁 • Precision = 𝑇𝑃 𝑇𝑃+𝐹𝑃 • F1-Score = 2 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 True Class (𝑦𝑖) Churner (𝑦𝑖=1) Non-Churner(𝑦𝑖=0) Predicted class (𝑐𝑖) Churner (𝑐𝑖=1) TP FP Non-Churner (𝑐𝑖=0) FN TN
  • 6. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Evaluation of a Campaign • However these measures assign the same weight to different errors • Not the case in a Churn model since  Failing to predict a churner carries a different cost than wrongly predicting a non-churner  Churners have different financial impact
  • 7. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Financial Evaluation of a Campaign Inflow New Customers Customer Base Active Customers *Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models. Predicted Churners Predicted Non-Churners TP: Actual Churners FP: Actual Non-Churners FN: Actual Churners TN: Actual Non-Churners Outflow Effective Churners Churn Model Prediction 0 𝐶𝐿𝑉 𝐶𝐿𝑉 + 𝐶 𝑎𝐶 𝑜 + 𝐶 𝑎 𝐶 𝑜 + 𝐶 𝑎
  • 8. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Financial Evaluation of a Campaign • Cost Matrix where: True Class (𝑦𝑖) Churner (𝑦𝑖=1) Non-Churner(𝑦𝑖=0) Predicted class (𝑐𝑖) Churner (𝑐𝑖=1) Non-Churner (𝑐𝑖=0) 𝐶 𝑎 = Administrative cost 𝐶𝐿𝑉𝑖 = Client Lifetime Value of customer 𝑖 𝐶 𝑜 𝑖 = Cost of the offer made to customer 𝑖 𝛾𝑖 = Probability that customer 𝑖 accepts the offer 𝐶 𝑇𝑃 𝑖 = 𝛾𝑖 𝐶 𝑜 𝑖 + 1 − 𝛾𝑖 𝐶𝐿𝑉𝑖 + 𝐶 𝑎 𝐶 𝐹𝑁 𝑖 = 𝐶𝐿𝑉𝑖 𝐶 𝑇𝑁 𝑖 = 0 𝐶 𝐹𝑃 𝑖 = 𝐶 𝑜 𝑖 + 𝐶 𝑎
  • 9. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Financial Evaluation of a Campaign • Using the cost matrix the total cost is calculated as: 𝐶 = 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝑇𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝐹𝑁 𝑖 + 1 − 𝑦𝑖 𝑐𝑖 ∙ 𝐶 𝐹𝑃 𝑖 + 1 − 𝑐𝑖 𝐶 𝑇𝑁 𝑖 • Additionally the savings are defined as: 𝐶𝑠 = 𝐶0 − 𝐶 𝐶0 where 𝐶0 is the cost when all the customers are predicted as non-churners
  • 10. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 • Customer Lifetime Value Financial Evaluation of a Campaign *Glady et al. (2009). Modeling churn using customer lifetime value.
  • 11. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Agenda • Churn modeling • Evaluation Measures • Offers • Predictive modeling • Cost-Sensitive Predictive Modeling  Cost Proportionate Sampling  Bayes Minimum Risk  CS – Decision Trees • Conclusions
  • 12. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Offers • Same offer may not apply to all customers (eg. Already have premium channels) • An offer should be made such that it maximizes the probability of acceptance (𝛾) and CLV
  • 13. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Offers clusters
  • 14. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Offers Analysis Improve to HD DVR Monthly Discount Premium Channels Evaluate Offers Performance
  • 15. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Offers Analysis 88% 90% 92% 94% 96% 98% 100% 0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% Cluster 1 Cluster 2 Cluster 3 Cluster 4 Churn Rate Gamma (right axis) 𝛾 = Probability that a customer accepts the offer
  • 16. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling • Using predictive analytics for detecting the behavioral patterns of those customer's who had defect in the past
  • 17. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling • Then check which of the current customers share the same patterns
  • 18. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling • Dataset Dataset N Churn 𝑪 𝟎 (Euros) Total 9410 4.83% 580,884 Training 3758 5.05% 244,542 Validation 2824 4.77% 174,171 Testing 2825 4.42% 162,171 Under-Sampling 374 50.80% 244,542
  • 19. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling • Algorithms  Decision Trees  Logistic Regression  Random Forest
  • 20. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling - Results 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Decision Trees Logistic Regression Random Forest F1-Score Training Under-Sampling 0% 1% 2% 3% 4% 5% 6% 7% 8% Decision Trees Logistic Regression Random Forest Savings Training Under-Sampling
  • 21. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling - SMOTE • Synthetic Minority Over-sampling Technique Dim2 Dim 1 Synthetic samples
  • 22. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling - SMOTE • Dataset Dataset N Churn 𝑪 𝟎 (Euros) Total 9410 4.83% 580,884 Training 3758 5.05% 244,542 Validation 2824 4.77% 174,171 Testing 2825 4.42% 162,171 Under-Sampling 374 50.80% 244,542 SMOTE 6988 48.94% 4,273,083
  • 23. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling - SMOTE 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Decision Trees Logistic Regression Random Forest F1-Score Training Under-Sampling SMOTE 0% 1% 2% 3% 4% 5% 6% 7% 8% Decision Trees Logistic Regression Random Forest Savings Training Under-Sampling SMOTE
  • 24. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Predictive Modeling - SMOTE • Sampling techniques helps to improve models’ predictive power however not necessarily the savings • There is a need for methods that aim to increase savings
  • 25. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Agenda • Churn modeling • Evaluation Measures • Offers • Predictive modeling • Cost-Sensitive Predictive Modeling  Cost Proportionate Sampling  Bayes Minimum Risk  CS – Decision Trees • Conclusions
  • 26. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost-Sensitive Predictive Modeling • Traditional methods assume the same cost for different errors • Not the case in Churn modeling • Some cost-sensitive methods assume a constant cost difference between errors • Example-Dependent Cost-Sensitive Predictive Modeling
  • 27. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost-Sensitive Predictive Modeling • Changing class distribution  Cost Proportionate Rejection Sampling  Cost Proportionate Over Sampling • Direct Cost  Bayes Minimum Risk • Modifying a learning algorithm  CS – Decision Tree
  • 28. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost Proportionate Sampling • Normalized Cost weight 𝑤𝑖 = 𝐶 𝐹𝑃 𝑖 𝑖𝑓 𝑦𝑖 = 0 𝐶 𝐹𝑁 𝑖 𝑖𝑓 𝑦𝑖 = 1 𝑤𝑖 = 𝑤𝑖 max 𝑗 𝑤𝑗
  • 29. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost Proportionate Sampling • Cost Proportionate Over Sampling Example 𝑦𝑖 𝑤𝑖 1 0 1 2 1 10 3 0 2 4 1 20 5 0 1 Initial Dataset (1,0,1) (2,1,10) (3,0,2) (4,1,20) (5,0,1) Cost Proportionate Dataset (1,0,1) (2,1,1), (2,1,1), …, (2,1,1) (3,0,2), (3,0,2) (4,1,1), (4,1,1), (4,1,1), …, (4,1,1), (4,1,1) (5,0,1) *Elkan, C. (2001). The Foundations of Cost-Sensitive Learning.
  • 30. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost Proportionate Sampling • Cost Proportionate Rejection Sampling Example 𝑦𝑖 𝑤𝑖 1 0 1 2 1 10 3 0 2 4 1 20 5 0 1 Initial Dataset (1,0,1) (2,1,10) (3,0,2) (4,1,20) (5,0,1) Cost Proportionate Dataset (2,1,1) (4,1,1) (4,1,1) (5,0,1) *Zadrozny et al. (2003). Cost-sensitive learning by cost-proportionate example weighting. 𝑤𝑖 0.05 0.5 0.1 1 0.05
  • 31. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost Proportionate Sampling • Dataset Dataset N Churn 𝑪 𝟎 (Euros) Total 9410 4.83% 580,884 Training 3758 5.05% 244,542 Validation 2824 4.77% 174,171 Testing 2825 4.42% 162,171 Under-Sampling 374 50.80% 244,542 SMOTE 6988 48.94% 4,273,083 CS – Rejection-Sampling 428 41.35% 231,428 CS – Over-Sampling 5767 31.24% 2,350,285
  • 32. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Cost Proportionate Sampling 0% 5% 10% 15% 20% 25% Decision Trees Logistic Regression Random Forest Savings Training Under SMOTE CS-Rejection CS-Over 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Decision Trees Logistic Regression Random Forest F1-Score Training Under SMOTE CS-Rejection CS-Over
  • 33. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 • Decision model based on quantifying tradeoffs between various decisions using probabilities and the costs that accompany such decisions • Risk of classification 𝑅 𝑐𝑖 = 0|𝑥𝑖 = 𝐶 𝑇𝑁 𝑖 1 − 𝑝𝑖 + 𝐶 𝐹𝑁 𝑖 ∙ 𝑝𝑖 𝑅 𝑐𝑖 = 1|𝑥𝑖 = 𝐶 𝐹𝑃 𝑖 1 − 𝑝𝑖 + 𝐶 𝑇𝑃 𝑖 ∙ 𝑝𝑖 Bayes Minimum Risk
  • 34. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 • Using the different risks the prediction is made based on the following condition: 𝑐𝑖 = 0 𝑅 𝑐𝑖 = 0|𝑥𝑖 ≤ 𝑅 𝑐𝑖 = 1|𝑥𝑖 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 • Example-dependent threshold 𝑡 𝐵𝑀𝑅 𝑖 = 𝐶 𝐹𝑃 𝑖 − 𝐶 𝑇𝑁 𝑖 𝐶 𝐹𝑁 𝑖 − 𝐶 𝑇𝑁 𝑖 − 𝐶 𝑇𝑃 𝑖 + 𝐶 𝐹𝑃 𝑖 Bayes Minimum Risk
  • 35. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Bayes Minimum Risk 0% 5% 10% 15% 20% 25% 30% 35% - BMR - BMR - BMR Decision Trees Logistic Regression Random Forest Savings Training Under-Sampling SMOTE CS-Rejection CS-Over
  • 36. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Bayes Minimum Risk 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 - BMR - BMR - BMR Decision Trees Logistic Regression Random Forest F1-Score Training Under-Sampling SMOTE CS-Rejection CS-Over
  • 37. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Bayes Minimum Risk • Bayes Minimum Risk increases the savings by using a cost- insensitive method and then introducing the costs • Why not introduce the costs during the estimation of the methods?
  • 38. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees • Decision trees  Classification model that iteratively creates binary decision rules 𝑥 𝑗 , 𝑙 𝑗 𝑚 that maximize certain criteria  Where 𝑥 𝑗 , 𝑙 𝑗 𝑚 refers to making a rule using feature 𝑗 on value 𝑚
  • 39. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 • Decision trees – Construction • Then the impurity of each leaf is calculated using:  Misclassification: 𝐼 𝑚 𝜋1 = 1 − 𝑚𝑎𝑥 𝜋1, (1 − 𝜋1)  Entropy : 𝐼𝑒 𝜋1 = −𝜋1 log 𝜋1 − 1 − 𝜋1 log(1 − 𝜋1)  Gini : 𝐼𝑔 𝜋1 = 2𝜋1 1 − 𝜋1 𝜋1is the percentage of positives. CS – Decision Trees 𝑆 𝑆 𝑙 𝑆 𝑟 𝑆 𝑙 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗 𝑖 ≤ 𝑙 𝑗 𝑚 𝑆 𝑟 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗 𝑖 > 𝑙 𝑗 𝑚 𝑥 𝑗 , 𝑙 𝑗 𝑚
  • 40. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 • Decision trees – Construction • Afterwards the gain of applying a given rule to the set 𝑆 is: 𝐺𝑎𝑖𝑛 𝑥 𝑗, 𝑙 𝑗 𝑚 = 𝐼 𝜋1 − 𝑆 𝑙 𝑆 𝐼(𝜋 𝑙 1) − 𝑆 𝑟 𝑆 𝐼(𝜋 𝑟 1) CS – Decision Trees 𝑆 𝑆 𝑙 𝑆 𝑟 𝑆 𝑙 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗 𝑖 ≤ 𝑙 𝑗 𝑚 𝑆 𝑟 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗 𝑖 > 𝑙 𝑗 𝑚 𝑥 𝑗 , 𝑙 𝑗 𝑚
  • 41. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 • Decision trees – Construction • The rule that maximizes the gain is selected 𝑏𝑒𝑠𝑡 𝑥, 𝑏𝑒𝑠𝑡𝑙 = argmax (𝑗,𝑚) 𝐺𝑎𝑖𝑛 𝑥 𝑗, 𝑙 𝑗 𝑚 • The process is repeated until a stopping criteria is met: CS – Decision Trees S S S S S S S S S S S
  • 42. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees • Decision trees - Pruning • Calculation of the Tree error and pruned Tree error • After calculating the pruning criteria for all possible trees. The maximum improvement is selected and the Tree is pruned. • Later the process is repeated until there is no further improvement. S S S S S S S S S S S S S S S S S S S S S S S S S 𝜖 𝑇𝑟𝑒𝑒 𝜖 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝜖 𝑇𝑟𝑒𝑒 𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)| 𝜖 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝜖 𝑇𝑟𝑒𝑒 𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|
  • 43. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees • Maximize the accuracy is different than maximizing the cost • To solve this, some studies had been proposed method that aim to introduce the cost-sensitivity into the algorithms • However, research have been focused on class-dependent methods Instead we used a:  Example-dependent cost based impurity measure  Example-dependent cost based pruning criteria
  • 44. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees • Cost based impurity measure • The impurity of each leaf is calculated using: 𝐼𝑐 𝑆 = 𝑚𝑖𝑛 𝐶0, 𝐶1 𝑓(𝑆) = 0 𝐶0 ≤ 𝐶1 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 𝑆 𝑆 𝑙 𝑆 𝑟 𝑆 𝑙 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗 𝑖 ≤ 𝑙 𝑗 𝑚 𝑆 𝑟 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥 𝑗 𝑖 > 𝑙 𝑗 𝑚 𝑥 𝑗 , 𝑙 𝑗 𝑚
  • 45. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees • Cost sensitive pruning 𝑃𝐶𝑐 = 𝐶 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝐶 𝑇𝑟𝑒𝑒 𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)| • New pruning criteria that evaluates the improvement in cost of eliminating a particular branch
  • 46. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees 0% 10% 20% 30% 40% 50% Error Pruning Cost Pruning Decision Trees Cost-Sensitive Decision Trees Savings Training Under-Sampling SMOTE CS-Rejection CS-Over
  • 47. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 CS – Decision Trees 0 0.05 0.1 0.15 0.2 0.25 0.3 F1-Score Training Under-Sampling SMOTE CS-Rejection CS-Over
  • 48. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Comparison of Models 0% 10% 20% 30% 40% 50% Random Forest Train Logistic Regression CSRejection Logistic Regression BMR Train Decision Tree CostPruning CSRejection CS-Decision Tree Train Savings F1-Score
  • 49. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Conclusions • Selecting models based on traditional statistics does not gives the best results measured by savings • Incorporating the costs into the modeling helps to achieve higher savings
  • 50. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Other Applications • Fraud Detection  Correa Bahnsen et al. (2013). Cost Sensitive Credit Card Fraud Detection using Bayes Minimum Risk.  Correa Bahnsen, et al. (2014). Improving Credit Card Fraud Detection with Calibrated Probabilities. • Credit Scoring  Correa Bahnsen, et al. (2014). Example-Dependent Cost-Sensitive Credit Scoring using Bayes Minimum Risk. • Direct Marketing  Correa Bahnsen, et al. (2014). Example-Dependent Cost-Sensitive Decision Trees.
  • 51. Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014 Contact Information Alejandro Correa Bahnsen University of Luxembourg Luxembourg al.bahnsen@gmail.com http://www.linkedin.com/in/albahnsen http://www.slideshare.net/albahnsen Andres Gonzalez Montoya DIRECTV Colombia andrezfg@gmail.com
  • 52. Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014 Thank you! Alejandro Correa Bahnsen, Luxembourg University Andres Felipe Gonzalez Montoya, DIRECTV