SlideShare ist ein Scribd-Unternehmen logo
1 von 47
When Model Interpretation Matters:
Understanding Complex Predictive Models
Dean Abbott
Co-Founder and Chief Data Scientist, SmarterHQ
Twitter: @deanabb
Your Boss
Simple Models…Simple Story
Variable Importance in Linear Regression
Variable Importance in Decision Trees
• Decision Trees
– You think they are easy to explain?
Variable Importance in Decision Trees
• Decision Trees
– You think they are easy to explain?
Then We Do This
Variable Importance in Neural Networks
• Huh?
Neural Networks:
Interpretation via Sensitivities
Variable Importance in Neural Networks: Weights
Other Ways to Compute Neural Network Sensitivities
Such as… http://www.palisade.com/downloads/pdf/academic/DTSpaper110915.pdf
And ftp://ftp.sas.com/pub/neural/importance.html#mlp_parder_interp
• Weight tracing – sum of product of weights (and variants)
• Partial derivatives – avg, avg absolute, squared, etc.
• Remove variable, compute change in accuracy
Naïve Bayes Model Outputs
Essentially a series of
cross-tabs for every
variable!
Remember, the final
probability is the
product of the
individual variable
probabilities.
SVM Output
What About Model Ensembles?
Decision Logic
Ensemble Prediction
10s to 100s of trees…
Outline
• Classical variable importance: linear regression
• Hack #1: using linear regression model statistics to
infer variable importance
The Data: Easiest Possible!
• 3 inputs: each is a random Normal: mean = 20, std = 5
• Target variable: 0.5*var1 + 0.2*var2 + 0.3*var3
• 95,412 records (same size as cup98lrn)
Linear Regression Coefficient
For Each Variable to Assess Influence
• Coefficient match (be definition) the proportions used to
be build the target variable
• This is the average influence of each input on the
predictions for all records
Assess Influence with t-proportion
For Each Variable
• I know I’m breaking rules here. Bear with me….
Assess Influence with t-proportion
For Each Variable
• T-value measures the significance of the relationship.
• It turns out, that the proportion of the t-values for the exact model
matches the coefficients
Assess Influence using Direct Measure of
Influence Proportion
• Compute the contribution of each term in the linear regression model
separately (each record).
– Var1_influence = $var1coef$ * $var1$, etc.
• Compute the proportion of the contribution of the predicted
target variable value
• Average the contributions of each variable for each record to compute the
average influence of each variable
So Far So Good
• Now let’s do the
same thing for
– Neural Networks
– Support Vector
Machines.
So Far So Good
• Now let’s do the
same thing for
– Neural Networks
– Support Vector
Machines.
Motivation for Input Shuffling
http://www.elderresearch.com/company/target-shuffling
Why “Input Shuffling”?
• We don’t always have nice metrics
to assess inputs of predictive
models -- Neural Networks, SVM,
ensembles
– Contrast with statistical methods like
Regression
• Even with regression, we don’t
always have the right input
distributions so these metrics are
good indicators of variable
influence
Input Distributions Are Not Always Ideal
What does “Shuffled” mean?
• Scramble (randomly) a single input
variable
– Input Shuffling Node doesn’t have to be
in a loop; it can scramble a column while
leaving the others in their natural order
• Captures the actual distribution of
the data
Principles of Input Shuffling
• Key: randomly re-populate values of a single input variable while leaving
all other variables with their original values
• Compute the standard deviation (or some other measure of perturbation)
for each record
– Of the Predicted Target Variable – posterior probability
– NOT the actual target variable value
• This perturbation is a measure of how influential the variable is in the
model
– High standard deviation -> lots of influence
– Low standard deviation -> not much influence
– ~0 standard deviation -> no influence
Shuffled Inputs Meta Node
Two Loops: (1) loop on input variables and (2) shuffle input variable (50x or so)
The Input Shuffling Process
1. Build the predictive model
2. For a data subset (can use training, or some suitably sized set), N records
3. Loop over every variable
1. Loop M times (50 by default)
1. Shuffle the variable (keeping all other inputs for that row fixed)
2. Score the Model
3. Save the scores for the entire data set (you will end up with times the #
records)
2. Compute the standard deviation of the predictions for each row (or some other
measure of “spread”), i.e., group by Row ID, computing stdev. Now we have N records
again
3. Compute the average spread of an input over all N records, such as the mean of these
standard deviations, i.e., group by entire data set. Now we have 1 number, the
variable influence
4. Compare all results. Sort descending by variable influence.
Single Record: what it looks like
• After 50 “input shuffles”: Row0
Average for All Records in data
• Measures the spread of the predictions when randomly perturbing the single
input variable
Input Shuffling Result:
Idealized Linear Regression Data
• Compute proportion of the average standard deviation from shuffling the
input (keeping others with the original values)
• (yes, I know I’m averaging standard deviations!)
Target variable: 0.5*var1 + 0.2*var2 + 0.3*var3
Realistic Data: KDD Cup 1998
• 95,412: cup98lrn from KDD Cup 1998 Competition
– Use only the responders (4843) in linear regression models
• Hundreds of fields in data, but only use 4 for our purposes here
– LASTGIFT, NGIFTALL,
RFA_2F, D_RFA_2A
• Continuous target
• Two continuous inputs
• One ordinal input (RFA_2F)
• One dummy input (D_RFA_2A)
Realistic Data: KDD Cup 1998
• Heavy skew of LASTGIFT, NGIFTALL, TARGET_D
– Makes visualization difficult
– Biases
regression
coefficients
(if
one cares)
– So, do the usual
“best practices”
Normalized Data
• To remove influence of skew and scale
– Log10 transform LASTGIFT, NGIFTALL, TARGET_D
– Scale all variables (post log10) to [0, 1]
Normalized Data
• Relationships clearer
– LASTGIFT strong positive correlation with TARGET_D
– NGIFTALL, RFA_2F, D_RFA_2A all have apparently slight negative
correlation
with
TARGET_D
The Basic Model: Linear Regression
Coefficient
Use abs() for influence calculations
Linear Regression:
Compare Influence Using Different Methods
Coefficient t-Proportion
Use abs() for t-proportion calculationsUse abs() for influence calculations
Linear Regression:
Compare Influence Using Different Methods
Coefficient t-Proportion
Direct Proportion Input Shuffling Proportion
Use abs() for t-proportion calculationsUse abs() for calculations
Use abs() for t-proportion calculations
Linear Regression, Neural Network: Input Shuffling
Influence
Input Shuffling- LR Input Shuffling - MLP
Applying Input Shuffling to Classification: Logistic Regression
Start simple: just 4 variables (like the regression example
Applying Input Shuffling to Classification: Logistic Regression
Influence Based on Proportion of z-score Influence Based on Input Shuffling
Ranking Larger Numbers of Variables
Ranking Larger Numbers of Variables
Conclusion
• Input shuffling can generate model sensitivity scores for
any model, no matter how complex or nonlinear
• Input shuffling can be applied to any algorithm, no
matter how linear or nonlinear the algorithm is
• Matches linear regression variable influence (t-value)
• Similar to logistic regression variable influence (z-
score)
Future Work
• If model predictions (scores) are not normally distributed, and if the influence
is not uniform, average overall influence doesn’t tell the full story (or may even
tell a misleading story) about how valuable the variable is in predicting the
target
– Break predictions into bins (deciles or other number of bins) allows us to compute
an influence score for every part of the predicted range
– Answers the question: for high predicted values, which variables are most
influential
• Build score influence rather than prediction influence
– Use ROC AUC statistics for each shuffled input, and determine the influence of each variable
on the model score rather than the predicted value

Weitere ähnliche Inhalte

Was ist angesagt?

Deterministic vs stochastic
Deterministic vs stochasticDeterministic vs stochastic
Deterministic vs stochastic
sohail40
 
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
Confirmatory Factor Analysis Presented by Mahfoudh MgammalConfirmatory Factor Analysis Presented by Mahfoudh Mgammal
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
Dr. Mahfoudh Hussein Mgammal
 
Measure of central tendency (2)
Measure of central tendency (2)Measure of central tendency (2)
Measure of central tendency (2)
AndresBrutas
 

Was ist angesagt? (18)

Deterministic vs stochastic
Deterministic vs stochasticDeterministic vs stochastic
Deterministic vs stochastic
 
05 use case
05 use case05 use case
05 use case
 
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to ...
 
ADAN Symposium
ADAN SymposiumADAN Symposium
ADAN Symposium
 
Understanding the Machine Learning Algorithms
Understanding the Machine Learning AlgorithmsUnderstanding the Machine Learning Algorithms
Understanding the Machine Learning Algorithms
 
Recommender Systems from A to Z – Model Evaluation
Recommender Systems from A to Z – Model EvaluationRecommender Systems from A to Z – Model Evaluation
Recommender Systems from A to Z – Model Evaluation
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using R
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)
 
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
Confirmatory Factor Analysis Presented by Mahfoudh MgammalConfirmatory Factor Analysis Presented by Mahfoudh Mgammal
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Measure of central tendency (2)
Measure of central tendency (2)Measure of central tendency (2)
Measure of central tendency (2)
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
MachineLlearning introduction
MachineLlearning introductionMachineLlearning introduction
MachineLlearning introduction
 
Introduction to ml
Introduction to mlIntroduction to ml
Introduction to ml
 
Unit 3(advanced state modeling & interaction meodelling)
Unit  3(advanced state modeling & interaction meodelling)Unit  3(advanced state modeling & interaction meodelling)
Unit 3(advanced state modeling & interaction meodelling)
 
Decision tree
Decision treeDecision tree
Decision tree
 
Feature selection
Feature selectionFeature selection
Feature selection
 

Ähnlich wie 1015 track2 abbott

GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
Sri Ambati
 

Ähnlich wie 1015 track2 abbott (20)

lecture-05.pptx
lecture-05.pptxlecture-05.pptx
lecture-05.pptx
 
Sim Slides,Tricks,Trends,2012jan15
Sim Slides,Tricks,Trends,2012jan15Sim Slides,Tricks,Trends,2012jan15
Sim Slides,Tricks,Trends,2012jan15
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient search
 
EMOD_Optimization_Presentation.pptx
EMOD_Optimization_Presentation.pptxEMOD_Optimization_Presentation.pptx
EMOD_Optimization_Presentation.pptx
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
 
LESSON 04 - Descriptive Satatistics.pdf
LESSON 04 - Descriptive Satatistics.pdfLESSON 04 - Descriptive Satatistics.pdf
LESSON 04 - Descriptive Satatistics.pdf
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
EDA by Sastry.pptx
EDA by Sastry.pptxEDA by Sastry.pptx
EDA by Sastry.pptx
 
Intro to data science
Intro to data scienceIntro to data science
Intro to data science
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.ppt
 
5954987.ppt
5954987.ppt5954987.ppt
5954987.ppt
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind Map
 

Mehr von Rising Media, Inc.

Mehr von Rising Media, Inc. (20)

1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop1415 track 1 wu_using his laptop
1415 track 1 wu_using his laptop
 
Matt gershoff
Matt gershoffMatt gershoff
Matt gershoff
 
Keynote adam greco
Keynote adam grecoKeynote adam greco
Keynote adam greco
 
1620 keynote olson_using our laptop
1620 keynote olson_using our laptop1620 keynote olson_using our laptop
1620 keynote olson_using our laptop
 
1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop1530 track 2 stuart_using our laptop
1530 track 2 stuart_using our laptop
 
1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop1530 track 1 fader_using our laptop
1530 track 1 fader_using our laptop
 
1415 track 2 richardson
1415 track 2 richardson1415 track 2 richardson
1415 track 2 richardson
 
1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop1215 daa lunch owusu_using our laptop
1215 daa lunch owusu_using our laptop
 
1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop1215 daa lunch a bos intro slides_using our laptop
1215 daa lunch a bos intro slides_using our laptop
 
915 e metrics_claudia perlich
915 e metrics_claudia perlich915 e metrics_claudia perlich
915 e metrics_claudia perlich
 
855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop855 sponsor movassate_using our laptop
855 sponsor movassate_using our laptop
 
1615 plack using our laptop
1615 plack using our laptop1615 plack using our laptop
1615 plack using our laptop
 
1530 rimmele do not share
1530 rimmele do not share1530 rimmele do not share
1530 rimmele do not share
 
1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable1325 keynote yale_pdf shareable
1325 keynote yale_pdf shareable
 
1115 fiztgerald schuchardt
1115 fiztgerald schuchardt1115 fiztgerald schuchardt
1115 fiztgerald schuchardt
 
1000 kondic do not share
1000 kondic do not share1000 kondic do not share
1000 kondic do not share
 
905 keynote peele_using our laptop
905 keynote peele_using our laptop905 keynote peele_using our laptop
905 keynote peele_using our laptop
 
Stephen morse sharable
Stephen morse sharableStephen morse sharable
Stephen morse sharable
 
Elder shareable
Elder shareableElder shareable
Elder shareable
 
1115 ramirez using our laptop
1115 ramirez using our laptop1115 ramirez using our laptop
1115 ramirez using our laptop
 

Kürzlich hochgeladen

unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
dlhescort
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
dlhescort
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Anamikakaur10
 

Kürzlich hochgeladen (20)

Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Falcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in indiaFalcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in india
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 

1015 track2 abbott

  • 1. When Model Interpretation Matters: Understanding Complex Predictive Models Dean Abbott Co-Founder and Chief Data Scientist, SmarterHQ Twitter: @deanabb
  • 4. Variable Importance in Linear Regression
  • 5. Variable Importance in Decision Trees • Decision Trees – You think they are easy to explain?
  • 6. Variable Importance in Decision Trees • Decision Trees – You think they are easy to explain?
  • 7. Then We Do This
  • 8. Variable Importance in Neural Networks • Huh?
  • 10. Variable Importance in Neural Networks: Weights
  • 11. Other Ways to Compute Neural Network Sensitivities Such as… http://www.palisade.com/downloads/pdf/academic/DTSpaper110915.pdf And ftp://ftp.sas.com/pub/neural/importance.html#mlp_parder_interp • Weight tracing – sum of product of weights (and variants) • Partial derivatives – avg, avg absolute, squared, etc. • Remove variable, compute change in accuracy
  • 12. Naïve Bayes Model Outputs Essentially a series of cross-tabs for every variable! Remember, the final probability is the product of the individual variable probabilities.
  • 14. What About Model Ensembles? Decision Logic Ensemble Prediction 10s to 100s of trees…
  • 15.
  • 16. Outline • Classical variable importance: linear regression • Hack #1: using linear regression model statistics to infer variable importance
  • 17. The Data: Easiest Possible! • 3 inputs: each is a random Normal: mean = 20, std = 5 • Target variable: 0.5*var1 + 0.2*var2 + 0.3*var3 • 95,412 records (same size as cup98lrn)
  • 18. Linear Regression Coefficient For Each Variable to Assess Influence • Coefficient match (be definition) the proportions used to be build the target variable • This is the average influence of each input on the predictions for all records
  • 19. Assess Influence with t-proportion For Each Variable • I know I’m breaking rules here. Bear with me….
  • 20. Assess Influence with t-proportion For Each Variable • T-value measures the significance of the relationship. • It turns out, that the proportion of the t-values for the exact model matches the coefficients
  • 21. Assess Influence using Direct Measure of Influence Proportion • Compute the contribution of each term in the linear regression model separately (each record). – Var1_influence = $var1coef$ * $var1$, etc. • Compute the proportion of the contribution of the predicted target variable value • Average the contributions of each variable for each record to compute the average influence of each variable
  • 22. So Far So Good • Now let’s do the same thing for – Neural Networks – Support Vector Machines.
  • 23. So Far So Good • Now let’s do the same thing for – Neural Networks – Support Vector Machines.
  • 24. Motivation for Input Shuffling http://www.elderresearch.com/company/target-shuffling
  • 25. Why “Input Shuffling”? • We don’t always have nice metrics to assess inputs of predictive models -- Neural Networks, SVM, ensembles – Contrast with statistical methods like Regression • Even with regression, we don’t always have the right input distributions so these metrics are good indicators of variable influence
  • 26. Input Distributions Are Not Always Ideal
  • 27. What does “Shuffled” mean? • Scramble (randomly) a single input variable – Input Shuffling Node doesn’t have to be in a loop; it can scramble a column while leaving the others in their natural order • Captures the actual distribution of the data
  • 28. Principles of Input Shuffling • Key: randomly re-populate values of a single input variable while leaving all other variables with their original values • Compute the standard deviation (or some other measure of perturbation) for each record – Of the Predicted Target Variable – posterior probability – NOT the actual target variable value • This perturbation is a measure of how influential the variable is in the model – High standard deviation -> lots of influence – Low standard deviation -> not much influence – ~0 standard deviation -> no influence
  • 29. Shuffled Inputs Meta Node Two Loops: (1) loop on input variables and (2) shuffle input variable (50x or so)
  • 30. The Input Shuffling Process 1. Build the predictive model 2. For a data subset (can use training, or some suitably sized set), N records 3. Loop over every variable 1. Loop M times (50 by default) 1. Shuffle the variable (keeping all other inputs for that row fixed) 2. Score the Model 3. Save the scores for the entire data set (you will end up with times the # records) 2. Compute the standard deviation of the predictions for each row (or some other measure of “spread”), i.e., group by Row ID, computing stdev. Now we have N records again 3. Compute the average spread of an input over all N records, such as the mean of these standard deviations, i.e., group by entire data set. Now we have 1 number, the variable influence 4. Compare all results. Sort descending by variable influence.
  • 31. Single Record: what it looks like • After 50 “input shuffles”: Row0
  • 32. Average for All Records in data • Measures the spread of the predictions when randomly perturbing the single input variable
  • 33. Input Shuffling Result: Idealized Linear Regression Data • Compute proportion of the average standard deviation from shuffling the input (keeping others with the original values) • (yes, I know I’m averaging standard deviations!) Target variable: 0.5*var1 + 0.2*var2 + 0.3*var3
  • 34. Realistic Data: KDD Cup 1998 • 95,412: cup98lrn from KDD Cup 1998 Competition – Use only the responders (4843) in linear regression models • Hundreds of fields in data, but only use 4 for our purposes here – LASTGIFT, NGIFTALL, RFA_2F, D_RFA_2A • Continuous target • Two continuous inputs • One ordinal input (RFA_2F) • One dummy input (D_RFA_2A)
  • 35. Realistic Data: KDD Cup 1998 • Heavy skew of LASTGIFT, NGIFTALL, TARGET_D – Makes visualization difficult – Biases regression coefficients (if one cares) – So, do the usual “best practices”
  • 36. Normalized Data • To remove influence of skew and scale – Log10 transform LASTGIFT, NGIFTALL, TARGET_D – Scale all variables (post log10) to [0, 1]
  • 37. Normalized Data • Relationships clearer – LASTGIFT strong positive correlation with TARGET_D – NGIFTALL, RFA_2F, D_RFA_2A all have apparently slight negative correlation with TARGET_D
  • 38. The Basic Model: Linear Regression Coefficient Use abs() for influence calculations
  • 39. Linear Regression: Compare Influence Using Different Methods Coefficient t-Proportion Use abs() for t-proportion calculationsUse abs() for influence calculations
  • 40. Linear Regression: Compare Influence Using Different Methods Coefficient t-Proportion Direct Proportion Input Shuffling Proportion Use abs() for t-proportion calculationsUse abs() for calculations Use abs() for t-proportion calculations
  • 41. Linear Regression, Neural Network: Input Shuffling Influence Input Shuffling- LR Input Shuffling - MLP
  • 42. Applying Input Shuffling to Classification: Logistic Regression Start simple: just 4 variables (like the regression example
  • 43. Applying Input Shuffling to Classification: Logistic Regression Influence Based on Proportion of z-score Influence Based on Input Shuffling
  • 44. Ranking Larger Numbers of Variables
  • 45. Ranking Larger Numbers of Variables
  • 46. Conclusion • Input shuffling can generate model sensitivity scores for any model, no matter how complex or nonlinear • Input shuffling can be applied to any algorithm, no matter how linear or nonlinear the algorithm is • Matches linear regression variable influence (t-value) • Similar to logistic regression variable influence (z- score)
  • 47. Future Work • If model predictions (scores) are not normally distributed, and if the influence is not uniform, average overall influence doesn’t tell the full story (or may even tell a misleading story) about how valuable the variable is in predicting the target – Break predictions into bins (deciles or other number of bins) allows us to compute an influence score for every part of the predicted range – Answers the question: for high predicted values, which variables are most influential • Build score influence rather than prediction influence – Use ROC AUC statistics for each shuffled input, and determine the influence of each variable on the model score rather than the predicted value