These slides were presented at a meetup in Kansas City by Bahador Khaleghi of H2O.ai.
More details can be viewed here: https://www.meetup.com/Kansas-City-Artificial-Intelligence-Deep-Learning/events/265662978/
1. Towards human centered
machine learning (ML)
Oct 2019
Kansas City
Bahador Khaleghi
Customer Data Scientist
bahador.khaleghi@h2o.ai
2. Outline
The promises and perils of ML
The need for human centered ML
The case for ML explainability (why and what)
The how of ML explainability
ML explainability at H2O.ai
2
3. The promises of ML
Recognize complex patterns in big structured data
- Customer churn prediction, fraud detection
Basic perception of unstructured data
- Computer vision: object detection, object recognition
- Natural language processing: sentiment analysis, document classification, automatic translation
- Speech recognition: voice command, automated answering system
See into the future
- Time series forecasting: predictive maintenance 3
4. The perils of ML
ML technology has been evolving too fast
Unethical/irresponsible usage of ML can threaten our basic
human rights
Regulatory bodies are left behind, thus many regulatory
gaps
ML community has been mainly focused on performance
until recently
4
https://blog-sap.com/analytics/files/2017/07/7.27.ageofacclerations.png
6. Human centered ML to the rescue?*
Key idea: humans should ultimately be in control of ML technology
Develop ML technology with humans in mind so that it is
- Useful
- Trustworthy
- Congruent with our social values
6
* https://www.linkedin.com/pulse/human-centered-ai-building-humans-mind-rachel-samson/
Ethics ML
HCI
7. Challenges of trustworthy ML
7https://medium.com/element-ai-research-lab/a-taxonomy-of-ai-trustability-challenges-1c68f160d027
8. ML explanations can facilitate other aspects of trust
- Flag potential model bias
- e.g.: are protected features acting as the main predictors?
- Accountability
- e.g.: what caused an autonomous car’s pedestrian detector to fail?
- Robustness
- e.g.: models that rely on explainable features tend to be more resilient to adversarial attacks
Importance of ML explainability
8
9. Why ML explainability?
Regulatory compliance
- Customers in regulated industries like banking and
insurance need/want MLI
- Equal Credit Opportunity Act (ECOA) in US
- GDPR in EU and the “right to explanation” argument
- Security audit
Can help identify issues with an otherwise (seemingly)
performant model
- The case of husky misclassified as wolf
https://arxiv.org/abs/1602.04938
9
10. What does it mean to explain a model?*
Main question: what drove prediction(s) of a model?
Answer: depends!
- Who is asking: e.g. model creator vs examiner
- Many explanation families
- Importance scores
- Decision trees/rules
- Dependency plots
- Counterfactual
- Verbal
- ...
10
*https://www.elementai.com/news/2019/the-what-of-explainable-ai
11. The How of ML explainability*
11*https://towardsdatascience.com/the-how-of-explainable-ai-pre-modelling-explainability-699150495fe4
12. Pre-modelling explanations*
Exploratory data analysis and visualization
Dataset description standardization
Dataset summarization
Explainable feature engineering
12*https://towardsdatascience.com/the-how-of-explainable-ai-pre-modelling-explainability-699150495fe4
13. Explainable modelling*
Alleged explainability vs performance
tradeoff
Many potential ways to beat that
- Joint prediction and explanation
- Hybrid models
- Explainability through regularization
- ...
13*https://towardsdatascience.com/the-how-of-explainable-ai-explainable-modelling-55c8c43d7bed
14. Extracting (post-hoc) explanations*
The main focus of the research
community
Many explaining deep neural
networks
Estimator mechanisms
- Input perturbation
- Backward propagation
- Proxy
- Activation maximization
14*https://towardsdatascience.com/the-how-of-explainable-ai-post-modelling-explainability-8b4cbc7adf5f
15. ML explainability at H2O.ai
We are one of the pioneers of ML explainability in industry
Work closely with our clients and regulatory bodies to establish
best practices
Incorporated explainability capabilities into our products back
in 2016
- Machine learning interpretability (MLI)
15
18. What is MLI?
Stands for Machine Learning
Interpretability
A (separate) pipeline within
DAI that provides a set of
features aimed at explaining
output of DAI models
Can also be applied to models
developed outside DAI
18
19. MLI’s local explanations
How a row prediction came about?
Feature importance Decision logic Prediction behavior
when varying a feature
Exact SHAP NA ICE on DAI model
Approximate LIME, LOCO DT surrogate ICE on RF surrogate
19
20. Key idea & Approach: approximate response function of a
complex model locally with a (weighted) linear model
Pros:
- Easy to implement
- Versatile
Cons:
- Rather costly to compute
- May not work for highly non-linear models => use SHAP
instead
LIME: Linear Interpretable Model-agnostic Explanation
https://arxiv.org/abs/1602.04938
20
21. SHAP
Key idea: explain prediction as a game played by feature values
Approach: compute the expected contribution of each feature value across all possible feature coalitions
Pros:
- Based on solid math (Shapley values theory)
- Gives exact marginal contribution of each feature to model prediction
Cons:
- Costly to compute in general (has a fast implementation for tree based models)
https://arxiv.org/abs/1705.07874
21
22. Surrogate Decision Tree (DT)
A single decision tree is trained on the original
inputs and predictions of the DAI model
- Meant to capture the decision making logic
of DAI model to some extent
- Useful for identifying potential feature
interactions
22
23. Surrogate Random Forest (RF)
Trained similar to surrogate DT
- Provides the global feature
importance scores (also used by
K-LIME clustering)
- Used by LOCO to provide an
alternative local feature importance to
LIME
- Used by approximate PDP and ICE
plots
23
24. LOCO: Leave One Covariate Out
Key idea: feature importance as the difference in model prediction with and without a given feature
Approach: approximate prediction without a feature using RF surrogate where contributions of rules
involving that feature are removed
Pros:
- Nonlinear and considers feature interactions => alternative to LIME approximate local
explanations
Cons:
- Difficult to generate a mathematical error rate (unlike LIME) 24
25. ICE: Individual Conditional Expectation
How a row prediction varies if ONLY a desired feature varies
within its domain?
Help explore if the treatment of a specific row is valid in
comparison to
- Average model behavior (PDP) => discrepancies could
reveal possible feature interaction => examine
surrogate DT
- Known standards
- Domain knowledge, and reasonable expectations
ICE PDP
25
26. MLI’s global explanations
How all (set of) model predictions came about? (what drives model predictions in general)
Feature importance Decision logic Prediction behavior
when varying a feature
Exact Aggregated SHAP NA PDP on DAI model
Approximate K-LIME , RF surrogate,
Aggregated LOCO?
DT surrogate PDP on RF surrogate
26
27. Key idea: response function of a complex model may
not be linear globally but it could be piecewise linear
Approach: use GLMs to approximate global response
of a complex model within local regions (clusters)
obtained by:
- K-Means applied to the globally most
important features provided by RF surrogate
model
- Decision tree surrogate leafs (LIME-SUP)
- An (optional) clustering column provided by
customer (based on their domain knowledge)
K-LIME
27
28. PDP: Partial Dependence Plot
How model predictions vary on average if ONLY a desired feature varies within its domain?
Basically aggregate of ICE plots for a given feature
Help explore if overall treatment of a specific feature is valid in comparison to
- Known standards
- Domain knowledge, and reasonable expectations
Feature interactions might be averaged out by PDP
28
29. Reason codes
A plain English “translation” of K-LIME explanations
Three different scopes: global, cluster, and local
- Local reason codes come with std dev
Also generated for feature contributions provided by
SHAP (not shown in UI by downloadable as a CSV)
29
30. MLI’s time series explanations
Based on SHAP
DAI might split time series data into multiple
groups when modelling
SHAP explanations are obtained for all
forecasts in each group and aggregated up
30
31. Evaluating explanations
SHAP explanation should be accurate and consistent, at least in theory
Goodness of fit in case of approximate (surrogate) models
- R2 and RMSE of training and validation data
- Surrogate prediction accuracy
- Ranked predictions plot for K-LIME
Standard deviation in case of PDP, ICE plots, and reason codes
Consistency between different explanation techniques (use MLI dashboard)
http://docs.h2o.ai/driverless-ai/latest-stable/docs/userguide/interpret-non-ts.html#expectations-for-consistency-between-e
xplanatory-techniques
31
32. Assess model’s group fairness through Disparate
Impact Analysis (DIA)
- For a given feature, e.g. sex, compute
average, per group, performance metrics,
e.g. accuracy
- Compute group disparities as the ratio of
group metrics to given reference metric
- Flag cases where group disparities are
beyond the preset thresholds as biased
MLI’s fairness assessment
32
33. Available for binary classification and regression models
Could be used for model debugging too, e.g. examine confusion matrix and group metrics of
non-protected features
Best suited for constrained models (linear, constrained GBM, RuleFit) as the average group metrics
reported by DIA is less likely to miss cases of local discrimination
MLI’s fairness assessment (cont.)
33
36. The what of explainable AI: https://www.elementai.com/news/2019/the-what-of-explainable-ai
The why of explainable AI: https://www.elementai.com/news/2019/the-why-of-explainable-ai
The how of explainable AI- pre-modelling explainability:
https://towardsdatascience.com/the-how-of-explainable-ai-pre-modelling-explainability-699150495fe4
The how of explainable AI- explainable modelling:
https://towardsdatascience.com/the-how-of-explainable-ai-explainable-modelling-55c8c43d7bed
The how of explainable AI- post-modelling explainability:
https://towardsdatascience.com/the-how-of-explainable-ai-post-modelling-explainability-8b4cbc7adf5f
Learn more
36