ML Drift - How to find issues before they become problems

ML Drift
Identifying Issues Before You Have a Problem
Amy E. Hodler
Dec 2021

The fundamental
assumption in any machine
learning model is that the
data and logic used actually
mimics the real-world.

Machine Learning Model Drift
The fundamental
assumption in any machine
learning model is that the
data and logic used actually
mimics the real-world. Over time, our ML models
make worse predictions.
Also called “decay”.

Amy E. Hodler
Evangelist, Responsible AI
Fiddler
@amyhodler Responsible AI, Data Science, and Graphs

Building Trust into AI
BEST
PAPER
KEY PARTNERSHIPS
TECHNOLOGY
PIONEER 2020
CB INSIGHTS MOST PROMISING
AI COMPANIES 2021
ENTERPRISE AI GOVERNANCE AND
ETHICAL RESPONSE 2019

We had a model drift over the weekend that
cost $500,000.”
— Chief Data Scientist
When something goes wrong, it takes our data
scientist 2 weeks to troubleshoot the problem.”
— Data Science Director

Don’t Get Too Caught Up In Terminology
ML Drift, Model Drift, Model Decay, Prediction
Drift = Your predictions are getting worse
Experience, types, causes, and indicators of
drift are sometimes used together, overlap,
and don’t have direct mappings to each
other.
Multiple types of drift can happen at the
same time.

How We Experience ML Drift
Not really drift
but can appear
to be
Image: KD nuggets The ravages of concept drift

Key Types of Drift
Concept Drift
Training data with
decision boundary
● Reality/behavioral
change
● Relationships
change, not the
input
P(Y|X)
Probability of
y output
given x input
P(Y|X) Changes
Image: Don’t let your model’s quality drift away by Michał Oleszak

Key Types of Drift
Concept Drift
Training data with
decision boundary
change
● Relationships
change, not the
input
P(Y|X)
Probability of
y output
given x input
Data Drift*
P(Y|X) Changes
● Data changes
● Fundamental
relationships do
not change

Key Types of Drift
Concept Drift
Training data with
decision boundary
change
● Relationships
change, not the
input
P(Y|X)
Probability of
y output
given x input
Label Drift
● Output data shifts
● P(Y) Changes
Data Drift*
P(Y|X) Changes
Feature Drift
● Input data shifts
● P(X) Changes
● Data changes
● Fundamental
relationships do
not change

Key Types of Drift
Virtual Drift
Data changes but
boundary still works
Concept Drift
Training data with
decision boundary
change
● Relationships
change, not the
input
P(Y|X)
Probability of
y output
given x input
Label Drift
● Output data shifts
● P(Y) Changes
Data Drift*
P(Y|X) Changes
Feature Drift
● Input data shifts
● P(X) Changes

Drift Examples for a Loan Application Model
An income level that was
earlier considered
creditworthy is now
considered riskier.
Concept Drift
*Note that if label and data drift happen together and cancel each other out, there is no concept drift. Otherwise, concept drift will be caused by one of the two, since
they are linked by Bayes equation.

earlier considered
creditworthy is now
considered riskier.
Concept Drift
A larger proportion of
credit-worthy applications
start showing up.
Label Drift

earlier considered
creditworthy is now
considered riskier.
Concept Drift
A larger proportion of
credit-worthy applications
start showing up.
Label Drift
Incomes of most
applicants increase or
decrease. Or you suddenly
get more application from
one region.
Feature Drift

Triggers of ML Model Drift
● Label or feature distribution
changes e.g. product launch in a
new market
● Concept can change. e.g. a
competitor launching a new
service
May require a new model
Real Data Distribution Change

Triggers of ML Model Drift
● Label or feature distribution
changes e.g. product launch in a
new market
● Concept can change. e.g. a
competitor launching a new
service
May require a new model
● Correct data enters at source but
faulty data engineering. E.g.
debt-to-income values & age
values are swapped in the input.
● Incorrect data enters at source.
E.g., due to a front-end issue, a
website form accepts leaving a
ﬁeld blank.
Real Data Distribution Change Data Integrity Issues

But Really. . . What’s Really Important?
NOT SURE IF DATA CHANGED
OR REALITY CHANGED

Detect Issues
Analyze Root Cause
Fix It!

Detecting Issues
Performance Monitoring &
Supervised Learning
Data Integrity
Monitoring
Data Drift Monitoring &
Unsupervised Learning
X

Performance Monitoring & Supervised Learning
Works well if you have
ground truth/labels!!
Monitor performance metrics
● Statistical measures
● Accuracy, precision,
FPR, AUC etc.
Supervised learning methods - Ref “A Survey of Concept Drift Adaptation”
● Sequential analysis (SPRT - CUMSUM & PH) - tune alarms on false positives
● Statistical process control (SPC) - rate of change
● Monitoring 2 distributions (ADWIN) - more precise, more overhead

Data Drift Monitoring & Unsupervised Learning
Monitor Statistical Distributions Metrics
● Population Stability Index (PSI)
compares current scoring variable to the
predicted probability from training data
● Kullback–Leibler (KL) divergence measures
the differences of one probability distribution
to a reference probability distribution
● Jensen-Shannon (JS) based on KL, measuring the similarity between two
probability distributions - but is notably symmetric and ﬁnite.
● Kolmogorov-Smirnov test (or KS test) quantiﬁes a distance between the
distribution of the sample and the cumulative distribution of the reference
(non-parametric)

Data Drift Monitoring & Unsupervised Learning
Un / Semi-supervised Learning
An overview of unsupervised drift
detection methods
● Can be more accurate
● Online methods look at each
instance (batch methods are
more efﬁcient)
● Most are global oriented
● May miss gradual drift /
sensitivity issues
● Not intuitive for explaining

Data Integrity & Outlier Monitoring
Data errors slowly degrade
performance and can look like real drift
● Missing values
● Range and Type mismatch
● Schema mismatch
Changes in the business (new cataloged
products, revoked pricing, etc.)
Broken data pipeline due to bugs or API
updates

Getting to the Root
Cause
Attribute to drift in features
Account for feature importance
Analyze affected trafﬁc
Root-causing drift in Fiddler

Fix It!
Retrain
new data and/or
relabel old data
Model Mgmt
Archive/schedule model,
ensemble balancing
Adapt/Augment
model behavior, weighting,
biz logic, data collections

A Few Resources
Fiddler: ML Model Perf Monitoring platform
XAI Summit
Al Infrastructure Alliance —Nonproﬁt, independent information
An overview of unsupervised drift detection methods
A Survey of Concept Drift Adaptation

Let’s build trust into AI
amy.hodler@ﬁddler.ai
@amyhodler
www.ﬁddler.ai

Fiddler MPM Stack: Deep & Versatile
MONITOR
BUILT-IN EXPLAINABILITY
PLUGGABLE MODEL & DATA INGESTION
Ingest Any
Data Source
Plug into
Any Model
Framework
Connect Via
Fiddler API
APP Custom App
ON-PREM
OR CLOUD
Performance
Model Drift & Bias
Data Integrity & Outliers
ANALYZE
Local & Global Explanations
Bias Detection
Auto-slicing for Performance
CONTROL
Model Inventory
Change & Policy Control
Model Reports

ML Drift - How to find issues before they become problems

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie ML Drift - How to find issues before they become problems

Ähnlich wie ML Drift - How to find issues before they become problems (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

ML Drift - How to find issues before they become problems