Weitere ähnliche Inhalte Ähnlich wie From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC version (20) Mehr von Louis Dorard (12) Kürzlich hochgeladen (20) From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC version1. From Data to AI with the
MACHINE LEARNING
CANVAS
@louisdorard
#odsc - 2017/10/13
2. 1. Descriptive analysis
2. Predictive analysis
3. Prescriptive analysis
4. Automated decisions
2
(Big?) Data analysis
“big data”, reporting,
old-school BI…
now we’re talking!
“Artificial Intelligence”!!
9. “DataRobot automatically searches
through millions of combinations of
algorithms, data preprocessing steps,
transformations, features, and tuning
parameters for the best machine learning
model for your data. Each model is
unique — fine-tuned for the specific
dataset and prediction target.” https://
www.datarobot.com/product/
12. –Jeremy Howard (Designing great data products)
“Great predictive modeling is an
important part of the solution, but it no
longer stands on its own; as products
become more sophisticated, it disappears
into the plumbing.”
20. • Technical:
• Getting data in ML-ready format
• Creating the best model for this data
• Deploying models
• Semi-technical:
• Trusting models
• Formalizing ML problems
20
Main barriers to integrating ML in real-world products
21. • Which are the Inputs and Outputs? Features?
• Anticipate how you’ll use predictive models:
• When/how often you’ll need to…
• Make predictions (to deliver value) ❤
• Learn/update models from (new) data
• How much time you’ll have for that
• Any other technical constraints? (e.g. model memory footprint)
• How will you inspect and evaluate predictive models? (so you can trust them)
21
Formalizing ML problems
30. • Who: SaaS company selling monthly subscription
• Question asked: “Is this customer going to leave within 1 month?”
• Input: customer
• Output: no-churn or churn
• Data collection: customer snapshots from 1 month ago; now, 1
month later, we know who left
• How predictions are used: target customers classified as churn in
retention efforts/campaign
30
Churn prediction
31. Assume we know who’s going to churn. What do we do?
• Contact all/some of them? Which ones first?
• Switch to different plan?
• Give special offer?
• Etc.
• No action?
31
Churn prevention
32. 1. Descriptive: show churn rate against time
2. Predictive: show which customers will churn next
3. Prescriptive: suggest which customers to target for
prevention efforts
4. Automated: campaigns sent automatically
32
Phases of churn analysis
33. • Targeting a customer has a cost
• For each TP we “gain”: (success rate of targeting) *
(customer revenue /month)
• Imagine…
• We make perfect predictions and target all Positives
• Revenue /month = 10€ for all customers
• Success rate of targeting = 20%
• Cost of targeting = 2€
• What is the Return On Investment?
33
Quizz: churn prevention ROI
34. 3. Prescriptive: prioritize customers to target, based on…
• Customer representations (i.e. feature values for each)
• Churn predictions
• Uncertainty in predictions
• Revenue brought by each customer
• Constraints on targeting frequency
34
Prescriptions to prevent churn
35. • Compute feature values for given input (a.k.a. “featurize”; involves
merging data sources, aggregating data…)
• Collect training data (inputs and outputs)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given input (context)
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
35
Software components for automated decisions
36. • Compute feature values for given input (a.k.a. “featurize”; involves
merging data sources, aggregating data…)
• Collect training data (inputs and outputs)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given input (context)
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
36
Application-specific component
37. • Compute feature values for given input (a.k.a. “featurize”; involves
merging data sources, aggregating data…)
• Collect training data (inputs and outputs)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given input (context)
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
37
Optimization / Operations Research component
38. • Compute feature values for given input (a.k.a. “featurize”; involves
merging data sources, aggregating data…)
• Collect training data (inputs and outputs)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given input (context)
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
38
Machine Learning components
39. • Compute feature values for given input (a.k.a. “featurize”; involves
merging data sources, aggregating data…)
• Collect training data (inputs and outputs)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given input (context)
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
39
Data Engineering components
42. 42
The Machine Learning Canvas
The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
43. • Started as a mini framework: Who, Question asked,
Input, Output, Features, Data collection, How predictions
are used
• Used it and refined it when consulting
• Made it into a visual chart and iterated on the design
• Used at Konica Minolta, BlaBlaCar, La Poste, Dassault
Systemes, Data Science Academy, UCL
43
Origins of the ML Canvas
44. • (Not an adaptation of the Business Model Canvas)
• Describe the Learning part of an AI system:
• What data are we learning from?
• How are we using predictions powered by that learning?
• How are we making sure that the whole thing “works”
through time?
44
The Machine Learning Canvas
49. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
Move important incoming
emails to a dedicated section
at the top of the inbox
We want to be able to answer
the question
“Is this email important?”
before the user gets a chance to
see the email
• Input: email
• Output:
“Important” (Positive class)
or “Regular”
-> Binary Classification
Make it easier for users of an
email client to identify
new important emails in their
inbox, by automatically
detecting them and making
them more visible in the inbox
(this detection must happen
before user sees email)
The objective is that users
spend less them in their inbox
and reply to important emails
more quickly
• Previous email messages
(as mbox files or in other
type of database)
• Address book
• Calendar
• Explicit labelling: users
can manually label emails
as important or not, by
clicking on an icon next to
each email’s subject
• Implicit labelling:
heuristics based on user
behavior after getting the
email (e.g. replying fast,
deleting without reading,
etc.)
Every time we receive an
email addressed to our user,
which starts a new thread
(otherwise the importance is
just the same as that of the
thread)
We aim to put the email in the
right section of the inbox,
within a 2s period
FP costs 1, FN costs 3.
For each user: take last 3
months of emails for test and
12 months before for
training. We make P.I.
feature available to user if…
• Cost < baseline heuristic
(e.g. “if sender in address
book then important”)
• No more than 1 error per X
emails
One model per user, initially
built on last 12 months of
email data, that we update…
• When an error is signaled
by the user via manual
labelling
• Every 5’ by adding new
data from implicit
labelling, if any
Per week:
• Ratio: #errors explicitly reported by user / #emails
received
• Same w. errors seen via implicit labelling
• Average time taken to reply to important emails
• Total time spent on inbox
Priority Inbox (PI) Louis Dorard Jan. 2017 1
• Content features: subject,
body, attachments, size
• Social features: based on
info about sender (e.g. in
address book?), previous
interactions, contextual
(e.g. upcoming meeting w.
sender)
• Email labels (typically
assigned via manual rules
defined by user)
50. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
If probability of Positive…
• > M: approve
• < m: reject
Otherwise request human
decision
Thresholds m and M are
chosen to maximize offline
evaluation (performed right
after model update)
“Is this review legit or fake?”
• Input: review
• Output: “legit” or
“fake” (Positive class)
-> Binary classification
Note: the distribution of
outputs is typically 70-30
(legit vs fake)
Reject fake incoming reviews
and approve legit
reviews automatically.
Flag fake reviews in database
to stop displaying them /
using them to compute
average ratings. Have ratings
which are closer to the truth.
Improve customer experience
and satisfaction (less
surprises).
• User database
• Reviews database
• Social networks
• Crowdsourcing platform
(e.g. Mechanical Turk)
• Initially: active learning
using crowdsourcing
platform
• Internal, manual labelling
• When explicitly
requested (complaint, or
model’s probability in
between thresholds)
• Randomly selected
reviews every day (as
many as allowed for a
budget of $X /day)
We receive X reviews / minute
on average. We can allow a
delay of 1 day / review, but
including 1/2 day for
manual review if we’re in
between thresholds.
Train model with data up
until 1 wk ago. Compute total
cost on last wk’s data, for
different values of m and M
(starting at m=0 and M=1),
taking into account:
• Gain of correct, automated
decision = - Cost of
manual decision
• Cost of FN (when review
sentiment positive /
negative)
• Cost of FP (smaller)
One model per language/
country
Somewhat adversarial setting
=> Keep on learning
=> Every week we update our
models by adding all the data
from last week. We allow a
day for this.
Every week:
• Average customer satisfaction
• # customer complaints
• # hotel complaints
• # manual reviews
Fake review detection Louis Dorard Jan. 2017 1
• Content of review: rating,
text, length, # capitals…
• Other predictions:
sentiment, emotion, etc.
• User: basic info, # previous
bookings, # approved
reviews, # rejected reviews
• Metadata (e.g. IP)
• Product being reviewed (e.g.
hotel chain)
• Similarity with prev.
reviews (total score)
51. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
Every week:
• Compute predictions for all
houses currently on the
market
• Filter out 50% randomly
(hold out set)
• Filter out properties where
asking price is higher
• Prioritize best deals first
and schedule visits
• Review manually and buy
at asking price or lower
« How much is this property
worth? »
• Input: property
• Output: value
-> regression task
OR
« Is this a good deal? »
-> classification task
Make better real-estate
investments: compare
price predictions with actual
asking price of properties on the
market, to find the best deals.
• Redfin
• Open data: public transports,
schools, etc.
• Google Maps
Every week request Redfin data
on:
- New properties on the
market. Should contain
property characteristics +
asking price
- Sale records (initially:
records for the past year).
Should contain properties
previously seen, but this
time with actual sale price.
Every week we make predictions
for new properties for sale
(using all property info
available except asking price).
Test on the last month of
labelled data, manually review
errors and compute…
• Average percentage error
• Cost: for bad deals (sale
price < asking) that were
seen as good deals (asking
< prediction), we would
have incurred a cost of
(asking - sale price) in case
we would have gone through
with investment.
Only keep data up until a year
in the past
Update model every month
(with new data available)
• Investment return (should go up)
• Time spent visiting properties (should go down as we’re
smarter about which we want to visit)
• Sale price compared to prediction, on hold out set
Real-estate deals Louis Dorard Jan. 2017 1
• Property basic info
• Extracted from text
description:
• Has swimming pool
• …
• Location:
• Latitude, longitude
• Address
• Distance to closest transports
and shops
• Average rating of schools in
5 mile radius
52. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
On 1st day of every month:
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target customers
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight.
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
Every month we create a new
model from the previous
month’s customers.
• Monitor churn rate
• Monitor (#non-churn among targeted) / #targets
Customer retention Louis Dorard Sept. 2016 1
53. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
On 1st day of every month:
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target customers
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight.
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
Every month we create a new
model from the previous
month’s customers.
• Monitor churn rate
• Monitor (#non-churn among targeted) / #targets
Customer retention Louis Dorard Sept. 2016 1
54. • We predicted customer would churn, but in the end they didn’t…
• Great! Prevention works!
• Sh*t! Data inconsistent…
• Imagine that:
• client1 and client2 very similar & predicted to churn
• only client2 was targeted, and we made him stay
Input Output
client1 Churn
client2 No-churn
56. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
On 1st day of every month:
• Randomly filter out 50% of
customers (hold-out set)
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target customers
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight.
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
• Monitor churn rate
• Monitor (#non-churn among targeted) / #targets
Customer retention Louis Dorard Sept. 2016 1
Every month we create a new
model from the previous
month’s customers.
57. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
On 1st day of every month:
• Randomly filter out 50% of
customers (hold-out set)
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target customers
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight.
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
Every month we create a new
model from the previous
month’s hold-out set (or the
whole set, when initializing
this system).
We do this overnight (along
with making predictions).
• Monitor churn rate
• Monitor (#non-churn among targeted) / #targets
Customer retention Louis Dorard Sept. 2016 1
58. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
On 1st day of every month:
• Randomly filter out 50% of
customers (hold-out set)
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target customers
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight.
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
Every month we create a new
model from the previous
month’s hold-out set (or the
whole set, when initializing
this system).
We do this overnight (along
with making predictions).
• Accuracy of last month’s predictions on hold-out set
• Compare churn rate & lost revenue between last month’s
hold-out set and remaining set
• Monitor (#non-churn among targeted) / #targets
• Monitor ROI (based on diff. in lost revenue & cost of
retention campaign)
Customer retention Louis Dorard Sept. 2016 1
59. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
On 1st day of every month:
• Randomly filter out 50% of
customers (hold-out set)
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target customers
Before targeting customers:
• Evaluate new model’s
accuracy on pre-defined
customer profiles
• Simulate decisions taken
on last month’s customers
(using model learnt from
customers 2 months ago).
Compute ROI w. different #
customers to target &
hypotheses on retention
success rate (is it >0?)
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight (along
with building the model that
powers these predictions and
evaluating it).
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
Every month we create a new
model from the previous
month’s hold-out set (or the
whole set, when initializing
this system).
We do this overnight (along
with offline evaluation and
making predictions).
• Accuracy of last month’s predictions on hold-out set
• Compare churn rate & lost revenue between last month’s
hold-out set and remaining set
• Monitor (#non-churn among targeted) / #targets
• Monitor ROI (based on diff. in lost revenue & cost of
retention campaign)
Customer retention Louis Dorard Sept. 2016 1
60. The Machine Learning Canvas (v0.4) Designed for: Designed by: Date: Iteration: .
Decisions
How are predictions used to
make decisions that provide
the proposed value to the enduser?
ML task
Input, output to predict,
type of problem.
Value
Propositions
What are we trying to do for the
enduser(s) of the predictive system?
What objectives are we serving?
Data Sources
Which raw data sources can
we use (internal and
external)?
Collecting Data
How do we get new data to
learn from (inputs and
outputs)?
Making
Predictions
When do we make predictions on new
inputs? How long do we have to
featurize a new input and make a
prediction?
Offline
Evaluation
Methods and metrics to evaluate the
system before deployment.
Features
Input representations
extracted from raw data
sources.
Building Models
When do we create/update
models with new training
data? How long do we have to
featurize training inputs and create a
model?
Live Evaluation and
Monitoring
Methods and metrics to evaluate the
system after deployment, and to
quantify value creation.
machinelearningcanvas.com by Louis Dorard, Ph.D. Licensed under a Creative Commons AttributionShareAlike 4.0 International License.
Before targeting customers:
• Evaluate new model’s
accuracy on pre-defined
customer profiles
• Simulate decisions taken
on last month’s customers
(using model learnt from
customers 2 months ago).
Compute ROI w. different #
customers to target &
hypotheses on retention
success rate (is it >0?)
Predict answer to “Is this
customer going to churn in
the coming month?”
• Input: customer
• Output: ‘churn’ or ‘no-
churn’ class (‘churn’ is the
Positive class)
• Binary Classification
Context:
• Company sells SaaS with
monthly subscription
• End-user of predictive
system is CRM team
We want to help them…
• Identify important clients
who may churn, so
appropriate action can be
taken
• Reduce churn rate among
high-revenue customers
• Improve success rate of
retention efforts by
understanding why
customers may churn
• CRM tool
• Payments database
• Website analytics
• Customer support
• Emailing to customers
Every month, we see which of
last month’s customers
churned or not, by looking
through the payments
database.
Associated inputs are
customer “snapshots” taken
last month.
Every month we (re-)featurize
all current customers and
make predictions for them.
We do this overnight (along
with building the model that
powers these predictions and
evaluating it).
Basic customer info at time t
(age, city, etc.)
Events between (t - 1 month)
and t:
• Usage of product: # times
logged in, functionalities
used, etc.
• Cust. support interactions
• Other contextual, e.g.
devices used
Every month we create a new
model from the previous
month’s hold-out set (or the
whole set, when initializing
this system).
We do this overnight (along
with offline evaluation and
making predictions).
• Accuracy of last month’s predictions on hold-out set
• Compare churn rate & lost revenue between last month’s
hold-out set and remaining set
• Monitor (#non-churn among targeted) / #targets
• Monitor ROI (based on diff. in lost revenue & cost of
retention campaign)
Customer retention Louis Dorard Sept. 2016 1
On 1st day of every month:
• Randomly filter out 50% of
customers (hold-out set)
• Filter out ‘no-churn’
• Sort remaining by
descending (churn prob.) x
(monthly revenue) and
show prediction path for
each
• Target as many customers
as suggested by simulation
62. • Adapt use cases from other industries/companies?
• Start from value proposition?
• Can you formalize a classification or regression problem?
• Start from a classification or regression problem?
• How do you go from predictions to value creation?
• Start from data sources: what if we could predict this?
62
Coming up with a good ML use case
63. • For each use case idea:
• Evaluate how difficult data collection and extraction will
be
• Evaluate potential for the business
• Start with low-hanging fruit: easy and high potential
• Fill in MLC
63
Coming up with a good ML use case
64. • Fill in MLC
• Choose technologies to use
• Implement data collection asap
64
From MLC to Data Preparation
65. • Feature extraction from sources of raw data
• Exploratory Data Analysis (with visualization and statistics)
• Spot problems early… reality check!
• Discover things you don’t already know
• Test hypotheses
• Data cleansing
• Modeling, offline evaluation and inspection
65
From Data Preparation to PoC
66. • Pipeline: extraction + cleansing + modeling + evaluation
• Live evaluation and monitoring (e.g. A/B test)
66
From PoC to deployment
67. –Ingolf Mollat, Principal Consultant at Blue Yonder
“The Machine Learning Canvas is
providing our clients real business value by
supplying the first critical entry point for
their implementation of predictive
applications.”
68. • Assist data scientists, software engineers, product and
business managers, in aligning their activities
• Make sure all efforts are directed at solving the right
problem!
• Guide project management
68
Why fill in ML canvas early
69. • Download ML Starter Kit (includes canvas + PDF guide)
from louisdorard.com
• UCL Engineering’s ML Academy
• Evening course, once a week, over 6 weeks
• Starts on Monday at IDEALondon
• Email l.dorard@ucl.ac.uk to apply
69
Learn more