This week’s CDx Connection Summit covers AI in the enterprise, providing practical, empirical and farsighted advise for those working on AI in large organizations from Pete Skomoroch and Tim O’Reilly.
4. • It’s hard to know which models will work
• ML needs lots of labelled training data
• ML is prone to fail in unexpected ways
• Development progress is often episodic
and unpredictable
• Many research & production systems still
held together by duct tape, notebooks
• Input data & dependencies change, so
past model runs are difficult to reproduce
Machine Learning is Still Difficult
5. If you only do things where you know
the answer in advance, your company
goes away.
Jeff Bezos
Founder, Chairman & CEO of Amazon.com
• Machine Learning shifts
engineering from a deterministic
process to a probabilistic one
• Take intelligent risks
• Most successful ML products are
experiments at massive scale
• Companies driven by analytics
and experimental insights are
more likely to succeed
ML Adds Uncertainty
6. How Does Uncertainty Alter Product Roadmaps?
• PMs are often uncomfortable with expensive ideas that have an
uncertain probability of success
• Many organizations will struggle to justify the expense of projects that
require significant research investment upfront
• Some ML products may need to be split into time boxed projects that
get to market in a shorter time frame
• What can you productize now vs. much later on?
• Keep track of dependencies on other teams and have a “Plan B”
7. Machine Learning Development Process
1. Problem definition and framing
2. Model design
3. Data collection, labelling, and cleaning
4. Feature engineering
5. Model training and offline validation
6. Model deployment, monitoring &
online training
Problem
Definition
Model
Design
Data
Collection &
Cleaning
Feature
Engineering
Model
Training
Deployment
& Monitoring
9. How do you manage expectations and
deliver on promises?
10. Put Company Strategy and Goals First
10@peteskomoroch
• Identify the right business problems
• Pair ML applications directly to company
objectives & key metrics
• Example: LinkedIn’s People You May
Know
• Get executive sponsorship and support
• Work with business units where data is
readily accessible
11. Learn Stakeholder OKRs
11@peteskomoroch
• Objectives and Key Results (OKRs) are a common goal
management framework used in the Enterprise
• Learn and track business unit KPIs, understand how your team can
help drive them
• Translate your team’s objectives to match stakeholder OKRs
instead of only focusing on model accuracy improvements
• Build credibility by improving key metrics over time
12. Define Objectives and Benchmarks, Track Everything
• Instrument your platform from
training through deployment
• Version everything: code, data,
labels, models, hyper-parameters…
• Data Scientists should be able to
make changes to models and kick
off new training runs quickly
• Track accuracy against benchmark
datasets regularly
• Communicate progress towards
goals regularly to stakeholders
12@peteskomoroch
Source: @paperswithcode
13. Anticipate & Monitor Errors, Report Externally
• Diversity of ever-changing data sources
• Data issues cascade through the ML stack
• Features also change, drift, break over time
• Model accuracy can decay over time
• Need data monitoring to spot statistical
changes in data inputs, anomalies
• Unlike traditional applications, model
performance is non-deterministic
• Need ability to adjust for Black Swan events
like COVID-19 Source: @fiddlerlabs
14. Flywheels: Communicate a Vision of ML Value
• Users generate data as a side effect of
using most software products
• That data in turn, can improve the
product’s algorithms and enable new
types of recommendations, leading to
more data
• These “Flywheels” get better the more
customers use them leading to unique
competitive moats
• This works well in platforms, networks or
marketplaces where value compounds
* https://medium.freecodecamp.org/the-business-implications-of-machine-learning-11480b99184d
15. Bridging the Gap: Key Points
15@peteskomoroch
• Focus on real business problems and priorities
• Learn the OKRs for partner teams, ensure they have data
• Communicate early and often in language stakeholders understand
• Manage expectations on time frames, impact, and quality of ML
• Build credibility by improving key metrics over time