SlideShare ist ein Scribd-Unternehmen logo
1 von 24
1
CD4ML and the challenges
of testing and quality in ML
systems
TensorFlow London Meetup, May 2020
Danilo Sato
@dtsato
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
7000+ technologists with 43 offices in 14 countries
We help clients become Modern Digital Businesses
DELIVER VALUE MOVE FASTTHINK BIG
#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Techniques
Continuous delivery
for machine
learning (CD4ML)
TRIAL
7
https://www.thoughtworks.com/radar
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
6
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
7
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
Machine Learning is:
● Non-deterministic
● Hard to test
● Hard to explain
● Hard to improve
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
MANY SOURCES OF CHANGE
8
ModelData Code
+ +
Schema
Sampling over Time
Volume
Algorithms
More Training
Experiments
Business Needs
Bug Fixes
Configuration
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
“Continuous Delivery is the ability to get changes of
all types — including new features, configuration
changes, bug fixes and experiments — into
production, or into the hands of users, safely and
quickly in a sustainable way.”
- Jez Humble & Dave Farley
9
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
PRINCIPLES OF CONTINUOUS DELIVERY
10
→ Create a Repeatable, Reliable Process for Releasing
Software
→ Automate Almost Everything
→ Build Quality In
→ Work in Small Batches
→ Keep Everything in Source Control
→ Done Means “Released”
→ Improve Continuously
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TECHNICAL
COMPONENTS OF
CD4ML
Implementation requires lots of tools,
technologies, and architecture decisions
to fully automate the end-to-end process.
This presentation will focus on the
testing and quality aspects of CD4ML.
11
DOING CD4ML IS STILL A HARD PROBLEM
DISCOVERABLE AND
ACCESSIBLE DATA
REPRODUCIBLE
MODEL TRAINING
EXPERIMENTS
TRACKING
ELASTIC
INFRASTRUCTURE
VERSION CONTROL
& ARTIFACTS REPOS
MODEL SERVING
MODEL
DEPLOYMENT
TESTING & QUALITY
MONITORING &
OBSERVABILITY
CD
ORCHESTRATION
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
https://martinfowler.com/articles/cd4ml.html
“CLASSIC” SOFTWARE TEST PYRAMID
12
UI
Tests
Service Tests
Unit Tests
https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Speed
Cost
AS SOFTWARE BECAME MORE COMPLEX
13
https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TESTING IN PRODUCTION
14
https://sookocheff.com/post/architecture/testing-in-production/©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
15
ModelData Code
+ +
??
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
TESTS FOR DATA
16
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
- Adherence to schemas
- Features can be used
- Schema versioning and
compatibility
- Integration tests against
(small) sample input
- Adherence to privacy
controls
- On-demand quality
checks
TESTS FOR MODEL
17
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Compare against a
simple model
- Numerical stability
(behaviour when NaN or
infinite values appear)
Unit Tests
(Model Specification)
Model
Quality
ML Training Pipeline
- Training is reproducible
(Watch out for sources of
non-determinism – e.g. RNG
seeds, initialization order)
- Integration test
18
ModelData Code
+ +
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
19
Model Performance
Contract Tests
Model Bias and Fairness
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
Unit Tests
(Model Specification)
Model
Quality
UI
Tests
Service Tests
Unit Tests
ML Training Pipeline
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Model evaluation against
different validation
datasets
- Thresholds for model
metrics and execution
performance
- Different data slices
- Feature generation is
same for training/serving
- Model contract is
adhered in production
- When model is exported,
test it still works
TESTING WHERE THEY OVERLAP
20
Model Performance
Contract Tests
Model Bias and Fairness
Data
Pipeline
Data/Feature Validation
Unit Tests
(Transformations, Engineered Features)
Unit Tests
(Model Specification)
Model
Quality
UI
Tests
Service Tests
Unit Tests
End-to-End Tests
Production Monitoring
Exploratory
Tests
ML Training Pipeline
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
- Model degradation
- Training/serving skew
- Operational metrics
(latency, throughput,
resource usage)
- Real impact! (KPIs)
21
“Inspection does not improve the
quality, nor guarantee quality.
Inspection is too late. The quality,
good or bad, is already in the
product.”
- W. Edward Deming
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
QUESTIONS?
22
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
WORKSHOPS,
PRESENTATIONS &
ARTICLES
Workshops:
https://github.com/ThoughtWorksInc/cd4ml-workshop
https://github.com/ThoughtWorksInc/CD4ML-Scenarios
Articles:
https://martinfowler.com/articles/cd4ml.html
https://www.thoughtworks.com/insights/articles/intelligent-enterprise-series-cd4ml
Paper:
“The ML Test Score: A Rubric for ML Production Readiness and Technical Debt
Reduction”, Breck et al (Google)
2323
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020
2424
THANK YOU!
Danilo Sato (dsato@thoughtworks.com)
@dtsato
©ThoughtWorks 2020 - @dtsato
TensorFlow London Meetup - May 28, 2020

Weitere ähnliche Inhalte

Was ist angesagt?

ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflowamesar0
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLJordan Birdsell
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflowDatabricks
 
Apply MLOps at Scale
Apply MLOps at ScaleApply MLOps at Scale
Apply MLOps at ScaleDatabricks
 
Best Practices in Qt Quick/QML - Part IV
Best Practices in Qt Quick/QML - Part IVBest Practices in Qt Quick/QML - Part IV
Best Practices in Qt Quick/QML - Part IVICS
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.Knoldus Inc.
 
OOD Principles and Patterns
OOD Principles and PatternsOOD Principles and Patterns
OOD Principles and PatternsNguyen Tung
 
ENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DBENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DBNeo4j
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentLeandro Totino Pereira
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3Raven Jiang
 
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialBuilding Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialXavier Amatriain
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleDatabricks
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleDatabricks
 
Python tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterPython tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterJeff Hale
 
Introduction to Git and Github
Introduction to Git and GithubIntroduction to Git and Github
Introduction to Git and GithubHouari ZEGAI
 
An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceJulien SIMON
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&MDatabricks
 

Was ist angesagt? (20)

ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflow
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
What is MLOps
What is MLOpsWhat is MLOps
What is MLOps
 
Apply MLOps at Scale
Apply MLOps at ScaleApply MLOps at Scale
Apply MLOps at Scale
 
Chat GPTs
Chat GPTsChat GPTs
Chat GPTs
 
Best Practices in Qt Quick/QML - Part IV
Best Practices in Qt Quick/QML - Part IVBest Practices in Qt Quick/QML - Part IV
Best Practices in Qt Quick/QML - Part IV
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Git and git hub basics
Git and git hub basicsGit and git hub basics
Git and git hub basics
 
OOD Principles and Patterns
OOD Principles and PatternsOOD Principles and Patterns
OOD Principles and Patterns
 
ENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DBENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DB
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3
 
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialBuilding Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Python tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterPython tools to deploy your machine learning models faster
Python tools to deploy your machine learning models faster
 
Introduction to Git and Github
Introduction to Git and GithubIntroduction to Git and Github
Introduction to Git and Github
 
An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 

Ähnlich wie CD4ML and the challenges of testing and quality in ML systems

Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionDr. Arif Wider
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyDr. Arif Wider
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGroup
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AISanjana Chowdhury
 
Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine LearningThoughtworks
 
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019Christoph Windheuser
 
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfData Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfHemaVeeradhi1
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Open Data Group
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringJordi Cabot
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisVivek Raja P S
 
Continuous delivery practices and real experiences
Continuous delivery   practices and real experiencesContinuous delivery   practices and real experiences
Continuous delivery practices and real experiencesEduardo Ferro Aldama
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation Profinit
 
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Rik Marselis
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Lionel Briand
 

Ähnlich wie CD4ML and the challenges of testing and quality in ML systems (20)

Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
IBM Think Milano
IBM Think MilanoIBM Think Milano
IBM Think Milano
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production Reliably
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Eliminate 7 Mudas
Eliminate 7 MudasEliminate 7 Mudas
Eliminate 7 Mudas
 
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the AnswerDevOpsGuys FutureDecoded 2016 - is DevOps the Answer
DevOpsGuys FutureDecoded 2016 - is DevOps the Answer
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
 
Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine Learning
 
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
 
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdfData Science Meets DevOps: GitOps with OpenShift (1).pdf
Data Science Meets DevOps: GitOps with OpenShift (1).pdf
 
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018 Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
 
Our research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software EngineeringOur research lines on Model-Driven Engineering and Software Engineering
Our research lines on Model-Driven Engineering and Software Engineering
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
 
Continuous delivery practices and real experiences
Continuous delivery   practices and real experiencesContinuous delivery   practices and real experiences
Continuous delivery practices and real experiences
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
Introduction of TMAP to representatives of ISTQB boards in the GA week in Mar...
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 

Mehr von Seldon

TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsSeldon
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiSeldon
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...Seldon
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...Seldon
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...Seldon
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...Seldon
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAISeldon
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow Seldon
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial servicesSeldon
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code Seldon
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...Seldon
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Seldon
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Seldon
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...Seldon
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'Seldon
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Seldon
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'Seldon
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoSeldon
 

Mehr von Seldon (20)

TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz SantissiTensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
Tensorflow London: Tensorflow and Graph Recommender Networks by Yaz Santissi
 
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...TensorFlow London: Progressive Growing of GANs for increased stability, quali...
TensorFlow London: Progressive Growing of GANs for increased stability, quali...
 
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
TensorFlow London 18: Dr Daniel Martinho-Corbishley, From science to startups...
 
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
TensorFlow London 18: Dr Alastair Moore, Towards the use of Graphical Models ...
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
TensorFlow London 17: How NASA Frontier Development Lab scientists use AI to ...
 
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAITensorFlow London 17: Practical Reinforcement Learning with OpenAI
TensorFlow London 17: Practical Reinforcement Learning with OpenAI
 
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
TensorFlow 16: Multimodal Sentiment Analysis with TensorFlow
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Ai in financial services
Ai in financial servicesAi in financial services
Ai in financial services
 
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
TensorFlow London 15: Find bugs in the herd with debuggable TensorFlow code
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
 
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
 
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
TensorFlow London 11: Gema Parreno 'Use Cases of TensorFlow'
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
 
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
TensorFlow London 12: Oliver Gindele 'Recommender systems in Tensorflow'
 
TensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya DmitrichenkoTensorFlow London 13.09.17 Ilya Dmitrichenko
TensorFlow London 13.09.17 Ilya Dmitrichenko
 

Kürzlich hochgeladen

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 

Kürzlich hochgeladen (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 

CD4ML and the challenges of testing and quality in ML systems

  • 1. 1 CD4ML and the challenges of testing and quality in ML systems TensorFlow London Meetup, May 2020 Danilo Sato @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 2. 7000+ technologists with 43 offices in 14 countries We help clients become Modern Digital Businesses DELIVER VALUE MOVE FASTTHINK BIG
  • 3. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 4.
  • 5. Techniques Continuous delivery for machine learning (CD4ML) TRIAL 7 https://www.thoughtworks.com/radar ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 6. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 6 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 7. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 7 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving Machine Learning is: ● Non-deterministic ● Hard to test ● Hard to explain ● Hard to improve HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 8. MANY SOURCES OF CHANGE 8 ModelData Code + + Schema Sampling over Time Volume Algorithms More Training Experiments Business Needs Bug Fixes Configuration ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 9. “Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes and experiments — into production, or into the hands of users, safely and quickly in a sustainable way.” - Jez Humble & Dave Farley 9 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 10. PRINCIPLES OF CONTINUOUS DELIVERY 10 → Create a Repeatable, Reliable Process for Releasing Software → Automate Almost Everything → Build Quality In → Work in Small Batches → Keep Everything in Source Control → Done Means “Released” → Improve Continuously ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 11. TECHNICAL COMPONENTS OF CD4ML Implementation requires lots of tools, technologies, and architecture decisions to fully automate the end-to-end process. This presentation will focus on the testing and quality aspects of CD4ML. 11 DOING CD4ML IS STILL A HARD PROBLEM DISCOVERABLE AND ACCESSIBLE DATA REPRODUCIBLE MODEL TRAINING EXPERIMENTS TRACKING ELASTIC INFRASTRUCTURE VERSION CONTROL & ARTIFACTS REPOS MODEL SERVING MODEL DEPLOYMENT TESTING & QUALITY MONITORING & OBSERVABILITY CD ORCHESTRATION ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 https://martinfowler.com/articles/cd4ml.html
  • 12. “CLASSIC” SOFTWARE TEST PYRAMID 12 UI Tests Service Tests Unit Tests https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Speed Cost
  • 13. AS SOFTWARE BECAME MORE COMPLEX 13 https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 15. 15 ModelData Code + + ?? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 16. TESTS FOR DATA 16 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) - Adherence to schemas - Features can be used - Schema versioning and compatibility - Integration tests against (small) sample input - Adherence to privacy controls - On-demand quality checks
  • 17. TESTS FOR MODEL 17 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Compare against a simple model - Numerical stability (behaviour when NaN or infinite values appear) Unit Tests (Model Specification) Model Quality ML Training Pipeline - Training is reproducible (Watch out for sources of non-determinism – e.g. RNG seeds, initialization order) - Integration test
  • 18. 18 ModelData Code + + ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 19. 19 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model evaluation against different validation datasets - Thresholds for model metrics and execution performance - Different data slices - Feature generation is same for training/serving - Model contract is adhered in production - When model is exported, test it still works TESTING WHERE THEY OVERLAP
  • 20. 20 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests End-to-End Tests Production Monitoring Exploratory Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model degradation - Training/serving skew - Operational metrics (latency, throughput, resource usage) - Real impact! (KPIs)
  • 21. 21 “Inspection does not improve the quality, nor guarantee quality. Inspection is too late. The quality, good or bad, is already in the product.” - W. Edward Deming ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 22. QUESTIONS? 22 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  • 24. 2424 THANK YOU! Danilo Sato (dsato@thoughtworks.com) @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020