Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

CD4ML and the challenges of testing and quality in ML systems

153 Aufrufe

Veröffentlicht am

Speaker: Danilo Sato, principal consultant at ThoughtWorks.

Bio: Danilo Sato (@dtsato) is a principal consultant at ThoughtWorks with experience in many areas of architecture and engineering: software, data, infrastructure, and machine learning. He is the author of "DevOps in Practice: Reliable and Automated Software Delivery", a member of ThoughtWorks Technology Advisory Board, and ThoughtWorks Office of the CTO.

Title: CD4ML and the challenges of testing and quality in ML systems

Abstract: Continuous Delivery for Machine Learning (CD4ML) deals with the challenges of applying Continuous Delivery principles to ML systems to make the end-to-end process of developing and deploying them more repeatable and reliable. These systems are generally more complex than traditional software applications, and ML models are non-deterministic and hard to explain. In this talk we will discuss the challenges of testing and quality in ML systems, and share some practices for applying different types of tests to help overcome those issues.

www.devopsinpractice.com
www.devopsnapratica.com.br

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

CD4ML and the challenges of testing and quality in ML systems

  1. 1. 1 CD4ML and the challenges of testing and quality in ML systems TensorFlow London Meetup, May 2020 Danilo Sato @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  2. 2. 7000+ technologists with 43 offices in 14 countries We help clients become Modern Digital Businesses DELIVER VALUE MOVE FASTTHINK BIG
  3. 3. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  4. 4. Techniques Continuous delivery for machine learning (CD4ML) TRIAL 7 https://www.thoughtworks.com/radar ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  5. 5. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 6 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  6. 6. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 7 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving Machine Learning is: ● Non-deterministic ● Hard to test ● Hard to explain ● Hard to improve HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  7. 7. MANY SOURCES OF CHANGE 8 ModelData Code + + Schema Sampling over Time Volume Algorithms More Training Experiments Business Needs Bug Fixes Configuration ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  8. 8. “Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes and experiments — into production, or into the hands of users, safely and quickly in a sustainable way.” - Jez Humble & Dave Farley 9 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  9. 9. PRINCIPLES OF CONTINUOUS DELIVERY 10 → Create a Repeatable, Reliable Process for Releasing Software → Automate Almost Everything → Build Quality In → Work in Small Batches → Keep Everything in Source Control → Done Means “Released” → Improve Continuously ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  10. 10. TECHNICAL COMPONENTS OF CD4ML Implementation requires lots of tools, technologies, and architecture decisions to fully automate the end-to-end process. This presentation will focus on the testing and quality aspects of CD4ML. 11 DOING CD4ML IS STILL A HARD PROBLEM DISCOVERABLE AND ACCESSIBLE DATA REPRODUCIBLE MODEL TRAINING EXPERIMENTS TRACKING ELASTIC INFRASTRUCTURE VERSION CONTROL & ARTIFACTS REPOS MODEL SERVING MODEL DEPLOYMENT TESTING & QUALITY MONITORING & OBSERVABILITY CD ORCHESTRATION ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 https://martinfowler.com/articles/cd4ml.html
  11. 11. “CLASSIC” SOFTWARE TEST PYRAMID 12 UI Tests Service Tests Unit Tests https://martinfowler.com/bliki/TestPyramid.html©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Speed Cost
  12. 12. AS SOFTWARE BECAME MORE COMPLEX 13 https://martinfowler.com/articles/microservice-testing©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  13. 13. TESTING IN PRODUCTION 14 https://sookocheff.com/post/architecture/testing-in-production/©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  14. 14. 15 ModelData Code + + ?? ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  15. 15. TESTS FOR DATA 16 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) - Adherence to schemas - Features can be used - Schema versioning and compatibility - Integration tests against (small) sample input - Adherence to privacy controls - On-demand quality checks
  16. 16. TESTS FOR MODEL 17 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Compare against a simple model - Numerical stability (behaviour when NaN or infinite values appear) Unit Tests (Model Specification) Model Quality ML Training Pipeline - Training is reproducible (Watch out for sources of non-determinism – e.g. RNG seeds, initialization order) - Integration test
  17. 17. 18 ModelData Code + + ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  18. 18. 19 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model evaluation against different validation datasets - Thresholds for model metrics and execution performance - Different data slices - Feature generation is same for training/serving - Model contract is adhered in production - When model is exported, test it still works TESTING WHERE THEY OVERLAP
  19. 19. 20 Model Performance Contract Tests Model Bias and Fairness Data Pipeline Data/Feature Validation Unit Tests (Transformations, Engineered Features) Unit Tests (Model Specification) Model Quality UI Tests Service Tests Unit Tests End-to-End Tests Production Monitoring Exploratory Tests ML Training Pipeline ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020 - Model degradation - Training/serving skew - Operational metrics (latency, throughput, resource usage) - Real impact! (KPIs)
  20. 20. 21 “Inspection does not improve the quality, nor guarantee quality. Inspection is too late. The quality, good or bad, is already in the product.” - W. Edward Deming ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  21. 21. QUESTIONS? 22 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  22. 22. WORKSHOPS, PRESENTATIONS & ARTICLES Workshops: https://github.com/ThoughtWorksInc/cd4ml-workshop https://github.com/ThoughtWorksInc/CD4ML-Scenarios Articles: https://martinfowler.com/articles/cd4ml.html https://www.thoughtworks.com/insights/articles/intelligent-enterprise-series-cd4ml Paper: “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction”, Breck et al (Google) 2323 ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020
  23. 23. 2424 THANK YOU! Danilo Sato (dsato@thoughtworks.com) @dtsato ©ThoughtWorks 2020 - @dtsato TensorFlow London Meetup - May 28, 2020

×