Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Challenges for machine learning systems toward continuous improvement

498 Aufrufe

Veröffentlicht am

Presented at IBIS 2019 @ Nagoya
Organized 9 challenges in Machine Learning systems in production with MLOps manifest.

Veröffentlicht in: Ingenieurwesen
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Challenges for machine learning systems toward continuous improvement

  1. 1. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Challenges for Machine Learning Systems toward Continuous Improvement IBIS 2019 機械学習工学セッション @ Nagoya 2019.11.22 Arm Treasure Data Aki Ariga
  2. 2. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [Sculley, 2015] より翻訳をして引用
  3. 3. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● ● ● https://sites.google.com/view/sig-mlse/wg 試行錯誤から本番適用にもっていくにあたり、継続的に改善を続 けていく上で適切な機械学習システムを作るため - 論文になりづらいLesson Learntを収集し - アーキテクチャパターンを体系化したい (鷲崎先生の取り組み [Washizaki 2019]以外、まだ十分に整備されていない)
  4. 4. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● Project process, organization structure/management ● Machine Learning systems for distributed training ● Hardware acceleration e.g. GPU, TPU, FPGA, etc...
  5. 5. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Typical steps for a Machine Learning project 1. ビジネス課題を 2. 類似の課題を、論文を中心にサーベイ 3. 方法を考える 4. システム設計 5. 学習データ(特徴量+ラベル)の設計 6. 実データの収集と前処理をする 7. 探索的データ分析とアルゴリズムの選定 8. 学習・パラメータチューニング 9. システムに組み込む 10. 予測精度・ビジネス指標をモニタリング 実験ループ: 5〜8を繰り返し 本番ループ: 8〜10を繰り返し (4に戻ることも)
  6. 6. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 1. ビジネス課題を 2. 類似の課題を、論文を中心にサーベイ 3. 機械学習をしない方法を考える 4. システム設計 5. 学習 6. 実データの収集と前処理 7. 探索的データ分析とアルゴリズムの選定 8. 学習・パラメータチューニング 9. システムに組み込む 10. をモニタリング 1. を書く 2. コードを書く 3. Pull Request/CIでの 4. コードレビュー、マージ 5. コード/バイナリのビルド、デプロイ 6. を と 共にモニタリング
  7. 7. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 1. ビジネス課題を 2. 類似の課題を、論文を中心にサーベイ 3. 機械学習をしない方法を考える 4. システム設計 5. 学習 6. 実データの収集と前処理 7. 探索的データ分析とアルゴリズムの選定 8. 学習・パラメータチューニング 9. システムに組み込む 10. をモニタリング 1. を書く 2. コードを書く 3. Pull Request/CIでの 4. コードレビュー、マージ 5. コード/バイナリのビルド、デプロイ 6. を と 共にモニタリング ● に振る舞いが決まる ● 仕様を網羅して ができる ● エラーは一意に定義でき、コードの ロジックを することができ る ● 入力 に対して にに振る 舞いが決まる ● 仕様を網羅できず、 しかできない ● エラーかどうかの検出が難しく、モ デル更新など しかでき ない
  8. 8. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● “Data defines everything” ● “Change Anything Changes Everything” [Sculley, 2015] ● データが振る舞いを決めるため、確率的にエラーを抑えることし かできない ● 確率的な挙動をするため、変更の影響範囲を事前に予見できない
  9. 9. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● した音声認識システム ○ リリース後、 に → 予測性能が劣化 ● 工場の画像検査システムで、一ヶ月以上現場の画像で学習 ○ が変化 → 予測性能が劣化 ● ウェブサイトのランキングシステム ○ により検索語のトレンドが変遷 → 性能が劣化 [Batch] より翻訳をして引用
  10. 10. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [Ziobaite, 2010]より引用 Production models should be retrained continuously to avoid concept drift
  11. 11. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [CD4ML]より引用 モデルを本番に学習・デプロイし続けるには、 チームをまたいだ協力が必要
  12. 12. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 🤔 ● : ”Infrastructure as Code” made Ops of Dev ○ which improves ● : Isolate DS/Researcher’s code from production system ○ which aims to improve ○ researcher’s code by Dev is ■ Those Dev can be DS/researcher and get high salary!
  13. 13. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [Ariga] より
  14. 14. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Preprocessing EDA Training Deploy/Serving Audit [Ariga] より
  15. 15. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [Marsden, 2019]より翻訳して引用 ● ○ 9ヶ月前に学習したモデルが全く同じ環境で、同じデータで再学習でき、 ほぼ同じ(数%以内の差)の精度を得られるべきである ● ○ 本番で稼働しているどのモデルも、作成時のパラメータと学習データ、 更に生データまでトレースできるべきである ● ○ 他の同僚の作ったモデルを本人に聞くことなく改善でき、 非同期で改善とコードやデータのマージができるべきである ● ○ 手動での作業0でモデルはデプロイできるべき。 統計的にモニタリングできるべき
  16. 16. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 1. Data/Model management 2. Experiment Tracking 3. Reproducible experiment 4. Pipeline management 5. ML framework abstraction 6. Model Serving/Deployment 7. Testing and quality check 8. Explaining model/data 9. Monitoring/Observability
  17. 17. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Reproducible / Accountable Collaborative / Continuous1. Data/Model management How do you manage 10000+ models in production? ● Phase ○ EDA, Training, Audit ● Related concepts ○ Data/Model Versioning, Data/Model Lineage, Metadata management, Feature store ● OSS ○ ModelDB, pachyderm, DVC, ML Metadata, Feast Example of Data Lineage. [Uber] より引用 EDA Training Audit
  18. 18. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Reproducible / Accountable Collaborative / Continuous2. Experiment tracking What parameter/condition did you choose with the production model? ● Phase ○ Training, Audit ● Related concepts ○ Parameter management, Artifact management Model tracking example [MLFlow tracking] より引用 ● OSS ○ kubeflow pipelines, MLflow tracking, polyaxon, comet.ml Training Audit
  19. 19. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 3. Reproducible experiments Reproducible / Accountable Collaborative / Continuous Is you notebook can reproducible in your peer’s environment? ● Phase ○ EDA, Training, Audit ● Related concepts ○ Reproducible notebook, ○ Dependency management ● OSS ○ Jupyter notebook, Polynote, Docker https://twitter.com/keigohtr/status/1197321232800071680 もちろん、実装によるが... EDA Training Audit
  20. 20. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Reproducible / Accountable Collaborative / Continuous4. Pipeline management How do you manage consistent end-to-end ML pipeline? ● Phase ○ Preprocessing, Training, Deploy/Serving, Audit ● Related concepts ○ Pipeline versioning, Workflow, CI ● OSS ○ kubeflow, Argo, ArgoCD, MLFlow Projects, Tensorflow Transform, [General workflow engines] Training Audit Preprocessing Deploy/Serving [Kubeflow pipeline] より引用
  21. 21. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Reproducible / Accountable Collaborative / Continuous5. ML framework abstraction How can Dev make DS’s code deployable without understanding the code? ● Phase ○ Preprocessing, Training, Deploy ● Related concepts ○ Deploy with function/decorator, Configuration based training, AutoML ● OSS & Known frameworks ○ TensorFlow Transform, Metaflow ○ (FBLeaner Flow, Overton, Bighead, Boson, Metaflow) TrainingPreprocessing Deploy/Serving Metaflow enables to deploy first models within a week for most of projects [Metaflow]
  22. 22. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Overton: A Data System for Monitoring and Improving Machine-Learned Products [Overton] “domain engineers should not be forced to write traditional deep learning modeling code” “Engineers are Comfortable with Automatic Hyperparameter Tuning” Apple implemented ML system with weak supervision and slicing. Engineers are required to 1) create/select schema and input payload. 2) Add slices or labeling functions, or synthetic examples
  23. 23. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Reproducible / Accountable Collaborative / Continuous6. Testing and quality check What metrics should we track for quality? ● Phase ○ Training, Deploy ● Related concepts ○ Data validation, Component integration validation, Model quality check, Adversarial example detection ● OSS ○ TensorFlow Data Validation, Deequ Training Deploy/Serving [TFDV] Understand data from stats
  24. 24. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Data validation: Testing toward train-serve skew 教師あり学習の予測モデルは、学 習データの分布に近いという仮定 学習データの仮定をvalidationす れば良い ● Check with schema ○ Categorical variable ○ Numerical value range ○ Similarity of distribution [Baylor, 2017]より引用
  25. 25. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 7. Explaining model/data Can you convince your boss why we should choose this model? ● Phase ○ Training, Audit ● Related concepts ○ Model explainability, Bias/fairness and ethics check ● OSS ○ TensorFlow Model Analysis, Facets, LIME, ELI5, SHAP, Manifold Reproducible / Accountable Collaborative / Continuous Training Audit Manifold can compare two sliced population with multiple models/features [Manifold]
  26. 26. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. “Explain what’s happening and why. There is still significant fear, uncertainty and doubt (FUD) about AI. I have seen that providing a basic education — along the lines of the AI for Everyone curriculum — eases these conversations. Other tactics including explainability, visualization, rigorous testing, and auditing also help build trust in an AI system and convince our customers (and ourselves!) that it really works.” [Batch 2] より引用。下線部は筆者強調
  27. 27. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 8. Model Serving/Deployment How rapidly can you deploy a model to production? ● Phase ○ Deploy/Serving ● Related concepts ○ Rollout (Canary rollouts, A/B testing, Shadowing) ○ Model serialization/export ● OSS ○ TF Serving, Seldon, KFServing, MLFlow Model registry, Clipper ○ PMML, PFA, ONNX, Menoh Reproducible / Accountable Collaborative / Continuous Deploy/Serving
  28. 28. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. “Time is money” - impact of serving latency [Bernardi 2019] +30% latency reduces 0.5% CVR at Booking.com Techniques to reduce latency: ● Model Redundancy ● In-house developed Linear Prediction engine ● Sparse models ● Precomputation and caching ● Bulking ● Minimum Feature Transformation
  29. 29. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 9. Monitoring/Observability How do you notice production model/data is corrupt? ● Phase ○ Deploy/Serving ● Related concepts ○ Data Validation, Outlier detection, Concept drift detection, Delayed feedback ● OSS ○ [General Monitoring tools], [General Workflow engine], TFDV Reproducible / Accountable Collaborative / Continuous Deploy/Serving “Smooth bimodal distributions with one clear stable point are signs of a model that successfully distinguishes two classes” [Bernardi 2019]
  30. 30. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Challenges Phase Reprod ucible Accou ntable Collab orative Contin uous Data/Model management EDA/Training/Audit ✅ ✅ ✅ Experiment Tracking Training/Audit ✅ ✅ ✅ Reproducible experiment EDA/Training/Audit ✅ ✅ ✅ Pipeline management Preprocessing/Training/S erving/Deploy/Audit ✅ ✅ ✅ ✅ ML framework abstraction Preprocessing/Training/D eploy ✅ ✅ ✅ Testing and quality check Training/Deploy ✅ ✅ Explaining model/data Training/Audit ✅ Model Serving/Deployment Serving/Deploy ✅ Monitoring/Observability Serving/Deploy ✅ ✅ ✅
  31. 31. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● Introduced the importance of Continuous Improvement for machine learning system ● Introduced 9 Challenges for ML systems in production ● Let’s join MLSE slack and collect/share knowledge on GitHub Wiki! https://github.com/chezou/ml_in_production/wiki ○ Planning for the Workshop for Machine Learning system in production
  32. 32. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● https://github.com/EthicalML/awesome-production-machine-learning ● https://hackernoon.com/why-is-devops-for-machine-learning-so-different-384z32f1 ● https://martinfowler.com/articles/cd4ml.html
  33. 33. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● [Sculley, 2015] Sculley, David, et al. "Hidden technical debt in machine learning systems." Advances in neural information processing systems. 2015. ● [Washizaki 2019] Washizaki, Hironori, et al. "Studying Software Engineering Patterns for Designing Machine Learning Systems." arXiv preprint arXiv:1910.04736 (2019). ● [Batch] The Batch Nov. 6, 2019 https://info.deeplearning.ai/the-batch-deepmind-masters-starcraft-2-ai-attacks-on-amazon-a-career-in-robot-management-banks-embrace-b ots-1 ● [Batch 2] The Batch Nov. 20, 2019 https://info.deeplearning.ai/the-batch-artificial-noses-surveillance-on-wheels-unwelcome-researchers-privacy-problems-beyond- bounding-boxes ● [Kubeflow pipeline] https://www.kubeflow.org/docs/pipelines/overview/pipelines-overview/ ● [TFDV] https://www.tensorflow.org/tfx/data_validation/get_started ● [Zliobaite, 2010] Žliobaitė, Indrė. "Learning under concept drift: an overview." arXiv preprint arXiv:1010.4784 (2010). ● [Marsden, 2019] Marsden, Luke. "The Future of MLOps" https://docs.google.com/presentation/d/17RWqPH8nIpwG-jID_UeZBCaQKoz4LVk1MLULrZdyNCs/edit#slide=id.p ● [Baylor, 2017] D. Baylor et al., “TFX: A tensorflow-based production-scale machine learning platform,” in Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining, 2017, pp. 1387–1395 ● [CD4ML] D. Sato, et al. “Continuous Delivery for Machine Learning” , https://martinfowler.com/articles/cd4ml.html ● [Uber] “Databook: Turning big data into knowledge with metadata at uber.” https://eng.uber.com/databook/ ● [Manifold] “Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber” https://eng.uber.com/manifold/ ● [Overton] Ré, Christopher & Niu, Feng & Gudipati, Pallavi & Srisuwananukorn, Charles. (2019). Overton: A Data System for Monitoring and Improving Machine-Learned Products. ● [Bernardi 2019] Bernardi, Lucas, et al. "150 Successful Machine Learning Models: 6 Lessons Learned at Booking. com." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019. ● [Metaflow] Julie Pitt, et al. “A Human-Friendly Approach to MLOps”, MLOps NYC19, https://youtu.be/fOSZuONmLbA ● [Ariga] “MLOpsの歩き方”, n月刊ラムダノートvol.1 no.1, 2019
  34. 34. Confidential © Arm 2017Confidential © Arm 2017Confidential © Arm 2019 Thank You! Danke! Merci! 谢谢! ありがとう! Gracias! Kiitos!

×