[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar

Reproducibility and Versioning
of ML Systems
ŠPELA POKLUKAR | MACHINE LEARNING CONSULTANT

DSC 2022 // © COPYRIGHT 2022 ENDAVA 2
"Špela is experienced machine learning consultant with experience mostly
in SW engineering services and energy sector. She has successfully lead
projects in various domains such as manufacturing, finance, robotics,
energy, and IT services. She is currently employed as a data discipline lead
in Endava Slovenia and an active member of innovation and gender
balance communities. Her background is in mathematics, philosophy, and
theology.”
Spela.poklukar@endava.com
+386 40 545 898
Špela Poklukar
MACHINE LEARNING CONSULTANT

Agenda
1. MOTIVATION
2. MODULARITY
3. VERSIONING
4. DOCUMENTATION

DSC 2022 // © COPYRIGHT 2022 ENDAVA
1
Motivation
WHY WE NEED REPRODUCIBILITY ANYWAY

Reproducibility:
Two Sides of the Same Coin
REPRODUCIBILITY OF
ML Research
Results
REPRODUCIBILITY OF
ML Systems
Reproducibility and Versioning of ML Systems - 1. Motivation
Reproducibility of ML research
results means being able to
recreate a ML workflow of
someone else and reach the
same or similar conclusions
as the original work.
Reproducibility of ML system
means being able to
repeatedly run a ML workflow
and reach the same or similar
results on each run.

EVIDENCE OF SIGNIFICANCE
To ensure the obtained results are accurate
and significant.
ABLATION
To ensure that claimed gain really comes
from the intended change and is not random.
Why Reproducibility?
COST ESTIMATION
To inform potential consumers about
computational complexity.

SCALING
To be able to scale the machine learning
system by replicating its parts.
INFERENCE
To ensure selected model is the same one
used for inference.
FAULT TOLERANCE
To reduce the risk of errors by consistently
obtaining the same results.
MODEL ROLLBACK
To allow for model rollback in case the new
model is not performing as expected.
TRUST
To create trust and credibility of the machine
learning product.
REGULATION
To adhere to the increasing regulation
constraints.
Why Reproducibility?

2
Modularity
ADOPTION OF PIPELINE MENTALITY

Feature Engineering
Data Preprocessing Model Training Prediction Service Model Evaluation
Feature Engineering
Data Preprocessing Model Training
Feature Engineering
Data Preprocessing Prediction Service
Development Pipeline:
Training Pipeline:
Inference Pipeline:
Reproducibility and Versioning of ML Systems - 2. Modularity

3
Versioning
TRACKING THE CHANGES IN ML SYSTEM

Reproducibility can be achieved
by tracking and versioning
every change in ML system.
11
for Training Datasets
Reproducibility and Versioning of ML Systems - 3. Versioning

Environment
Source Code
Model Parameters
Features
Preprocessing
System
Model
Dataset
Changes to Track
Data
‣ Dataset version
‣ Data availability
timestamp
‣ Dataset split
‣ Dataset shuffling
‣ Preprocessing
parameters
‣ Target variable
transformation
‣ Feature computation
parameters
‣ Feature selection
‣ Model type
‣ Model
hyperparameters
‣ Weights initialization
‣ Evaluation parameters
‣ Dropout
‣ Components source
code
‣ Pipeline definition
‣ Dependencies
‣ Environment variables
‣ Infrastructure
‣ Floating point
calculation

Experiment Tracking
14

Dataset Versioning
15

The feature store is a central location where the features are stored and organized for the explicit purpose of being used to either train models
or make predictions. Features are computed when the new data become available and stored in the feature store as opposed to being
computed on the fly by training and serving services.
Feature store should provide:
‣ Updated list of feature consumers
‣ Point-in-time lookup
Benefits of using feature store:
‣ Consistent feature engineering for model development, training and serving
‣ Bridging the gap between data scientists and data & ML engineers
‣ Discover and reuse available feature sets, avoid having similar features with different definitions
‣ Point-in-time lookup to prevent data leakage
‣ Accelerate ML innovation
‣ Reproducibility of ML experiments
‣ Empower legal and compliance teams to ensure compliant use of data
Feature Versioning – Feature Store
16
for Training Datasets

Model registry is a service that manages multiple model artifacts, tracks, and governs models at different stages of the ML lifecycle.
The model registry provides:
‣ Centralized storage for all types of models,
‣ Collaborative unit for model lifecycle management.
‣ Basis for assessing model risks and model governance.
‣ Fast and seamless model roll-out and roll-back.
Model registry should keep track of:
‣ Model name
‣ Model architecture
‣ Model hyperparameters
‣ Trained model/model weights
‣ Model metrics
Model Versioning – Model Registry
17

Environment Versioning – Container Registry
18

Pipeline Versioning – Workflow Orchestration
19

Provisioning, configuring and managing infrastructure with machine-readable definition files.
Benefits:
‣ Ensures infrastructure consistency and eliminates configuration drift.
‣ Cost reduction.
‣ Increase in speed of deployments.
‣ Scalability and availability.
‣ Fosters collaboration.
‣ Standardizes deployment workflow.
‣ Error risk reduction.
Infrastructure Versioning – IaC
20

Metadata store is a central place that holds and connects all parameters about ML system.
It may hold, for example:
‣ Data version: Reference to the dataset, md5 hash, dataset sample to know which data was used
to train the model
‣ Environment configuration: Docker image ID, requirements.txt, conda.yml, Dockerfile, Makefile to
know how to recreate the environment where the model was trained
‣ Code version: Git SHA of a commit or an actual snapshot of code to know what code was used
to build a model
‣ Model version: Model ID, configuration of the feature preprocessing steps of the pipeline, model
training, and inference to reproduce the process if needed
‣ Model performance metrics: Experiment ID, F1, accuracy, ROC on test and validation set to
know how your model performs
‣ Hardware metrics: CPU, GPU, TPU, memory to see how much your model consumes during
training/inference
‣ Performance visualizations: ROC curve, Confusion matrix, PR curve to understand the errors
deeply
‣ Model predictions: to see the actual predictions and understand model performance beyond
metrics
Version Versioning – Metadata Store
21

EXPERIMENT
TRACKING
SOURCE
CODE
FEATURE
STORE
MODEL
REGISTRY
METADATA
STORE
EXPERIMENTING AND
MODEL DEVELOPMENT
ML PIPELINE CI/CD:
BUILD, TEST,
PACKAGE, DEPLOY
DATA ENGINEERING
CONTINUOUS MODEL
TRAINING
MODEL CD
PREDICTION SERVICE
CONTINUOUS
MONITORING
DATA
ENGINEERING

4
Documentation
THE ONLY DIFFERENCE BETWEEN SCIENCE AND FOOLING AROUND IS WRITIN G IT DOWN

Reproducibility and Versioning of ML Systems - 4. Documentation

Document as you go.
Start from day 1.
25
Reproducibility and Versioning of ML Systems - 4. Documentation

MLOps – New Kid on the Block - Thank You!

[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie [DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar

Ähnlich wie [DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar (20)

Mehr von DataScienceConferenc1

Mehr von DataScienceConferenc1 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar