SlideShare a Scribd company logo
1 of 23
Download to read offline
Flock: E2E platform
to democratize Data
Science
Agenda
Chapter 1:
• GSL’s vantage point
• Why are we building (yet) another Data Science platform?
• Flock platform
• A technology showcase demo
Chapter 2:
• EGML applications in Microsoft: MLflow + ONNX + SQL Server
• Capturing provenance
• Future Work
GSL’s vantage point
640Patents GAed or Public Preview
features just this year
LoC in OSS
0.5MLoC in OSS
130+Publications in top tier
conferences/journals
1.1MLoC in products
600kServers running our code
in Azure/Hydra
Applied research lab part of Office of the CTO, Azure Data
Systems considered thus far
Cloud Providers Private Services OSS
Systems comparison
Training
Experiment Tracking
Managed Notebooks
Pipelines / Projects
Multi-Framework
Proprietary Algos
Distributed Training
Auto ML
Serving
Batch prediction
On-prem deployment
Model Monitoring
Model Validation
Data Management
Data Provenance
Data testing
Feature Store
Featurization DSL
Labelling
Good Support OK Support No Support Unknown
Insights
– Data Science is all about data J
– There’s an emerging class of applications:
• Enterprise Grade Machine Learning (EGML – CIDR’20)
– Dichotomy of “smarts” with rudimentary process
» Challenge on account of dual nature of models – software & data
– Couple of key pillars to enable EGML are:
• Tools for automating DS lifecycle
– Only O(100) ipython notebooks in GitHub import mlflow over 1M+ analyzed
– O(1000) for sklearn pipelines
• Data Governance
• (Unified data access)
Flock: Data-driven development
offline
online
Data-driven development
Solution Deployment
NN
Model transform
ONNX
ONNX’ Optimization
Close/Update Incidents
Job-id
Job telemetry
telemetry
application
tracking
model
training
LightGBM
policies
deployment
ONNX’
policies
Dhalion
DEMO Python code
import pandas as pd
import lightgbm as lgb
from sklearn import metrics
data_train = pd.read_csv("global_train_x_label_with_mapping.csv")
data_test = pd.read_csv("global_test_x_label_with_mapping.csv")
train_x = data_train.iloc[:,:-1].values
train_y = data_train.iloc[:,-1].values
test_x = data_test.iloc[:,:-1].values
test_y = data_test.iloc[:,-1].values
n_leaves = 8
n_trees = 100
clf = lgb.LGBMClassifier(num_leaves=n_leaves, n_estimators=n_trees)
clf.fit(train_x,train_y)
score = metrics.precision_score(test_y, clf.predict(test_x), average='macro’)
print("Precision Score on Test Data: " + str(score))
import mlflow
import mlflow.onnx
import multiprocessing
import torch
import onnx
from onnx import optimizer
from functools import partial
from flock import get_tree_parameters, LightGBMBinaryClassifier_Batched
import mlflow.sklearn
import mlflow
import pandas as pd
import lightgbm as lgb
from sklearn import metrics
data_train = pd.read_csv('global_train_x_label_with_mapping.csv')
data_test = pd.read_csv('global_test_x_label_with_mapping.csv')
train_x = data_train.iloc[:, :-1].values
train_y = data_train.iloc[:, (-1)].values
test_x = data_test.iloc[:, :-1].values
test_y = data_test.iloc[:, (-1)].values
n_leaves = 8
n_trees = 100
clf = lgb.LGBMClassifier(num_leaves=n_leaves, n_estimators=n_trees)
mlflow.log_param('clf_init_n_estimators', n_trees)
mlflow.log_param('clf_init_num_leaves', n_leaves)
clf.fit(train_x, train_y)
mlflow.sklearn.log_model(clf, 'clf_model')
score = metrics.precision_score(test_y, clf.predict(test_x), average='macro')
mlflow.log_param('precision_score_average', ' macro')
mlflow.log_param('score', score)
print('Precision Score on Test Data: ' + str(score))
n_features = 100
activation = 'sigmoid'
torch.set_num_threads(1)
device = torch.device('cpu')
model_name = 'griffon'
model = clf.booster_.dump_model()
n_features = clf.n_features_
tree_infos = model['tree_info']
pool = multiprocessing.Pool(8)
parameters = pool.map(partial(get_tree_parameters, n_features=n_features),
tree_infos)
lgb_nn = LightGBMBinaryClassifier_Batched(parameters, n_features, activation
).to(device)
torch.onnx.export(lgb_nn, torch.randn(1, n_features).to(device), model_name +
'_nn.onnx', export_params=True, operator_export_type=torch.onnx.
OperatorExportTypes.ONNX_ATEN_FALLBACK)
passes = ['eliminate_deadend', 'eliminate_identity',
'eliminate_nop_monotone_argmax', 'eliminate_nop_transpose',
'eliminate_unused_initializer', 'extract_constant_to_initializer',
'fuse_consecutive_concats', 'fuse_consecutive_reduce_unsqueeze',
'fuse_consecutive_squeezes', 'fuse_consecutive_transposes',
'fuse_matmul_add_bias_into_gemm', 'fuse_transpose_into_gemm',
'lift_lexical_references']
model = onnx.load(model_name + '_nn.onnx')
opt_model = optimizer.optimize(model, passes)
mlflow.onnx.log_model(opt_model, 'opt_model')
pyfunc_loaded = mlflow.pyfunc.load_pyfunc('opt_model', run_id=mlflow.
active_run().info.run_uuid)
scoring = pyfunc_loaded.predict(pd.DataFrame(test_x[:1].astype('float32'))
).values
print('Scoring through mlflow pyfunc: ', scoring)
mlflow.log_param('pyfunc_scoring', scoring[0][0])
User code Instrumented
code
Flock
Griffon: why is my job slow today?
Current OnCall Workflow
Revised OnCall Workflow with Griffon
A support engineer (SE) spends hours
of manual labor looking through
hundreds of metrics
After 5-6 hours of investigation, the
reason for job slow down is found.
A job goes out of SLA and
Support is alerted
A job goes out of SLA and
the SE is alerted The Job ID is fed through Griffon and
the top reasons for job slowdown are
generated automatically
The reason is found in
the top five generated
by Griffon.
All the metrics Griffon
has looked at can be
ruled out and the SE can
direct their efforts to a
smaller set of metrics.
ACM SoCC 2019
EGML applications in Microsoft
Model Training
TensorFlow
Spark
PyTorch
…
1
2
Model Generation
Conversion to
ONNX
Mlflow
Model.v1
3
Serving
SQL Server
Model.vn
H2O
Keras
…
Scikit-learn
Run
--Tracks the runs (parameters, code versions, metrics, output files)
-- Visualizes the output
4
ML Flow Model
(ONNX flavor)
SQL Server as
artifact/backend store
ONNX: Interoperability across ML frameworks
Open format to represent ML models
Backed by Microsoft, Amazon, Facebook, and several hardware vendors
ONNX exchange format
• Open format
• Enables interoperability across frameworks
• Many supported frameworks to import/export
– Caffe2, PyTorch, CNTK, MXNet, TensorFlow, CoreML
ONNX Runtime
• Cross-platform, high-performance scoring engine for ONNX models
• Open-sourced at the end of 2018
• ONNX Runtime is used in millions of Windows devices and powers core
models across Office, Bing, and Azure
Train a model using a
popular framework
such as TensorFlow
Convert the model to
ONNX format
Perform inference
efficiently across
multiple platforms
and hardware using
ONNX runtime
ONNX Runtime and optimizations
Key design points:
Graph IR
Support for multiple backends (e.g., CPU, GPU, FPGA)
Graph optimizations
Rule-based optimizer inspired by DB optimizers
Improved inference time and memory consumption
Examples: 117msec à 34msec; 250MB à 200MB
~40 ONNX
models
in production
>10 orgs
are migrating models
to ONNX Runtime
Average Speedup
2.7x
ONNX Runtime in production
ONNX Runtime in production
Office – Grammar Checking Model
14.6x reduction in latency
MLflow + ONNX
• MLflow (1.0.0) has now built-in support for ONNX models
• ONNX model flavor for saving, loading and evaluating ONNX
models
Train a sklearn
model
Serving the ONNX model
mlflow models serve -m /artifacts/model -p 1234
[6.379428821398614]
Deploy the server
Perform Inference
ONNX Runtime is
automatically invoked
curl -XPOST-H"Content-Type:application/json; format=pandas-split"--data'{"columns":["alcohol", "chlorides", "citricacid", "density", "fixedacidity", "free
sulfurdioxide","pH","residualsugar","sulphates","totalsulfurdioxide","volatileacidity"],"data":[[12.8,0.029,0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}'
http://127.0.0.1:1234/invocations
MLflow + SQL Server
• MLflow can use SQL Server as an artifact store (and other
RDBMSs as well) (PR)
• The models are stored in binary format in the database along with
other metadata such as model name, size, run_id, etc.
client = MlflowClient()
exp_name = “test"
client.create_experiment(exp_name, artifact_location="mssql+pyodbc://sa:password@ipAddress:port/dbName?driver=ODBC+Driver+17+for+SQL+Server")
mlflow.set_experiment(exp_name)
mlflow.onnx.log_model(onnx, “model")
Provenance in EGML applications
• Need for end-to-end provenance tracking
• Multiple systems involved in each pipeline
SQL
Data pre-processing
Python
Script
Model Training
• Compliance
• Keeping ML models up-to-date
Tracking provenance in Python scripts
Python
Script
Python
AST
generation
Dependencies
between variables
and functions
Semantic annotation
through a knowledge
base of common ML
libraries
• Automatically identify models, metrics, hyperparameters in python scripts
• Answer questions such as: “Which columns in a dataset were used for model
training?”
Dataset #Scripts %ML models
covered
%Training
Datasets Covered
Kaggle 49 95% 61%
Microsoft 37 100% 100%
Future work
• MLflow:
– Integration with metadata management systems such as Apache Atlas
• Flock:
– Data Governance
– Generalize and extend coverage of auto-tracking and ML à NN
conversion.
– Provenance of end-to-end pipelines
• Combine with other systems (e.g., SQL, Spark)

More Related Content

What's hot

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Databricks
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleDatabricks
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.Knoldus Inc.
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflowDatabricks
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&MDatabricks
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsWeaveworks
 
Apply MLOps at Scale
Apply MLOps at ScaleApply MLOps at Scale
Apply MLOps at ScaleDatabricks
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"Databricks
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with DatabricksLiangjun Jiang
 
Productionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model ServingProductionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model ServingDatabricks
 

What's hot (20)

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
MLOps for production-level machine learning
MLOps for production-level machine learningMLOps for production-level machine learning
MLOps for production-level machine learning
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
Introducing MLOps.pdf
Introducing MLOps.pdfIntroducing MLOps.pdf
Introducing MLOps.pdf
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
Apply MLOps at Scale
Apply MLOps at ScaleApply MLOps at Scale
Apply MLOps at Scale
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
What is MLOps
What is MLOpsWhat is MLOps
What is MLOps
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Productionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model ServingProductionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model Serving
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 

Similar to Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow

MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup Databricks
 
Flock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLFlock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLDatabricks
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowIT Arena
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningParis Data Engineers !
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Nisha Talagala
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAnimesh Singh
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle Databricks
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)AZUG FR
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning serviceRuth Yakubu
 
Pitfalls of machine learning in production
Pitfalls of machine learning in productionPitfalls of machine learning in production
Pitfalls of machine learning in productionAntoine Sauray
 
EPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUEPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUDmitrii Suslov
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowDatabricks
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Animesh Singh
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionFabian Hadiji
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflowDatabricks
 
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsMachine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsLightbend
 
Kognitio - an overview
Kognitio - an overviewKognitio - an overview
Kognitio - an overviewKognitio
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...James Anderson
 

Similar to Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow (20)

MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
 
Flock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISLFlock: Data Science Platform @ CISL
Flock: Data Science Platform @ CISL
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
NextGenML
NextGenML NextGenML
NextGenML
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
 
Pitfalls of machine learning in production
Pitfalls of machine learning in productionPitfalls of machine learning in production
Pitfalls of machine learning in production
 
EPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHUEPAM ML/AI Accelerator - ODAHU
EPAM ML/AI Accelerator - ODAHU
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflow
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
 
MLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to productionMLOps pipelines using MLFlow - From training to production
MLOps pipelines using MLFlow - From training to production
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflow
 
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsMachine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
 
Kognitio - an overview
Kognitio - an overviewKognitio - an overview
Kognitio - an overview
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 

Recently uploaded (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 

Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow

  • 1. Flock: E2E platform to democratize Data Science
  • 2. Agenda Chapter 1: • GSL’s vantage point • Why are we building (yet) another Data Science platform? • Flock platform • A technology showcase demo Chapter 2: • EGML applications in Microsoft: MLflow + ONNX + SQL Server • Capturing provenance • Future Work
  • 3. GSL’s vantage point 640Patents GAed or Public Preview features just this year LoC in OSS 0.5MLoC in OSS 130+Publications in top tier conferences/journals 1.1MLoC in products 600kServers running our code in Azure/Hydra Applied research lab part of Office of the CTO, Azure Data
  • 4. Systems considered thus far Cloud Providers Private Services OSS
  • 5. Systems comparison Training Experiment Tracking Managed Notebooks Pipelines / Projects Multi-Framework Proprietary Algos Distributed Training Auto ML Serving Batch prediction On-prem deployment Model Monitoring Model Validation Data Management Data Provenance Data testing Feature Store Featurization DSL Labelling Good Support OK Support No Support Unknown
  • 6. Insights – Data Science is all about data J – There’s an emerging class of applications: • Enterprise Grade Machine Learning (EGML – CIDR’20) – Dichotomy of “smarts” with rudimentary process » Challenge on account of dual nature of models – software & data – Couple of key pillars to enable EGML are: • Tools for automating DS lifecycle – Only O(100) ipython notebooks in GitHub import mlflow over 1M+ analyzed – O(1000) for sklearn pipelines • Data Governance • (Unified data access)
  • 7. Flock: Data-driven development offline online Data-driven development Solution Deployment NN Model transform ONNX ONNX’ Optimization Close/Update Incidents Job-id Job telemetry telemetry application tracking model training LightGBM policies deployment ONNX’ policies Dhalion
  • 8. DEMO Python code import pandas as pd import lightgbm as lgb from sklearn import metrics data_train = pd.read_csv("global_train_x_label_with_mapping.csv") data_test = pd.read_csv("global_test_x_label_with_mapping.csv") train_x = data_train.iloc[:,:-1].values train_y = data_train.iloc[:,-1].values test_x = data_test.iloc[:,:-1].values test_y = data_test.iloc[:,-1].values n_leaves = 8 n_trees = 100 clf = lgb.LGBMClassifier(num_leaves=n_leaves, n_estimators=n_trees) clf.fit(train_x,train_y) score = metrics.precision_score(test_y, clf.predict(test_x), average='macro’) print("Precision Score on Test Data: " + str(score)) import mlflow import mlflow.onnx import multiprocessing import torch import onnx from onnx import optimizer from functools import partial from flock import get_tree_parameters, LightGBMBinaryClassifier_Batched import mlflow.sklearn import mlflow import pandas as pd import lightgbm as lgb from sklearn import metrics data_train = pd.read_csv('global_train_x_label_with_mapping.csv') data_test = pd.read_csv('global_test_x_label_with_mapping.csv') train_x = data_train.iloc[:, :-1].values train_y = data_train.iloc[:, (-1)].values test_x = data_test.iloc[:, :-1].values test_y = data_test.iloc[:, (-1)].values n_leaves = 8 n_trees = 100 clf = lgb.LGBMClassifier(num_leaves=n_leaves, n_estimators=n_trees) mlflow.log_param('clf_init_n_estimators', n_trees) mlflow.log_param('clf_init_num_leaves', n_leaves) clf.fit(train_x, train_y) mlflow.sklearn.log_model(clf, 'clf_model') score = metrics.precision_score(test_y, clf.predict(test_x), average='macro') mlflow.log_param('precision_score_average', ' macro') mlflow.log_param('score', score) print('Precision Score on Test Data: ' + str(score)) n_features = 100 activation = 'sigmoid' torch.set_num_threads(1) device = torch.device('cpu') model_name = 'griffon' model = clf.booster_.dump_model() n_features = clf.n_features_ tree_infos = model['tree_info'] pool = multiprocessing.Pool(8) parameters = pool.map(partial(get_tree_parameters, n_features=n_features), tree_infos) lgb_nn = LightGBMBinaryClassifier_Batched(parameters, n_features, activation ).to(device) torch.onnx.export(lgb_nn, torch.randn(1, n_features).to(device), model_name + '_nn.onnx', export_params=True, operator_export_type=torch.onnx. OperatorExportTypes.ONNX_ATEN_FALLBACK) passes = ['eliminate_deadend', 'eliminate_identity', 'eliminate_nop_monotone_argmax', 'eliminate_nop_transpose', 'eliminate_unused_initializer', 'extract_constant_to_initializer', 'fuse_consecutive_concats', 'fuse_consecutive_reduce_unsqueeze', 'fuse_consecutive_squeezes', 'fuse_consecutive_transposes', 'fuse_matmul_add_bias_into_gemm', 'fuse_transpose_into_gemm', 'lift_lexical_references'] model = onnx.load(model_name + '_nn.onnx') opt_model = optimizer.optimize(model, passes) mlflow.onnx.log_model(opt_model, 'opt_model') pyfunc_loaded = mlflow.pyfunc.load_pyfunc('opt_model', run_id=mlflow. active_run().info.run_uuid) scoring = pyfunc_loaded.predict(pd.DataFrame(test_x[:1].astype('float32')) ).values print('Scoring through mlflow pyfunc: ', scoring) mlflow.log_param('pyfunc_scoring', scoring[0][0]) User code Instrumented code Flock
  • 9. Griffon: why is my job slow today? Current OnCall Workflow Revised OnCall Workflow with Griffon A support engineer (SE) spends hours of manual labor looking through hundreds of metrics After 5-6 hours of investigation, the reason for job slow down is found. A job goes out of SLA and Support is alerted A job goes out of SLA and the SE is alerted The Job ID is fed through Griffon and the top reasons for job slowdown are generated automatically The reason is found in the top five generated by Griffon. All the metrics Griffon has looked at can be ruled out and the SE can direct their efforts to a smaller set of metrics. ACM SoCC 2019
  • 10.
  • 11. EGML applications in Microsoft Model Training TensorFlow Spark PyTorch … 1 2 Model Generation Conversion to ONNX Mlflow Model.v1 3 Serving SQL Server Model.vn H2O Keras … Scikit-learn Run --Tracks the runs (parameters, code versions, metrics, output files) -- Visualizes the output 4 ML Flow Model (ONNX flavor) SQL Server as artifact/backend store
  • 12. ONNX: Interoperability across ML frameworks Open format to represent ML models Backed by Microsoft, Amazon, Facebook, and several hardware vendors
  • 13. ONNX exchange format • Open format • Enables interoperability across frameworks • Many supported frameworks to import/export – Caffe2, PyTorch, CNTK, MXNet, TensorFlow, CoreML
  • 14. ONNX Runtime • Cross-platform, high-performance scoring engine for ONNX models • Open-sourced at the end of 2018 • ONNX Runtime is used in millions of Windows devices and powers core models across Office, Bing, and Azure Train a model using a popular framework such as TensorFlow Convert the model to ONNX format Perform inference efficiently across multiple platforms and hardware using ONNX runtime
  • 15. ONNX Runtime and optimizations Key design points: Graph IR Support for multiple backends (e.g., CPU, GPU, FPGA) Graph optimizations Rule-based optimizer inspired by DB optimizers Improved inference time and memory consumption Examples: 117msec à 34msec; 250MB à 200MB
  • 16. ~40 ONNX models in production >10 orgs are migrating models to ONNX Runtime Average Speedup 2.7x ONNX Runtime in production
  • 17. ONNX Runtime in production Office – Grammar Checking Model 14.6x reduction in latency
  • 18. MLflow + ONNX • MLflow (1.0.0) has now built-in support for ONNX models • ONNX model flavor for saving, loading and evaluating ONNX models Train a sklearn model
  • 19. Serving the ONNX model mlflow models serve -m /artifacts/model -p 1234 [6.379428821398614] Deploy the server Perform Inference ONNX Runtime is automatically invoked curl -XPOST-H"Content-Type:application/json; format=pandas-split"--data'{"columns":["alcohol", "chlorides", "citricacid", "density", "fixedacidity", "free sulfurdioxide","pH","residualsugar","sulphates","totalsulfurdioxide","volatileacidity"],"data":[[12.8,0.029,0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' http://127.0.0.1:1234/invocations
  • 20. MLflow + SQL Server • MLflow can use SQL Server as an artifact store (and other RDBMSs as well) (PR) • The models are stored in binary format in the database along with other metadata such as model name, size, run_id, etc. client = MlflowClient() exp_name = “test" client.create_experiment(exp_name, artifact_location="mssql+pyodbc://sa:password@ipAddress:port/dbName?driver=ODBC+Driver+17+for+SQL+Server") mlflow.set_experiment(exp_name) mlflow.onnx.log_model(onnx, “model")
  • 21. Provenance in EGML applications • Need for end-to-end provenance tracking • Multiple systems involved in each pipeline SQL Data pre-processing Python Script Model Training • Compliance • Keeping ML models up-to-date
  • 22. Tracking provenance in Python scripts Python Script Python AST generation Dependencies between variables and functions Semantic annotation through a knowledge base of common ML libraries • Automatically identify models, metrics, hyperparameters in python scripts • Answer questions such as: “Which columns in a dataset were used for model training?” Dataset #Scripts %ML models covered %Training Datasets Covered Kaggle 49 95% 61% Microsoft 37 100% 100%
  • 23. Future work • MLflow: – Integration with metadata management systems such as Apache Atlas • Flock: – Data Governance – Generalize and extend coverage of auto-tracking and ML à NN conversion. – Provenance of end-to-end pipelines • Combine with other systems (e.g., SQL, Spark)