SlideShare ist ein Scribd-Unternehmen logo
1 von 69
Downloaden Sie, um offline zu lesen
Enterprise MLOps In Practice
Updated: July 2021
About MavenCode
MavenCode Confidential and Proprietary
MavenCode is a Artificial Intelligence Solutions company located Southlake, Texas - We do training, product
development and consulting services with specialization in
● Provisioning Scalable AI and ML platforms - OnPrem and in the Cloud
● Deployment & Development of Machine Learning Platforms - OnPrem and in the Cloud
● Enterprise Feature Store Development and Management
● Model Management and Governance
● Streaming Data Analytics and Edge IoT Model Deployments
● Document Understanding and Natural Language Processing with Artificial Intelligence
Industry Verticals We Serve
Retail Industry
● Recommendation Engines
● Customer Management
● Demand Analysis and Planning
● Logistics and Supply Management
Insurance Industry
● AI Infrastructure Tooling
● Claims Analysis and Processing
● Document Processing
● Damage Detection and Identification
Automotive Industry
● AI infrastructure Tooling
● Near Real Time Car Telemetry Analysis
● Preemptive maintenance
recommendation
MavenCode Confidential and Proprietary
Healthcare Industry
● Medical insurance claim analysis
● X-ray image analysis and diagnostics
● Data Driven decision making enablement
Energy Industry
● Capacity Planning and Demand
Forecasting
● Preemptive Equipment Maintenance
Travel & Hospitality Industry
● Planning and Logistics
● Customer Recommendations
● Logistics, Planning and Forecasting
Telecom Industry
● Utilization Forecasting
● Churn Rate Analysis
● Preemptive Maintenance of Equipments
Agriculture Industry
● Precision Farming
● Mechanical Utilization Rate and Planning
● Capacity Planning
MavenCode Confidential and Proprietary
Let’s Watch this Quick Video
MavenCode Confidential and Proprietary
Agenda
MavenCode Confidential and Proprietary
1 Overview of Machine Learning Ops
2 MLOps Roles
3 MLOps Landscape
4 Discuss a Use Case
5 Questions and Answers
01
Overview of MLOps
MavenCode Confidential and Proprietary
Background of MLOps
MavenCode Confidential and Proprietary
As far back as 2014, a group of Google researchers published a paper on this subject...
Interest in MLOps
MavenCode Confidential and Proprietary
MLOps is not easy!
MavenCode Confidential and Proprietary
Launching a rocket is easy, but the ongoing
operations of guiding it successfully into Space
afterward is hard
MavenCode Confidential and Proprietary
“It took me 3 weeks to develop the model. It’s been > 11 months, and it’s still
not deployed”
@ginablaber
“On average, 40% of companies said it takes more than a month to deploy
ML models into production”
thenewstack.io
MavenCode Confidential and Proprietary
Machine Learning Operations, or MLOps, helps simplify the processes involved in the deployment
of machine learning models between operations team and machine learning researchers or data
scientists in the organization
What is Machine Learning Operations?
MavenCode Confidential and Proprietary
● The goal is to standardize and streamline the Machine Learning Life Cycle management
● Is a critical component of any successful Machine Learning project in the Enterprise
● Organizations generate long term value and mitigate risk associated with Machine Learning
projects
So we can say with MLOps ...
MavenCode Confidential and Proprietary
Challenges In Enterprise ML
Reproducibility
● Not Easy to Reproduce ML Model Output
on each iterative runs
● Constantly Changing Training Data
● Consistent Environment Configuration
Issues
Reusability
● Training Pipelines are not
Componentized for Reusability
● No well defined way of doing Model
versioning and tagging
● Collaboration and sharing of source
code is not well defined
Manageability
● Managing model deployment and serving
between environments is difficult
● Versioning and Tracking model artifacts is
very difficult and complex
● No defined way to visually track updates
and changes
Automation
● A lot of deployment process is still
manual
● Steps needed to update model
parameters are not not automated
● Most data science teams are not
equipped with the right knowledge to
take models to production
02
MLOps Roles
MavenCode Confidential and Proprietary
What People Think about Machine Learning
Machine Learning Code
MavenCode Confidential and Proprietary
Hidden Technical Debt of ML Deployment
Data Verification
Configuration
Feature
Extraction
Data Validation
Machine Resource
Management
Serving
Infrastructure
Monitoring
Analysis Tool
Machine Learning Code
MavenCode Confidential and Proprietary
● Ensure a scalable and
flexible environment for ML
model pipelines
● Introduce new technologies
that improve ML model
performance in production
● Identify bottlenecks in the
production system and
pinpoint solutions for long
term improvements
ML Architects
● Analyze initial business
goals and model
outcomes
● Minimize overall risk as a
result of ML models in
production
● Ensure compliance with
internal and external
requirements before
pushing ML models to
production
Model Risk
Managers/Auditors
● Conduct and build
operational systems
● Test systems for security,
performance and
availability
● CI/CD pipeline
management
DevOps
● Integrate ML models in
company’s applications
● Ensure seamless working of
ML models with non-ML
based applications
● Maintain functional ML
models in production
ML Engineers
● Identify the right data for a
project
● Optimize the retrieval and
use of data to power ML
models
● Resolve underlying issues in
data pipelines
Data Engineers
● Build models that address
business needs
● Deliver operationalizable
models for production
environment
● Access model quality
Data Scientists
MavenCode Confidential and Proprietary
● Provide business
questions for framing ML
models
● Define business KPIs to
be achieved
● Evaluate Model
performance
Subject Matter Experts
MLOps Roles and Responsibilities
Data scientists
Model risk
managers/auditors
Subject Matter
Experts
Business Questions
Data Acquisition Feature Engineering
Data Preparation
Model
Training/Experimentation
Model Evaluation and
Comparison
Develop Models
Runtime
Environment
Risk Evaluation
QA
Scabilibility
Containerization
Continuous
Integration
Prepare for
Production
Subject Matter
Experts
Development
to Production
Logging/Alerting
Input drift tracking
Online Evaluation
Monitoring &
Feedback
Performance Drift
MavenCode Confidential and Proprietary
DevOps Data Engineers
Data Engineers
Data scientists
Software Engineers
ML Architects
Data Engineers
DevOps
1
2
3
4
ML Team Workflow
Model risk
managers/auditors
03
MLOps Landscape
MavenCode Confidential and Proprietary
Machine Learning Pipeline
MavenCode Confidential and Proprietary
Data Extraction
Data Preparation &
Analysis
Data QA and Validation
Feature Engineering
Streaming Source
Batch Job Operations
Datasource with
Streaming sources like
MQTT, Kafka, Pubsub etc
Batch Operations on
Databases, FileStorage,
Distributed Storage etc
Model
Training/Validation
Model Training
Model Serving
Model Versioning
Prediction Service
Monitoring
Logging
App
Integration
Deployment / Inferencing
Typical ML Engineer or Data Scientist Workflow
Data
Sourcing
Pre
Processing
Feature
Engineering
Model
Training /
Evaluation
Model Scoring
/Management
Model
Inferencing
Azure Storage
Google Storage
AWS S3 Storage
Raw Data Transformation Processed Data
Storage Compute
GCP Vertex AWS SageMaker Azure ML
Data Scientist / ML Engineers works
on pulling or processing data first
before starting ML training on a
Managed Cloud Service
Raw Data Processing and
Transformation Pipeline
Cloud Training Platforms
on-prem KF
Team A
Team B
Team C
Team D
Google Cloud AI
AWS SageMaker
KF on prem
Azure ML
Running ML workflow across
the enterprise with multiple
teams using different Cloud
Provider technology stacks
Data
Sourcing
Pre
Processing
Feature
Engineering
Azure Storage
Google Storage
AWS S3 Storage
Raw Data Transformation Processed Data
Storage Compute
At scale, it gets complex ...
MavenCode Confidential and Proprietary
To simplify the Complexities can we abstract our ML Pipeline...
Data
Sourcing
Pre
Processing
Feature
Engineering
Model Training
/ Evaluation
Model Scoring
/Management
Model
Inferencing
Storage Compute
1 2
Feature Store
Kubernetes
MavenCode Confidential and Proprietary
To simplify the Complexities can we abstract our ML Pipeline...
Data Sourcing Pre
Processing
Feature
Engineering
Model Training /
Evaluation
ModelScoring
/Management
Model
Inferencing
Storage Compute
1 2
Feature Store
Kubeflow on Kubernetes Vertex AI
- Vertex AI Feature Store (Managed Service )
- Feast
- Databricks Feature Store
MavenCode Confidential and Proprietary
MavenCode Confidential and Proprietary
1. Feature Store In MLOps
What’s Feature Store All About
A Feature is a measurable observable attribute that is part of the input to a Machine Learning Model.
X1
X2
X3
Xn
Model Training
[Feature Vector]
Model
MavenCode Confidential and Proprietary
What’s Feature Store All About
X1
X2
X3
Xn
Model Training
[Feature Vector]
Model
Features are derived from
● Raw Datastore
● Streaming Datasource
● Aggregates of Raw Inputs
● Windows (mins, hourly, daily, weekly)
MavenCode Confidential and Proprietary
Features Change Over time!
X1
X2
X3
Xn
Model Training
X1
X2
X3
Xn
X1
X2
X3
Xn
Time
MavenCode Confidential and Proprietary
Feature Stores In MLOps
● Makes it easy to operationalize our ML workload, most importantly Data Management and Storage for
Model training
● Features can be shared easily among teams running different Model training pipelines
● We can get to version of datasets and track changes easily
● Consistency in Feature input attributes between Model Training and Serving
MavenCode Confidential and Proprietary
Getting Data into a Feature Store
import kfp
from kfp import components
KafkaDatastreamer_op =
kfp.components.create_component_from_func(KafkaDatastreamer,base_image="python:3.7.1”)
ValidatorOnSchema_op =
kfp.components.create_component_from_func(ValidatorOnSchema,base_image="python:3.7.1")
PreProcessor_op =
kfp.components.create_component_from_func(PreProcessor,base_image="python:3.7.1")
FeatureStoreWriter_op= kfp.components.create_component_from_func(FeatureStoreWriter,
base_image="mavencode.io/spark:v3.1.1")
MavenCode Confidential and Proprietary
MavenCode Confidential and Proprietary
2. Kubeflow for MLOps
MavenCode Confidential and Proprietary
Challenges In Enterprise ML
Reproducibility
● Not Easy to Reproduce ML Model Output
on each iterative runs
● Constantly Changing Training Data
● Consistent Environment Configuration
Issues
Reusability
● Training Pipelines are not
Componentized for Reusability
● No well defined way of doing Model
versioning and tagging
● Collaboration and sharing of source
code is not well defined
Manageability
● Managing model deployment and serving
between environments is difficult
● Versioning and Tracking model artifacts is
very difficult and complex
● No defined way to visually track updates
and changes
Automation
● A lot of deployment process is still
manual
● Steps needed to update model
parameters are not not automated
● Most data science teams are not
equipped with the right knowledge to
take models to production
Why Machine Learning with Kubeflow?
With Kubeflow out of the box on Kubernetes, we can easily have
Composability Portability
MavenCode Confidential and Proprietary
Scalability
What is Kubeflow
● Machine learning toolkit for Kubernetes.
● Platform to productionize ML models, making them simple, scalable and
reliable.
● Collection of Cloud native tools for all the stages of a model development
life cycle.
● Build integrated end-to-end pipelines which connect all the stages of a
model development life cycle.
MavenCode Confidential and Proprietary
Simply Put ...
Kubeflow Simplifies your Model Development Life Cycle (MDLC)
MavenCode Confidential and Proprietary
Kubeflow Overview
Chainer Jupyter
MPI Scikit-Learn
Pytorch Tensorflow
MXNet XGBoost
ML Tools
Kubeflow
Applications
Jupyter
Notebook
Chainer
Operator
MPI Operator
MXNet
Operator
Pytorch
Operator
TFJob
Operator
XGBoost
Operator
Hyperparameter Tuning
(Katib)
Fairing
Metadata
Pipelines
Kubeflow UI
KFServing
Tensorflow Batch
Prediction
Pytorch Serving
Tensorflow
Serving
SeldonCore
Serving
Knative
Serving
Istio
Argo
Prometheus
Kubernetes
MavenCode Confidential and Proprietary
Kubeflow Overview
MavenCode Confidential and Proprietary
3
1
2
Enterprise Machine Learning with Kubeflow
MLOps Training and Deployment Platform
In-Cluster Traffic Control By ISTIO -
RBAC, Access UI With SSO Identity
Compatible Proxy
Kubeflow Jupyter NoteBook Kubeflow Jupyter NoteBook Kubeflow Jupyter NoteBook Kubeflow Jupyter NoteBook
Kubeflow Managed Model
Infrastructure
Namespace - Bob Namespace - Dav Namespace - Chuck Namespace - Team
Data Scientist 1 Data Scientist 2 Data Scientist 3
Data Science Team
Authentication and
Authorization
Auto-Scalable CPU Node Pool Auto-Scalable GPU Node Pool
MavenCode Confidential and Proprietary
Vertex AI
MavenCode Confidential and Proprietary
https://codelabs.developers.google.com/vertex-pipelines-intro#6
MavenCode Confidential and Proprietary
04
Let’s go through a Scenario
Airline Customer Prediction
● The Dataset is from Kaggle.
● The data is from an airline organization whose actual name is not given for
various reasons, therefore, the airline is given the pseudonym Invistico airlines.
● The dataset consists of (23 columns and 129880 entries) details of customers
who have already flown with them.
MavenCode Confidential and Proprietary
Data Scientists
Subject Matter
Experts
Problem Statement
Customer satisfaction is priority in the airline industry.
Unhappy or disengaged customers naturally mean fewer passengers and less revenue.
As satisfaction is rarely solely about the flight itself but also the experience from booking to landing, this scenario is aimed
at building a machine learning model using all salient features in the data to predict customer satisfaction.
MavenCode Confidential and Proprietary
Data Analysis
Data Scientists
Subject Matter
Experts
Customers on business class seats were the most satisfied.
The dataset showed more satisfied customers than otherwise, with 54.7% of
the surveyed customers reporting satisfaction with their experiences
Exploratory Data Analysis
MavenCode Confidential and Proprietary
There were more female travelers than males and more females
reported satisfaction with their experiences.
Most customers travelled for business purposes and satisfaction was
higher in business travelers.
Heatmap showing Feature
Correlation
MavenCode Confidential and Proprietary
Data Scientists
Subject Matter
Experts
MavenCode Confidential and Proprietary
Feature Engineering
Data Scientists
Data Engineers
Feature Engineering
To make the data fit four our machine learning model, we performed the
following feature engineering steps:
1. Removing outliers
2. Dropping rows with null values
3. Dropping and combining columns with little or no correlation with our
variable
4. Converting Categorical features to numbers
MavenCode Confidential and Proprietary
Data Scientists
Data Engineers
Before Outlier Removal After Outlier Removal
MavenCode Confidential and Proprietary
Feature Engineering: Outlier Removal
Feature Engineering Data Pipeline
● Load data: reads data from source.
● Dataset Statistics: displays summary statistics of the data.
● Dataset Schema: automatically generates a schema by
inferring types, categories, and ranges from the data.
● Dataset Validation: uses the inferred schema to detect
anomalies in the data.
● Feature Engineering: performs necessary preprocessing
and feature engineering steps on the dataset.
MavenCode Confidential and Proprietary
MavenCode Confidential and Proprietary
Model Training with ML Operators on
Kubeflow
● An ML operator helps to deploy, monitor and manage the
lifecycle of a training job.
● Kubeflow Operators Include
○ Tf-operator
○ Pytorch-operator,
○ Xgboost-operator
○ MPI-operator and many more which can be found on
the official kubeflow account.
ML Operators - Overview
MavenCode Confidential and Proprietary
Model Training with Tensorflow Operator
● Tensorflow Operator is one of the operators offered by Kubeflow to make it easy to run and
monitor both distributed and non-distributed tensorflow jobs on Kubernetes.
● Training tensorflow models using tf-operator relies on centralized parameter servers for
coordination between workers. It supports the tensorflow framework only.
● After preprocessing our data, we built a tensorflow neural network model.
● Our tensorflow model had an accuracy of approximately 88%.
MavenCode Confidential and Proprietary
MavenCode Confidential and Proprietary
Hyperparameter Tuning
Model Risk
Managers/Auditors
ML Engineers
Data Scientists
Hyperparameters: Configuration and variable values that are external to the model, the values are always
set before model training process begin
Selecting the right Hyperparameters can significantly improve model performance in production
Hyperparameter Tuning: Is all about finding hyperparameter input values that optimizes the objective
function of the model training
What is Hyperparameter Tuning?
(a1, b1, c1,.....zN)
(a2, b2, c2,.....zN)
(a3, b3, c3,.....zN)
MavenCode Confidential and Proprietary
What is Hyperparameter Tuning?
MavenCode Confidential and Proprietary
ml.trainModel(layers=10. batch=20. learning_rate=0.2)
Hyperparameters Parameters Score
layers=13. batch=12. learning_rate=0.2
layers=14. batch=14. learning_rate=0.1
layers=15. batch=11. learning_rate=0.5
layers=5. batch=10. learning_rate=0.4
layers=4. batch=20. learning_rate=0.3
weight optimization
weight optimization
weight optimization
weight optimization
weight optimization
Score. 85
Score. 89
Score. 94
Score. 91
Score. 81
Manually tuning by Hand is very inefficient, error-prone and difficult to track
Capturing metrics across multiple jobs and comparing them is difficult!
Efficiently allocating resources and infrastructure on the Cluster to handle all the job runs is not an easy
task
As more Hyperparameters are added, the combinatorial search space of possible inputs to maximize the
training objective function grows exponentially!
Hyperparameter Tuning is Hard!
MavenCode Confidential and Proprietary
Hyperparameter Tuning with Katib on Kubeflow
Katib is the Hyperparameter tuning component of Kubeflow
It is Language and Framework Agnostic
- Tensorflow
- Pytorch
- MxNet
- XGBoost
Customizable Hyperparameter Search space Algorithm
- Random Search
- Grid search
- Bayesian Optimization
- Hyperband
MavenCode Confidential and Proprietary
1. Experiment: An experiment is a single tuning run, also called an optimization run. You specify configuration
settings to define the experiment. The following are the main configurations:
● Objective: What you intend to optimize. This is the objective metric, also called the target variable.
● Search Space: The set of all possible hyperparameter values that the hyperparameter tuning job
should consider for optimization, and the constraints for each hyperparameter.
● Search Algorithm: The algorithm to use when searching for the optimal hyperparameter values.
Katib Concepts
MavenCode Confidential and Proprietary
Hyperparameter Tuning with Katib
Katib automates the Hyperparameter Tuning
process by running a pre-configured number of
training jobs (known as trials) in parallel.
MavenCode Confidential and Proprietary
Result of Katib Experiment
With katib hyperparameter tuning, accuracy increased from 88% to 92.1%
MavenCode Confidential and Proprietary
Model Serving with KFServing
● KFServing is Kubeflow’s model deployment
and serving toolkit
● To efficiently serve our model using
KfServing, we built a Kubeflow pipeline to
load data, preprocess, train the model, make
predictions, export and serve the model.
MavenCode Confidential and Proprietary
KFServing Prediction Request
MavenCode Confidential and Proprietary
Enterprise ML
Operationalization Goal
MavenCode Confidential and Proprietary
End to End ML Operationalization Process
Thank you
Model Development Life Cycle (Data Scientist View)
Data Information Knowledge Insight
Data Scientist workflow essentially follows this path ...
MavenCode Confidential and Proprietary
Machine Learning Development Life Cycle (Production Deployment)
Model Training
T
r
a
i
n
i
n
g
D
a
t
a
E
T
L
Tuning
Inferencing
S
e
r
v
i
n
g
M
o
n
i
t
o
r
i
n
g
Update
MavenCode Confidential and Proprietary

Weitere ähnliche Inhalte

Was ist angesagt?

An introduction to Behavior-Driven Development (BDD)
An introduction to Behavior-Driven Development (BDD)An introduction to Behavior-Driven Development (BDD)
An introduction to Behavior-Driven Development (BDD)Suman Guha
 
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...DevOpsDays Tel Aviv
 
Success Story - Game Testing
Success Story - Game Testing Success Story - Game Testing
Success Story - Game Testing Indium Software
 
How to Start Your Application Modernization Journey
How to Start Your Application Modernization JourneyHow to Start Your Application Modernization Journey
How to Start Your Application Modernization JourneyVMware Tanzu
 
Microsoft DevOps Solution - DevOps
Microsoft DevOps Solution - DevOps  Microsoft DevOps Solution - DevOps
Microsoft DevOps Solution - DevOps Chetan Gordhan
 
How to plug the data gap in DevOps
How to plug the data gap in DevOpsHow to plug the data gap in DevOps
How to plug the data gap in DevOpsDeborah Schalm
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCA Technologies
 
CI/CD on Google Cloud Platform
CI/CD on Google Cloud PlatformCI/CD on Google Cloud Platform
CI/CD on Google Cloud PlatformDevOps Indonesia
 
Scaling Enterprise DevOps with CloudBees
Scaling Enterprise DevOps with CloudBeesScaling Enterprise DevOps with CloudBees
Scaling Enterprise DevOps with CloudBeesDeborah Schalm
 
Tackling customer issues in cloud native environments
Tackling customer issues in cloud native environmentsTackling customer issues in cloud native environments
Tackling customer issues in cloud native environmentsLibbySchulze
 
Integrating SAP into DevOps Pipelines: Why and How
Integrating SAP into DevOps Pipelines: Why and HowIntegrating SAP into DevOps Pipelines: Why and How
Integrating SAP into DevOps Pipelines: Why and HowDevOps.com
 
How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...
How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...
How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...WalmartLabs
 
DevSecOps at the GSA
DevSecOps at the GSADevSecOps at the GSA
DevSecOps at the GSAChris Downey
 
Product Ownership: Explained
Product Ownership: ExplainedProduct Ownership: Explained
Product Ownership: ExplainedRichard Seroter
 
Integrating DevOps and ALM tools to speed delivery
Integrating DevOps and ALM tools to speed deliveryIntegrating DevOps and ALM tools to speed delivery
Integrating DevOps and ALM tools to speed deliveryTasktop
 
A Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityA Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityAcquia
 
Mark Harrison AppDev 2021
Mark Harrison AppDev 2021Mark Harrison AppDev 2021
Mark Harrison AppDev 2021Mark Harrison
 

Was ist angesagt? (20)

An introduction to Behavior-Driven Development (BDD)
An introduction to Behavior-Driven Development (BDD)An introduction to Behavior-Driven Development (BDD)
An introduction to Behavior-Driven Development (BDD)
 
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
Implementing SRE practices: SLI/SLO deep dive - David Blank-Edelman - DevOpsD...
 
Success Story - Game Testing
Success Story - Game Testing Success Story - Game Testing
Success Story - Game Testing
 
How to Start Your Application Modernization Journey
How to Start Your Application Modernization JourneyHow to Start Your Application Modernization Journey
How to Start Your Application Modernization Journey
 
Azure DevOps
Azure DevOpsAzure DevOps
Azure DevOps
 
Microsoft DevOps Solution - DevOps
Microsoft DevOps Solution - DevOps  Microsoft DevOps Solution - DevOps
Microsoft DevOps Solution - DevOps
 
How to plug the data gap in DevOps
How to plug the data gap in DevOpsHow to plug the data gap in DevOps
How to plug the data gap in DevOps
 
Case Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software DevelopmentCase Study: How The Home Depot Built Quality Into Software Development
Case Study: How The Home Depot Built Quality Into Software Development
 
CI/CD on Google Cloud Platform
CI/CD on Google Cloud PlatformCI/CD on Google Cloud Platform
CI/CD on Google Cloud Platform
 
Scaling Enterprise DevOps with CloudBees
Scaling Enterprise DevOps with CloudBeesScaling Enterprise DevOps with CloudBees
Scaling Enterprise DevOps with CloudBees
 
Tackling customer issues in cloud native environments
Tackling customer issues in cloud native environmentsTackling customer issues in cloud native environments
Tackling customer issues in cloud native environments
 
Integrating SAP into DevOps Pipelines: Why and How
Integrating SAP into DevOps Pipelines: Why and HowIntegrating SAP into DevOps Pipelines: Why and How
Integrating SAP into DevOps Pipelines: Why and How
 
Securing DevOps Lifecycle
Securing DevOps LifecycleSecuring DevOps Lifecycle
Securing DevOps Lifecycle
 
How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...
How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...
How We Do DevOps at Walmart: OneOps OSS Application Lifecycle Management Plat...
 
DDD In Agile
DDD In Agile   DDD In Agile
DDD In Agile
 
DevSecOps at the GSA
DevSecOps at the GSADevSecOps at the GSA
DevSecOps at the GSA
 
Product Ownership: Explained
Product Ownership: ExplainedProduct Ownership: Explained
Product Ownership: Explained
 
Integrating DevOps and ALM tools to speed delivery
Integrating DevOps and ALM tools to speed deliveryIntegrating DevOps and ALM tools to speed delivery
Integrating DevOps and ALM tools to speed delivery
 
A Crash Course in Building Site Reliability
A Crash Course in Building Site ReliabilityA Crash Course in Building Site Reliability
A Crash Course in Building Site Reliability
 
Mark Harrison AppDev 2021
Mark Harrison AppDev 2021Mark Harrison AppDev 2021
Mark Harrison AppDev 2021
 

Ähnlich wie GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice

MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentDatabricks
 
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...Databricks
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsMárton Kodok
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...All Things Open
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)dtz001
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...James Anderson
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019GoDataDriven
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Itai Yaffe
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018Adam Gibson
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSSri Ambati
 
Introduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataIntroduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataWeCloudData
 
Introduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataIntroduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataWeCloudData
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsDatabricks
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-useltonrodriguez11
 

Ähnlich wie GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice (20)

MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
Navigating the ML Pipeline Jungle with MLflow: Notes from the Field with Thun...
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
 
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
Deployment Design Patterns - Deploying Machine Learning and Deep Learning Mod...
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
 
Introduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataIntroduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudData
 
Introduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataIntroduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudData
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 

Mehr von James Anderson

GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...James Anderson
 
GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...
GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...
GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...James Anderson
 
GDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for Kubernetes
GDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for KubernetesGDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for Kubernetes
GDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for KubernetesJames Anderson
 
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...James Anderson
 
GDG SLK - Why should devs care about container security.pdf
GDG SLK - Why should devs care about container security.pdfGDG SLK - Why should devs care about container security.pdf
GDG SLK - Why should devs care about container security.pdfJames Anderson
 
GraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdf
GraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdfGraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdf
GraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdfJames Anderson
 
GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ...
 GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ... GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ...
GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ...James Anderson
 
A3 - AR Code Planetarium CST.pdf
A3 - AR Code Planetarium CST.pdfA3 - AR Code Planetarium CST.pdf
A3 - AR Code Planetarium CST.pdfJames Anderson
 
GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...
GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...
GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...James Anderson
 
GDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language Models
GDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language ModelsGDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language Models
GDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language ModelsJames Anderson
 
GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...
GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...
GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...James Anderson
 
GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...
GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...
GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...James Anderson
 
GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...
GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...
GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...James Anderson
 
GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...
GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...
GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...James Anderson
 
GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...
GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...
GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...James Anderson
 
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneGDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneJames Anderson
 
GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...
GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...
GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...James Anderson
 
GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...
GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...
GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...James Anderson
 
GDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud Boundaries
GDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud BoundariesGDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud Boundaries
GDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud BoundariesJames Anderson
 

Mehr von James Anderson (20)

GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
GDG Cloud Southlake 31: Santosh Chennuri and Festus Yeboah: Empowering Develo...
 
GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...
GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...
GDG Cloud Southlake 30 Brian Demers Breeding 10x Developers with Developer Pr...
 
GDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for Kubernetes
GDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for KubernetesGDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for Kubernetes
GDG Cloud Southlake 29 Jimmy Mesta OWASP Top 10 for Kubernetes
 
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
 
GDG SLK - Why should devs care about container security.pdf
GDG SLK - Why should devs care about container security.pdfGDG SLK - Why should devs care about container security.pdf
GDG SLK - Why should devs care about container security.pdf
 
GraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdf
GraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdfGraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdf
GraphQL Insights Deck ( Sabre_GDG - Sept 2023).pdf
 
GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ...
 GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ... GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ...
GDG Cloud Southlake #25: Jacek Ostrowski & David Browne: Sabre's Journey to ...
 
A3 - AR Code Planetarium CST.pdf
A3 - AR Code Planetarium CST.pdfA3 - AR Code Planetarium CST.pdf
A3 - AR Code Planetarium CST.pdf
 
GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...
GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...
GDG Cloud Southlake #24: Arty Starr: Enabling Powerful Software Insights by V...
 
GDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language Models
GDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language ModelsGDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language Models
GDG Cloud Southlake #23:Ralph Lloren: Social Engineering Large Language Models
 
GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...
GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...
GDG Cloud Southlake no. 22 Gutta and Nayer GCP Terraform Modules Scaling Your...
 
GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...
GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...
GDG Cloud Southlake #21:Alexander Snegovoy: Master Continuous Resiliency in C...
 
GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...
GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...
GDG Cloud Southlake #20:Stefano Doni: Kubernetes performance tuning dilemma: ...
 
GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...
GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...
GDG Cloud Southlake #19: Sullivan and Schuh: Design Thinking Primer: How to B...
 
GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...
GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...
GDG Cloud Southlake #18 Yujun Liang Crawl, Walk, Run My Journey into Google C...
 
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for EveryoneGDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
GDG Cloud Southlake #17: Meg Dickey-Kurdziolek: Explainable AI is for Everyone
 
GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...
GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...
GDG Cloud Southlake #15: Mihir Mistry: Cybersecurity and Data Privacy in an A...
 
GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...
GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...
GDG Cloud Southlake #14: Jonathan Schneider: OpenRewrite: Making your source ...
 
GDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud Boundaries
GDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud BoundariesGDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud Boundaries
GDG Cloud Southlake #9 Secure Cloud Networking - Beyond Cloud Boundaries
 

Kürzlich hochgeladen

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice

  • 1. Enterprise MLOps In Practice Updated: July 2021
  • 2. About MavenCode MavenCode Confidential and Proprietary MavenCode is a Artificial Intelligence Solutions company located Southlake, Texas - We do training, product development and consulting services with specialization in ● Provisioning Scalable AI and ML platforms - OnPrem and in the Cloud ● Deployment & Development of Machine Learning Platforms - OnPrem and in the Cloud ● Enterprise Feature Store Development and Management ● Model Management and Governance ● Streaming Data Analytics and Edge IoT Model Deployments ● Document Understanding and Natural Language Processing with Artificial Intelligence
  • 3. Industry Verticals We Serve Retail Industry ● Recommendation Engines ● Customer Management ● Demand Analysis and Planning ● Logistics and Supply Management Insurance Industry ● AI Infrastructure Tooling ● Claims Analysis and Processing ● Document Processing ● Damage Detection and Identification Automotive Industry ● AI infrastructure Tooling ● Near Real Time Car Telemetry Analysis ● Preemptive maintenance recommendation MavenCode Confidential and Proprietary Healthcare Industry ● Medical insurance claim analysis ● X-ray image analysis and diagnostics ● Data Driven decision making enablement Energy Industry ● Capacity Planning and Demand Forecasting ● Preemptive Equipment Maintenance Travel & Hospitality Industry ● Planning and Logistics ● Customer Recommendations ● Logistics, Planning and Forecasting Telecom Industry ● Utilization Forecasting ● Churn Rate Analysis ● Preemptive Maintenance of Equipments Agriculture Industry ● Precision Farming ● Mechanical Utilization Rate and Planning ● Capacity Planning
  • 4. MavenCode Confidential and Proprietary Let’s Watch this Quick Video
  • 6. Agenda MavenCode Confidential and Proprietary 1 Overview of Machine Learning Ops 2 MLOps Roles 3 MLOps Landscape 4 Discuss a Use Case 5 Questions and Answers
  • 7. 01 Overview of MLOps MavenCode Confidential and Proprietary
  • 8. Background of MLOps MavenCode Confidential and Proprietary As far back as 2014, a group of Google researchers published a paper on this subject...
  • 9. Interest in MLOps MavenCode Confidential and Proprietary
  • 10. MLOps is not easy! MavenCode Confidential and Proprietary Launching a rocket is easy, but the ongoing operations of guiding it successfully into Space afterward is hard
  • 11. MavenCode Confidential and Proprietary “It took me 3 weeks to develop the model. It’s been > 11 months, and it’s still not deployed” @ginablaber “On average, 40% of companies said it takes more than a month to deploy ML models into production” thenewstack.io
  • 12. MavenCode Confidential and Proprietary Machine Learning Operations, or MLOps, helps simplify the processes involved in the deployment of machine learning models between operations team and machine learning researchers or data scientists in the organization What is Machine Learning Operations?
  • 13. MavenCode Confidential and Proprietary ● The goal is to standardize and streamline the Machine Learning Life Cycle management ● Is a critical component of any successful Machine Learning project in the Enterprise ● Organizations generate long term value and mitigate risk associated with Machine Learning projects So we can say with MLOps ...
  • 14. MavenCode Confidential and Proprietary Challenges In Enterprise ML Reproducibility ● Not Easy to Reproduce ML Model Output on each iterative runs ● Constantly Changing Training Data ● Consistent Environment Configuration Issues Reusability ● Training Pipelines are not Componentized for Reusability ● No well defined way of doing Model versioning and tagging ● Collaboration and sharing of source code is not well defined Manageability ● Managing model deployment and serving between environments is difficult ● Versioning and Tracking model artifacts is very difficult and complex ● No defined way to visually track updates and changes Automation ● A lot of deployment process is still manual ● Steps needed to update model parameters are not not automated ● Most data science teams are not equipped with the right knowledge to take models to production
  • 16. What People Think about Machine Learning Machine Learning Code MavenCode Confidential and Proprietary
  • 17. Hidden Technical Debt of ML Deployment Data Verification Configuration Feature Extraction Data Validation Machine Resource Management Serving Infrastructure Monitoring Analysis Tool Machine Learning Code MavenCode Confidential and Proprietary
  • 18. ● Ensure a scalable and flexible environment for ML model pipelines ● Introduce new technologies that improve ML model performance in production ● Identify bottlenecks in the production system and pinpoint solutions for long term improvements ML Architects ● Analyze initial business goals and model outcomes ● Minimize overall risk as a result of ML models in production ● Ensure compliance with internal and external requirements before pushing ML models to production Model Risk Managers/Auditors ● Conduct and build operational systems ● Test systems for security, performance and availability ● CI/CD pipeline management DevOps ● Integrate ML models in company’s applications ● Ensure seamless working of ML models with non-ML based applications ● Maintain functional ML models in production ML Engineers ● Identify the right data for a project ● Optimize the retrieval and use of data to power ML models ● Resolve underlying issues in data pipelines Data Engineers ● Build models that address business needs ● Deliver operationalizable models for production environment ● Access model quality Data Scientists MavenCode Confidential and Proprietary ● Provide business questions for framing ML models ● Define business KPIs to be achieved ● Evaluate Model performance Subject Matter Experts MLOps Roles and Responsibilities
  • 19. Data scientists Model risk managers/auditors Subject Matter Experts Business Questions Data Acquisition Feature Engineering Data Preparation Model Training/Experimentation Model Evaluation and Comparison Develop Models Runtime Environment Risk Evaluation QA Scabilibility Containerization Continuous Integration Prepare for Production Subject Matter Experts Development to Production Logging/Alerting Input drift tracking Online Evaluation Monitoring & Feedback Performance Drift MavenCode Confidential and Proprietary DevOps Data Engineers Data Engineers Data scientists Software Engineers ML Architects Data Engineers DevOps 1 2 3 4 ML Team Workflow Model risk managers/auditors
  • 21. Machine Learning Pipeline MavenCode Confidential and Proprietary Data Extraction Data Preparation & Analysis Data QA and Validation Feature Engineering Streaming Source Batch Job Operations Datasource with Streaming sources like MQTT, Kafka, Pubsub etc Batch Operations on Databases, FileStorage, Distributed Storage etc Model Training/Validation Model Training Model Serving Model Versioning Prediction Service Monitoring Logging App Integration Deployment / Inferencing
  • 22. Typical ML Engineer or Data Scientist Workflow Data Sourcing Pre Processing Feature Engineering Model Training / Evaluation Model Scoring /Management Model Inferencing Azure Storage Google Storage AWS S3 Storage Raw Data Transformation Processed Data Storage Compute GCP Vertex AWS SageMaker Azure ML Data Scientist / ML Engineers works on pulling or processing data first before starting ML training on a Managed Cloud Service Raw Data Processing and Transformation Pipeline Cloud Training Platforms on-prem KF
  • 23. Team A Team B Team C Team D Google Cloud AI AWS SageMaker KF on prem Azure ML Running ML workflow across the enterprise with multiple teams using different Cloud Provider technology stacks Data Sourcing Pre Processing Feature Engineering Azure Storage Google Storage AWS S3 Storage Raw Data Transformation Processed Data Storage Compute At scale, it gets complex ... MavenCode Confidential and Proprietary
  • 24. To simplify the Complexities can we abstract our ML Pipeline... Data Sourcing Pre Processing Feature Engineering Model Training / Evaluation Model Scoring /Management Model Inferencing Storage Compute 1 2 Feature Store Kubernetes MavenCode Confidential and Proprietary
  • 25. To simplify the Complexities can we abstract our ML Pipeline... Data Sourcing Pre Processing Feature Engineering Model Training / Evaluation ModelScoring /Management Model Inferencing Storage Compute 1 2 Feature Store Kubeflow on Kubernetes Vertex AI - Vertex AI Feature Store (Managed Service ) - Feast - Databricks Feature Store MavenCode Confidential and Proprietary
  • 26. MavenCode Confidential and Proprietary 1. Feature Store In MLOps
  • 27. What’s Feature Store All About A Feature is a measurable observable attribute that is part of the input to a Machine Learning Model. X1 X2 X3 Xn Model Training [Feature Vector] Model MavenCode Confidential and Proprietary
  • 28. What’s Feature Store All About X1 X2 X3 Xn Model Training [Feature Vector] Model Features are derived from ● Raw Datastore ● Streaming Datasource ● Aggregates of Raw Inputs ● Windows (mins, hourly, daily, weekly) MavenCode Confidential and Proprietary
  • 29. Features Change Over time! X1 X2 X3 Xn Model Training X1 X2 X3 Xn X1 X2 X3 Xn Time MavenCode Confidential and Proprietary
  • 30. Feature Stores In MLOps ● Makes it easy to operationalize our ML workload, most importantly Data Management and Storage for Model training ● Features can be shared easily among teams running different Model training pipelines ● We can get to version of datasets and track changes easily ● Consistency in Feature input attributes between Model Training and Serving MavenCode Confidential and Proprietary
  • 31. Getting Data into a Feature Store import kfp from kfp import components KafkaDatastreamer_op = kfp.components.create_component_from_func(KafkaDatastreamer,base_image="python:3.7.1”) ValidatorOnSchema_op = kfp.components.create_component_from_func(ValidatorOnSchema,base_image="python:3.7.1") PreProcessor_op = kfp.components.create_component_from_func(PreProcessor,base_image="python:3.7.1") FeatureStoreWriter_op= kfp.components.create_component_from_func(FeatureStoreWriter, base_image="mavencode.io/spark:v3.1.1") MavenCode Confidential and Proprietary
  • 32. MavenCode Confidential and Proprietary 2. Kubeflow for MLOps
  • 33. MavenCode Confidential and Proprietary Challenges In Enterprise ML Reproducibility ● Not Easy to Reproduce ML Model Output on each iterative runs ● Constantly Changing Training Data ● Consistent Environment Configuration Issues Reusability ● Training Pipelines are not Componentized for Reusability ● No well defined way of doing Model versioning and tagging ● Collaboration and sharing of source code is not well defined Manageability ● Managing model deployment and serving between environments is difficult ● Versioning and Tracking model artifacts is very difficult and complex ● No defined way to visually track updates and changes Automation ● A lot of deployment process is still manual ● Steps needed to update model parameters are not not automated ● Most data science teams are not equipped with the right knowledge to take models to production
  • 34. Why Machine Learning with Kubeflow? With Kubeflow out of the box on Kubernetes, we can easily have Composability Portability MavenCode Confidential and Proprietary Scalability
  • 35. What is Kubeflow ● Machine learning toolkit for Kubernetes. ● Platform to productionize ML models, making them simple, scalable and reliable. ● Collection of Cloud native tools for all the stages of a model development life cycle. ● Build integrated end-to-end pipelines which connect all the stages of a model development life cycle. MavenCode Confidential and Proprietary
  • 36. Simply Put ... Kubeflow Simplifies your Model Development Life Cycle (MDLC) MavenCode Confidential and Proprietary
  • 37. Kubeflow Overview Chainer Jupyter MPI Scikit-Learn Pytorch Tensorflow MXNet XGBoost ML Tools Kubeflow Applications Jupyter Notebook Chainer Operator MPI Operator MXNet Operator Pytorch Operator TFJob Operator XGBoost Operator Hyperparameter Tuning (Katib) Fairing Metadata Pipelines Kubeflow UI KFServing Tensorflow Batch Prediction Pytorch Serving Tensorflow Serving SeldonCore Serving Knative Serving Istio Argo Prometheus Kubernetes MavenCode Confidential and Proprietary
  • 39. 3 1 2 Enterprise Machine Learning with Kubeflow MLOps Training and Deployment Platform In-Cluster Traffic Control By ISTIO - RBAC, Access UI With SSO Identity Compatible Proxy Kubeflow Jupyter NoteBook Kubeflow Jupyter NoteBook Kubeflow Jupyter NoteBook Kubeflow Jupyter NoteBook Kubeflow Managed Model Infrastructure Namespace - Bob Namespace - Dav Namespace - Chuck Namespace - Team Data Scientist 1 Data Scientist 2 Data Scientist 3 Data Science Team Authentication and Authorization Auto-Scalable CPU Node Pool Auto-Scalable GPU Node Pool MavenCode Confidential and Proprietary
  • 40. Vertex AI MavenCode Confidential and Proprietary https://codelabs.developers.google.com/vertex-pipelines-intro#6
  • 41. MavenCode Confidential and Proprietary 04 Let’s go through a Scenario
  • 42. Airline Customer Prediction ● The Dataset is from Kaggle. ● The data is from an airline organization whose actual name is not given for various reasons, therefore, the airline is given the pseudonym Invistico airlines. ● The dataset consists of (23 columns and 129880 entries) details of customers who have already flown with them. MavenCode Confidential and Proprietary Data Scientists Subject Matter Experts
  • 43. Problem Statement Customer satisfaction is priority in the airline industry. Unhappy or disengaged customers naturally mean fewer passengers and less revenue. As satisfaction is rarely solely about the flight itself but also the experience from booking to landing, this scenario is aimed at building a machine learning model using all salient features in the data to predict customer satisfaction.
  • 44. MavenCode Confidential and Proprietary Data Analysis Data Scientists Subject Matter Experts
  • 45. Customers on business class seats were the most satisfied. The dataset showed more satisfied customers than otherwise, with 54.7% of the surveyed customers reporting satisfaction with their experiences Exploratory Data Analysis MavenCode Confidential and Proprietary There were more female travelers than males and more females reported satisfaction with their experiences. Most customers travelled for business purposes and satisfaction was higher in business travelers.
  • 46. Heatmap showing Feature Correlation MavenCode Confidential and Proprietary Data Scientists Subject Matter Experts
  • 47. MavenCode Confidential and Proprietary Feature Engineering Data Scientists Data Engineers
  • 48. Feature Engineering To make the data fit four our machine learning model, we performed the following feature engineering steps: 1. Removing outliers 2. Dropping rows with null values 3. Dropping and combining columns with little or no correlation with our variable 4. Converting Categorical features to numbers MavenCode Confidential and Proprietary Data Scientists Data Engineers
  • 49. Before Outlier Removal After Outlier Removal MavenCode Confidential and Proprietary Feature Engineering: Outlier Removal
  • 50. Feature Engineering Data Pipeline ● Load data: reads data from source. ● Dataset Statistics: displays summary statistics of the data. ● Dataset Schema: automatically generates a schema by inferring types, categories, and ranges from the data. ● Dataset Validation: uses the inferred schema to detect anomalies in the data. ● Feature Engineering: performs necessary preprocessing and feature engineering steps on the dataset. MavenCode Confidential and Proprietary
  • 51. MavenCode Confidential and Proprietary Model Training with ML Operators on Kubeflow
  • 52. ● An ML operator helps to deploy, monitor and manage the lifecycle of a training job. ● Kubeflow Operators Include ○ Tf-operator ○ Pytorch-operator, ○ Xgboost-operator ○ MPI-operator and many more which can be found on the official kubeflow account. ML Operators - Overview MavenCode Confidential and Proprietary
  • 53. Model Training with Tensorflow Operator ● Tensorflow Operator is one of the operators offered by Kubeflow to make it easy to run and monitor both distributed and non-distributed tensorflow jobs on Kubernetes. ● Training tensorflow models using tf-operator relies on centralized parameter servers for coordination between workers. It supports the tensorflow framework only. ● After preprocessing our data, we built a tensorflow neural network model. ● Our tensorflow model had an accuracy of approximately 88%. MavenCode Confidential and Proprietary
  • 54. MavenCode Confidential and Proprietary Hyperparameter Tuning Model Risk Managers/Auditors ML Engineers Data Scientists
  • 55. Hyperparameters: Configuration and variable values that are external to the model, the values are always set before model training process begin Selecting the right Hyperparameters can significantly improve model performance in production Hyperparameter Tuning: Is all about finding hyperparameter input values that optimizes the objective function of the model training What is Hyperparameter Tuning? (a1, b1, c1,.....zN) (a2, b2, c2,.....zN) (a3, b3, c3,.....zN) MavenCode Confidential and Proprietary
  • 56. What is Hyperparameter Tuning? MavenCode Confidential and Proprietary ml.trainModel(layers=10. batch=20. learning_rate=0.2) Hyperparameters Parameters Score layers=13. batch=12. learning_rate=0.2 layers=14. batch=14. learning_rate=0.1 layers=15. batch=11. learning_rate=0.5 layers=5. batch=10. learning_rate=0.4 layers=4. batch=20. learning_rate=0.3 weight optimization weight optimization weight optimization weight optimization weight optimization Score. 85 Score. 89 Score. 94 Score. 91 Score. 81
  • 57. Manually tuning by Hand is very inefficient, error-prone and difficult to track Capturing metrics across multiple jobs and comparing them is difficult! Efficiently allocating resources and infrastructure on the Cluster to handle all the job runs is not an easy task As more Hyperparameters are added, the combinatorial search space of possible inputs to maximize the training objective function grows exponentially! Hyperparameter Tuning is Hard! MavenCode Confidential and Proprietary
  • 58. Hyperparameter Tuning with Katib on Kubeflow Katib is the Hyperparameter tuning component of Kubeflow It is Language and Framework Agnostic - Tensorflow - Pytorch - MxNet - XGBoost Customizable Hyperparameter Search space Algorithm - Random Search - Grid search - Bayesian Optimization - Hyperband MavenCode Confidential and Proprietary
  • 59. 1. Experiment: An experiment is a single tuning run, also called an optimization run. You specify configuration settings to define the experiment. The following are the main configurations: ● Objective: What you intend to optimize. This is the objective metric, also called the target variable. ● Search Space: The set of all possible hyperparameter values that the hyperparameter tuning job should consider for optimization, and the constraints for each hyperparameter. ● Search Algorithm: The algorithm to use when searching for the optimal hyperparameter values. Katib Concepts MavenCode Confidential and Proprietary
  • 60. Hyperparameter Tuning with Katib Katib automates the Hyperparameter Tuning process by running a pre-configured number of training jobs (known as trials) in parallel. MavenCode Confidential and Proprietary
  • 61. Result of Katib Experiment With katib hyperparameter tuning, accuracy increased from 88% to 92.1% MavenCode Confidential and Proprietary
  • 62. Model Serving with KFServing ● KFServing is Kubeflow’s model deployment and serving toolkit ● To efficiently serve our model using KfServing, we built a Kubeflow pipeline to load data, preprocess, train the model, make predictions, export and serve the model. MavenCode Confidential and Proprietary
  • 64. MavenCode Confidential and Proprietary Enterprise ML Operationalization Goal
  • 65. MavenCode Confidential and Proprietary End to End ML Operationalization Process
  • 67.
  • 68. Model Development Life Cycle (Data Scientist View) Data Information Knowledge Insight Data Scientist workflow essentially follows this path ... MavenCode Confidential and Proprietary
  • 69. Machine Learning Development Life Cycle (Production Deployment) Model Training T r a i n i n g D a t a E T L Tuning Inferencing S e r v i n g M o n i t o r i n g Update MavenCode Confidential and Proprietary