SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Operationalizing Machine Learning
Models in production
—
What many
people
think ML is
“Given inputs (x) and answer (y) . . .
. . . learn a function f(x)
y”
Model Development
What real-world ML
actually looks like:
Operationalizing ML is a challenge
Data science pipelines
> act on real-world data
> impact business activities as they happen
Transactions + Decisions in a closed-loop
Goes beyond traditional BI
5 tenets for operationalizing ML
Analytics-Ready Data
Managed Trusted
Quality, Provenance and
Explainability
Resilient Measurable
Monitor + Measure
Evolution
Deliver & ImproveAt Scale & Always On
Where’s my data ?
Analytics-Ready Data
Managed
• access to data with techniques to track & deal with
sensitive content
• data virtualization
• automate-able pipelines for data preparation,
transformations
How can I convince you to use my model ?
• provenance of data used to train & test
• lineage of the model - transforms , features & labels
• model explainability - algorithm, size of data, compute
resources used to train & test, evaluation thresholds,
repeatable
Trusted
Quality, Provenance and
Explainability
How was the model built ?
Dependable for your business
Resilient
At Scale & Always On
• reliable & performant for (re-)training
• highly available, low latency model serving at real time
even with sophisticated data prep
• outage free model /version upgrades in production
ML infused in real-time,
critical business processes
Is my model still good enough ?
Measurable
Monitor + Measure
• latency metrics for real-time scoring
• frequent accuracy evaluations with thresholds
• health monitoring for model decay
Growth & Maturity
Evolution
Deliver & Improve
• versioning: champion/challenger, experimentation and
hyper-parameterization
• process efficiencies: automated re-training auto
deployments (with curation & approvals)
Governed Data Science - is also a team sport
Data Engineer
CDO (Data Steward)
Data Scientist
Organizes
• Data & Analytics Asset
Enterprise Catalog
• Lineage
• Governance of Data &
Models
• Audits & Policies
• Enables Model Explainability
Collects
Builds data lakes
and warehouses
Gets Data Ready for Analytics
Analyzes
Explores, Shapes data
& trains models
Exec
App. Developer Problem
Statement
or target
opportunity
Finds Data
Explores &
Understands
Data
Collects,
Preps &
Persists Data
Extracts
features for
ML
Train Models
Deploy &
monitor
accuracy
Trusted
Predictions
Experiments
Sets goals &
measures results
Real-time Apps
& business processes
Infuses ML in
apps &
processes
PRODUCTION
• Secure & Governed
• Monitor & Instrument
• High Throughput
-Load Balance & Scale
• Reliable Deployments
- outage free upgrades
Auto-retrain & upgrade
Refine
Features, Lineage/Relationships recorded
in Governance Catalog
Development
& prototyping
Production
Admin/Ops
Operationalizing ML for the Enterprise
- Requirements & implications
16
On demand or leased compute
Scale out for users & workloads
Serve Models in Real-time
APIs
Load Balancing Resiliency
Service Discovery
Role Based ACL & tenancy
Network PoliciesSecure
Trusted Predictions
a Platform
Notebooks & IDEs
Model Evaluations
an Integrated
Experience
Continuous Delivery
(automatic) Rolling upgrades
Teaming Collaborations
Extensible
Get Data Ready for Analytics Data Prep
Plug-n-Play extensibility
Kubernetes
IBM DSX Local
Tools Packages & Algorithms
Explore at scale
• Scale out on-demand
• No Dev-ops/engineering setup
Reproducibility
• Process of tracking
• Reproduce results easily
Secure
• Governed Access
• Administration capabilities
Collaborate
• Understand what’s been done
• Share and accelerate learning
Publish Efforts
• Models as APIs out of the box
• Avoid Engineering re-work
Discovery to Production
• Minimal efforts
• Seamless scale
• Integration with business process
Open
• Use desired tool of choice
• Interoperability across tools
Review Results
• Stakeholder review
• Via Dashboards/Static reports
Monitoring
• QA/QC on-demand
• Retrain
Value of a Data Science Platform
Data Science
Experience
DSX is an Open Platform
 Add your favorite libraries
 Publish Open APIs for secure ML applications
Organize Models, scripts & other assets
 A “Project”
• just a folder of assets grouped together
• “models” (say serialized pickle or R objects) aren’t everything
o you still need scripts for data wrangling or for scheduled evaluations and interactive notebooks and
apps (R Shiny etc.)
o scripts used for training & testing need to be recorded (remember repeatable ?)
o as well as sample data sets & references to remote data sources
o perhaps even your own home-grown libraries & modules..
• also a git repository !
• why ? Easy to share, version & publish – across different teams /personas
• track history of commit changes & truly co-develop
 Easily enable a CICD pipeline for Machine Learning !
• git tag a project and deploy to production the “release”. (Reproducible snapshots of assets !)
Build & Collaborate
Collaborate within git-backed
projects
Essential tools for Data Scientists
Jupyter notebook Environment
Python 2.7/3.5 with Anaconda
Scala, R
R Studio Environment with >
300 packages, R Markdown, R
Shiny
Zeppelin notebook
with Python 2.7 with Anaconda
Self service Compute Environments
Servers/IDEs - lifecyle easily
controlled by each Data
Scientist
Self-serve reservations of
compute resources
Worker compute resources –
for batch jobs run on-demand
or on schedule
Environments are essentially Kubernetes
pods – with High Availability & Compute
scale-out baked in
(load-balancing/auto-scaling is being planned
for a future spring)
On demand or leased compute
Extend ..
– Roll your own Environments
Add libs/packages to the existing Jupyter, Rstudio , Zeppelin IDE
Environments or introduce new Job “Worker” environments
https://content-dsxlocal.mybluemix.net/docs/content/local/images.html
DSX Local provides a Docker Registry
(and replicated for HA) as well.
These images get managed by DSX and is
used to help build out custom
Environments
Plug-n-Play extensibility
Reproducibility, courtesy of Docker images
Automate ..
Jobs – trigger on-demand or by a
schedule.
such as for Model Evaluations, Batch
scoring or even continuous (re-) training
Monitor models through a dashboard
Model versioning, evaluation history
Publish versions of models, supporting
dev/stage/production paradigm
Monitor scalability through cluster dashboard
Adapt scalability by redistributing compute/memory/disk
resources
Deploy, monitor and manage
Deployment manager - Project Releases
Project releases
Deployed & (delta)
updatable
Current git tag
Bring in a new “release” to production
New Releases
- from a “Source”
Project in the
same cluster
New Releases
- from a “Source”
Project pulled
from
github/bitbucket
New Releases
- from a “Source”
Project created
from a .tar.gz
package
Expose a ML model via a REST API
replicas for load
balancing
pick a version to
expose
(multiple
deployments are
possible too..)
Optionally
reserve compute
scoring end-point Model pre-loaded into memory
inside scoring containers
Expose Python and R scripts as a Web Service
Custom scripts can
be externalized as a
REST service
- say for custom
prediction functions
DSX Local Architecture overview
Customer
Systems
DSX Local Cluster
Relational
Data Stores
(DB2,Oracle,
Teradata, etc.)
LDAP/AD
(Authentication)
Spark
Cluster
(via Livy)
Hadoop
(HDFS, Hive, IBM Big
SQL)
Admin interfaces
Metering
Monitoring
Kubernetes
Master
Components
Platform Control
& Management
Redis
(session store)
Shared user
volume
Cloudant
(metadata)
Storage
Management
Services
registry vol
(dockeri mages)
KubernetesCluster
csv files,
projects,
models
Kubernetes cluster spread
across multiple servers
Data Scientist –
web browser & API
• Start with 3 nodes with HA enabled
• Expand as needed
GitHub
& GitHub Enterprise
(projects & assets)
Kubernetes
Persistent Volumes
Integrations & Connections
Project services, Data
Sources & .git
Spark, Spark-ML
Environments: Resource mgmt
& Jobs
Core Services
User Management
& tokens
DSX Hadoop
Integration Service
Custom
Scorers
Model
Evaluations
Project Releases: Deployment/Scale-out load
balancing & Monitoring
Model Mgmt & Deployment
+ operations
Scorers
Package & Image mgmt
Jupyter
(anaconda-python, scala, R)
Tensorflow-GPU
Zeppelin
Scripts
(.py, .R) & Jobs
Data Scientist Dev environment
RStudio
(and Shiny Apps)
Projects, Publish & collaborations
Data Refinery
SPSS Modeler
(add-on)
Decision
Opt/CPLEX
(add-on)
H20 Flows
(add-on)
Jobs &
Scheduling
Hosted Apps Webservices
(Python,R)
Batch
Scoring
Jupyter
notebook apps
R.Shiny
apps
Operationalizing Machine Learning
Models in production
—
Summary
5 tenets for operationalizing ML
Analytics-Ready Data
Managed Trusted
Quality, Provenance and
Explainability
Resilient Measurable
Monitor + Measure
Evolution
Deliver & ImproveAt Scale & Always On
IBM DSX LocalKubernetes

Weitere ähnliche Inhalte

Was ist angesagt?

Machine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerMachine Learning & Amazon SageMaker
Machine Learning & Amazon SageMaker
Amazon Web Services
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 

Was ist angesagt? (20)

MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
Machine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerMachine Learning & Amazon SageMaker
Machine Learning & Amazon SageMaker
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Building-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWSBuilding-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWS
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Engineering Basics
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
 

Ähnlich wie Machine Learning Models in Production

Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServices
webuploader
 

Ähnlich wie Machine Learning Models in Production (20)

Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
Microsoft DevOps for AI with GoDataDriven
Microsoft DevOps for AI with GoDataDrivenMicrosoft DevOps for AI with GoDataDriven
Microsoft DevOps for AI with GoDataDriven
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
 
Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServices
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
 

Mehr von DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Machine Learning Models in Production

  • 2. What many people think ML is “Given inputs (x) and answer (y) . . . . . . learn a function f(x) y” Model Development
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Operationalizing ML is a challenge Data science pipelines > act on real-world data > impact business activities as they happen Transactions + Decisions in a closed-loop Goes beyond traditional BI
  • 9. 5 tenets for operationalizing ML Analytics-Ready Data Managed Trusted Quality, Provenance and Explainability Resilient Measurable Monitor + Measure Evolution Deliver & ImproveAt Scale & Always On
  • 10. Where’s my data ? Analytics-Ready Data Managed • access to data with techniques to track & deal with sensitive content • data virtualization • automate-able pipelines for data preparation, transformations
  • 11. How can I convince you to use my model ? • provenance of data used to train & test • lineage of the model - transforms , features & labels • model explainability - algorithm, size of data, compute resources used to train & test, evaluation thresholds, repeatable Trusted Quality, Provenance and Explainability How was the model built ?
  • 12. Dependable for your business Resilient At Scale & Always On • reliable & performant for (re-)training • highly available, low latency model serving at real time even with sophisticated data prep • outage free model /version upgrades in production ML infused in real-time, critical business processes
  • 13. Is my model still good enough ? Measurable Monitor + Measure • latency metrics for real-time scoring • frequent accuracy evaluations with thresholds • health monitoring for model decay
  • 14. Growth & Maturity Evolution Deliver & Improve • versioning: champion/challenger, experimentation and hyper-parameterization • process efficiencies: automated re-training auto deployments (with curation & approvals)
  • 15. Governed Data Science - is also a team sport Data Engineer CDO (Data Steward) Data Scientist Organizes • Data & Analytics Asset Enterprise Catalog • Lineage • Governance of Data & Models • Audits & Policies • Enables Model Explainability Collects Builds data lakes and warehouses Gets Data Ready for Analytics Analyzes Explores, Shapes data & trains models Exec App. Developer Problem Statement or target opportunity Finds Data Explores & Understands Data Collects, Preps & Persists Data Extracts features for ML Train Models Deploy & monitor accuracy Trusted Predictions Experiments Sets goals & measures results Real-time Apps & business processes Infuses ML in apps & processes PRODUCTION • Secure & Governed • Monitor & Instrument • High Throughput -Load Balance & Scale • Reliable Deployments - outage free upgrades Auto-retrain & upgrade Refine Features, Lineage/Relationships recorded in Governance Catalog Development & prototyping Production Admin/Ops
  • 16. Operationalizing ML for the Enterprise - Requirements & implications 16 On demand or leased compute Scale out for users & workloads Serve Models in Real-time APIs Load Balancing Resiliency Service Discovery Role Based ACL & tenancy Network PoliciesSecure Trusted Predictions a Platform Notebooks & IDEs Model Evaluations an Integrated Experience Continuous Delivery (automatic) Rolling upgrades Teaming Collaborations Extensible Get Data Ready for Analytics Data Prep Plug-n-Play extensibility Kubernetes IBM DSX Local Tools Packages & Algorithms
  • 17. Explore at scale • Scale out on-demand • No Dev-ops/engineering setup Reproducibility • Process of tracking • Reproduce results easily Secure • Governed Access • Administration capabilities Collaborate • Understand what’s been done • Share and accelerate learning Publish Efforts • Models as APIs out of the box • Avoid Engineering re-work Discovery to Production • Minimal efforts • Seamless scale • Integration with business process Open • Use desired tool of choice • Interoperability across tools Review Results • Stakeholder review • Via Dashboards/Static reports Monitoring • QA/QC on-demand • Retrain Value of a Data Science Platform
  • 18. Data Science Experience DSX is an Open Platform  Add your favorite libraries  Publish Open APIs for secure ML applications
  • 19. Organize Models, scripts & other assets  A “Project” • just a folder of assets grouped together • “models” (say serialized pickle or R objects) aren’t everything o you still need scripts for data wrangling or for scheduled evaluations and interactive notebooks and apps (R Shiny etc.) o scripts used for training & testing need to be recorded (remember repeatable ?) o as well as sample data sets & references to remote data sources o perhaps even your own home-grown libraries & modules.. • also a git repository ! • why ? Easy to share, version & publish – across different teams /personas • track history of commit changes & truly co-develop  Easily enable a CICD pipeline for Machine Learning ! • git tag a project and deploy to production the “release”. (Reproducible snapshots of assets !)
  • 20. Build & Collaborate Collaborate within git-backed projects
  • 21. Essential tools for Data Scientists Jupyter notebook Environment Python 2.7/3.5 with Anaconda Scala, R R Studio Environment with > 300 packages, R Markdown, R Shiny Zeppelin notebook with Python 2.7 with Anaconda
  • 22. Self service Compute Environments Servers/IDEs - lifecyle easily controlled by each Data Scientist Self-serve reservations of compute resources Worker compute resources – for batch jobs run on-demand or on schedule Environments are essentially Kubernetes pods – with High Availability & Compute scale-out baked in (load-balancing/auto-scaling is being planned for a future spring) On demand or leased compute
  • 23. Extend .. – Roll your own Environments Add libs/packages to the existing Jupyter, Rstudio , Zeppelin IDE Environments or introduce new Job “Worker” environments https://content-dsxlocal.mybluemix.net/docs/content/local/images.html DSX Local provides a Docker Registry (and replicated for HA) as well. These images get managed by DSX and is used to help build out custom Environments Plug-n-Play extensibility Reproducibility, courtesy of Docker images
  • 24. Automate .. Jobs – trigger on-demand or by a schedule. such as for Model Evaluations, Batch scoring or even continuous (re-) training
  • 25. Monitor models through a dashboard Model versioning, evaluation history Publish versions of models, supporting dev/stage/production paradigm Monitor scalability through cluster dashboard Adapt scalability by redistributing compute/memory/disk resources Deploy, monitor and manage
  • 26. Deployment manager - Project Releases Project releases Deployed & (delta) updatable Current git tag
  • 27. Bring in a new “release” to production New Releases - from a “Source” Project in the same cluster New Releases - from a “Source” Project pulled from github/bitbucket New Releases - from a “Source” Project created from a .tar.gz package
  • 28. Expose a ML model via a REST API replicas for load balancing pick a version to expose (multiple deployments are possible too..) Optionally reserve compute scoring end-point Model pre-loaded into memory inside scoring containers
  • 29. Expose Python and R scripts as a Web Service Custom scripts can be externalized as a REST service - say for custom prediction functions
  • 30. DSX Local Architecture overview Customer Systems DSX Local Cluster Relational Data Stores (DB2,Oracle, Teradata, etc.) LDAP/AD (Authentication) Spark Cluster (via Livy) Hadoop (HDFS, Hive, IBM Big SQL) Admin interfaces Metering Monitoring Kubernetes Master Components Platform Control & Management Redis (session store) Shared user volume Cloudant (metadata) Storage Management Services registry vol (dockeri mages) KubernetesCluster csv files, projects, models Kubernetes cluster spread across multiple servers Data Scientist – web browser & API • Start with 3 nodes with HA enabled • Expand as needed GitHub & GitHub Enterprise (projects & assets) Kubernetes Persistent Volumes Integrations & Connections Project services, Data Sources & .git Spark, Spark-ML Environments: Resource mgmt & Jobs Core Services User Management & tokens DSX Hadoop Integration Service Custom Scorers Model Evaluations Project Releases: Deployment/Scale-out load balancing & Monitoring Model Mgmt & Deployment + operations Scorers Package & Image mgmt Jupyter (anaconda-python, scala, R) Tensorflow-GPU Zeppelin Scripts (.py, .R) & Jobs Data Scientist Dev environment RStudio (and Shiny Apps) Projects, Publish & collaborations Data Refinery SPSS Modeler (add-on) Decision Opt/CPLEX (add-on) H20 Flows (add-on) Jobs & Scheduling Hosted Apps Webservices (Python,R) Batch Scoring Jupyter notebook apps R.Shiny apps
  • 31. Operationalizing Machine Learning Models in production — Summary
  • 32. 5 tenets for operationalizing ML Analytics-Ready Data Managed Trusted Quality, Provenance and Explainability Resilient Measurable Monitor + Measure Evolution Deliver & ImproveAt Scale & Always On IBM DSX LocalKubernetes