SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Scaling Interactive Workloads
across Kubernetes Cluster
Luciano Resende
Codemotion Milan - 2018
1© 2018 IBM Corporation
© 2018 IBM Corporation
About me - Luciano Resende
2
Open Source AI Platform Architect – IBM – CODAIT
• Senior Technical Staff Member at IBM, contributing to open source for over 10 years
• Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache
Toree, Apache Spark among other projects related to AI/ML platforms
lresende@us.ibm.com
https://www.linkedin.com/in/lresende
@lresende1975
https://github.com/lresende
© 2018 IBM Corporation
3
Learn
Open Source @ IBM
Program touches
78,000
IBMers annually
Consume
Virtually all
IBM products
contain some
open source
• 40,363 pkgs
Per Year
Contribute
• >62K OS Certs
per year
• ~10K IBM
commits per
month
Connect
> 1000
active IBM
Contributors
Working in key OS
projects
2018 / © 2018 IBM Corporation
IBM Open Source Participation
4
IBM Open Source Participation
IBM generated open source innovation
• 137 Code Open (dWO) projects w/1000+ Github projects
• 4 graduates: Node-Red, OpenWhisk, SystemML,
Blockchain fabric to full open governance in the last year
• developer.ibm.com/code/open/code/
Community
• IBM focused on 18 strategic communities
• Drive open governance in “Centers of Gravity”
• IBM Leaders drive key technologies and assure freedom
of action
The IBM OS Way is now open sourced
• Training, Recognition, Tooling
• Organization, Consuming, Contributing
2018 / © 2018 IBM Corporation
Center for Open Source
Data and AI Technologies
CODAIT
codait.org
2018 / © 2018 IBM Corporation
codait (French)
= coder/coded
https://m.interglot.com/fr/en/codait
CODAIT aims to make AI solutions
dramatically easier to create, deploy,
and manage in the enterprise
Relaunch of the Spark Technology
Center (STC) to reflect expanded
mission
5
© 2018 IBM Corporation
Interactive
Development with
Jupyter Notebooks
6© 2018 IBM Corporation
Jupyter Notebooks
© 2018 IBM Corporation
7
Notebooks are interactive
computational
environments, in which
you can combine code
execution, rich text,
mathematics, plots and
rich media.
Jupyter Notebooks
© 2018 IBM Corporation
8
• Notebook UI runs on the browser
• The Notebook Server serves the
’Notebooks’
• Kernels interpret/execute cell
contents
– Are responsible for code execution
– Abstracts different languages
– 1:1 relationship with Notebook
– Runs and consume resources as long as
notebook is running
© 2018 IBM Corporation
Analytics and Deep Learning
Workloads
9© 2018 IBM Corporation
Analytics Workloads
© 2018 IBM Corporation
10
Large amount of data
Shared across organization in Data Lakes
Multiple workload types
- Data cleansing
- Data Warehouse
- ML and Insights
Deep Learning Workloads
© 2018 IBM Corporation
11
Resource Intensive workloads
Requires expensive hardware (GPU, TPU)
Long Running training jobs
- Simple MNIST takes over one hour
WITHOUT a decent GPU
- Other non complex deep learning model
training can easily take over a day WITH
GPUs
Local Development Environment
© 2018 IBM Corporation
12
Analytic and AI Platforms
© 2018 IBM Corporation
14
Large pool of shared computing
resources
• Enterprise Cloud, Public Cloud or Hybrid
• Shared Data (Data Lakes/Object Storage)
Distributed Consumers
• Notebooks running local (users laptop)
or as a service (e.g. Jupyter Hub)
Different Resource Utilization Patterns
• High number of idle resources
Limitations of Jupyter Notebook Stack
© 2018 IBM Corporation
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
15
8 8 8 8
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF
SIMULTANEOUS KERNELS
• Scalability
• Jupyter Kernels running as local process
• Resources are limited by what is available
on the one single node that runs all Kernels
and associated Spark drivers
• Security
• Single user sharing the same privileges
• Users can see and control each other process
using Jupyter administrative utilities
Kernel
Kernel
Kernel
Kernel
Kernel
© 2018 IBM Corporation
Jupyter Enterprise Gateway
16© 2018 IBM Corporation
Jupyter Enterprise
Gateway
© 2018 IBM Corporation
Jupyter Enterprise Gateway at IBM Code
https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/
Jupyter Enterprise Gateway source code at GitHub
https://github.com/jupyter-incubator/enterprise_gateway
Jupyter Enterprise Gateway Documentation
http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Supported Kernels
Supported Platforms
17
A lightweight, multi-tenant, scalable
and secure gateway that enables
Jupyter Notebooks to share resources
across an Apache Spark or Kubernetes
cluster for Enterprise/Cloud use cases
Spectrum Conductor
+
Jupyter Enterprise Gateway Features
© 2018 IBM Corporation
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
18
16
32
48
64
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF
SIMULTANEOUS KERNELS
Optimized Resource Allocation
– Utilize resources on all cluster nodes by running kernels as Spark
applications in YARN Cluster Mode.
– Pluggable architecture to enable support for additional Resource Managers
Enhanced Security
– End-to-End secure communications
• Secure socket communications
• Encrypted HTTP communication using SSL
Multiuser support with user impersonation
– Enhance security and sandboxing by enabling user impersonation when
running kernels (using Kerberos).
– Individual HDFS home folder for each notebook user.
– Use the same user ID for notebook and batch jobs.
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
Kernel
© 2018 IBM Corporation
Jupyter Notebooks
and Kubernetes
19© 2018 IBM Corporation
Deep Learning Workloads
© 2018 IBM Corporation
21
Resource Intensive workloads
Requires expensive hardware (GPU, TPU)
Long Running training jobs
- Simple MNIST takes over one hour
WITHOUT a decent GPU
- Other non complex deep learning model
training can easily take over a day WITH
GPUs
Jupyter & Kubernetes
© 2018 IBM Corporation
22
Kubernetes Platform
- Containers provides a flexible way to
deploy applications and are here to stay
- Containers simplify management of
complicated and heterogenous AI/Deep
Learning infratructure
- Kubernetes enables easy management
of containerized applications and
resources with the benefit of Elasticity
and Quality of Services
Source: https://github.com/Langhalsdino/Kubernetes-GPU-Guide
Enterprise Gateway & Kubernetes
© 2018 IBM Corporation
Supported Platforms
FfDL
Before Enterprise Gateway After Enterprise Gateway
Before Jupyter Enterprise Gateway …
• Resources required for all kernels needs to
be allocated during Notebook Server pod
creation
• Resources limited to what is physically
available on the host node that runs all
kernels and associated Spark drivers
After Jupyter Enterprise Gateway …
• Gateway pod very lightweight
• Kernels in their own pod, isolation
• Kernel pods built from community images:
Spark-on-K8s, TensorFlow, Keras, etc.
Jupyter Enterprise Gateway - Kubernetes
© 2018 IBM Corporation
24
Container images defined in kernelspec
Community image
Kernel
Spark on K8
Kernel
Distributed
File
System
Vanilla Kernels
Spark based kernels
Gateway
nb2kg
nb2kg
© 2018 IBM Corporation
25March 30 2018 / © 2018 IBM Corporation
March 30 2018 / © 2018 IBM Corporation
26
• Multi-user Enterprise Gateway pod
• Each kernel launched on it’s own pod
• Kernel pod namespace is configurable
Jupyter & Kubernetes
© 2018 IBM Corporation
Jupyter Kernels are configured by kernelspecs
• Each kernel has a correspondent kernel spec
• Stored in one of the Jupyter data path
• $ jupyter kernelspec list
Enabling remote kernels
/…/anaconda3/share/jupyter/kernels/python2/kernel.jsom
© 2018 IBM Corporation
Process Proxy:
• Abstracts kernel process represented by Jupyter
framework
• Pluggable class definition identified in kernelspec
(kernel.json)
• Manages kernel lifecycle
Kernel Launcher:
• Embeds target kernel
• Listens on gateway communication port
• Conveys interrupt requests (via local signal)
• Could be extended for additional communications
{
"language": "python",
"display_name": "Spark - Python (Kubernetes Mode)",
"process_proxy": {
"class_name":
"enterprise_gateway.services.processproxies.k8s.KubernetesProcessP
roxy",
"config": {
"image_name": "elyra/kubernetes-kernel-py:dev",
"executor_image_name": "elyra/kubernetes-kernel-py:dev”,
"port_range" : "40000..42000"
}
},
"env": {
"SPARK_HOME": "/opt/spark",
"SPARK_OPTS": "--master k8s://https://${KUBERNETES_SERVICE_HOST
--deploy-mode cluster --name …",
…
},
"argv": [
"/usr/local/share/jupyter/kernels/spark_python_kubernetes/bin/run.
sh",
"{connection_file}",
"--RemoteProcessProxy.response-address",
"{response_address}",
"--RemoteProcessProxy.spark-context-initialization-mode",
"lazy"
]
}
Enabling remote kernels
Process Proxies mixed with Kernel Launchers
Jupyter Enterprise Gateway Components
© 2018 IBM Corporation
29
Spectrum Conductor
+
Supported
Runtime
Platforms
J U P Y T E R E N T E R P R I S E G A T E W A Y
Remote
Kernel Manager
Distributed
Process Proxy
YARN Cluster
Process Proxy
Kubernetes
Process Proxy
Conductor Cluster
Process Proxy
J U P Y T E R N O T E B O O K
NB2KG Extension Lab Extension
J U P Y T E R K E R N E L G A T E W A Y
J U P Y T E R N O T E B O O K
FfDL
© 2018 IBM Corporation
Jupyter Notebooks
and Deep Learning Platforms
30© 2018 IBM Corporation
Deep Learning Platforms
© 2018 IBM Corporation
31
Prohibited costs
- Deep Learning resources are prohibitive in
costs to be locked/idle during interactive
development
Deep Learning Platforms
- We have seen the rise of Deep Learning
platforms that leverage containers and
Kubernetes as the basis of their
infrastructure
- Kubernetes enables Deep Learning
platforms to easily share and restrict
accelerated hardware
Fabric for Deep Learning
IBM Watson Studio
Deep Learning as a service
Batch
oriented
developm
ent
Deep Learning Workspace
March 30 2018 / © 2018 IBM Corporation
32
Streamline Data Science user experience when
coming from Notebook/Interactive
development interfaces
• Current process include multiple steps, one
being decomposing the notebook into an
application that needs to be submitted as a
zip to the deep learning runtime which
becomes a show stopper for data scientists
to adopt FfDL and DLaas
March 30 2018 / © 2018 IBM Corporation
33
Streamline the Deep Learning application
lifecycle
• Run local notebook experiments, with small data
samples and seamlessly validate experiments on
Deep Learning environments
• IBM Cloud DLaaS, FfDL (open source), KubeFlow
(open source)
Simplify productionalization of Model training
and serving from Notebooks
• Enable running/scheduling notebooks on
production environments as batch jobs
• Results can be made available via updated
notebook, or exported to html, pdf and a few
other formats.
Interactive development
lifecycle done on
commodity hardware
with sampled data
Training on full dataset
gets scheduled as
batch jobs on deep
learning infrastructure
Deep Learning Workspace
© 2018 IBM Corporation
34March 30 2018 / © 2018 IBM Corporation
March 30 2018 / © 2018 IBM Corporation
35
• User select where to run the
experiment
• Job is packaged and submitted on
behalf of user
• User has access to Job Console to
monitor experiment
Deep Learning Workspace
© 2018 IBM Corporation
Summary
36© 2018 IBM Corporation
© 2018 IBM Corporation
Interactive Workloads
across Kubernetes Cluster
37© 2018 IBM Corporation
37
+
FfDL
• Enable support to
remote kernels in order
to scale Notebook across
entire cluster
• Multitenant with support
for user impersonation
leveraging Kerberos
• Base container image
becomes a choice (e.g.
Python with Tensorflow)
J U P Y T E R
E N T E R P R I S E G A T E W A Y
S U P P O R T E D
P L A T A F O R M S
D E E P L E A R N I N G
W O R K S P A C E
S U P P O R T E D
P L A T A F O R M S
• Kernels
• Runtimes
• Seamlessly integrate
interactive development
with Deep Learning
frameworks for Model
training
• Schedule Notebooks to
run remotely
© 2018 IBM Corporation
J U P Y T E R E N T E R P R I S E G A T E W A Y
Jupyter Enterprise Gateway at IBM Code
https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/
Jupyter Enterprise Gateway source code at GitHub
https://github.com/jupyter/enterprise_gateway
Jupyter Enterprise Gateway Documentation
http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Jupyter Blog
https://blog.jupyter.org/
38
Other Resources
© 2018 IBM Corporation
Thank you!
@lresende1975
© 2018 IBM Corporation 40

Weitere ähnliche Inhalte

Mehr von Luciano Resende

Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeLuciano Resende
 
IoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache BahirIoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache BahirLuciano Resende
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirLuciano Resende
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examplesLuciano Resende
 
Building analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsBuilding analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsLuciano Resende
 
Building iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache BahirBuilding iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache BahirLuciano Resende
 
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache SparkAn Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache SparkLuciano Resende
 
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017Luciano Resende
 
What's new in Apache SystemML - Declarative Machine Learning
What's new in Apache SystemML  - Declarative Machine LearningWhat's new in Apache SystemML  - Declarative Machine Learning
What's new in Apache SystemML - Declarative Machine LearningLuciano Resende
 
Big analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel GatewayBig analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel GatewayLuciano Resende
 
Jupyter con meetup extended jupyter kernel gateway
Jupyter con meetup   extended jupyter kernel gatewayJupyter con meetup   extended jupyter kernel gateway
Jupyter con meetup extended jupyter kernel gatewayLuciano Resende
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirLuciano Resende
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceLuciano Resende
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningLuciano Resende
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende
 
Open Source tools overview
Open Source tools overviewOpen Source tools overview
Open Source tools overviewLuciano Resende
 
Data access layer and schema definitions
Data access layer and schema definitionsData access layer and schema definitions
Data access layer and schema definitionsLuciano Resende
 
How mentoring programs can help newcomers get started with open source
How mentoring programs can help newcomers get started with open sourceHow mentoring programs can help newcomers get started with open source
How mentoring programs can help newcomers get started with open sourceLuciano Resende
 
Building RESTful services using SCA and JAX-RS
Building RESTful services using SCA and JAX-RSBuilding RESTful services using SCA and JAX-RS
Building RESTful services using SCA and JAX-RSLuciano Resende
 

Mehr von Luciano Resende (20)

Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
IoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache BahirIoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache Bahir
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache Bahir
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examples
 
Building analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsBuilding analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernels
 
Building iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache BahirBuilding iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache Bahir
 
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache SparkAn Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
 
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
 
What's new in Apache SystemML - Declarative Machine Learning
What's new in Apache SystemML  - Declarative Machine LearningWhat's new in Apache SystemML  - Declarative Machine Learning
What's new in Apache SystemML - Declarative Machine Learning
 
Big analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel GatewayBig analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel Gateway
 
Jupyter con meetup extended jupyter kernel gateway
Jupyter con meetup   extended jupyter kernel gatewayJupyter con meetup   extended jupyter kernel gateway
Jupyter con meetup extended jupyter kernel gateway
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open source
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
 
Asf icfoss-mentoring
Asf icfoss-mentoringAsf icfoss-mentoring
Asf icfoss-mentoring
 
Open Source tools overview
Open Source tools overviewOpen Source tools overview
Open Source tools overview
 
Data access layer and schema definitions
Data access layer and schema definitionsData access layer and schema definitions
Data access layer and schema definitions
 
How mentoring programs can help newcomers get started with open source
How mentoring programs can help newcomers get started with open sourceHow mentoring programs can help newcomers get started with open source
How mentoring programs can help newcomers get started with open source
 
Building RESTful services using SCA and JAX-RS
Building RESTful services using SCA and JAX-RSBuilding RESTful services using SCA and JAX-RS
Building RESTful services using SCA and JAX-RS
 

Kürzlich hochgeladen

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Scaling interactive workloads across kubernetes cluster

  • 1. Scaling Interactive Workloads across Kubernetes Cluster Luciano Resende Codemotion Milan - 2018 1© 2018 IBM Corporation
  • 2. © 2018 IBM Corporation About me - Luciano Resende 2 Open Source AI Platform Architect – IBM – CODAIT • Senior Technical Staff Member at IBM, contributing to open source for over 10 years • Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache Toree, Apache Spark among other projects related to AI/ML platforms lresende@us.ibm.com https://www.linkedin.com/in/lresende @lresende1975 https://github.com/lresende © 2018 IBM Corporation
  • 3. 3 Learn Open Source @ IBM Program touches 78,000 IBMers annually Consume Virtually all IBM products contain some open source • 40,363 pkgs Per Year Contribute • >62K OS Certs per year • ~10K IBM commits per month Connect > 1000 active IBM Contributors Working in key OS projects 2018 / © 2018 IBM Corporation IBM Open Source Participation
  • 4. 4 IBM Open Source Participation IBM generated open source innovation • 137 Code Open (dWO) projects w/1000+ Github projects • 4 graduates: Node-Red, OpenWhisk, SystemML, Blockchain fabric to full open governance in the last year • developer.ibm.com/code/open/code/ Community • IBM focused on 18 strategic communities • Drive open governance in “Centers of Gravity” • IBM Leaders drive key technologies and assure freedom of action The IBM OS Way is now open sourced • Training, Recognition, Tooling • Organization, Consuming, Contributing 2018 / © 2018 IBM Corporation
  • 5. Center for Open Source Data and AI Technologies CODAIT codait.org 2018 / © 2018 IBM Corporation codait (French) = coder/coded https://m.interglot.com/fr/en/codait CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise Relaunch of the Spark Technology Center (STC) to reflect expanded mission 5
  • 6. © 2018 IBM Corporation Interactive Development with Jupyter Notebooks 6© 2018 IBM Corporation
  • 7. Jupyter Notebooks © 2018 IBM Corporation 7 Notebooks are interactive computational environments, in which you can combine code execution, rich text, mathematics, plots and rich media.
  • 8. Jupyter Notebooks © 2018 IBM Corporation 8 • Notebook UI runs on the browser • The Notebook Server serves the ’Notebooks’ • Kernels interpret/execute cell contents – Are responsible for code execution – Abstracts different languages – 1:1 relationship with Notebook – Runs and consume resources as long as notebook is running
  • 9. © 2018 IBM Corporation Analytics and Deep Learning Workloads 9© 2018 IBM Corporation
  • 10. Analytics Workloads © 2018 IBM Corporation 10 Large amount of data Shared across organization in Data Lakes Multiple workload types - Data cleansing - Data Warehouse - ML and Insights
  • 11. Deep Learning Workloads © 2018 IBM Corporation 11 Resource Intensive workloads Requires expensive hardware (GPU, TPU) Long Running training jobs - Simple MNIST takes over one hour WITHOUT a decent GPU - Other non complex deep learning model training can easily take over a day WITH GPUs
  • 12. Local Development Environment © 2018 IBM Corporation 12
  • 13. Analytic and AI Platforms © 2018 IBM Corporation 14 Large pool of shared computing resources • Enterprise Cloud, Public Cloud or Hybrid • Shared Data (Data Lakes/Object Storage) Distributed Consumers • Notebooks running local (users laptop) or as a service (e.g. Jupyter Hub) Different Resource Utilization Patterns • High number of idle resources
  • 14. Limitations of Jupyter Notebook Stack © 2018 IBM Corporation Gather Data Analyze Data Machine Learning Deep Learning Deploy Model Maintain Model Python Data Science Stack Fabric for Deep Learning (FfDL) Mleap + PFA Scikit-LearnPandas Apache Spark Apache Spark Jupyter Model Asset eXchange Keras + Tensorflow 15 8 8 8 8 0 10 20 30 40 50 60 70 80 4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap) Cluster Size (32GB Nodes) MAXIMUM NUMBER OF SIMULTANEOUS KERNELS • Scalability • Jupyter Kernels running as local process • Resources are limited by what is available on the one single node that runs all Kernels and associated Spark drivers • Security • Single user sharing the same privileges • Users can see and control each other process using Jupyter administrative utilities Kernel Kernel Kernel Kernel Kernel
  • 15. © 2018 IBM Corporation Jupyter Enterprise Gateway 16© 2018 IBM Corporation
  • 16. Jupyter Enterprise Gateway © 2018 IBM Corporation Jupyter Enterprise Gateway at IBM Code https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/ Jupyter Enterprise Gateway source code at GitHub https://github.com/jupyter-incubator/enterprise_gateway Jupyter Enterprise Gateway Documentation http://jupyter-enterprise-gateway.readthedocs.io/en/latest/ Supported Kernels Supported Platforms 17 A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across an Apache Spark or Kubernetes cluster for Enterprise/Cloud use cases Spectrum Conductor +
  • 17. Jupyter Enterprise Gateway Features © 2018 IBM Corporation Gather Data Analyze Data Machine Learning Deep Learning Deploy Model Maintain Model Python Data Science Stack Fabric for Deep Learning (FfDL) Mleap + PFA Scikit-LearnPandas Apache Spark Apache Spark Jupyter Model Asset eXchange Keras + Tensorflow 18 16 32 48 64 0 10 20 30 40 50 60 70 80 4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap) Cluster Size (32GB Nodes) MAXIMUM NUMBER OF SIMULTANEOUS KERNELS Optimized Resource Allocation – Utilize resources on all cluster nodes by running kernels as Spark applications in YARN Cluster Mode. – Pluggable architecture to enable support for additional Resource Managers Enhanced Security – End-to-End secure communications • Secure socket communications • Encrypted HTTP communication using SSL Multiuser support with user impersonation – Enhance security and sandboxing by enabling user impersonation when running kernels (using Kerberos). – Individual HDFS home folder for each notebook user. – Use the same user ID for notebook and batch jobs. Kernel Kernel Kernel Kernel Kernel Kernel Kernel Kernel Kernel
  • 18. © 2018 IBM Corporation Jupyter Notebooks and Kubernetes 19© 2018 IBM Corporation
  • 19. Deep Learning Workloads © 2018 IBM Corporation 21 Resource Intensive workloads Requires expensive hardware (GPU, TPU) Long Running training jobs - Simple MNIST takes over one hour WITHOUT a decent GPU - Other non complex deep learning model training can easily take over a day WITH GPUs
  • 20. Jupyter & Kubernetes © 2018 IBM Corporation 22 Kubernetes Platform - Containers provides a flexible way to deploy applications and are here to stay - Containers simplify management of complicated and heterogenous AI/Deep Learning infratructure - Kubernetes enables easy management of containerized applications and resources with the benefit of Elasticity and Quality of Services Source: https://github.com/Langhalsdino/Kubernetes-GPU-Guide
  • 21. Enterprise Gateway & Kubernetes © 2018 IBM Corporation Supported Platforms FfDL Before Enterprise Gateway After Enterprise Gateway Before Jupyter Enterprise Gateway … • Resources required for all kernels needs to be allocated during Notebook Server pod creation • Resources limited to what is physically available on the host node that runs all kernels and associated Spark drivers After Jupyter Enterprise Gateway … • Gateway pod very lightweight • Kernels in their own pod, isolation • Kernel pods built from community images: Spark-on-K8s, TensorFlow, Keras, etc.
  • 22. Jupyter Enterprise Gateway - Kubernetes © 2018 IBM Corporation 24 Container images defined in kernelspec Community image Kernel Spark on K8 Kernel Distributed File System Vanilla Kernels Spark based kernels Gateway nb2kg nb2kg
  • 23. © 2018 IBM Corporation 25March 30 2018 / © 2018 IBM Corporation
  • 24. March 30 2018 / © 2018 IBM Corporation 26 • Multi-user Enterprise Gateway pod • Each kernel launched on it’s own pod • Kernel pod namespace is configurable Jupyter & Kubernetes
  • 25. © 2018 IBM Corporation Jupyter Kernels are configured by kernelspecs • Each kernel has a correspondent kernel spec • Stored in one of the Jupyter data path • $ jupyter kernelspec list Enabling remote kernels /…/anaconda3/share/jupyter/kernels/python2/kernel.jsom
  • 26. © 2018 IBM Corporation Process Proxy: • Abstracts kernel process represented by Jupyter framework • Pluggable class definition identified in kernelspec (kernel.json) • Manages kernel lifecycle Kernel Launcher: • Embeds target kernel • Listens on gateway communication port • Conveys interrupt requests (via local signal) • Could be extended for additional communications { "language": "python", "display_name": "Spark - Python (Kubernetes Mode)", "process_proxy": { "class_name": "enterprise_gateway.services.processproxies.k8s.KubernetesProcessP roxy", "config": { "image_name": "elyra/kubernetes-kernel-py:dev", "executor_image_name": "elyra/kubernetes-kernel-py:dev”, "port_range" : "40000..42000" } }, "env": { "SPARK_HOME": "/opt/spark", "SPARK_OPTS": "--master k8s://https://${KUBERNETES_SERVICE_HOST --deploy-mode cluster --name …", … }, "argv": [ "/usr/local/share/jupyter/kernels/spark_python_kubernetes/bin/run. sh", "{connection_file}", "--RemoteProcessProxy.response-address", "{response_address}", "--RemoteProcessProxy.spark-context-initialization-mode", "lazy" ] } Enabling remote kernels Process Proxies mixed with Kernel Launchers
  • 27. Jupyter Enterprise Gateway Components © 2018 IBM Corporation 29 Spectrum Conductor + Supported Runtime Platforms J U P Y T E R E N T E R P R I S E G A T E W A Y Remote Kernel Manager Distributed Process Proxy YARN Cluster Process Proxy Kubernetes Process Proxy Conductor Cluster Process Proxy J U P Y T E R N O T E B O O K NB2KG Extension Lab Extension J U P Y T E R K E R N E L G A T E W A Y J U P Y T E R N O T E B O O K FfDL
  • 28. © 2018 IBM Corporation Jupyter Notebooks and Deep Learning Platforms 30© 2018 IBM Corporation
  • 29. Deep Learning Platforms © 2018 IBM Corporation 31 Prohibited costs - Deep Learning resources are prohibitive in costs to be locked/idle during interactive development Deep Learning Platforms - We have seen the rise of Deep Learning platforms that leverage containers and Kubernetes as the basis of their infrastructure - Kubernetes enables Deep Learning platforms to easily share and restrict accelerated hardware Fabric for Deep Learning IBM Watson Studio Deep Learning as a service Batch oriented developm ent
  • 30. Deep Learning Workspace March 30 2018 / © 2018 IBM Corporation 32 Streamline Data Science user experience when coming from Notebook/Interactive development interfaces • Current process include multiple steps, one being decomposing the notebook into an application that needs to be submitted as a zip to the deep learning runtime which becomes a show stopper for data scientists to adopt FfDL and DLaas
  • 31. March 30 2018 / © 2018 IBM Corporation 33 Streamline the Deep Learning application lifecycle • Run local notebook experiments, with small data samples and seamlessly validate experiments on Deep Learning environments • IBM Cloud DLaaS, FfDL (open source), KubeFlow (open source) Simplify productionalization of Model training and serving from Notebooks • Enable running/scheduling notebooks on production environments as batch jobs • Results can be made available via updated notebook, or exported to html, pdf and a few other formats. Interactive development lifecycle done on commodity hardware with sampled data Training on full dataset gets scheduled as batch jobs on deep learning infrastructure Deep Learning Workspace
  • 32. © 2018 IBM Corporation 34March 30 2018 / © 2018 IBM Corporation
  • 33. March 30 2018 / © 2018 IBM Corporation 35 • User select where to run the experiment • Job is packaged and submitted on behalf of user • User has access to Job Console to monitor experiment Deep Learning Workspace
  • 34. © 2018 IBM Corporation Summary 36© 2018 IBM Corporation
  • 35. © 2018 IBM Corporation Interactive Workloads across Kubernetes Cluster 37© 2018 IBM Corporation 37 + FfDL • Enable support to remote kernels in order to scale Notebook across entire cluster • Multitenant with support for user impersonation leveraging Kerberos • Base container image becomes a choice (e.g. Python with Tensorflow) J U P Y T E R E N T E R P R I S E G A T E W A Y S U P P O R T E D P L A T A F O R M S D E E P L E A R N I N G W O R K S P A C E S U P P O R T E D P L A T A F O R M S • Kernels • Runtimes • Seamlessly integrate interactive development with Deep Learning frameworks for Model training • Schedule Notebooks to run remotely
  • 36. © 2018 IBM Corporation J U P Y T E R E N T E R P R I S E G A T E W A Y Jupyter Enterprise Gateway at IBM Code https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/ Jupyter Enterprise Gateway source code at GitHub https://github.com/jupyter/enterprise_gateway Jupyter Enterprise Gateway Documentation http://jupyter-enterprise-gateway.readthedocs.io/en/latest/ Jupyter Blog https://blog.jupyter.org/ 38 Other Resources
  • 37. © 2018 IBM Corporation Thank you! @lresende1975
  • 38. © 2018 IBM Corporation 40