SlideShare a Scribd company logo
1 of 47
Download to read offline
#ibmedge© 2016 IBM Corporation
Scalable TensorFlow Deep Learning
as a Service with Docker,
OpenPOWER, and GPUs
Andrei Yurkevich, Altoros
Indrajit Poddar, IBM
Sep 23, 2016
#ibmedge
Please Note:
• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
and at IBM’s sole discretion.
• Information regarding potential future products is intended to outline our general product direction and it
should not be relied on in making a purchasing decision.
• The information mentioned regarding potential future products is not a commitment, promise, or legal
obligation to deliver any material, code or functionality. Information about potential future products may not be
incorporated into any contract.
• The development, release, and timing of any future features or functionality described for our products
remains at our sole discretion.
• Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending upon
many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the
I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve results similar to those stated here.
1
#ibmedge
About Indrajit (a.k.a. I.P)
Expertise:
• Accelerated Cloud Data Services, Machine
Learning and Deep Learning
• Apache Spark, TensorFlow… with GPUs
• Distributed Computing (scale out and up)
• Cloud Foundry, Spectrum Conductor, Mesos,
Kubernetes, Docker, OpenStack, WebSphere
• Cloud computing on High Performance Systems
• OpenPOWER, IBM POWER
2
Indrajit Poddar
Senior Technical Staff Member,
Master Inventor, IBM Systems
ipoddar@us.ibm.com
Twitter: @ipoddar
#ibmedge 3
#ibmedge 4
Sunnyvale, CA
(HQ)
#ibmedge
5
“
#ibmedge
We will talk about
6
- Current state of Deep Learning
#ibmedge
We will talk about
7
- Current state of Deep Learning
- Deep Learning for cancer diagnosis (Digital Pathology)
#ibmedge
We will talk about
8
- Current state of Deep Learning
- Deep Learning for cancer diagnosis (Digital Pathology)
- TensorFlow framework for Deep Learning
#ibmedge
We will talk about
9
- Current state of Deep Learning
- Deep Learning for cancer diagnosis (Digital Pathology)
- TensorFlow framework for Deep Learning
- Distributing TensorFlow with Docker
#ibmedge
We will talk about
10
- Current state of Deep Learning
- Deep Learning for cancer diagnosis (Digital Pathology)
- TensorFlow framework for Deep Learning
- Distributing TensorFlow with Docker
- Faster training with TensorFlow on OpenPOWER and GPUs
#ibmedge
We will talk about
11
- Current state of Deep Learning
- Deep Learning for cancer diagnosis (Digital Pathology)
- TensorFlow framework for Deep Learning
- Distributing TensorFlow with Docker
- Faster training with TensorFlow on OpenPOWER and GPUs
- Infrastructure for TensorFlow as a Service
#ibmedge
A picture is worth
a thousand
words…
12
http://www.wordclouds.com/
#ibmedge
What is Deep Learning?
Machine Learning in layers and hierarchies
13
#ibmedge
Face classification example
#ibmedge
15
Medical Data Analysis Example: Image classification
Comparing classification by humans and by machines
Detected by a
Doctor visually
Caught by a
Trained model
#ibmedge
Time Scale: Before, Digital Pathology, Deep Learning
16
1980 1990 1997 2005
Video
cameras
Progress in
functional
telemedicine
Robotic
microscopy
First fully
functional WSI
Scanner
ANN intro Yann LeCun et al.,
backpropagation
algorithm
“Deep Learning” for Speech
Recognition
#ibmedge
Machines are now learning the way we learn
17
From "Texture of the Nervous System
of Man and the Vertebrates" by
Santiago Ramón y Cajal.
Artificial Neural Networks
#ibmedge
Deep Learning is improving in accuracy
18
#ibmedge
Time Scale: Advances in Deep Learning
19
2005 2010 2015
Whole Slide
Image (WSI)
Scanner
2016
GPUs 12 core/socket 8 thread/core
#ibmedge
Open Source Deep Learning Libraries
20
IBM Machine Learning and Deep Learning distribution for Ubuntu on OpenPOWER:
http://openpowerfoundation.org/blogs/openpower-deep-learning-distribution/
(does not include TensorFlow and DL4J in the current release)
#ibmedge
Why TensorFlow?
• Authored by Google
• OpenSource
• TensorFlow has a Python API
• Use Jupyter notebooks and examples to learn
• Distributed training
21
https://www.tensorflow.org/
#ibmedge
Why distribute in clusters and why use GPUs?
• Input data sets are becoming larger
• High resolution images
• Video feeds
• Large number of training features
• Training times are very long (hours, days and weeks)
• Moore’s law is dying
• CPUs are not getting any faster
• Even the largest machine has limited capacity
22
#ibmedge
Distributed Deep Learning using TensorFlow
• TensorFlow (version > 0.8.0) can distribute compute intensive tasks on
multiple nodes
• Parameter Server for storing parameters (weight matrix)
• Performing computations in Clients (Workers)
• Once computed, gradients are sent to Parameter Server to update stored parameters
23
SuperVessel Private
Network
◼ Worker Task
◼ Parameter
Server Task
node #1 node #2 node #10
•••◼ Worker Task ◼ Worker Task
# define Parameter Server jobs:
with tf.device('/job:ps/task:%d' % taskID):
...
# define Worker jobs
with tf.device('/job:worker/task:%d' %
taskID):
...
TensorFlow cluster
The Problem: automated detection of metastases in whole-slide images of lymph node sections,
Source: Camelyon16
The Solution: Train Deep Learning Model, and classify whole slide histology image at “Level 0”
Medical Data Analysis Example
#ibmedge
Questions to address
26
- How long does it take to train a model?
- How performance will scale vs the cluster size?
- How scaling the cluster out will affect accuracy?
#ibmedge
27
Deep Learning in a TensorFlow cluster
Goal: improve the training time for Camelyon16 without losing accuracy significantly.
100K images, ~2GB
4 training epochs
(~5.5k iterations at batch size 72)
VGG model
#ibmedge
28
Medical Data Analysis Example: applying Deep Learning
Goal: improve the training time for Camelyon16 without losing accuracy significantly.
100K images, ~2GB
4 training epochs
(~5.5k iterations at batch size 72)
VGG model
#ibmedge
29
Medical Data Analysis Example: applying Deep Learning
Accuracy metrics: ROC
100K images, ~2Gb
4 training epochs
(~5.5k iterations at batch size 72)
VGG model
Zoom
© 2016 IBM Corporation #ibmedge
Infrastructure
Components for Deep
Learning as a Service
30
#ibmedge
Deep Learning Cluster as a Service
31
#ibmedge
Example Dockerfile to create Deep Learning images
32
FROM ppc64le/ubuntu:14.04
MAINTAINER Mike Hollinger <mchollin@us.ibm.com>
#bring in some base utils
RUN apt-get -y update && apt-get -y install software-properties-common wget
build-essential bash-completion #enable apt-add-repository and wget for the next line and
for the cuda installer to work correctly
RUN apt-get -y install dictionaries-common #inexplicably, this needs to be first before
vnc-related things will install successfully
#install VNC and VNC-related items
RUN apt-get -y install x11vnc xfce4 xvfb xfce4-artwork xubuntu-icon-theme
#install advanced toolchain and Linux SDK
RUN wget
ftp://public.dhe.ibm.com/software/server/iplsdk/v1.9.0/packages/deb/repo/dists/trusty/B346
CA20.gpg.key -O /tmp/B346CA20.gpg.key
RUN wget
ftp://ftp.unicamp.br/pub/linuxpatch/toolchain/at/ubuntu/dists/precise/6976a827.gpg.key -O
/tmp/6976a827.gpg.key
RUN wget
http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/p
ublic.gpg -O /tmp/xl_public.gpg
RUN apt-key add /tmp/B346CA20.gpg.key
RUN apt-key add /tmp/6976a827.gpg.key
RUN apt-key add /tmp/xl_public.gpg
RUN add-apt-repository "deb ftp://ftp.unicamp.br/pub/linuxpatch/toolchain/at/ubuntu trusty
at9.0"
RUN apt-get -y update
RUN apt-get -y install advance-toolchain-at9.0-runtime 
advance-toolchain-at9.0-devel 
advance-toolchain-at9.0-perf 
advance-toolchain-at9.0-mcore-libs
#install XL C/C++ Community Edition, auto-accepting the license (from Ke Wen Lin)
RUN apt-get -y install xlc.13.1.4 xlc-license-community.13.1.4
RUN mkdir -p /opt/ibm/xlC/13.1.4/lap/license/ && chmod a+rx
/opt/ibm/xlC/13.1.4/lap/license
RUN echo "Status=9" >/opt/ibm/xlC/13.1.4/lap/license/status.dat
RUN /opt/ibm/xlC/13.1.4/bin/xlc_configure
RUN apt-get -y install ibm-sdk-lop
#bring in the ibm mldl PPA
RUN apt-add-repository -y ppa:ibmpackages/mldl
#bring local cuda repo with GPU driver 352.39 and CUDA 7.5
RUN wget
http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo
-ubuntu1404-7-5-local_7.5-18_ppc64el.deb && 
dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_ppc64el.deb && apt-get
update && 
apt-get install -y --no-install-recommends --force-yes cuda
gpu-deployment-kit && 
ln -s /usr/lib/nvidia-352/libnvidia-ml.so /usr/lib/libnvidia-ml.so && 
rm cuda-repo-ubuntu1404-7-5-local_7.5-18_ppc64el.deb
#bring in and install cudnn
COPY rootfs/cudnn-7.0-linux-ppc64le-v3.0-prod.tgz
/tmp/cudnn-7.0-linux-ppc64le-v3.0-prod.tgz
RUN tar --no-same-owner -xvf /tmp/cudnn-7.0-linux-ppc64le-v3.0-prod.tgz -C /usr/local
#copy then untar to handle ownership problems vs "add"
#install the MLDL frameworks
RUN apt-get update && apt-get -y install torch caffe theano
continued ..
install deep learning
software
install GPU drivers
and libraries
#ibmedge
Cluster components to manage compute resources
33
Docker containers and images
OR
#ibmedge
An OpenStack- and Docker-based research cloud
SuperVessel
34
https://ny1.ptopenlab.com/bigdata_cluster/
#ibmedge
Mesos with Marathon with Docker and GPUs
#ibmedge
OpenPOWER: GPU support
36
GPU
Credit: Kevin Klaues, Mesosphere
IBM Spectrum
Conductor includes
enhanced support for
fine grained GPU and
CPU scheduling with
Apache Spark and
Docker
Mesos supports GPUs
#ibmedge
POWER8 Core: Back bone of big data computing system
• Enhanced Micro-Architecture
• Increased Execution Bandwidth
• SMT 8
• Transactional Memory
• Vector/Scalar Unit
• High-performance Integer & FP Vector Processor
• Optimized for Data Rich Applications
VSU
FXU
IFU
DFU
ISU
PC
PC
LSU
#ibmedge
Combined I/O Bandwidth = 7.6Tb/s
POWER
8
Process
or
Memory
Buffers
Memory
Buffers
PCI
DMI
PCI
POWER8
Processor
POWER8
Processor
DMI
DMI
DMI
DMI
DMI
DMI
DMI
NODE-to-NODE
ON-NODE
SMP
Putting it all together with the memory links, on- and off-node SMP links
as well as PCIe, at 7.6Tb/s of chip I/O bandwidth
#ibmedge
New OpenPOWER Systems with NVLink
39
S822LC-hpc “Minsky”:
2 POWER8 CPUs with 4 NVIDIA® Tesla® P100 GPUs GPUs
hooked directly to CPUs using Nvidia’s NVLink high-speed
interconnect
http://www-03.ibm.com/systems/power/hardware/s822lc-hpc/index.html
#ibmedge
OpenPOWER: Open Hardware for High Performance
40
#ibmedge
Machine Learning and Deep Learning analytics on OpenPOWER
No code changes needed!!
41
ATLAS
Automatically Tuned Linear Algebra
Software)
#ibmedge
Challenges and what’s next
42
● Infrastructure issues:
○ Advanced resource scheduling with Platform Conductor and Kubernetes or Mesos
○ More GPUs per system (up to 4-16 cards) for improved power consumption and better density
● TensorFlow issues:
○ Resolve problems with TF-Slim and model convergence
○ Integrate HDFS or another Distributed FS with TensorFlow
○ Try Synchronous training and compare results with Asynchronous
● Improve model training for better accuracy:
○ Train on a 300K samples dataset
○ Increase the number of training iterations to 30 epochs
○ 2 iteration for update False-Positive samples in dataset
○ Use another model (change from VGG16 to Inception-v3)
#ibmedge
More related sessions at Edge
•Expo Center Demo
•Tue, Sept 20, 1:00-2:00PM, RM 312: Docker on IBM Power Systems: Build, Ship and Run
•Tue, Sept 20, 1:00-2:00PM, RM 313: Docker Containers for High Performance Computing
•Tue, Sept 20, 1:00-2:00PM, RM 317C: Lab: FPGA Virtualization and Operations Environment for
Accelerator Application Development on Cloud
•Tue, Sept 20, 2:15-3:15PM, RM 320: Bringing the Deep Learning Revolution into the Enterprise
•Tue, Sept 20, 5:00-6:00PM, RM 308, Thu, Sep 22, 09:45 AM - 10:45 AM : Enabling Cognitive
Workloads on the Cloud: GPU Enablement with Mesos, Docker and Marathon on POWER
•Wed, Sept 21, 9:45-10:45AM, RM 317 C: Lab: Fast, Scalable, Easy Machine Learning in the Cloud
with OpenPOWER, GPUs and Docker
#ibmedge
Notices and Disclaimers
Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission
from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of
initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS
DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE
USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY.
IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.
IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our
warranty terms apply.”
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers
have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in
which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials
and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or
their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and
interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such
laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
#ibmedge
Notices and Disclaimers Con’t.
45
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not
tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the
ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual
property right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®,
FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG,
Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®,
PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business
Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
© 2016 IBM Corporation #ibmedge
Thank You

More Related Content

What's hot

Delivering Container-based Apps to IoT Edge devices
Delivering Container-based Apps to IoT Edge devicesDelivering Container-based Apps to IoT Edge devices
Delivering Container-based Apps to IoT Edge devicesAjeet Singh Raina
 
Using Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated ApplicationsUsing Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated ApplicationsNVIDIA
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionNVIDIA Taiwan
 
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsProfiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsJack (Jaegeun) Han
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Clusterinside-BigData.com
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA Taiwan
 
Affordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeAffordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeNVIDIA Taiwan
 
Classification of aerial photographs using DIGITS 2 - Mike Wang
Classification of aerial photographs using DIGITS 2 - Mike WangClassification of aerial photographs using DIGITS 2 - Mike Wang
Classification of aerial photographs using DIGITS 2 - Mike WangPAPIs.io
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group Ganesan Narayanasamy
 
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceCloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceMr. Vengineer
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesKoan-Sin Tan
 
DockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐるDockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐるKohei Tokunaga
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PC Cluster Consortium
 
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver MeetupDaneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver MeetupShannon McFarland
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...NVIDIA Taiwan
 
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangIntroduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangPAPIs.io
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021Grigory Sapunov
 

What's hot (20)

Delivering Container-based Apps to IoT Edge devices
Delivering Container-based Apps to IoT Edge devicesDelivering Container-based Apps to IoT Edge devices
Delivering Container-based Apps to IoT Edge devices
 
Using Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated ApplicationsUsing Docker for GPU Accelerated Applications
Using Docker for GPU Accelerated Applications
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
 
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsProfiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systems
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Cluster
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
 
BSC LMS DDL
BSC LMS DDL BSC LMS DDL
BSC LMS DDL
 
Affordable AI Connects To A Better Life
Affordable AI Connects To A Better LifeAffordable AI Connects To A Better Life
Affordable AI Connects To A Better Life
 
Classification of aerial photographs using DIGITS 2 - Mike Wang
Classification of aerial photographs using DIGITS 2 - Mike WangClassification of aerial photographs using DIGITS 2 - Mike Wang
Classification of aerial photographs using DIGITS 2 - Mike Wang
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group
 
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceCloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & Inference
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
 
DockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐるDockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐる
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
 
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver MeetupDaneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
Daneyon Hansen - Intro to OpenStack - Feb13 OpenStack Denver Meetup
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
 
CFD on Power
CFD on Power CFD on Power
CFD on Power
 
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangIntroduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
 
2018 bsc power9 and power ai
2018   bsc power9 and power ai 2018   bsc power9 and power ai
2018 bsc power9 and power ai
 

Viewers also liked

Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in DockerEric Ahn
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesPetteriTeikariPhD
 
Ultrasound nerve segmentation, kaggle review
Ultrasound nerve segmentation, kaggle reviewUltrasound nerve segmentation, kaggle review
Ultrasound nerve segmentation, kaggle reviewEduard Tyantov
 
L gordon slideshare assign assitive technology
L gordon slideshare assign assitive technologyL gordon slideshare assign assitive technology
L gordon slideshare assign assitive technologyLa Shelia Gordon
 
climate-leadership-report-to-minister-executive-summary
climate-leadership-report-to-minister-executive-summaryclimate-leadership-report-to-minister-executive-summary
climate-leadership-report-to-minister-executive-summaryKimberly Harback
 
Why Work With Us
Why Work With Us Why Work With Us
Why Work With Us housejive
 
WebAnalyticsAssignment4 (2) copy
WebAnalyticsAssignment4 (2) copyWebAnalyticsAssignment4 (2) copy
WebAnalyticsAssignment4 (2) copyAlyssa Sybilrud
 
About Laser Scanning
About Laser ScanningAbout Laser Scanning
About Laser ScanningLewis Boxer
 
หัวข้อ
หัวข้อหัวข้อ
หัวข้อfernfnnn
 
Agile - A failure story
Agile - A failure storyAgile - A failure story
Agile - A failure storyMiki Lior
 
Mapa de riesgo de una empresa de farmacos
Mapa de riesgo de una empresa de farmacosMapa de riesgo de una empresa de farmacos
Mapa de riesgo de una empresa de farmacossaidriana
 
Open stack + Cloud Foundry: Palo Alto Meetup February 2015
Open stack + Cloud Foundry: Palo Alto Meetup February 2015Open stack + Cloud Foundry: Palo Alto Meetup February 2015
Open stack + Cloud Foundry: Palo Alto Meetup February 2015Joshua McKenty
 
Tfk 6618 tensor_flow로얼굴인식구현_r10_mariocho
Tfk 6618 tensor_flow로얼굴인식구현_r10_mariochoTfk 6618 tensor_flow로얼굴인식구현_r10_mariocho
Tfk 6618 tensor_flow로얼굴인식구현_r10_mariochoMario Cho
 
ECS for Amazon Deep Learning and Amazon Machine Learning
ECS for Amazon Deep Learning and Amazon Machine LearningECS for Amazon Deep Learning and Amazon Machine Learning
ECS for Amazon Deep Learning and Amazon Machine LearningAmanda Mackay (she/her)
 

Viewers also liked (20)

Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in Docker
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 
Deploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and KubernetesDeploying deep learning models with Docker and Kubernetes
Deploying deep learning models with Docker and Kubernetes
 
Ultrasound nerve segmentation, kaggle review
Ultrasound nerve segmentation, kaggle reviewUltrasound nerve segmentation, kaggle review
Ultrasound nerve segmentation, kaggle review
 
L gordon slideshare assign assitive technology
L gordon slideshare assign assitive technologyL gordon slideshare assign assitive technology
L gordon slideshare assign assitive technology
 
climate-leadership-report-to-minister-executive-summary
climate-leadership-report-to-minister-executive-summaryclimate-leadership-report-to-minister-executive-summary
climate-leadership-report-to-minister-executive-summary
 
Red House Advertising-The Idea Incubator
Red House Advertising-The Idea IncubatorRed House Advertising-The Idea Incubator
Red House Advertising-The Idea Incubator
 
Why Work With Us
Why Work With Us Why Work With Us
Why Work With Us
 
WebAnalyticsAssignment4 (2) copy
WebAnalyticsAssignment4 (2) copyWebAnalyticsAssignment4 (2) copy
WebAnalyticsAssignment4 (2) copy
 
test upload
test uploadtest upload
test upload
 
About Laser Scanning
About Laser ScanningAbout Laser Scanning
About Laser Scanning
 
หัวข้อ
หัวข้อหัวข้อ
หัวข้อ
 
Independencia Dominicana (Pequeña practica)
Independencia Dominicana (Pequeña practica)Independencia Dominicana (Pequeña practica)
Independencia Dominicana (Pequeña practica)
 
Agile - A failure story
Agile - A failure storyAgile - A failure story
Agile - A failure story
 
Mapa de riesgo de una empresa de farmacos
Mapa de riesgo de una empresa de farmacosMapa de riesgo de una empresa de farmacos
Mapa de riesgo de una empresa de farmacos
 
Open stack + Cloud Foundry: Palo Alto Meetup February 2015
Open stack + Cloud Foundry: Palo Alto Meetup February 2015Open stack + Cloud Foundry: Palo Alto Meetup February 2015
Open stack + Cloud Foundry: Palo Alto Meetup February 2015
 
TensorFlow
TensorFlowTensorFlow
TensorFlow
 
Tfk 6618 tensor_flow로얼굴인식구현_r10_mariocho
Tfk 6618 tensor_flow로얼굴인식구현_r10_mariochoTfk 6618 tensor_flow로얼굴인식구현_r10_mariocho
Tfk 6618 tensor_flow로얼굴인식구현_r10_mariocho
 
ECS for Amazon Deep Learning and Amazon Machine Learning
ECS for Amazon Deep Learning and Amazon Machine LearningECS for Amazon Deep Learning and Amazon Machine Learning
ECS for Amazon Deep Learning and Amazon Machine Learning
 
Docker で Deep Learning
Docker で Deep LearningDocker で Deep Learning
Docker で Deep Learning
 

Similar to Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs

DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.Vlad Fedosov
 
Using Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at SplunkUsing Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at SplunkDocker, Inc.
 
Machine learning in cybersecutiry
Machine learning in cybersecutiryMachine learning in cybersecutiry
Machine learning in cybersecutiryVishwas N
 
Drupal 8 DevOps . Profile and SQL flows.
Drupal 8 DevOps . Profile and SQL flows.Drupal 8 DevOps . Profile and SQL flows.
Drupal 8 DevOps . Profile and SQL flows.Andrii Podanenko
 
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...PranavPatil822557
 
DockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDocker, Inc.
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYEnterprise Management Associates
 
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...Teodoro Cipresso
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes vty
 
Operational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU SeminarOperational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU SeminarCanturk Isci
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science PlatformDecision Science Community
 
Dockerization (Replacement of VMs)
Dockerization (Replacement of VMs)Dockerization (Replacement of VMs)
Dockerization (Replacement of VMs)IRJET Journal
 
Introduction to PaaS and Heroku
Introduction to PaaS and HerokuIntroduction to PaaS and Heroku
Introduction to PaaS and HerokuTapio Rautonen
 
Dictionary Within the Cloud
Dictionary Within the CloudDictionary Within the Cloud
Dictionary Within the Cloudgueste4978b94
 
Infrastrucutre As Code
Infrastrucutre As Code Infrastrucutre As Code
Infrastrucutre As Code Venu Murthy
 
Containers: DevOp Enablers of Technical Solutions
Containers: DevOp Enablers of Technical SolutionsContainers: DevOp Enablers of Technical Solutions
Containers: DevOp Enablers of Technical SolutionsJules Pierre-Louis
 
ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...
ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...
ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...ITProceed
 

Similar to Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs (20)

DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.
 
Using Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at SplunkUsing Docker EE to Scale Operational Intelligence at Splunk
Using Docker EE to Scale Operational Intelligence at Splunk
 
Machine learning in cybersecutiry
Machine learning in cybersecutiryMachine learning in cybersecutiry
Machine learning in cybersecutiry
 
All in one
All in oneAll in one
All in one
 
Drupal 8 DevOps . Profile and SQL flows.
Drupal 8 DevOps . Profile and SQL flows.Drupal 8 DevOps . Profile and SQL flows.
Drupal 8 DevOps . Profile and SQL flows.
 
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
Machine Learning , Analytics & Cyber Security the Next Level Threat Analytics...
 
DockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General SessionDockerCon EU 2015: Day 1 General Session
DockerCon EU 2015: Day 1 General Session
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
 
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
Innovate 2014: Get an A+ on Testing Your Enterprise Applications with Rationa...
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
 
Ritesh Resume
Ritesh Resume Ritesh Resume
Ritesh Resume
 
Operational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU SeminarOperational Visibiliy and Analytics - BU Seminar
Operational Visibiliy and Analytics - BU Seminar
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
Dockerization (Replacement of VMs)
Dockerization (Replacement of VMs)Dockerization (Replacement of VMs)
Dockerization (Replacement of VMs)
 
Introduction to PaaS and Heroku
Introduction to PaaS and HerokuIntroduction to PaaS and Heroku
Introduction to PaaS and Heroku
 
Dictionary Within the Cloud
Dictionary Within the CloudDictionary Within the Cloud
Dictionary Within the Cloud
 
Explore Android Internals
Explore Android InternalsExplore Android Internals
Explore Android Internals
 
Infrastrucutre As Code
Infrastrucutre As Code Infrastrucutre As Code
Infrastrucutre As Code
 
Containers: DevOp Enablers of Technical Solutions
Containers: DevOp Enablers of Technical SolutionsContainers: DevOp Enablers of Technical Solutions
Containers: DevOp Enablers of Technical Solutions
 
ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...
ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...
ITPROCEED_WorkplaceMobility_Creating a seamless experience with ue v and wind...
 

Recently uploaded

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Recently uploaded (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs

  • 1. #ibmedge© 2016 IBM Corporation Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs Andrei Yurkevich, Altoros Indrajit Poddar, IBM Sep 23, 2016
  • 2. #ibmedge Please Note: • IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice and at IBM’s sole discretion. • Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. • The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. • The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. • Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 1
  • 3. #ibmedge About Indrajit (a.k.a. I.P) Expertise: • Accelerated Cloud Data Services, Machine Learning and Deep Learning • Apache Spark, TensorFlow… with GPUs • Distributed Computing (scale out and up) • Cloud Foundry, Spectrum Conductor, Mesos, Kubernetes, Docker, OpenStack, WebSphere • Cloud computing on High Performance Systems • OpenPOWER, IBM POWER 2 Indrajit Poddar Senior Technical Staff Member, Master Inventor, IBM Systems ipoddar@us.ibm.com Twitter: @ipoddar
  • 7. #ibmedge We will talk about 6 - Current state of Deep Learning
  • 8. #ibmedge We will talk about 7 - Current state of Deep Learning - Deep Learning for cancer diagnosis (Digital Pathology)
  • 9. #ibmedge We will talk about 8 - Current state of Deep Learning - Deep Learning for cancer diagnosis (Digital Pathology) - TensorFlow framework for Deep Learning
  • 10. #ibmedge We will talk about 9 - Current state of Deep Learning - Deep Learning for cancer diagnosis (Digital Pathology) - TensorFlow framework for Deep Learning - Distributing TensorFlow with Docker
  • 11. #ibmedge We will talk about 10 - Current state of Deep Learning - Deep Learning for cancer diagnosis (Digital Pathology) - TensorFlow framework for Deep Learning - Distributing TensorFlow with Docker - Faster training with TensorFlow on OpenPOWER and GPUs
  • 12. #ibmedge We will talk about 11 - Current state of Deep Learning - Deep Learning for cancer diagnosis (Digital Pathology) - TensorFlow framework for Deep Learning - Distributing TensorFlow with Docker - Faster training with TensorFlow on OpenPOWER and GPUs - Infrastructure for TensorFlow as a Service
  • 13. #ibmedge A picture is worth a thousand words… 12 http://www.wordclouds.com/
  • 14. #ibmedge What is Deep Learning? Machine Learning in layers and hierarchies 13
  • 16. #ibmedge 15 Medical Data Analysis Example: Image classification Comparing classification by humans and by machines Detected by a Doctor visually Caught by a Trained model
  • 17. #ibmedge Time Scale: Before, Digital Pathology, Deep Learning 16 1980 1990 1997 2005 Video cameras Progress in functional telemedicine Robotic microscopy First fully functional WSI Scanner ANN intro Yann LeCun et al., backpropagation algorithm “Deep Learning” for Speech Recognition
  • 18. #ibmedge Machines are now learning the way we learn 17 From "Texture of the Nervous System of Man and the Vertebrates" by Santiago Ramón y Cajal. Artificial Neural Networks
  • 19. #ibmedge Deep Learning is improving in accuracy 18
  • 20. #ibmedge Time Scale: Advances in Deep Learning 19 2005 2010 2015 Whole Slide Image (WSI) Scanner 2016 GPUs 12 core/socket 8 thread/core
  • 21. #ibmedge Open Source Deep Learning Libraries 20 IBM Machine Learning and Deep Learning distribution for Ubuntu on OpenPOWER: http://openpowerfoundation.org/blogs/openpower-deep-learning-distribution/ (does not include TensorFlow and DL4J in the current release)
  • 22. #ibmedge Why TensorFlow? • Authored by Google • OpenSource • TensorFlow has a Python API • Use Jupyter notebooks and examples to learn • Distributed training 21 https://www.tensorflow.org/
  • 23. #ibmedge Why distribute in clusters and why use GPUs? • Input data sets are becoming larger • High resolution images • Video feeds • Large number of training features • Training times are very long (hours, days and weeks) • Moore’s law is dying • CPUs are not getting any faster • Even the largest machine has limited capacity 22
  • 24. #ibmedge Distributed Deep Learning using TensorFlow • TensorFlow (version > 0.8.0) can distribute compute intensive tasks on multiple nodes • Parameter Server for storing parameters (weight matrix) • Performing computations in Clients (Workers) • Once computed, gradients are sent to Parameter Server to update stored parameters 23 SuperVessel Private Network ◼ Worker Task ◼ Parameter Server Task node #1 node #2 node #10 •••◼ Worker Task ◼ Worker Task # define Parameter Server jobs: with tf.device('/job:ps/task:%d' % taskID): ... # define Worker jobs with tf.device('/job:worker/task:%d' % taskID): ...
  • 26. The Problem: automated detection of metastases in whole-slide images of lymph node sections, Source: Camelyon16 The Solution: Train Deep Learning Model, and classify whole slide histology image at “Level 0” Medical Data Analysis Example
  • 27. #ibmedge Questions to address 26 - How long does it take to train a model? - How performance will scale vs the cluster size? - How scaling the cluster out will affect accuracy?
  • 28. #ibmedge 27 Deep Learning in a TensorFlow cluster Goal: improve the training time for Camelyon16 without losing accuracy significantly. 100K images, ~2GB 4 training epochs (~5.5k iterations at batch size 72) VGG model
  • 29. #ibmedge 28 Medical Data Analysis Example: applying Deep Learning Goal: improve the training time for Camelyon16 without losing accuracy significantly. 100K images, ~2GB 4 training epochs (~5.5k iterations at batch size 72) VGG model
  • 30. #ibmedge 29 Medical Data Analysis Example: applying Deep Learning Accuracy metrics: ROC 100K images, ~2Gb 4 training epochs (~5.5k iterations at batch size 72) VGG model Zoom
  • 31. © 2016 IBM Corporation #ibmedge Infrastructure Components for Deep Learning as a Service 30
  • 33. #ibmedge Example Dockerfile to create Deep Learning images 32 FROM ppc64le/ubuntu:14.04 MAINTAINER Mike Hollinger <mchollin@us.ibm.com> #bring in some base utils RUN apt-get -y update && apt-get -y install software-properties-common wget build-essential bash-completion #enable apt-add-repository and wget for the next line and for the cuda installer to work correctly RUN apt-get -y install dictionaries-common #inexplicably, this needs to be first before vnc-related things will install successfully #install VNC and VNC-related items RUN apt-get -y install x11vnc xfce4 xvfb xfce4-artwork xubuntu-icon-theme #install advanced toolchain and Linux SDK RUN wget ftp://public.dhe.ibm.com/software/server/iplsdk/v1.9.0/packages/deb/repo/dists/trusty/B346 CA20.gpg.key -O /tmp/B346CA20.gpg.key RUN wget ftp://ftp.unicamp.br/pub/linuxpatch/toolchain/at/ubuntu/dists/precise/6976a827.gpg.key -O /tmp/6976a827.gpg.key RUN wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/p ublic.gpg -O /tmp/xl_public.gpg RUN apt-key add /tmp/B346CA20.gpg.key RUN apt-key add /tmp/6976a827.gpg.key RUN apt-key add /tmp/xl_public.gpg RUN add-apt-repository "deb ftp://ftp.unicamp.br/pub/linuxpatch/toolchain/at/ubuntu trusty at9.0" RUN apt-get -y update RUN apt-get -y install advance-toolchain-at9.0-runtime advance-toolchain-at9.0-devel advance-toolchain-at9.0-perf advance-toolchain-at9.0-mcore-libs #install XL C/C++ Community Edition, auto-accepting the license (from Ke Wen Lin) RUN apt-get -y install xlc.13.1.4 xlc-license-community.13.1.4 RUN mkdir -p /opt/ibm/xlC/13.1.4/lap/license/ && chmod a+rx /opt/ibm/xlC/13.1.4/lap/license RUN echo "Status=9" >/opt/ibm/xlC/13.1.4/lap/license/status.dat RUN /opt/ibm/xlC/13.1.4/bin/xlc_configure RUN apt-get -y install ibm-sdk-lop #bring in the ibm mldl PPA RUN apt-add-repository -y ppa:ibmpackages/mldl #bring local cuda repo with GPU driver 352.39 and CUDA 7.5 RUN wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo -ubuntu1404-7-5-local_7.5-18_ppc64el.deb && dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_ppc64el.deb && apt-get update && apt-get install -y --no-install-recommends --force-yes cuda gpu-deployment-kit && ln -s /usr/lib/nvidia-352/libnvidia-ml.so /usr/lib/libnvidia-ml.so && rm cuda-repo-ubuntu1404-7-5-local_7.5-18_ppc64el.deb #bring in and install cudnn COPY rootfs/cudnn-7.0-linux-ppc64le-v3.0-prod.tgz /tmp/cudnn-7.0-linux-ppc64le-v3.0-prod.tgz RUN tar --no-same-owner -xvf /tmp/cudnn-7.0-linux-ppc64le-v3.0-prod.tgz -C /usr/local #copy then untar to handle ownership problems vs "add" #install the MLDL frameworks RUN apt-get update && apt-get -y install torch caffe theano continued .. install deep learning software install GPU drivers and libraries
  • 34. #ibmedge Cluster components to manage compute resources 33 Docker containers and images OR
  • 35. #ibmedge An OpenStack- and Docker-based research cloud SuperVessel 34 https://ny1.ptopenlab.com/bigdata_cluster/
  • 36. #ibmedge Mesos with Marathon with Docker and GPUs
  • 37. #ibmedge OpenPOWER: GPU support 36 GPU Credit: Kevin Klaues, Mesosphere IBM Spectrum Conductor includes enhanced support for fine grained GPU and CPU scheduling with Apache Spark and Docker Mesos supports GPUs
  • 38. #ibmedge POWER8 Core: Back bone of big data computing system • Enhanced Micro-Architecture • Increased Execution Bandwidth • SMT 8 • Transactional Memory • Vector/Scalar Unit • High-performance Integer & FP Vector Processor • Optimized for Data Rich Applications VSU FXU IFU DFU ISU PC PC LSU
  • 39. #ibmedge Combined I/O Bandwidth = 7.6Tb/s POWER 8 Process or Memory Buffers Memory Buffers PCI DMI PCI POWER8 Processor POWER8 Processor DMI DMI DMI DMI DMI DMI DMI NODE-to-NODE ON-NODE SMP Putting it all together with the memory links, on- and off-node SMP links as well as PCIe, at 7.6Tb/s of chip I/O bandwidth
  • 40. #ibmedge New OpenPOWER Systems with NVLink 39 S822LC-hpc “Minsky”: 2 POWER8 CPUs with 4 NVIDIA® Tesla® P100 GPUs GPUs hooked directly to CPUs using Nvidia’s NVLink high-speed interconnect http://www-03.ibm.com/systems/power/hardware/s822lc-hpc/index.html
  • 41. #ibmedge OpenPOWER: Open Hardware for High Performance 40
  • 42. #ibmedge Machine Learning and Deep Learning analytics on OpenPOWER No code changes needed!! 41 ATLAS Automatically Tuned Linear Algebra Software)
  • 43. #ibmedge Challenges and what’s next 42 ● Infrastructure issues: ○ Advanced resource scheduling with Platform Conductor and Kubernetes or Mesos ○ More GPUs per system (up to 4-16 cards) for improved power consumption and better density ● TensorFlow issues: ○ Resolve problems with TF-Slim and model convergence ○ Integrate HDFS or another Distributed FS with TensorFlow ○ Try Synchronous training and compare results with Asynchronous ● Improve model training for better accuracy: ○ Train on a 300K samples dataset ○ Increase the number of training iterations to 30 epochs ○ 2 iteration for update False-Positive samples in dataset ○ Use another model (change from VGG16 to Inception-v3)
  • 44. #ibmedge More related sessions at Edge •Expo Center Demo •Tue, Sept 20, 1:00-2:00PM, RM 312: Docker on IBM Power Systems: Build, Ship and Run •Tue, Sept 20, 1:00-2:00PM, RM 313: Docker Containers for High Performance Computing •Tue, Sept 20, 1:00-2:00PM, RM 317C: Lab: FPGA Virtualization and Operations Environment for Accelerator Application Development on Cloud •Tue, Sept 20, 2:15-3:15PM, RM 320: Bringing the Deep Learning Revolution into the Enterprise •Tue, Sept 20, 5:00-6:00PM, RM 308, Thu, Sep 22, 09:45 AM - 10:45 AM : Enabling Cognitive Workloads on the Cloud: GPU Enablement with Mesos, Docker and Marathon on POWER •Wed, Sept 21, 9:45-10:45AM, RM 317 C: Lab: Fast, Scalable, Easy Machine Learning in the Cloud with OpenPOWER, GPUs and Docker
  • 45. #ibmedge Notices and Disclaimers Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.” Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
  • 46. #ibmedge Notices and Disclaimers Con’t. 45 Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.
  • 47. © 2016 IBM Corporation #ibmedge Thank You