SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Innovating with
AI at Scale:
Tools and Tips for
Training and Inference
Presenter: Clarisse Taaffe-Hedglin
clarisse@us.ibm.com
Executive AI Architect
IBM Systems
1. Drivers of the AI explosion
2. Implementing use cases at scale
3. Deploying models to the edge
2
Why AI models now?
3
4
USE CASES
ARE EVERYWHERE
IBM Skills Academy / © Copyright 2018 IBM Corporation
Artificial Intelligence brings
new Cognitive Capabilities
‱ Computers can be trained to “See”
Example: Airport security inspecting luggage
‱ Computers can be trained to “Hear”
Example: Maintenance crew listening to railcars
‱ Computers can be trained to “do”: mimic an expert
Example: Mobile phone provider predicting customer churn
Data + Algorithms + Compute
CPU
GPU
FPGA
The key triggers rapidly advancing AI
Open Source Software
MEDIA/ENTERTAINMENT
RETAIL
Reco. Engines,
Precision Mktg
OTHERS
Agriculture,
Remote
Sensing
LIFE SCIENCES
Sequence
Analysis,
Radiology
UTILITIES
Smart Meter
analysis, Capacity
planning
$
FINANCIAL SERVICES
Risk analysis
Fraud detection
CUSTOMER SERVICE
Chatbots, Helpdesk
Automated
Expenses
LAW & DEFENSE
Threat analysis -
social media
monitoring
RESEARCH
Physics
Modeling
HEALTH CARE
Patient sensors,
monitoring, EHRs
TRANSPORTATION
Optimal traffic
flows, Route
planning
CONSUMER GOODS
Sentiment
analysis
Advertising
effectiveness
OIL & GAS
Exploration,
sensor
analysis
AUTOMOTIVE
ADAS,
Maintenance
MANUFACTURING
Line
inspection,
Defect analysis
Addressable market
Cognitive Systems / February 26 / © 2019 IBM Corporation
BIG, COMPLEX SYSTEMS
PERSONALIZATION
AUTOMATION
SIMULATING
RELATIONSHIPS
VISUAL RECOGNITION
PATTERNS
The scenariosAI can
best solve for today
IBM Skills Academy / © Copyright 2018 IBM Corporation
ML Framework Landscape
9
Which ML frameworks have you used
the most over the last 5 years?
Source: Kaggle Data Science Survey 2018
scikit-learn is, by far, the most widely-used
ML framework
Why?
‱ Wide variety of ML models
‱ Good documentation
‱ Standardized API
Some downsides of scikit-learn are:
1. Lack of support of deep learning (DL)
2. Slow performance for large datasets
Problem (1) is addressed by DL frame works in
PowerAI (TensorFlow, PyTorch) recently rebranded
as Watson Machine Learning Accelerator
Problem (2) is addressed by Snap ML
Watson Machine Learning Community Edition
TensorFlow
TensorFlow Probability
TensorBoard
TensorFlow-Keras
BVLC Caffe
IBM Enhanced Caffe
Caffe2
OpenBLAS
HDF5
Curated, tested and pre-compiled binary
software distribution that enables enterprises
to quickly and easily deploy deep learning for
their data science and analytics development
Including all of the following frameworks:
Nvidia RAPIDS
Distributed Deep Learning
Simplifies the process of training
deep learning models across a
cluster for faster time to results.
Software Libraries
WML CE software and the
accelerated Power servers
support a host of accelerator
libraries like SnapML, Nvidia
RAPIDS
Large Model Support
Use system memory with GPUs
to support more complex models
and higher resolution data.
IBM adds value to curated, tested, and
pre-compiled frameworks with
Watson Machine Learning Community Edition
GPU
CPU
Evolving from compute systems to Cognitive Systems
P8 P9 P10
Open Frameworks
Partnerships
Industry Alignment
DevEcosystem
Accelerator Roadmaps
Open Accelerator
Interfaces
Not Just About Hardware Design
hardware
software
+
It’s about co-optimization and open
innovation
which just work for ML, DL, and AI
IBM Software
12
How to get to AI at scale ?
13
14
Top 5 Error Rate
Distributed Deep
Learning (DDL)
16Think 2018 / DOC ID / Month XX, 2018 / © 2018 IBM Corporation
Deep learning training
takes days to weeks
Limited scaling to
multiple x86 servers
PowerAI with DDL
enables scaling to 100s
of GPUs
1 System 64 Systems
16 Days Down to 7 Hours
58x Faster
16 Days
7 Hours
Near Ideal Scaling to 256 GPUs
ResNet-101, ImageNet-22K
1
2
4
8
16
32
64
128
256
4 16 64 256
Speedup
Number of GPUs
Ideal Scaling
95%Scaling with
256 GPUS
Caffe with PowerAI DDL, Running on Minsky (S822Lc) Power
System
ResNet-50, ImageNet-1K
17
Train larger more complex models
Large Model SupportTraditional Model Support
Limited memory on GPU forces tradeoff
in model size / data resolution
Use system memory and GPU to support more
complex and higher resolution data
CPUDDR4
GPU
PCIe
Graphics
Memory
System
Bottleneck
Here
POWER
CPU
DDR4
GPU
NVLink
Graphics
Memory
POWER NVLink
Data Pipe
Large AI Models Train
~4 Times Faster
POWER9 Servers with
NVLink to GPUs
vs
x86 Servers with PCIe to
GPUs
19
3.1 Hours
49 Mins
0
2000
4000
6000
8000
10000
12000
Xeon x86 2640v4 w/
4x V100 GPUs
Power AC922 w/ 4x
V100 GPUs
Time(secs)
Caffe with LMS (Large Model Support)
Runtime of 1000 Iterations
3.8x Faster
GoogleNet model on Enlarged
ImageNet Dataset (2240x2240)
TensorFlow Large Model Support NVLINK2 Advantage
s.
3DUnet segmentation models with
higher resolution images allows for
learning and labeling finer details
and structures of brain tumors.
https://developer.ibm.com/linuxonpower/2018/07/27/tensorflow-large-model-support-case-study-3d-image-segmentation/
Accelerating Machine Learning
Why Fast?
Speed is important/crucial in many cases:
‱ online re-training of models
‱ model selection and hyper-parameter tuning
‱ fast adaptability to changes
Why Large-Scale?
Large datasets arise in numerous business-critical
applications: recommendation, credit fraud, advertising,
space exploration, weather, etc.
Why Resource-Savvy?
Not everyone can afford on-prem computing.
Renting computing in the cloud is billed by usage.
Less usage means savings, higher profit margin.
Snap ML is a framework for training
Machine Learning (ML) Models
It is characterized by:
 high performance
 scalability to very large datasets
 high resource efficiency
Artificial
Intelligence
Machine
Learning
Deep Learning
(Neural Networks)
21
Which models are supported?
22
Snap ML (PowerAI 1.6.0) currently supports:
‱ Generalized Linear Models:
- Logistic Regression
- Ridge Regression
- Lasso Regression
- Support Vector Machines (SVMs)
‱ Tree-based models:
- Decision Trees
- Random Forest
With more to come

Source: Kaggle Data Science Survey 2017
Which data science methods are used at work?
Supported
by Snap
ML
23
Decision Tree
Performance Results
Random Forest
Performance Results
23
5.2x 4.5x
On average 6.5x faster than sklearn (CPU-only) On average 3.8x faster than sklearn (CPU-only)
Project www: https://www.zurich.ibm.com/snapml/
Core publication: https://arxiv.org/abs/1803.06333
Nvidia RAPIDS
RAPIDS is a set of open source libraries for GPU accelerating data preparation and machine
learning.
OSS website: rapids.ai
Nvidia RAPIDS cuDF - GPU DataFrames
is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data
provides a pandas-like API that will be familiar to data engineers & data scientists
Current version is 0.6
 PowerAI 1.6.0 CuDF included tech preview version is backlevel (0.2)
 WIP to get latest into Conda or build yourself (open source)
Examples of data manipulation in cuDF like object creation, viewing, selection, merge, concat, etc can be
found here:
https://rapidsai.github.io/projects/cudf/en/latest/10min.html
Simple cuDF example
download a CSV, then uses the GPU to parse it into rows and columns and run calculations:
output:
Nvidia RAPIDS cuML - GPU Machine Learning
is a suite of libraries that implement machine learning algorithms and mathematical primitives functions
enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs
Current version is 0.6
 PowerAI 1.6.0 CuML included tech preview version is backlevel (0.2)
 WIP to get latest into Conda or build yourself (open source)
Documentation on supported algorithms like Kmeans, tSVD, PCA, DBSCAN can be found here:
https://docs.rapids.ai/api/cuml/stable/
Simple cuML example
loads input and computes DBSCAN clusters, all on GPU:
output:
29
How to deploy at the edge?
COLLECT - Make data simple and accessible
ORGANIZE - Create a trusted analytics foundation
ANALYZE - Scale AI everywhere with trust & transparency
Data of every type, regardless of
where it lives
MODERNIZE
your data estate for an
AI and multicloud world
INFUSE – Operationalize AI across business processes
The AI Ladder
A prescriptive approach to accelerating the journey to AI
30
AI
AI-optimized systems
infrastructure
AI Open
Source
Frameworks
Introduction to Nvidia TensorRT
NVIDIA TensorRTℱ is a platform for high-performance deep learning inference. It includes a deep
learning inference optimizer and runtime that delivers low latency and high-throughput for deep
learning inference applications.
Nvidia website: https://developer.nvidia.com/tensorrt
Tensorflow and TensorRT inference
TensorFlowℱ integration
with TensorRTℱ (TF-TRT)
optimizes and executes
compatible subgraphs,
allowing TensorFlow to execute
the remaining graph. While you
can still use TensorFlow's wide
and flexible feature
set, TensorRT will parse the
model and apply optimizations
to the portions of the graph
wherever possible.
Note: TensorRT engines are optimized for the
currently available GPUs, so conversions should
take place on the machine that will be running
inference.
Calibrating for lower precision with a minimal loss of accuracy
reduces the requirements on bandwidth and allows for faster
computation speed. It also allows for the use of Tensor Cores,
which perform matrix multiplication on 4×4 FP16 matrices and adds
a 4×4 FP16 or FP32 matrix.
https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-
inference/
Nvidia TensorRT Current Version
Version 6 Announced on September 16th (current)
https://news.developer.nvidia.com/tensorrt6-breaks-bert-record/
Version 5.1.3.6 added as a tech preview to WML CE 1.6.1
https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/
41
Resources
https://developer.ibm.com/linuxonpower/deep-learning-powerai#tab_education
Nvidia TensorRT: https://developer.nvidia.com/tensorrt
WML CE 1.6.1: https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/
TF-TRT Documentation: https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/
IBM TensorRT introduction blog: https://developer.ibm.com/linuxonpower/2019/07/29/introducing-tensorflow-with-tensorrt-tf-trt/
IBM Tensorflow Serving blog (includes TensorRT example): https://developer.ibm.com/linuxonpower/2019/08/05/using-tensorrt-models-
with-tensorflow-serving-on-wml-ce/
Image classification and object detection: github.com/tensorflow/tensorrt
Nvidia forum:https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/
Mixed precision and accuracy: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9143-mixed-precision-
training-of-deep-neural-networks.pdf
Demo: https://github.com/cheeyauk/tf_to_tensorrt
IBM Systems WW Client Experience Centers
IBM Internal Use Only
Search Center Offerings in ISCEP:
https://ibm.biz/client-experience-portal
Contact Center via
IBM Systems Worldwide Client Experience
Centers maximize IBM Systems competitive
advantage in the Cloud and Cognitive era by
providing access to world class technical
experts and infrastructure services to assist
Clients with the transformation of their IT
implementations. Center offerings enable IBM
Sellers and Business Partners to progress and
expedite System Sales opportunities.
9 Worldwide Locations (* also Infrastructure
Hubs):
Austin TX , *Poughkeepsie NY, Rochester MN,
Tucson AZ, *Beijing CHINA, Boeblingen
GERMANY, Guadalajara MEXICO,*Montpellier
FRANCE, Tokyo JAPAN
Client Experience
Tailored, in-depth
technology
Innovation Exchange
Events
Relationship building
Demonstrations
Meetups
Solution workshops
Remote options
(Inbound & Outbound)
Infrastructure
Solutions
Benchmarks, MVP & Proof
of Technology
“Test Drives”
Demonstrations
Infrastructure Services
Certify ISV solutions
Hosting
Cloud Environment
(Inbound to Centers)
Architecture &
Design
Advise clients, Enable
Sellers, “Art of the
Possible”
Discovery & Design
Workshops, Consulting,
Showcases, Reference
Architectures, Co-
Creation of assets
Included CSSC
(Inbound & Outbound)
Content
Content Development
IBM Redbooks
Training Courses
Video courses
“Test Drives”
Demonstrations
NEW: Co-Creation Lab; CEC Cloud; IBM Systems Center of Competency for Red
Hat
Please note
IBM’s statements regarding its plans, directions, and intent are subject to change
or withdrawal without notice and at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general
product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise,
or legal obligation to deliver any material, code or functionality. Information about potential
future products may not be incorporated into any contract.
The development, release, and timing of any future features or functionality described for our
products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in
a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon many factors, including considerations such as the
amount of multiprogramming in the user’s job stream, the I/O configuration, the storage
configuration, and the workload processed. Therefore, no assurance can be given that an
individual user will achieve results similar to those stated here.
44
Notices and disclaimers
45Replace the footer with text from the PPT-Updater. Instructions are included in that file.
© 2018 International Business Machines Corporation. No part of this
document may be reproduced or transmitted in any form without
written permission from IBM.
U.S. Government Users Restricted Rights — use, duplication or
disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to
products that have not yet been announced by IBM) has been reviewed
for accuracy as of the date of initial publication and could include
unintentional technical or typographical errors. IBM shall have no
responsibility to update this information. This document is distributed
“as is” without any warranty, either express or implied. In no event,
shall IBM be liable for any damage arising from the use of this
information, including but not limited to, loss of data, business
interruption, loss of profit or loss of opportunity. IBM products and
services are warranted per the terms and conditions of the agreements
under which they are provided.
IBM products are manufactured from new parts or new and used parts.
In some cases, a product may not be new and may have been previously
installed. Regardless, our warranty terms apply.”
Any statements regarding IBM's future direction, intent or product
plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled,
isolated environments. Customer examples are presented as illustrations of
how those
customers have used IBM products and the results they may have
achieved. Actual performance, cost, savings or other results in other
operating environments may vary.
References in this document to IBM products, programs, or services does
not imply that IBM intends to make such products, programs or services
available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by
independent session speakers, and do not necessarily reflect the views of
IBM. All materials and discussions are provided for informational purposes
only, and are neither intended to, nor shall constitute legal or other guidance
or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal
requirements and to obtain advice of competent legal counsel as to
the identification and interpretation of any relevant laws and regulatory
requirements that may affect the customer’s business and any actions the
customer may need to take to comply with such laws. IBM does not provide
legal advice or represent or warrant that its services or products will ensure
that the customer follows any law.
Notices and disclaimers
continued
46Replace the footer with text from the PPT-Updater. Instructions are included in that file.
Information concerning non-IBM products was obtained from the
suppliers of those products, their published announcements or other
publicly available sources. IBM has not tested those products about this
publication and cannot confirm the accuracy of performance, compatibility
or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of
those products. IBM does not warrant the quality of any third-party
products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM expressly disclaims all
warranties, expressed or implied, including but not limited to, the
implied warranties of merchantability and fitness for a purpose.
The provision of the information contained herein is not intended to, and
does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
IBM, the IBM logo, ibm.com and [names of other referenced IBM
products and services used in the presentation] are trademarks of
International Business Machines Corporation, registered in many
jurisdictions worldwide. Other product and service names might
be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark
information" at: www.ibm.com/legal/copytrade.shtml.
.

Weitere Àhnliche Inhalte

Was ist angesagt?

TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
Willy Marroquin (WillyDevNET)
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
Divyen Patel
 

Was ist angesagt? (20)

Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerFast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
 
PowerAI Deep Dive ( key points )
PowerAI Deep Dive ( key points )PowerAI Deep Dive ( key points )
PowerAI Deep Dive ( key points )
 
SNAP MACHINE LEARNING
SNAP MACHINE LEARNINGSNAP MACHINE LEARNING
SNAP MACHINE LEARNING
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group
 
CFD on Power
CFD on Power CFD on Power
CFD on Power
 
OpenPOWER Webinar on Machine Learning for Academic Research
OpenPOWER Webinar on Machine Learning for Academic Research OpenPOWER Webinar on Machine Learning for Academic Research
OpenPOWER Webinar on Machine Learning for Academic Research
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learning
 
OpenPOWER Boot camp in Zurich
OpenPOWER Boot camp in ZurichOpenPOWER Boot camp in Zurich
OpenPOWER Boot camp in Zurich
 
Distributed deep learning reference architecture v3.2l
Distributed deep learning reference architecture v3.2lDistributed deep learning reference architecture v3.2l
Distributed deep learning reference architecture v3.2l
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power Systems
 
Programming Models for Exascale Systems
Programming Models for Exascale SystemsProgramming Models for Exascale Systems
Programming Models for Exascale Systems
 
AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
JMI Techtalk: í•œìžŹê·Œ - How to use GPU for developing AI
JMI Techtalk: í•œìžŹê·Œ - How to use GPU for developing AIJMI Techtalk: í•œìžŹê·Œ - How to use GPU for developing AI
JMI Techtalk: í•œìžŹê·Œ - How to use GPU for developing AI
 

Ähnlich wie Innovation with ai at scale on the edge vt sept 2019 v0

Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorial
cybercbm
 

Ähnlich wie Innovation with ai at scale on the edge vt sept 2019 v0 (20)

Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
FĂłrum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
FĂłrum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...FĂłrum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
FĂłrum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
Nvidia at SEMICon, Munich
Nvidia at SEMICon, MunichNvidia at SEMICon, Munich
Nvidia at SEMICon, Munich
 
Open power ddl and lms
Open power ddl and lmsOpen power ddl and lms
Open power ddl and lms
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorial
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
 
HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoT
 
Power AI introduction
Power AI introductionPower AI introduction
Power AI introduction
 
AI + E-commerce
AI + E-commerceAI + E-commerce
AI + E-commerce
 
End to End Machine Learning Open Source Solution Presented in Cisco Developer...
End to End Machine Learning Open Source Solution Presented in Cisco Developer...End to End Machine Learning Open Source Solution Presented in Cisco Developer...
End to End Machine Learning Open Source Solution Presented in Cisco Developer...
 
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie moĆŒliwoƛci daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie moĆŒliwoƛci daj...infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie moĆŒliwoƛci daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie moĆŒliwoƛci daj...
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
DataArt
DataArtDataArt
DataArt
 

Mehr von Ganesan Narayanasamy

180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 

Mehr von Ganesan Narayanasamy (20)

Chip Design Curriculum development Residency program
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency program
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
OpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT RoorkeeOpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT Roorkee
 
Deep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systems
 
IBM BOA for POWER
IBM BOA for POWER IBM BOA for POWER
IBM BOA for POWER
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
 
AI in healthcare - Use Cases
AI in healthcare - Use Cases AI in healthcare - Use Cases
AI in healthcare - Use Cases
 
AI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsAI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systems
 
Poster from NUS
Poster from NUSPoster from NUS
Poster from NUS
 
SAP HANA on POWER9 systems
SAP HANA on POWER9 systemsSAP HANA on POWER9 systems
SAP HANA on POWER9 systems
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
AI in the enterprise
AI in the enterprise AI in the enterprise
AI in the enterprise
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 

KĂŒrzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

KĂŒrzlich hochgeladen (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Innovation with ai at scale on the edge vt sept 2019 v0

  • 1. Innovating with AI at Scale: Tools and Tips for Training and Inference Presenter: Clarisse Taaffe-Hedglin clarisse@us.ibm.com Executive AI Architect IBM Systems
  • 2. 1. Drivers of the AI explosion 2. Implementing use cases at scale 3. Deploying models to the edge 2
  • 3. Why AI models now? 3
  • 4. 4 USE CASES ARE EVERYWHERE IBM Skills Academy / © Copyright 2018 IBM Corporation
  • 5. Artificial Intelligence brings new Cognitive Capabilities ‱ Computers can be trained to “See” Example: Airport security inspecting luggage ‱ Computers can be trained to “Hear” Example: Maintenance crew listening to railcars ‱ Computers can be trained to “do”: mimic an expert Example: Mobile phone provider predicting customer churn
  • 6. Data + Algorithms + Compute CPU GPU FPGA The key triggers rapidly advancing AI Open Source Software
  • 7. MEDIA/ENTERTAINMENT RETAIL Reco. Engines, Precision Mktg OTHERS Agriculture, Remote Sensing LIFE SCIENCES Sequence Analysis, Radiology UTILITIES Smart Meter analysis, Capacity planning $ FINANCIAL SERVICES Risk analysis Fraud detection CUSTOMER SERVICE Chatbots, Helpdesk Automated Expenses LAW & DEFENSE Threat analysis - social media monitoring RESEARCH Physics Modeling HEALTH CARE Patient sensors, monitoring, EHRs TRANSPORTATION Optimal traffic flows, Route planning CONSUMER GOODS Sentiment analysis Advertising effectiveness OIL & GAS Exploration, sensor analysis AUTOMOTIVE ADAS, Maintenance MANUFACTURING Line inspection, Defect analysis Addressable market Cognitive Systems / February 26 / © 2019 IBM Corporation
  • 8. BIG, COMPLEX SYSTEMS PERSONALIZATION AUTOMATION SIMULATING RELATIONSHIPS VISUAL RECOGNITION PATTERNS The scenariosAI can best solve for today IBM Skills Academy / © Copyright 2018 IBM Corporation
  • 9. ML Framework Landscape 9 Which ML frameworks have you used the most over the last 5 years? Source: Kaggle Data Science Survey 2018 scikit-learn is, by far, the most widely-used ML framework Why? ‱ Wide variety of ML models ‱ Good documentation ‱ Standardized API Some downsides of scikit-learn are: 1. Lack of support of deep learning (DL) 2. Slow performance for large datasets Problem (1) is addressed by DL frame works in PowerAI (TensorFlow, PyTorch) recently rebranded as Watson Machine Learning Accelerator Problem (2) is addressed by Snap ML
  • 10. Watson Machine Learning Community Edition TensorFlow TensorFlow Probability TensorBoard TensorFlow-Keras BVLC Caffe IBM Enhanced Caffe Caffe2 OpenBLAS HDF5 Curated, tested and pre-compiled binary software distribution that enables enterprises to quickly and easily deploy deep learning for their data science and analytics development Including all of the following frameworks: Nvidia RAPIDS
  • 11. Distributed Deep Learning Simplifies the process of training deep learning models across a cluster for faster time to results. Software Libraries WML CE software and the accelerated Power servers support a host of accelerator libraries like SnapML, Nvidia RAPIDS Large Model Support Use system memory with GPUs to support more complex models and higher resolution data. IBM adds value to curated, tested, and pre-compiled frameworks with Watson Machine Learning Community Edition GPU CPU
  • 12. Evolving from compute systems to Cognitive Systems P8 P9 P10 Open Frameworks Partnerships Industry Alignment DevEcosystem Accelerator Roadmaps Open Accelerator Interfaces Not Just About Hardware Design hardware software + It’s about co-optimization and open innovation which just work for ML, DL, and AI IBM Software 12
  • 13. How to get to AI at scale ? 13
  • 14. 14
  • 15. Top 5 Error Rate
  • 16. Distributed Deep Learning (DDL) 16Think 2018 / DOC ID / Month XX, 2018 / © 2018 IBM Corporation Deep learning training takes days to weeks Limited scaling to multiple x86 servers PowerAI with DDL enables scaling to 100s of GPUs 1 System 64 Systems 16 Days Down to 7 Hours 58x Faster 16 Days 7 Hours Near Ideal Scaling to 256 GPUs ResNet-101, ImageNet-22K 1 2 4 8 16 32 64 128 256 4 16 64 256 Speedup Number of GPUs Ideal Scaling 95%Scaling with 256 GPUS Caffe with PowerAI DDL, Running on Minsky (S822Lc) Power System ResNet-50, ImageNet-1K
  • 17. 17
  • 18. Train larger more complex models Large Model SupportTraditional Model Support Limited memory on GPU forces tradeoff in model size / data resolution Use system memory and GPU to support more complex and higher resolution data CPUDDR4 GPU PCIe Graphics Memory System Bottleneck Here POWER CPU DDR4 GPU NVLink Graphics Memory POWER NVLink Data Pipe
  • 19. Large AI Models Train ~4 Times Faster POWER9 Servers with NVLink to GPUs vs x86 Servers with PCIe to GPUs 19 3.1 Hours 49 Mins 0 2000 4000 6000 8000 10000 12000 Xeon x86 2640v4 w/ 4x V100 GPUs Power AC922 w/ 4x V100 GPUs Time(secs) Caffe with LMS (Large Model Support) Runtime of 1000 Iterations 3.8x Faster GoogleNet model on Enlarged ImageNet Dataset (2240x2240)
  • 20. TensorFlow Large Model Support NVLINK2 Advantage s. 3DUnet segmentation models with higher resolution images allows for learning and labeling finer details and structures of brain tumors. https://developer.ibm.com/linuxonpower/2018/07/27/tensorflow-large-model-support-case-study-3d-image-segmentation/
  • 21. Accelerating Machine Learning Why Fast? Speed is important/crucial in many cases: ‱ online re-training of models ‱ model selection and hyper-parameter tuning ‱ fast adaptability to changes Why Large-Scale? Large datasets arise in numerous business-critical applications: recommendation, credit fraud, advertising, space exploration, weather, etc. Why Resource-Savvy? Not everyone can afford on-prem computing. Renting computing in the cloud is billed by usage. Less usage means savings, higher profit margin. Snap ML is a framework for training Machine Learning (ML) Models It is characterized by:  high performance  scalability to very large datasets  high resource efficiency Artificial Intelligence Machine Learning Deep Learning (Neural Networks) 21
  • 22. Which models are supported? 22 Snap ML (PowerAI 1.6.0) currently supports: ‱ Generalized Linear Models: - Logistic Regression - Ridge Regression - Lasso Regression - Support Vector Machines (SVMs) ‱ Tree-based models: - Decision Trees - Random Forest With more to come
 Source: Kaggle Data Science Survey 2017 Which data science methods are used at work? Supported by Snap ML
  • 23. 23 Decision Tree Performance Results Random Forest Performance Results 23 5.2x 4.5x On average 6.5x faster than sklearn (CPU-only) On average 3.8x faster than sklearn (CPU-only) Project www: https://www.zurich.ibm.com/snapml/ Core publication: https://arxiv.org/abs/1803.06333
  • 24. Nvidia RAPIDS RAPIDS is a set of open source libraries for GPU accelerating data preparation and machine learning. OSS website: rapids.ai
  • 25. Nvidia RAPIDS cuDF - GPU DataFrames is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data provides a pandas-like API that will be familiar to data engineers & data scientists Current version is 0.6  PowerAI 1.6.0 CuDF included tech preview version is backlevel (0.2)  WIP to get latest into Conda or build yourself (open source) Examples of data manipulation in cuDF like object creation, viewing, selection, merge, concat, etc can be found here: https://rapidsai.github.io/projects/cudf/en/latest/10min.html
  • 26. Simple cuDF example download a CSV, then uses the GPU to parse it into rows and columns and run calculations: output:
  • 27. Nvidia RAPIDS cuML - GPU Machine Learning is a suite of libraries that implement machine learning algorithms and mathematical primitives functions enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs Current version is 0.6  PowerAI 1.6.0 CuML included tech preview version is backlevel (0.2)  WIP to get latest into Conda or build yourself (open source) Documentation on supported algorithms like Kmeans, tSVD, PCA, DBSCAN can be found here: https://docs.rapids.ai/api/cuml/stable/
  • 28. Simple cuML example loads input and computes DBSCAN clusters, all on GPU: output:
  • 29. 29 How to deploy at the edge?
  • 30. COLLECT - Make data simple and accessible ORGANIZE - Create a trusted analytics foundation ANALYZE - Scale AI everywhere with trust & transparency Data of every type, regardless of where it lives MODERNIZE your data estate for an AI and multicloud world INFUSE – Operationalize AI across business processes The AI Ladder A prescriptive approach to accelerating the journey to AI 30 AI AI-optimized systems infrastructure
  • 32. Introduction to Nvidia TensorRT NVIDIA TensorRTℱ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Nvidia website: https://developer.nvidia.com/tensorrt
  • 33. Tensorflow and TensorRT inference TensorFlowℱ integration with TensorRTℱ (TF-TRT) optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. While you can still use TensorFlow's wide and flexible feature set, TensorRT will parse the model and apply optimizations to the portions of the graph wherever possible.
  • 34.
  • 35.
  • 36. Note: TensorRT engines are optimized for the currently available GPUs, so conversions should take place on the machine that will be running inference.
  • 37. Calibrating for lower precision with a minimal loss of accuracy reduces the requirements on bandwidth and allows for faster computation speed. It also allows for the use of Tensor Cores, which perform matrix multiplication on 4×4 FP16 matrices and adds a 4×4 FP16 or FP32 matrix.
  • 39.
  • 40. Nvidia TensorRT Current Version Version 6 Announced on September 16th (current) https://news.developer.nvidia.com/tensorrt6-breaks-bert-record/ Version 5.1.3.6 added as a tech preview to WML CE 1.6.1 https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/
  • 41. 41 Resources https://developer.ibm.com/linuxonpower/deep-learning-powerai#tab_education Nvidia TensorRT: https://developer.nvidia.com/tensorrt WML CE 1.6.1: https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/ TF-TRT Documentation: https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/ IBM TensorRT introduction blog: https://developer.ibm.com/linuxonpower/2019/07/29/introducing-tensorflow-with-tensorrt-tf-trt/ IBM Tensorflow Serving blog (includes TensorRT example): https://developer.ibm.com/linuxonpower/2019/08/05/using-tensorrt-models- with-tensorflow-serving-on-wml-ce/ Image classification and object detection: github.com/tensorflow/tensorrt Nvidia forum:https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/ Mixed precision and accuracy: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9143-mixed-precision- training-of-deep-neural-networks.pdf Demo: https://github.com/cheeyauk/tf_to_tensorrt
  • 42.
  • 43. IBM Systems WW Client Experience Centers IBM Internal Use Only Search Center Offerings in ISCEP: https://ibm.biz/client-experience-portal Contact Center via IBM Systems Worldwide Client Experience Centers maximize IBM Systems competitive advantage in the Cloud and Cognitive era by providing access to world class technical experts and infrastructure services to assist Clients with the transformation of their IT implementations. Center offerings enable IBM Sellers and Business Partners to progress and expedite System Sales opportunities. 9 Worldwide Locations (* also Infrastructure Hubs): Austin TX , *Poughkeepsie NY, Rochester MN, Tucson AZ, *Beijing CHINA, Boeblingen GERMANY, Guadalajara MEXICO,*Montpellier FRANCE, Tokyo JAPAN Client Experience Tailored, in-depth technology Innovation Exchange Events Relationship building Demonstrations Meetups Solution workshops Remote options (Inbound & Outbound) Infrastructure Solutions Benchmarks, MVP & Proof of Technology “Test Drives” Demonstrations Infrastructure Services Certify ISV solutions Hosting Cloud Environment (Inbound to Centers) Architecture & Design Advise clients, Enable Sellers, “Art of the Possible” Discovery & Design Workshops, Consulting, Showcases, Reference Architectures, Co- Creation of assets Included CSSC (Inbound & Outbound) Content Content Development IBM Redbooks Training Courses Video courses “Test Drives” Demonstrations NEW: Co-Creation Lab; CEC Cloud; IBM Systems Center of Competency for Red Hat
  • 44. Please note IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice and at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 44
  • 45. Notices and disclaimers 45Replace the footer with text from the PPT-Updater. Instructions are included in that file. © 2018 International Business Machines Corporation. No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights — use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. This document is distributed “as is” without any warranty, either express or implied. In no event, shall IBM be liable for any damage arising from the use of this information, including but not limited to, loss of data, business interruption, loss of profit or loss of opportunity. IBM products and services are warranted per the terms and conditions of the agreements under which they are provided. IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.” Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer follows any law.
  • 46. Notices and disclaimers continued 46Replace the footer with text from the PPT-Updater. Instructions are included in that file. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products about this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied warranties of merchantability and fitness for a purpose. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com and [names of other referenced IBM products and services used in the presentation] are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml. .

Hinweis der Redaktion

  1. So what is triggering the rapid advancements in AI? It comes from major innovation in three critical categories: 1) Digitization of society is creating an abundance of interesting datasets. Inside and outside the enterprise. And that continues to grow about 40% per year. 2) Algorithm innovation in supervised & unsupervised learning techniques. Especially Deep Learning. Most of which is advancing in open source. 3) Ability to run those algorithms on distributed compute and especially on GPUs. So together, the developments here have allowed us to employ AI on any problem where a human can get a task done in less than a 1 second of thought . It’s in this scope of problems where AI is being applied and it’s being wielded to create an flywheel: Data -> Products -> Users. Which is why competing on algorithms alone is not a defensible model. REFERENCE NOTES: Top trends: 99% of commercial value associated with A->B: 0s or 1s. This is called supervised learning. Speech Recognition: Audio -> Text Image Recognition Types of Deep Learning: Supervised Learning: Learn from labeled datasets. Most economic value is here and drops off quickly through below. Transfer Learning: Learn about one topic. Apply to another domain. Unsupervised Learning. Learning without labeled data Reinforcement Learning. The rise of the internet via analogy: Shopping mall + internet doesn’t make an internet/ecommerce company What defines whether you are truly an internet company? A) architect the organizational design to take advantage of the internet. For instance, A/B tests, short cycle times, push decision making down to PM/dev, The rise of the AI era: Traditional tech company + deep learning doesn’t make it an AI company. Although only some patterns exist, Google & Baidu are good examples. Other patterns: a) strategic data acquisition, b) unified data ‘warehouse’, c) persuasive automation, d) new job descriptions. Building an AI company, centrally build an AI group and matrix them into your AI.
  2. When working with clients, these are the top AI scenarios to look for as you explore their potential AI use cases.
  3. The genesis of IBM PowerAI (now known as Watson Machine Learning Community Edition - WML CE) was to make it simple for data scientists to be more productive, more quickly, by greatly simplifying the tasks necessary to get up and running. WML CE is an enterprise software distribution that combines popular open source deep learning frameworks, efficient AI development tools, and accelerated IBM Power Systems servers to take your deep learning projects to the next level. For a fee, IBM offers formal support for WML CE components as long as their versions are consistent with the release configuration (NOTE that WML CE is a no charge offering but we do offer support for a fee). If you choose to use a different version of any of the components, no formal support will be available. However, in keeping with industry norms, specific questions can be posted on the WML CE space on DeveloperWorks Answers: https://developer.ibm.com/answers/topics/powerai/. This forum is monitored by the IBM technical team and technical support is provided on a best effort basis. There a several ways for you to get WML CE. Order it. WML CE is available as a no charge orderable part number from IBM (called PowerAI until 2H2019). Download it from here: http://ibm.biz/download-powerai Get the Docker container from here: https://hub.docker.com/r/ibmcom/powerai/ As of WML CE (PowerAI) 1.5.4, the following frameworks are included in WML CE: (Make sure to check the Knowledge Center for the latest versions as they change rapidly https://www.ibm.com/support/knowledgecenter/SS5SF7_1.5.4/navigation/pai_software_pkgs.html): DDL 1.2.0 - Distributed Deep Learning (with support for up to 4 nodes in WML CE) TensorFlow 1.12.0 Tensorflow Probability 0.5.0 - TensorFlow Probability is a library for probabilistic reasoning and statistical analysis. TensorBoard 1.12.0 - a suite of visualization tools for TensorFlow TensorFlow Keras – NOTE that Keras is supported as part of the TensorFlow core library and as such we can support Keras through TensorFlow IBM enhanced Caffe 1.0.0 BVLC Caffe 1.0.0 - The Berkeley Vision and Learning Center (BVLC) Caffe2 1.0rc1 – in technology preview PyTorch 1.0rc1 Snap ML 1.0.0 Spectrum MPI 10.2 Bazel 0.15.0 OpenBLAS 0.3.3 HDF5 1.10.1 Protobuf 3.6.1 ONNX 1.3.0 – in technology preview
  4. There are three additional capabilities on top of the open source frameworks (and in addition to the performance advantage that Power brings to the table); Large Model Support (LMS), Distributed Deep Learning (DDL), and support by IBM. Large Model Support WML CE addresses a fundamental limitation for deep learning; the size of memory available within GPUs. When training complex models or training with high definition images, the memory available on a GPU can be prohibitively restrictive. Instead of being forced into less complex, shallower deep learning models, customers can develop more accurate models with Large Model Support. With Large Model Support, enabled by IBM’s unique NVLink connection between CPU (memory) and GPU, the entire model and dataset can be loaded in to system memory and cached down to the GPU for action. Customers can now address bigger challenges and get much more work done within a cluster of WML CE servers increasing organizational efficiency. We will cover more details on LMS later in this deck. Distributed Deep Learning To accelerate the time dedicated to training a model, the WML CE stack includes function for distributing a single training job across a cluster of servers. IBM’s Distributed Deep Learning brings intelligence about the structure and layout of the underlying hardware cluster (topology). The impact of this is significant! WML CE and WML-A with Distributed Deep Learning can scale jobs across large numbers of cluster resources with very little loss due to communications overhead. There will be more details later in the presentation. WML CE allows for the use of DDL with up to a 4 node cluster. If a client wants to scale beyond 4 nodes, they must purchase WML-A. Supported by IBM Although WML CE is available free to download and use, IBM also provides a “for fee” support offering for those clients that want enterprise level support for the features and capabilities within the base offering.
  5. We normally would focus on the HW optimization starting with the processor, the IO interfaces enabled by this processor and then what accelerators we would align to those interfaces for the optimal performance. And we are doing that today, however, it is not just about the HW. As I mentioned on the previous slide, we co-optimized the SW. We took the opensource deep learning frameworks and optimized them around this advanced design, added enhancements such as spark conductor for DDL and large model support while supporting everything from the HW to the SW in the solution. Not only do we have differentiated HW in AC922 with many industry only innovation, but we have a full SW offering on top of it that is equally rich of differentiated innovation and innovations only found with Power Systems.
  6. It’s estimated that 1.2 trillion photos will be taken in 2017. Even if each photo only took someone 1 second to organize, tag and annotate, it would still take over 38,000 years to classify them all! There is a competition every year, known as ImageNet. Roughly 500,000 images (low resolution) and 200 categories for which to classify them.
  7. We talked about this earlier – it’s all about maximizing accuracy (or minimizing error/loss) One way to get more accurate models is to simply add more layers The more layers the more complex, and the more difficult (computationaly) it becomes to train
  8. Distributed deep learning (DDL) is IBM’s high performance approach to training single models across an entire cluster of compute nodes. Unlike native model parallelism (such as Google’s gRPC method for tensorflow), or Spark based approaches, the DDL library distributes model, training data set, and parameter serving across the defined cluster and it uses a novel algorithm to improve communication over very low latency fabric. The result is extremely efficient performance scaling, losing less than 5% of ideal efficiency when moving from 4 GPUs to 64 GPUs. This was available as a technology preview within PowerAI, but is now supported in PowerAI Enterprise. The outcome of this capability is that data science teams can run larger, more complex models while still reducing training time
 allowing more iterations faster
 and faster time to accurate results.
  9. https://www.olcf.ornl.gov/wp-content/uploads/2018/12/summit_training_mldl.pdf https://vimeo.com/307071617 Junqi Yin Advanced Data and Workflows Group
  10. Watson Machine Learning Accelerator addresses memory constraints within Deep Learning Large Model Support Watson Machine Learning Accelerator (WML-A) addresses a very big deep learning scaling challenge: the size of memory available within GPUs. When data scientists develop a deep learning workload, the structure of matrices in the neural model, and the data elements which train the model (in a batch), must sit within the memory on the GPUs. As models grow in complexity and data sets increase in size, data scientists are forced to make tradeoffs to stay within the constrained 32GB (or even 16GB on older GPU cards) memory limits. Instead of training on web-scale images, WML-A users can train on high definition video. Instead of being forced in to less complex, shallower deep learning models, customers can develop more accurate models for better inference capability. With Large Model Support, enabled by WML-A’s unique NVLink connection between CPU (memory) and GPU, the entire model and dataset can be loaded in to system memory and cached down to the GPU for action. IBM’s capabilities, with the co-optimized WML-A software on the Power Systems servers, have enabled increased model size (more layers, larger matrices), increased data element sizes (higher definition images), and larger batch sizes (for faster time to convergence). With Large Model Support, data scientists can load models which span nearly an entire terabyte of system memory across the GPUs. The final impact? Customers can now address bigger challenges and get much more work done within a cluster of WML-A servers increasing organizational efficiency.
  11. Not only do large models allow data scientists to work with more complex data, it turns out that for certain models because they rely on pulling significantly larger number of data elements to the training cycle that large models will allow training jobs to actually complete faster. By using the entire system memory resource that is available, Data scientists are able to operate much more efficiently within each single server. The outcome of being able to use larger data and train faster is a significant advantage for power AI enterprise, and is only available operate at this scale because of the architectural choices IBM and Nvidia have made in developing this accelerated architecture.
  12. When you need to retrain models frequently – multiple times per day: Cybersecurity threats on your critical infra (e.g. energy grid), credit card fraud detection models Online retraining: e.g. anomaly detection on your compute or storage infrastructure, where you want to constantly learn from new events, to improve model
  13. These are all Power-9 results, CPU-only. Datasets: Epsilon: 300K x 2000 Higgs: 8M x 28 Creditcard: 200K x 28 Susy: 3.75M x 18
  14. This is our prescriptive approach to helping clients accelerate their journey to AI which connects their data and AI capabilities within a unified data and AI lifecycle (or platform). This is also a way to help our clients identify where they are and where to focus based upon their maturity on the journey to AI. Furthermore, it is an organizing construct to the Data and AI products and services offered by IBM and our business partners, and it is the technology foundation to unify how those products and services work together.    What we have learned from AI pioneers is that every step of the ladder is critical. AI is not magic and requires a thoughtful and well-architected approach. For example, the vast majority of AI failures are due to data preparation and organization, not the AI models themselves. Success with AI models is dependent on achieving success first with how you COLLECT and ORGANIZE data. Therefore, we believe clients must: COLLECT -- Establish a strong foundation of data, making it simple and accessible, regardless where that data resides. Since data used in AI is often very dynamic and fluid with ever-expanding sources, virtualizing how data is collected is critical for clients.   ORGANIZE – Create a trusted, business-ready analytics foundation that ensures your data is ready for AI. Just because you can access your data doesn’t mean that it’s prepared for AI use cases. Bad data is paralyzing to AI. So clients must integrate, cleanse, catalog, and govern the full lifecycle of their AI data. ANALYZE – Once your data is accessible and AI-ready, then you are better prepared to apply advanced analytics and AI models. This rung provides the business and planning analytics capabilities that are key for success with AI. It also provides the capabilities needed to build, deploy, and manage AI models within an integrated portfolio of technology.  INFUSE – Many businesses create highly useful AI models but then encounter challenges in operationalizing them to attain broader business value. This rung of the ladder infuses AI to achieve trust and transparency in model-recommended decisions, decision explainability, bias detection, decision audits, etc. For clients with common use cases, the INFUSE rung operationalizes those AI use cases with pre-built application services, speeding time to value. MODERNIZE – Given the dynamic nature of AI, your data estate needs a highly elastic and extensible multi-cloud infrastructure to unify the aforementioned capabilities within a fully governed team-platform. Clients are also looking to automate their AI lifecycles across an array of contributors through collaborative workflows. Essentially, MODERNIZE means building an information architecture for AI that provides choice and flexibility across your enterprise.  As clients modernize their data estates for an AI and multicloud world, they will find that there is less "assembly required" in expanding the impact of AI across the organization. 
  15. This is the IBM Cloud Architecture Center high level reference architecture. A Data centric and AI reference architecture needs to support capabilities that address the Collect, Analyze, Organize and Infuse activities.  This architecture diagram illustrates the need for strong data management capabilities inside a 'multi cloud data platform' (Dark blue area), on which AI capabilities are plugged in to support analyze done by data scientists ( machine learning workbench and business analytics). The data platform addresses the data collection and transformation to move data to local highly scalable store. Sometime, it is necessary to avoid moving data when there is no need to do transformations or there is no performance impact to the origin data sources by adding readers, so a virtualization capability is necessary to open a view on remote data sources without moving data. On the AI side, data scientists need to perform data analysis, which includes making sense of the data using data visualization. To build a model they need to define features, and the AI environment supports feature engineering. Then to build the model, the development environment helps to select and combine the different algorithms and to tune the hyper parameters. The execution can be done on local cluster or can be executed, at the big data scale level, to machine learning cluster. Once the model provides acceptable accuracy level, it can be published as a service. The model management capability supports the meta-data definition and the life cycle management of the model. When the model is deployed, monitoring capability, ensures the model is still accurate and even not biased.  The intelligent application, represented as a combination of capabilities at the top of the diagram: business process, core application, CRM... can run on cloud, fog, or mist. It accesses the deployed model, access Data using APIs, and even consumes pre-built models, congitive services, like a speech to text and text to speech service, an image recognition, a tone analyzer services, the Natural Language Understanding (NLU), and chatbot. 
  16. ISCEP Link