SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
1
ALISON B LOWNDES
AI DevRel | EMEA
@alisonblowndes
November 2018
www.FrontierDevelopmentLab.org
3
4
CUDA DEVELOPMENT ECOSYSTEM
CUDA: Programming Model, GPU Architecture, System Architecture
SpecializedPerformanceEase of use
FrameworksApplications Libraries
Directives and
Standard Languages
Extended Standard
Languages
CUDA-C++
CUDA Fortran
GPU Users Domain
Specialists
Problem
Specialists
New Algorithm Developersand
OptimizationExperts
5
NVIDIA RESEARCH
6
EMBED SUPERNOVA VIDEO
7
TESLA V100 32GB
THE MOST ADVANCED DATA CENTER GPU EVER BUILT
5,120 CUDA cores
640 NEW Tensor cores
7.5 FP64 TFLOPS | 15 FP32 TFLOPS
120 Tensor TFLOP
20MB SM RF | 16MB Cache | 32GB HBM2 @ 900 GB/s
300 GB/s NVLink
https://devblogs.nvidia.com/parallelforall/inside-volta/
8
Tensor Cores: More resources
Further information
Mixed-precision blog: https://devblogs.nvidia.com/mixed-precision-training-deep-neural-networks/
Mixed-precision best practices: https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
Mixed-precision arVix paper: https://arxiv.org/abs/1710.03740
GTC 2018 Sessions: Training with Mixed Precision: Theory and Practice and Training with Mixed Precision: Real Examples
Call to action:
1) Share these resources
2) Provide feedback to help us prioritize new examples and improve examples
DGX/NGC deep learning framework containers: Latest versions of software stack and Tensor Core optimized examples - DGX/NGC Registry
Tools
TensorFlow OpenSeq2seq: https://nvidia.github.io/OpenSeq2Seq/html/mixed-precision.html & arVix paper
PyTorch Apex: https://nvidia.github.io/apex/fp16_utils.html & NVIDIA developer news article
Examples
New mixed-precision model examples: https://developer.nvidia.com/deep-learning-examples
GitHub: https://github.com/NVIDIA/DeepLearningExamples
9
Li et al, University of Maryland & US Naval Academy
https://arxiv.org/pdf/1712.09913.pdf
10
REFACTORING FP16https://arxiv.org/pdf/1811.01721.pdf
11
TRAINING VS INFERENCE
12
NVIDIA TensorRT
Optimize and Deploy neural networks in
production environments
Maximize throughput for latency-critical apps
with optimizer and runtime
Deploy responsive and memory efficient apps
with INT8 & FP16 optimizations
Accelerate every framework with TensorFlow
integration and ONNX support
Run multiple models on a node with
containerized inference server
Platform for High-PerformanceDeep LearningInference
developer.nvidia.com/tensorrt
TensorRT
Optimizer
TensorRT
Runtime
Engine
Trained
Neural
Network
Embedded Automotive Data center
Jetson Drive PX Tesla
13
320 Turing Tensor Cores
2,560 CUDACores
65 FP16 TFLOPS | 130 INT8 TOPS | 260 INT4 TOPS
16GB | 320GB/s
70 W
Now Available Early Access
TESLA T4 ON GCP
The Cloud GPU
14
WORLD’S MOST ADVANCED
INFERENCE GPU
INTEGRATED INTO TENSORFLOW&
ONNX SUPPORT
TENSORRT HYPERSCALE INFERENCE PLATFORM
TENSORRT
INFERENCE SERVER
15NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
NVIDIA DEEP LEARNING SDK UPDATE
GPU-accelerated
DL Primitives
Support for Turing
Optimizations for RNNs
Extended support for NHWC
cuDNN
7.4
Multi-node distributed
training (multiple machines)
Leading frameworks support
Multi-GPU &
Multi-node
NCCL
2.3
TensorFlow model reader
Turing & Xavier support
INT8 RNNs support
High-performance
Inference Engine
TensorRT 5
DOWNLOAD NOW
16
NEW PLATFORM SUPPORT
Deploy deep learning inference on
Xavier platforms through NVIDIA
DRIVE AI Platforms
Available for development and
host environments
NVIDIA XAVIER
Use AI from hospitals to trading floors
to factory floors
Support for C++ API and Python API
Available as an RPM file
CENTOS OPERATING SYSTEM
Deploy deep learning inference
on DLA platforms
Support for FP16 precision in
current version
WW.NVDLA.ORG
Use AI for games and medical apps
Support for C++ API
Available as a ZIP file
WINDOWS OPERATING SYSTEM
17
18
19
DEEPMIND ALPHA* 1.0 => 2.0 => 0.0
* ..a deeply structuredhybrid (Gary Marcus,Jan2018)
20
ROBOTS
THINK /
REASON
SENSE /
PERCEIVE
ACT
21
22
23
24
ISAAC
Simulation Engine
Photo-realistic Graphics ∙Physics ∙Soft bodies ∙
∙ Procedural Generation ∙ Massive parallelism
∙ Unreal Engine 4 / Unity 3D
Isaac Framework
Codelets ∙Behaviors ∙ 3D Poses ∙Distributed
∙ Messaging ∙Synchronization ∙ Record & Replay
∙ Configuration ∙Visualization
World model
Warehouse ∙Office
∙ Store ∙ Home
Robot model
Carter ∙ URDF loader
Virtual Sensors Virtual Actuators Sensor Processing Actuator Control HW Sensor HW Actuator
ML
TensorRT∙ CUDA
∙ Tensorflow ∙...
Gems
Optimizers ∙ Algebra
∙ EKFs ∙ Depth ∙ ...
Navigation
Behaviors
Interactions
withhumans
Manipulation
Perception
Drivers
Lidar ∙ Camera ∙IMU ∙
RobotBase ∙ ...
Unified Message API
Use the same messages for simulation,
actual hardware and across all apps
Simulate DeployDevelop
Jetson
Fully integrated with
X2 and Xavier
NVIDIA JETSON AGX
ISAAC GEMS JETSONAGX XAVIERDEVELOPERKIT
Available Now
ISAAC SIM
26NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
27
TensorRT SUPPORTS
Xavier
Deploy deep learning inference on Xavier platforms
through NVIDIA DRIVE AI Platforms
Import models in any framework (including TensorFlow,
Caffe and Torch) through ONNX, Universal Framework
Format or custom C/C++ API
Optimize CNN, RNN and novel neural network layers and
deploy reduced precision on Tensor Cores
Download for development or host environment today
Optimized Inference on the
World’s Most Powerful SoC
developer.nvidia.com/tensorrt
Lidar
Localization
Surround Perception
RADAR LIDAR
Egomotion Path
Planning
Camera
Localization
Lanes Signs Lights
Path
Perception
LIDAR
Localization
28
XAVIER DLA
NOW OPEN SOURCE
Command Interface
Tensor Execution Micro-controller
Memory Interface
Input DMA
(Activations
and Weights)
Unified
512KB
Input
Buffer
Activations
and
Weights
Sparse Weight
Decompression
Native
Winograd
Input
Transform
MAC
Array
2048 Int8
or
1024 Int16
or
1024 FP16
Output
Accumulators
Output
Postprocess
or
(Activation
Function,
Pooling
etc.)
Output
DMA
WWW.NVDLA.ORG
29
NVIDIA AGX
Family of Systems for
Embedded AI HPC
Self-driving cars
Robotics
Smart Cities
Healthcare
30
END TO END SYSTEM FOR AV
Sources: NVIDIA, RAND Corporation
SIMULATETRAIN MODELSCOLLECT DATA RE-SIMULATE
Lanes Lights
Path
Signs
PedestriansCars
Up to 2 petabytes per
car per year
20 DNNs
1+ million images
per DNN
10B+ miles to
ensure safe driving
Validate and verify
for self-driving cars
by 2020
MAPPING
5 + DNNs
Create and update
HD maps
31
COMPUTATIONAL SCALE REQUIRED
3 million labeled images
1 DGX-1 trains 300k labeled images on 1 DNN in 1 day
10 DNNs required for self-driving
10 parallel experimentsat all times
100 DGX-1 per car
TURING — A GIANT LEAP
TENSOR CORE RT CORE
SHADER | COMPUTE
0
20
40
60
80
100
120
TURING 9X PEAK FLOPS
PASCAL
TITAN Xp
TURING
2080 Ti
114
14.214.212.9
FP32 INT32 TC FP32 INT32 TC
33
TURING MULTI-PROCESS SERVICE
Improved Small Batch Inference Throughput
Hardware
Accelerated
Work Submission
Hardware
Isolation
TURING MULTI-PROCESS SERVICE
Turing
A B C
CUDA MULTI-PROCESS SERVICE CONTROL
CPU Processes
GPU Execution
Turing MPS:
Inherits Volta’s enhanced MPS
architecture
• Reduced launch latency
• Improved launch throughput
• Improved quality of service with
scheduler partitioning
• 3x more clients than Pascal
A B C
34
35
TENSOR CORES FOR SCIENCE
Multi-precision computing
AI-POWERED WEATHER
PREDICTION
PLASMA FUSION
APPLICATION EARTHQUAKE SIMULATION
7.8
15.7
125
0
20
40
60
80
100
120
140
V100
TFLOPS
FP64+ MULTI-PRECISION
FP16 Solver
3.5x times faster
FP16/FP32
1.15x ExaOPS
FP16-FP21-FP32-FP64
25x times faster
Source: J. Zhang, KLA-Tencor, “Physics-BasedAI forSemiconductorInspectionUsing a GPU BasedOpticalNeural
Network,” GTC2018
Physics-based AI for Semiconductor Wafer Inspection
KLA-Tencor patent pending
37
AI-BUILD AI TO
FABRICATE
SUBATOMIC MATERIALS
To expand the benefits of deep learning for science,
researchers need new tools to build high-performing
neural networks that don’t require specialized
knowledge. Scientists at Oak Ridge National
Laboratory used the MENNDL algorithm on
Summit to develop a neural network that
analyzes electron microscopy data at the
atomic level. The team achieved a
speed of 152.5 petaflops across
3,000 nodes.
38NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
39
GPU ACCELERATED COMPUTING
HPC AI ML VISUALIZATION
CLARA NVIDIA AI RAPIDS ISAAC DRIVE METROPOLIS
WORKSTATIONS SERVERS CLOUD
CUDA & GPU COMPUTING ARCHITECTURE
NVIDIA GPU CLOUD
TESLA DGXTEGRA
40
RE-IMAGINING DATA SCIENCE WORKFLOW
Open Source, End-to-end GPU-accelerated Workflow Built On CUDA
Data
preparation /
wrangling
cuDF
Optimized ML
model
training
cuML Visualization
Data
visualization
libraries
data insights
WWW.RAPIDS.AI
41
Cloud
DOWNLOAD AND DEPLOY
On-premises
Source code, libraries, packages
Source available on Github | Container available on NGC and Dockerhub | PIP available at a later date
NGC
42
ENTERPRISE DATA MODEL SCALABILITY
DGX-2
DGX-1
DGX STATION
2*RTX 8000
TESLA V100
256GB
128GB
96GB
32GB
512 GB
10240 TENSORCORES
5120 TENSORCORES
2560 TENSORCORES
1152 TENSORCORES
640 TENSORCORES
DGX POD
ARCHITECTURE
a single data center
rack containing up
to 9x NVIDIA DGX-1
servers, storage,
networking &
NVIDIA AI software
Nine DGX-1 servers
12 storage servers
10 GbE (min) storage
& management
switch
Mellanox 100 Gpps
intra-rack high speed
network switches.
44
WHY CONTAINERS?
Benefits of Containers:
Simplify deployment of
GPU-accelerated software, eliminating time-
consuming software integration work
Isolate individual deep learning frameworks
and applications
Share, collaborate,
and test applications across
different environments
44
To learn more:
nvidia.com/ngc
To sign up:
ngc.nvidia.com
45
RAPID USER ADOPTION
HPC APPS CONTAINERS ON NVIDIA GPU CLOUD
RAPID CONTAINER ADDITION
GAMESS
CHROMA*CANDLE
GROMACS LAMMPS NAMD RELION
LATTICE MICROBESMILC*
*Coming soon
46
POWERING THE WORLD’S FASTEST
SUPERCOMPUTERS
48% More Systems | 22 of Top 25 Greenest
Piz Daint
Europe’s Fastest
5,704 GPUs| 21 PF
ORNL Summit
World’s Fastest
27,648 GPUs| 144 PF
ABCI
Japan’s Fastest
4,352 GPUs| 20 PF
ENI HPC4
Fastest Industrial
3,200 GPUs| 12 PF
LLNL Sierra
World’s 2nd Fastest
17,280 GPUs| 95 PF
48
NEXT STEPS
NVIDIA Deep Learning Institute
www.nvidia.com/en-us/deep-learning-ai/education
GTC San Jose | March 26-29 2018
www.nvidia.com/en-us/gtc
NGC
www.nvidia.com/en-us/gpu-cloud
49
ALowndes@NVIDIA.com

Weitere ähnliche Inhalte

Was ist angesagt?

組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステムShinnosuke Furuya
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoEmbarcados
 
Programming Models for Exascale Systems
Programming Models for Exascale SystemsProgramming Models for Exascale Systems
Programming Models for Exascale Systemsinside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PXNVIDIA Japan
 
AI, A New Computing Model
AI, A New Computing ModelAI, A New Computing Model
AI, A New Computing ModelNVIDIA Taiwan
 
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA Taiwan
 
Jetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesJetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesDustin Franklin
 
Lenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
Lenovo HPC: Energy Efficiency and Water-Cool-Technology InnovationsLenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
Lenovo HPC: Energy Efficiency and Water-Cool-Technology Innovationsinside-BigData.com
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說NVIDIA Taiwan
 
1030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.01030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.0NVIDIA Japan
 
NVIDIA Deep Learning Institute 2017 基調講演
NVIDIA Deep Learning Institute 2017 基調講演NVIDIA Deep Learning Institute 2017 基調講演
NVIDIA Deep Learning Institute 2017 基調講演NVIDIA Japan
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCINVIDIA Japan
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PC Cluster Consortium
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
 

Was ist angesagt? (20)

組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
 
AI + E-commerce
AI + E-commerceAI + E-commerce
AI + E-commerce
 
Programming Models for Exascale Systems
Programming Models for Exascale SystemsProgramming Models for Exascale Systems
Programming Models for Exascale Systems
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
 
AI, A New Computing Model
AI, A New Computing ModelAI, A New Computing Model
AI, A New Computing Model
 
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
 
Jetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesJetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous Machines
 
Lenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
Lenovo HPC: Energy Efficiency and Water-Cool-Technology InnovationsLenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
Lenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
 
Lenovo HPC Strategy Update
Lenovo HPC Strategy UpdateLenovo HPC Strategy Update
Lenovo HPC Strategy Update
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說
 
1030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.01030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.0
 
NVIDIA Deep Learning Institute 2017 基調講演
NVIDIA Deep Learning Institute 2017 基調講演NVIDIA Deep Learning Institute 2017 基調講演
NVIDIA Deep Learning Institute 2017 基調講演
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
GTC 2022 Keynote
GTC 2022 KeynoteGTC 2022 Keynote
GTC 2022 Keynote
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 

Ähnlich wie Nvidia at SEMICon, Munich

Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA Taiwan
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0Ganesan Narayanasamy
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1IBM Sverige
 
Dell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterDell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterRenee Yao
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIndrajit Poddar
 
Using-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdf
Using-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdfUsing-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdf
Using-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdfjn7887
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AITyrone Systems
 
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...Infoshare
 
Gömülü Sistemlerde Derin Öğrenme Uygulamaları
Gömülü Sistemlerde Derin Öğrenme UygulamalarıGömülü Sistemlerde Derin Öğrenme Uygulamaları
Gömülü Sistemlerde Derin Öğrenme UygulamalarıFerhat Kurt
 
GTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyGTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyNVIDIA
 

Ähnlich wie Nvidia at SEMICon, Munich (20)

Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
AI talk at CogX 2018
AI talk at CogX 2018AI talk at CogX 2018
AI talk at CogX 2018
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1
 
Hardware in Space
Hardware in SpaceHardware in Space
Hardware in Space
 
Dell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterDell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data Center
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
 
Using-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdf
Using-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdfUsing-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdf
Using-NVIDIA-GPU-Cloud-Containers-on-the-Nimbix-Cloud-NVIDIA.pdf
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
infoShare AI Roadshow 2018 - Tomasz Kopacz (Microsoft) - jakie możliwości daj...
 
DRIVE PX 2
DRIVE PX 2DRIVE PX 2
DRIVE PX 2
 
Gömülü Sistemlerde Derin Öğrenme Uygulamaları
Gömülü Sistemlerde Derin Öğrenme UygulamalarıGömülü Sistemlerde Derin Öğrenme Uygulamaları
Gömülü Sistemlerde Derin Öğrenme Uygulamaları
 
GTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyGTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon Valley
 
Breaking RSA & the internet
Breaking RSA & the internetBreaking RSA & the internet
Breaking RSA & the internet
 

Mehr von Alison B. Lowndes

Mehr von Alison B. Lowndes (20)

Exploring solutions for humanity's greatest challenges
Exploring solutions for humanity's greatest challengesExploring solutions for humanity's greatest challenges
Exploring solutions for humanity's greatest challenges
 
Future of Skills
Future of SkillsFuture of Skills
Future of Skills
 
NVIDIA @ AI FEST
NVIDIA @ AI FESTNVIDIA @ AI FEST
NVIDIA @ AI FEST
 
MAXSS & NVIDIA
MAXSS & NVIDIAMAXSS & NVIDIA
MAXSS & NVIDIA
 
Omniverse for the Metaverse
Omniverse for the MetaverseOmniverse for the Metaverse
Omniverse for the Metaverse
 
DataArt
DataArtDataArt
DataArt
 
From gaming to the metaverse
From gaming to the metaverseFrom gaming to the metaverse
From gaming to the metaverse
 
Decision-ready climate data
Decision-ready climate dataDecision-ready climate data
Decision-ready climate data
 
Talk on commercialising space data
Talk on commercialising space data Talk on commercialising space data
Talk on commercialising space data
 
NVIDIA DataArt IT
NVIDIA DataArt ITNVIDIA DataArt IT
NVIDIA DataArt IT
 
NVIDIA Keynote #GTC21
NVIDIA Keynote #GTC21 NVIDIA Keynote #GTC21
NVIDIA Keynote #GTC21
 
Tales of AI agents saving the human race!
Tales of AI agents saving the human race!Tales of AI agents saving the human race!
Tales of AI agents saving the human race!
 
Harnessing the virtual realm
Harnessing the virtual realmHarnessing the virtual realm
Harnessing the virtual realm
 
EPSRC CDT Conference
EPSRC CDT ConferenceEPSRC CDT Conference
EPSRC CDT Conference
 
Talk on using AI to address some of humanities problems
Talk on using AI to address some of humanities problemsTalk on using AI to address some of humanities problems
Talk on using AI to address some of humanities problems
 
GTC Fall 2020 Keynote
GTC Fall 2020 KeynoteGTC Fall 2020 Keynote
GTC Fall 2020 Keynote
 
Innovation Roundtable
Innovation RoundtableInnovation Roundtable
Innovation Roundtable
 
Fuelling the AI Revolution with Gaming
Fuelling the AI Revolution with GamingFuelling the AI Revolution with Gaming
Fuelling the AI Revolution with Gaming
 
Possibilities of generative models
Possibilities of generative modelsPossibilities of generative models
Possibilities of generative models
 
Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Nvidia at SEMICon, Munich

  • 1. 1 ALISON B LOWNDES AI DevRel | EMEA @alisonblowndes November 2018
  • 3. 3
  • 4. 4 CUDA DEVELOPMENT ECOSYSTEM CUDA: Programming Model, GPU Architecture, System Architecture SpecializedPerformanceEase of use FrameworksApplications Libraries Directives and Standard Languages Extended Standard Languages CUDA-C++ CUDA Fortran GPU Users Domain Specialists Problem Specialists New Algorithm Developersand OptimizationExperts
  • 7. 7 TESLA V100 32GB THE MOST ADVANCED DATA CENTER GPU EVER BUILT 5,120 CUDA cores 640 NEW Tensor cores 7.5 FP64 TFLOPS | 15 FP32 TFLOPS 120 Tensor TFLOP 20MB SM RF | 16MB Cache | 32GB HBM2 @ 900 GB/s 300 GB/s NVLink https://devblogs.nvidia.com/parallelforall/inside-volta/
  • 8. 8 Tensor Cores: More resources Further information Mixed-precision blog: https://devblogs.nvidia.com/mixed-precision-training-deep-neural-networks/ Mixed-precision best practices: https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html Mixed-precision arVix paper: https://arxiv.org/abs/1710.03740 GTC 2018 Sessions: Training with Mixed Precision: Theory and Practice and Training with Mixed Precision: Real Examples Call to action: 1) Share these resources 2) Provide feedback to help us prioritize new examples and improve examples DGX/NGC deep learning framework containers: Latest versions of software stack and Tensor Core optimized examples - DGX/NGC Registry Tools TensorFlow OpenSeq2seq: https://nvidia.github.io/OpenSeq2Seq/html/mixed-precision.html & arVix paper PyTorch Apex: https://nvidia.github.io/apex/fp16_utils.html & NVIDIA developer news article Examples New mixed-precision model examples: https://developer.nvidia.com/deep-learning-examples GitHub: https://github.com/NVIDIA/DeepLearningExamples
  • 9. 9 Li et al, University of Maryland & US Naval Academy https://arxiv.org/pdf/1712.09913.pdf
  • 12. 12 NVIDIA TensorRT Optimize and Deploy neural networks in production environments Maximize throughput for latency-critical apps with optimizer and runtime Deploy responsive and memory efficient apps with INT8 & FP16 optimizations Accelerate every framework with TensorFlow integration and ONNX support Run multiple models on a node with containerized inference server Platform for High-PerformanceDeep LearningInference developer.nvidia.com/tensorrt TensorRT Optimizer TensorRT Runtime Engine Trained Neural Network Embedded Automotive Data center Jetson Drive PX Tesla
  • 13. 13 320 Turing Tensor Cores 2,560 CUDACores 65 FP16 TFLOPS | 130 INT8 TOPS | 260 INT4 TOPS 16GB | 320GB/s 70 W Now Available Early Access TESLA T4 ON GCP The Cloud GPU
  • 14. 14 WORLD’S MOST ADVANCED INFERENCE GPU INTEGRATED INTO TENSORFLOW& ONNX SUPPORT TENSORRT HYPERSCALE INFERENCE PLATFORM TENSORRT INFERENCE SERVER
  • 15. 15NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE. NVIDIA DEEP LEARNING SDK UPDATE GPU-accelerated DL Primitives Support for Turing Optimizations for RNNs Extended support for NHWC cuDNN 7.4 Multi-node distributed training (multiple machines) Leading frameworks support Multi-GPU & Multi-node NCCL 2.3 TensorFlow model reader Turing & Xavier support INT8 RNNs support High-performance Inference Engine TensorRT 5 DOWNLOAD NOW
  • 16. 16 NEW PLATFORM SUPPORT Deploy deep learning inference on Xavier platforms through NVIDIA DRIVE AI Platforms Available for development and host environments NVIDIA XAVIER Use AI from hospitals to trading floors to factory floors Support for C++ API and Python API Available as an RPM file CENTOS OPERATING SYSTEM Deploy deep learning inference on DLA platforms Support for FP16 precision in current version WW.NVDLA.ORG Use AI for games and medical apps Support for C++ API Available as a ZIP file WINDOWS OPERATING SYSTEM
  • 17. 17
  • 18. 18
  • 19. 19 DEEPMIND ALPHA* 1.0 => 2.0 => 0.0 * ..a deeply structuredhybrid (Gary Marcus,Jan2018)
  • 21. 21
  • 22. 22
  • 23. 23
  • 24. 24 ISAAC Simulation Engine Photo-realistic Graphics ∙Physics ∙Soft bodies ∙ ∙ Procedural Generation ∙ Massive parallelism ∙ Unreal Engine 4 / Unity 3D Isaac Framework Codelets ∙Behaviors ∙ 3D Poses ∙Distributed ∙ Messaging ∙Synchronization ∙ Record & Replay ∙ Configuration ∙Visualization World model Warehouse ∙Office ∙ Store ∙ Home Robot model Carter ∙ URDF loader Virtual Sensors Virtual Actuators Sensor Processing Actuator Control HW Sensor HW Actuator ML TensorRT∙ CUDA ∙ Tensorflow ∙... Gems Optimizers ∙ Algebra ∙ EKFs ∙ Depth ∙ ... Navigation Behaviors Interactions withhumans Manipulation Perception Drivers Lidar ∙ Camera ∙IMU ∙ RobotBase ∙ ... Unified Message API Use the same messages for simulation, actual hardware and across all apps Simulate DeployDevelop Jetson Fully integrated with X2 and Xavier
  • 25. NVIDIA JETSON AGX ISAAC GEMS JETSONAGX XAVIERDEVELOPERKIT Available Now ISAAC SIM
  • 26. 26NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
  • 27. 27 TensorRT SUPPORTS Xavier Deploy deep learning inference on Xavier platforms through NVIDIA DRIVE AI Platforms Import models in any framework (including TensorFlow, Caffe and Torch) through ONNX, Universal Framework Format or custom C/C++ API Optimize CNN, RNN and novel neural network layers and deploy reduced precision on Tensor Cores Download for development or host environment today Optimized Inference on the World’s Most Powerful SoC developer.nvidia.com/tensorrt Lidar Localization Surround Perception RADAR LIDAR Egomotion Path Planning Camera Localization Lanes Signs Lights Path Perception LIDAR Localization
  • 28. 28 XAVIER DLA NOW OPEN SOURCE Command Interface Tensor Execution Micro-controller Memory Interface Input DMA (Activations and Weights) Unified 512KB Input Buffer Activations and Weights Sparse Weight Decompression Native Winograd Input Transform MAC Array 2048 Int8 or 1024 Int16 or 1024 FP16 Output Accumulators Output Postprocess or (Activation Function, Pooling etc.) Output DMA WWW.NVDLA.ORG
  • 29. 29 NVIDIA AGX Family of Systems for Embedded AI HPC Self-driving cars Robotics Smart Cities Healthcare
  • 30. 30 END TO END SYSTEM FOR AV Sources: NVIDIA, RAND Corporation SIMULATETRAIN MODELSCOLLECT DATA RE-SIMULATE Lanes Lights Path Signs PedestriansCars Up to 2 petabytes per car per year 20 DNNs 1+ million images per DNN 10B+ miles to ensure safe driving Validate and verify for self-driving cars by 2020 MAPPING 5 + DNNs Create and update HD maps
  • 31. 31 COMPUTATIONAL SCALE REQUIRED 3 million labeled images 1 DGX-1 trains 300k labeled images on 1 DNN in 1 day 10 DNNs required for self-driving 10 parallel experimentsat all times 100 DGX-1 per car
  • 32. TURING — A GIANT LEAP TENSOR CORE RT CORE SHADER | COMPUTE 0 20 40 60 80 100 120 TURING 9X PEAK FLOPS PASCAL TITAN Xp TURING 2080 Ti 114 14.214.212.9 FP32 INT32 TC FP32 INT32 TC
  • 33. 33 TURING MULTI-PROCESS SERVICE Improved Small Batch Inference Throughput Hardware Accelerated Work Submission Hardware Isolation TURING MULTI-PROCESS SERVICE Turing A B C CUDA MULTI-PROCESS SERVICE CONTROL CPU Processes GPU Execution Turing MPS: Inherits Volta’s enhanced MPS architecture • Reduced launch latency • Improved launch throughput • Improved quality of service with scheduler partitioning • 3x more clients than Pascal A B C
  • 34. 34
  • 35. 35 TENSOR CORES FOR SCIENCE Multi-precision computing AI-POWERED WEATHER PREDICTION PLASMA FUSION APPLICATION EARTHQUAKE SIMULATION 7.8 15.7 125 0 20 40 60 80 100 120 140 V100 TFLOPS FP64+ MULTI-PRECISION FP16 Solver 3.5x times faster FP16/FP32 1.15x ExaOPS FP16-FP21-FP32-FP64 25x times faster
  • 36. Source: J. Zhang, KLA-Tencor, “Physics-BasedAI forSemiconductorInspectionUsing a GPU BasedOpticalNeural Network,” GTC2018 Physics-based AI for Semiconductor Wafer Inspection KLA-Tencor patent pending
  • 37. 37 AI-BUILD AI TO FABRICATE SUBATOMIC MATERIALS To expand the benefits of deep learning for science, researchers need new tools to build high-performing neural networks that don’t require specialized knowledge. Scientists at Oak Ridge National Laboratory used the MENNDL algorithm on Summit to develop a neural network that analyzes electron microscopy data at the atomic level. The team achieved a speed of 152.5 petaflops across 3,000 nodes.
  • 38. 38NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
  • 39. 39 GPU ACCELERATED COMPUTING HPC AI ML VISUALIZATION CLARA NVIDIA AI RAPIDS ISAAC DRIVE METROPOLIS WORKSTATIONS SERVERS CLOUD CUDA & GPU COMPUTING ARCHITECTURE NVIDIA GPU CLOUD TESLA DGXTEGRA
  • 40. 40 RE-IMAGINING DATA SCIENCE WORKFLOW Open Source, End-to-end GPU-accelerated Workflow Built On CUDA Data preparation / wrangling cuDF Optimized ML model training cuML Visualization Data visualization libraries data insights WWW.RAPIDS.AI
  • 41. 41 Cloud DOWNLOAD AND DEPLOY On-premises Source code, libraries, packages Source available on Github | Container available on NGC and Dockerhub | PIP available at a later date NGC
  • 42. 42 ENTERPRISE DATA MODEL SCALABILITY DGX-2 DGX-1 DGX STATION 2*RTX 8000 TESLA V100 256GB 128GB 96GB 32GB 512 GB 10240 TENSORCORES 5120 TENSORCORES 2560 TENSORCORES 1152 TENSORCORES 640 TENSORCORES
  • 43. DGX POD ARCHITECTURE a single data center rack containing up to 9x NVIDIA DGX-1 servers, storage, networking & NVIDIA AI software Nine DGX-1 servers 12 storage servers 10 GbE (min) storage & management switch Mellanox 100 Gpps intra-rack high speed network switches.
  • 44. 44 WHY CONTAINERS? Benefits of Containers: Simplify deployment of GPU-accelerated software, eliminating time- consuming software integration work Isolate individual deep learning frameworks and applications Share, collaborate, and test applications across different environments 44 To learn more: nvidia.com/ngc To sign up: ngc.nvidia.com
  • 45. 45 RAPID USER ADOPTION HPC APPS CONTAINERS ON NVIDIA GPU CLOUD RAPID CONTAINER ADDITION GAMESS CHROMA*CANDLE GROMACS LAMMPS NAMD RELION LATTICE MICROBESMILC* *Coming soon
  • 46. 46 POWERING THE WORLD’S FASTEST SUPERCOMPUTERS 48% More Systems | 22 of Top 25 Greenest Piz Daint Europe’s Fastest 5,704 GPUs| 21 PF ORNL Summit World’s Fastest 27,648 GPUs| 144 PF ABCI Japan’s Fastest 4,352 GPUs| 20 PF ENI HPC4 Fastest Industrial 3,200 GPUs| 12 PF LLNL Sierra World’s 2nd Fastest 17,280 GPUs| 95 PF
  • 47.
  • 48. 48 NEXT STEPS NVIDIA Deep Learning Institute www.nvidia.com/en-us/deep-learning-ai/education GTC San Jose | March 26-29 2018 www.nvidia.com/en-us/gtc NGC www.nvidia.com/en-us/gpu-cloud