Kicking off the first in a series of global GPU Technology Conferences, NVIDIA co-founder and CEO Jen-Hsun Huang today at GTC China unveiled technology that will accelerate the deep learning revolution that is sweeping across industries. Huang spoke in front of a crowd of more than 2,500 scientists, engineers, entrepreneurs and press, gathered in Beijing for a day devoted to deep learning and AI. On stage he announced the Tesla P4 and P40 GPU accelerators for inferencing production workloads for AI services and, a small, energy-efficient AI supercomputer for highway driving — the NVIDIA DRIVE PX 2 for AutoCruise.
1. GTC 2016 — China
THE DEEP LEARNING
AI REVOLUTION
2. 2
GPU DEEP LEARNING BIG BANG
Deep Learning NVIDIA GPU
NIPS (2012)
ImageNet Classification with Deep Convolutional
Neural Networks
Alex Krizhevsky
University of Toronto
Ilya Sutskever
University of Toronto
Geoffrey e. Hinton
University of Toronto
3. 3
74%
96%
2010 2011 2012 2013 2014 2015
DL
GPU DEEP LEARNING ACHIEVES
“SUPERHUMAN” RESULTS
2012: Deep Learning researchers
worldwide discover GPUs
2015: DNN achieves
superhuman image recognition
2015: Deep Speech 2 achieves
superhuman voice recognition
ImageNet — Accuracy %
Human
Hand-coded CV
Microsoft,
Google
3.5% error rate
4. 4
NVIDIA — “THE AI COMPUTING COMPANY”
GPU Computing Computer Graphics Artificial Intelligence
5. 5
ANNOUNCING NEW GRAPHICS SDKS
Funhouse VR
Open Source
360 Video 1.0
Real-Time Panoramic VR
Iray VR
Photorealistic
VR Ray Tracing
GVDB
Sparse Volumes for
Special Effects
Remote Rendering
Video Compositing
Ansel
In-game Photography
Volumetric
Physical Light Models
OptiX 4.0
Multi-GPU Ray-Tracing
MDL 1.0
Physically Based Materials
Mental Ray
Now GPU-Accelerated!
9. 9
GTC — 25X GROWTH IN GPU DL DEVELOPERS
4X Attendees 3X GPU Developers 25x Deep Learning Developers
2014
55,000400,00016,000
2,200
120,000
3,700
• Australia
• China
• Europe
• India
• Japan
• Korea
• United States
(Silicon Valley, D.C.)
20162014 2016
• Japan
• United States
• Higher Ed 35%
• Software 19%
• Internet 15%
• Auto 10%
• Government 5%
• Medical 4%
• Finance 4%
• Manufacturing 4%
2014 2016
10. 10
WHY DID AI RESEARCHERS
ADOPT GPUs FOR DEEP LEARNING?
11. 11
BRAIN IS LIKE A GPU
BRAIN CREATES MENTAL IMAGES WHEN WE THINK
14. 14
GPU DEEP LEARNING
IS A NEW COMPUTING MODEL
Training
Device
Datacenter
TRAINING
Billions of Trillions of Operations
GPU train larger models,
accelerate time to market
15. 15
GPU DEEP LEARNING
IS A NEW COMPUTING MODEL
Training
Device
Datacenter
DATACENTER INFERENCING
10s of billions of image, voice, video
queries per day
GPU inference for fast response,
maximize datacenter throughput
16. 16
GPU DEEP LEARNING
IS A NEW COMPUTING MODEL
Training
Device
Datacenter
DEVICE INFERENCING
Billions of intelligent devices
GPU for real-time accurate response
17. 17
AI — THE ULTIMATE
COMPUTING CHALLENGE
IMAGE RECOGNITION SPEECH RECOGNITION
Important Property of Neural Networks
Results get better with
more data +
bigger models +
more computation
(Better algorithms, new insights and
improved techniques always help, too!)
2012
AlexNet
2015
ResNet
152 layers
22.6 GFLOP/image
~3.5% error
8 layers
1.4 GFLOP/image
~16% Error
16X
Model
2014
Deep Speech 1
2015
Deep Speech 2
2 ExaFLOPS
25M | 7,000 Hours
~8% Error
10X
Training Ops
20 ExaFLOPS
100M | 12,000 Hours
~5% Error
18. 18
PASCAL “5 MIRACLES”
BOOST DEEP LEARNING 65X
Pascal — 5 Miracles NVIDIA DGX-1 Supercomputer 65X in 4 yrs Accelerate Every Framework
PaddlePaddle
Baidu Deep Learning
Pascal
16nm FinFET
CoWoS HBM2
NVLink
cuDNN
Chart: Relative speed-up of images/sec vs K40 in 2013. AlexNet training throughput based on 20 iterations. CPU: 1x E5-2680v3 12 Core 2.5GHz. 128GB System Memory, Ubuntu 14.04. M40 datapoint: 8x M40 GPUs in a node P100: 8x P100 NVLink-enabled.
Kepler
Maxwell
Pascal
X
10X
20X
30X
40X
50X
60X
70X
2013 2014 2015 2016
19. 19
ANNOUNCING
NEW IBM SERVER
POWER8 + NVIDIA TESLA P100
FOR THE AI ENTERPRISE
“ Putting NVIDIA’s technology into the IBM system will speed
up performance for such emerging workloads as AI, deep
learning and data analytics.” — eWeek
22. 22
ANNOUNCING
TESLA P4 & P40
INFERENCING ACCELERATORS
Pascal Architecture | INT8
P40: 250W | 40X Energy Efficient versus CPU
P40: 250W | 40X Performance versus CPU
27. 27
>1,500 AI STARTUPS AROUND THE WORLD
Deep Learning
for Cybersecurity
Deep Learning
for Genomics
Deep Learning
for Self-Driving Cars
Deep Learning
for Art
28. 28
AI STARTUPS IN CHINA
Weather & Environment
Forecast
Eye-tracking for Human-
machine Interaction
Medical
Imaging
Face
Recognition
Product Recognition,
Detection, Search
Personal
Concierge App
30. 30
“BILLIONS OF INTELLIGENT DEVICES”
“Billions of intelligent devices will take advantage of DNNs
to provide personalization and localization as GPUs
become faster and faster over the next several years.”
— Tractica
31. 31
AI CITY — 1B CAMERAS BY 2020
~1 billion cameras worldwide by 2020
30 billion inferences/sec
Tesla P40: 2,500 inferences/sec @ 720P
AI City needs ~10M P40 servers
DATA: 1B cameras, IHS “Video Surveillance Intelligence Service, Aug. 2016”
32. 32
1/20TH
THE SPACE,
1/10TH
THE POWER
Hikvision Blade
16 Jetson TX1s
NVIDIA DGX-1 Traditional Server Hikvision Blade
~21 1U Servers
42 CPUs
~4,000 W
1 Hikvision Blade
16 TX1 + 1 CPU
>8 1080 streams
~300 W
37. 37
NVIDIA DRIVE PX 2
AutoCruise to Full Autonomy — One Architecture
Full Autonomy
AutoChauffeur
AutoCruise
AUTONOMOUS DRIVING
Perception, Reasoning, Driving
AI Supercomputing, AI Algorithms, Software
Scalable Architecture
38. 38
NVIDIA DRIVE PX 2
AUTOCRUISE
10W AI Car Computer | Passive Cooling | Automotive IO
AI Highway Driving | Localization & Mapping