Women in Big Data Event Deep Learning Workflows and Use Cases

10/20/2017 Women in Big Data Event Hashtags: #IamAI, #WiBD
Oct 18th AI Connect Speakers
WiBD Introduction & DL Use Cases
Renee Yao
Product Marketing Manager,
Deep Learning and Analytics
NVIDIA
Deep Learning Workflows (w/ a demo)
Kari Briski
Director of Deep Learning
Software Product
NVIDIA
Deep Learning in Enterprise
Nazanin Zaker
Data Scientist
SAP Innovation Center Network

Renee Yao
Product Marketing Manager, NVIDIA
AI CONNECT

Agenda
AI Connect
• 6:00-7:00pm – Registration and Networking
• 7:00-7:15pm – “WiBD Introduction & DL Use
Cases”, Renee Yao, Product Marketing
Manager, Deep Learning and Analytics, NVIDIA
• 7:15-7:45pm – “Deep Learning Workflows
(with a live demo)”, Kari Briski, Director of
Deep Learning Software Product, NVIDIA
• 7:45-8:15pm – “Deep Learning in Enterprise”
by Nazanin Zaker, Data Scientist, SAP
Innovation Center Network
• 8:15-8:30pm - Wrap-up & Giveaways
February Apache Hadoop
Training @ Cloudera
May Apache Drill and
Apache Spark @ MapR
June Career Empowerment
@ Andreessen Horowitz
June @ Spark Summit
June @ Hadoop SummitMarch @ Strata+Hadoop
World SJ
10/20/2017 Women in Big Data Event Hashtags: #IamAI, #WiBD

10/20/2017 Women in Big Data Forum
Be Part of The Solution
Become a member or a sponsor
• Website: womeninbigdata.org
• LinkedIn: “Women in Big Data Forum”
• Meetup: meetup.com/Women-in-Big-Data-Meetup/
• Twitter: @DataWomen
• Video: https://www.youtube.com/channel/UCOaMT7A9SVkeBdvYNxiITVA
Join us
Event Hashtags: #IamAI, #WiBD

Kari Briski, 10-18-17
DEEP LEARNING WORKFLOWS:
DEEP LEARNING TRAINING AND INFERENCE

7
NATURAL LANGUAGE
PROCESSING
SPEECH & AUDIO
AI APPLICATIONS
Object Detection Voice Recognition Language Translation
Recommendation
Engines Sentiment AnalysisImage Classification
COMPUTER VISION

8
NATURAL LANGUAGE
PROCESSING
SPEECH & AUDIO
AI APPLICATIONS
Object Detection
Classification
Segmentation
Visual Q&A
Neural Machine
Translation
Question & Answer
Sentiment Analysis
Search and
recommendation engines
ASR
automatic speech recognition
Generation
Processing
Audio-classification
Denoising
Recommendation
COMPUTER VISION

9
ACCELERATED DEEP LEARNING TRAINING STACK
AI Applications
are Built on NVIDIA Hardware and Software
End-to-End
Recommendation
COMPUTER VISION SPEECH AND AUDIO NATURAL LANGUAGE PROCESSING

10
NVIDIA TOOLS FOR DEEP LEARNING WORKFLOW
NVIDIA DEEP LEARNING SDK
TRAINING DEPLOY WITH TENSORRT
TRAINED
NETWORK
TRAINING
DATA TRAINING
DATA MANAGEMENT
MODEL ASSESSMENT
EMBEDDED
Jetson TX
AUTOMOTIVE
Drive PX (XAVIER)
DATA CENTER
Tesla (Pascal, Volta)
DATA: GATHER AND LABEL
Rapidly label data,
guide training get
insights
Gather Data
Curate data sets
Accelerated Deep Learning Training Software Stack

11
DL FLOW
INFERENCE &
MICROSERVICES
IMPORT
Format…
PREPROCESS
clean, clip,
label,
Normalize, ..
VISUALIZATION
Curated DatasetSource Dataset
TRAIN
SCORE +
OPTIMIZE,
VISUALIZATION
DEPLOY
tune,
compile
+ runtime
REST
API
RESULT *
inference,
prediction
MODEL
ZOO

14
Crowd Source Tools
VATIC
Free Labeled Data
ViPER
Computer Vision
Translation
Speech & Audio
Home-grown

15
Project Manager
STEP 1
Project Setup
Project named
Classifier types defined
Labeling task settings
defined
Sequences added
Data Labeler
STEP 3
Labeling
Labels created
Attributes of labels selected
Frames committed for QA
Curator
STEP 2
Data Labeler
STEP 4
QA
Frames accepted or rejected
Rejection reason specified
Data Labeler
STEP 5
Export
Data sent to training
Export
Data set sent to training
Curation
Which pieces of data make the
most sense to us

17
UI / JOB MANAGEMENT / DATASET VERSIONING/ VISUALIZATION
DIGITS, NVIDIA GPU Cloud, HumanLoop, MagLev,Keras
NVIDIA DEEP LEARNING SOFTWARE TRAINING STACK
Recommendation
At Your Desk On-Prem In-the-Cloud

18
DEEP LEARNING
cuDNN
MATH LIBRARIES
cuBLAS cuSPARSE
COMMUNICATION
cuFFT
DIGITS, NVIDIA GPU Cloud, HumanLoop, MagLev,Keras
DEEP LEARNING FRAMEWORKS
Deep Learning Software Libraries (AKA Frameworks)
NCCLArchitecture Specific Libraries
Productivity: Workflow, Data and Job Management, Experiments
Recommendation

19
Recommendation
Engines Sentiment Analysis
DEEP LEARNING
cuDNN
MATH LIBRARIES
cuBLAS cuSPARSE
COMMUNICATION
cuFFT
Image Classification
DIGITS, NVIDIA GPU Cloud, NVDocker, Keras, Kubernetes
NV OPTIMIZED NV ACCELERATED
NCCL
Paddle

20
GENERATIONAL GPU PERFORMANCE & TENSOR CORES
0
1
2
3
4
5
6
7
8
k80 p100 v100 v100 TC
Single GPU Generational Training Scaling ResNet-50; 1,4,8 GPU training on DGX-1 Volta

21
GENERATIONAL GPU PERFORMANCE & TENSOR CORES
0
1
2
3
4
5
6
7
8
k80 p100 v100 v100 TC
Single GPU Generational Training Scaling ResNet-50; 1,4,8 GPU training on DGX-1 Volta
with Volta Tensor Core math
3-3.5X CNN training
over Pascal

22
0 10 20 30 40 50
8x-V100
8x P100
8x K80
TIME TO SOLUTION (HOURS)
1 weekend
1 day
1 afternoon
Convolutional Neural Networks
Recursive Neural Networks
Training ImageNet to accuracy(90 epochs) with ResNet-50
Training OpenNMT to accuracy (13 epochs)
0 10 20 30 40
V100
P100
K80

23
WHERE TO TRAIN

24
INFERENCE
DEPLOY YOUR TRAINED NETWORK
TO INFER IN APPLICATIONS

25
TRAINED
NETWORK
MODEL
NOW WHAT?
0
500
1000
1500
2000
2500
CPU K80 TF P100 TF P100 TRT
Throughput
Images/sec

26
TRAINED
NETWORK
MODEL
OPTIMIZE
0
500
1000
1500
2000
2500
CPU K80 TF P100 TF P100 TRT
Throughput
Images/sec

27
NVIDIA TENSOR RT
Maximize inference throughput for latency critical services
TRAINED
NETWORK
MODEL
EMBEDDED
Jetson TX
AUTOMOTIVE
Drive PX (XAVIER)
DATA CENTER
High performance neural network inference optimizer and runtime engine for production
deployment
TensorRT Optimizer
TensorRT
Runtime
Engine
OPTIMIZED
NETWORK

28
TESLA V100
DRIVE PX 2
TESLA P4
JETSON TX2
NVIDIA DLA
TensorRT
NVIDIA TENSORRT PROGRAMMABLE
INFERENCING PLATFORM
NVIDIA TENSORRT PROGRAMMABLE
INFERENCING PLATFORM

29
NVIDIA TensorRT
Maximize throughput and minimize latency
Deploy reduced precision without retraining and
without accuracy loss
Train in any framework, deploy in TensorRT without
overhead
Programmable Inference Accelerator
Embedded Automotive Data center
Jetson Drive PX Tesla
developer.nvidia.com/tensorrt

30
VOLTA ON A BUDGET
LATENCY BENCHMARKS
0
1000
2000
3000
4000
5000
6000
CPU-Only V100 + TensorFlow V100 + TensorRT
Throughput (image/s) vs Latency (ms)
CPU-Only
V100 + TensorFlow
V100 + TensorRT
3X
19
6
7
ResNet-50 (ImageNet) OpenNMT (English to Deutsch)
Throughput on a 200 ms latency budget
6X

31
ENABLE INT8 INFERENCE
TensorRT is ENABLER
for entropy quantization
Training
Framework
TensorRT
Calibrate
&
Quantize
fp32 int8 Inference
100’s of samples
of training data
FP32 TOP 1 INT8 TOP 1 DIFFERENCE
Alexnet 57.22% 56.96% 0.26%
Googlenet 68.87% 68.49% 0.38%
VGG 68.56% 68.45% 0.11%
Resnet-50 73.11% 72.54% 0.57%
Resnet-
101
74.58% 74.14% 0.44%
Resnet-
152
75.18% 74.56% 0.61%
Maintain accuracy without
retraining

32
NVIDIA TENSOR RT
Maximize inference throughput for latency critical services
EMBEDDED
Jetson TX
AUTOMOTIVE
Drive PX (XAVIER)
DATA CENTER
Large Batch,
Low Latency, Production-ready
Real-time execution, high resolution,
high throughput, small footprint
Low power
small footprint, multi-inference

33
“On average TensorRT has doubled the speed of
our inference which is pretty amazing!”
Source: Paul Kruszewski; CEO WRNCH
“On average we see around 10x speedup, with between
3-70x speedups depending on the scenarios ”
Source: Matthew Zieler CEO Clarifai
“Self-driving car’s having real-time execution is obviously
very important. With our ResNet101 network, TensorRT
brought our inference time down from 250ms to 89ms.”

35NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
FAST IMPLEMENTATION OF TENSORFLOW

37
DL DATACENTER WORKFLOW
TensorRT increases productivity and time to results
INFERENCE &
MICROSERVICES
TRAIN
SCORE +
OPTIMIZE,
VISUALIZATION
DEPLOY
tune,
compile
+ runtime
REST
API
RESULT
inference,
prediction
MODEL
ZOO
Automated
with TensorRT
A/B Testing, Use
data

38
NVIDIA DIGITS
>10k pulls
>2.5k stars
DL EDGE/ IVA WORKFLOW
Transfer Learning: Train and deploy to edge in less than a minute

39
DEMO DEEP LEARNING WORKFLOW
Transfer Learning: Train and deploy to edge in less than a minute
A special THANK YOU!
Zheng Liu &
Varun Praveen

41
WHO, WHAT, WHERE
RESEARCHERS
Explore the “next big thing”
opportunity to fuel business
APPLIED DL/ DATA SCIENTISTS
Retrain w/ data, productize models
for consistency, focus on quality
APPLICATION DEVELOPER
Scale and deploy successful
applications w/ great user ex.

42
WHO, WHAT, WHERE
RESEARCHERS
opportunity to fuel business, and find
ways to productize it
APPLIED DL/ DATA SCIENTISTS
Retrain, productize models for
consistency, quality, tuning with
right data
Object Detection Voice Recognition Language Translation Recommendation Engines Sentiment AnalysisImage Classification
Paddle

43
WHO, WHAT, WHERE
RESEARCHERS
opportunity to fuel business, and find
ways to productize it
DATA SCIENTISTS
Retrain, productize models for
consistency, quality, tuning with
right data
TensorRT
Object Detection Voice Recognition Language Translation Recommendation Engines Sentiment AnalysisImage Classification
Training Deployingor
Paddle

Women in Big Data Event Deep Learning Workflows and Use Cases

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Women in Big Data Event Deep Learning Workflows and Use Cases

Similar to Women in Big Data Event Deep Learning Workflows and Use Cases (20)

More from NVIDIA

More from NVIDIA (20)

Recently uploaded

Recently uploaded (20)

Women in Big Data Event Deep Learning Workflows and Use Cases