Harnessing the virtual realm for successful real world artificial intelligence

Harnessing the virtual realm
for successful real world
artificial intelligence
Alison B. Lowndes
AI DevRel | NVIDIA

ANNOUNCING
NVIDIA MAXINE
Early Access at developer.nvidia.com/maxine

4
https://arxiv.org/pdf/1912.04958.pdf

5
5
25 YEARS OF ACCELERATED COMPUTING
X-FACTOR SPEED UP FULL STACK ONE ARCHITECTURE
SYSTEMS
GPU
CPU

6
6
25 YEARS OF ACCELERATED COMPUTING
X-FACTOR SPEED UP FULL STACK DATA-CENTER SCALE
GPU
CPU
DPU
ONE ARCHITECTURE

8
NVIDIA SELENE
Featuring NVIDIA DGX A100 640GB
4,480 A100 GPUs
560 DGX A100 system
850 Mellanox 200G HDR switches
14 PB of high-performance storage
2.8 EFLOPS of AI peak performance
63 PFLOPS HPL @ 24GF/W
https://blogs.nvidia.com/blog/2020/12/18/nvidia-selene-busy/

SINGLE A100 WITH MIG RUNS ALL MLPERF TESTS…
AT THE SAME TIME
Delivers 98% of Performance of a Single MIG Instance Running Alone
MLPerf v1.0 Inference Closed; Per-accelerator performance derived from the best MLPerf results for respective submissions using reported accelerator count in Data Center Offline and Server. 3D U-
Net 99%, ResNet-50, SSD-Large, DLRM 99%, RNN-T, BERT 99%: 1.0-26. MLPerf name and logo are trademarks. See www.mlperf.org for more information.
ResNet-50 v1.5
3D-UNet 99%
RNN-T
BERT-Large
SSD-Large
DLRM
ResNet-50 v1.5
Single A100 with 7
MIG Instances Enabled
98%
Performance vs.
MIG instance
running alone

TODAY’S AI
DATA CENTER
50 DGX-1 systems for AI training
600 CPU systems for AI inference
$11M
25 racks
630 kW

5 DGX A100 systems for AI training
and inference
$1M
1 rack
28 kW
1/10th
COST
1/20th
POWER
$1M 28 kW
DGX A100
DATA CENTER

13
13
EXPANDING NGC
NEW CONTAINERS FOR A100 & ARM
Now
NGC-READY SYSTEMS FOR A100
Starting Q3
NGC Private Registry
NGC Container
Environment Modules
Higher HPC app
performance w/ NVTAGS
NEW FEATURES
Now
Multi-arch support for x86,
Arm and Power
Learn More – ngc.nvidia.com | NGC Private Registry | NVTAGS | NGC Container Environment Modules
HPC Simulation & Visualization
AI Frameworks (A100)
Chroma
AutoDock 4
VMD
**
* Available week of June 22 ** Available starting with v20.06
*
*
*

14
ENABLING ENTERPRISE TRANSFORMATION WITH AI
End to End Application Frameworks
Desktop Development Data Center Solutions Accelerated Edge Supercomputers GPU-Accelerated Cloud
Jarvis Merlin Metropolis Clara Isaac Drive Aerial
Conversational
AI
Recommender
Systems
Smart Cities Healthcare Robotics Autonomous
Vehicles
Telecom

23
BUILDING AN AI PRODUCT
SENSORS
PERCEIVE REASON
PLAN
DATA
DATA
ANALYTICS
MACHINE
LEARNING
AI MODEL
VALIDATION
ACTUATORS
AI MODEL

INGESTION STORAGE PROCESSING SERVING
BIG DATA PIPELINE
Ingredients:
• Lots of data
• Lots of compute
• Software tools
• Time and patience
Method:
1. Collect raw, massive sets of data.
2. Put the data in a Data Lake.
3. Grab the data that you need and
sort through.
4. Find patterns in the data.
5. Solve the problem.
1. Obtaining and importing
data
2. Organizing & storing data for future use
3. Manipulating and analyzing the
data
4. Operationalizing the
solution

25
HARNESSING
AI
Step I: Build data fabric for your organization
Step II: Define your objective
Step III: Hire the right talent
Step IV: Identify key processes to augment with AI
Step V: Create a sandbox lab environment
Step VI: Operationalize successful pilots
Step VII: Scale up for enterprise-wide adoption
Step VIII: Drive cultural change

26
World Sense See, Understand Automation
AI Program
Computer
ARTIFICIAL INTELLIGENCE IS DOMAIN SPECIFIC
Self-Driving

27
AI Program
Computer
AI Program
Computer
Self-Driving
Manufacturing

28
AI Program
Computer
AI Program
Computer
AI Program
Computer
Self-Driving
Manufacturing
Radiology

29
Image “Volvo XC90”
Image source: “Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks” ICML 2009 & Comm. ACM 2011.
Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Ng.
CONVOLUTIONAL NEURAL NETWORKS

30
FULLY CONVOLUTIONAL NETWORK
https://github.com/NVIDIA/MinkowskiEngine

RT DENOISING
VIDEO TO 3D
CHARACTER
LOCOMOTION
CHARACTER
CONCEPTING
AUDIO TO FACIAL
ANIMATION
PHYSICS SIMULATION
Clothing models from UC Berkeley Garment Library
THE MAGIC OF DEEP LEARNING

33
ADVANCED TOOLS AND TECHNOLOGIES
Foundational Platform Components

34
LEARN MORE ABOUT OMNIVERSE
DEVELOPER TOOLS EARLY ACCESS CLOSED BETA TUTORIAL COLLECTION
OPEN BETA DOWNLOAD
WEBSITE

ADD ANYMAL VIDEO (RL TRAINING IN SIM)

37
https://arxiv.org/pdf/1810.05762.pdf

38
END-TO-END GPU RL
Grasping Robot Use Case
38

THE IMPORTANCE OF SYNTHETIC DATA
https://blogs.nvidia.com/blog/2021/06/08/what-is-synthetic-data/

45
Retail & supply chain
GTC talk S31538 with Kinetic Vision

46
PURPOSE BUILT PRE-TRAINED NETWORKS
Number of classes: 3
Dataset: 750k frames
Accuracy: 84%
Accuracy: 84%
Dataset: 56k frames
Accuracy: 88%
Dataset: 60k Frames
Accuracy: 92%
Number of Classes: 4
Accuracy: 84%
Dataset: 600k images
Accuracy: 95%
PeopleNet
TrafficCamNet
VehicleTypeNet
DashCamNet FaceDetect-IR
VehicleMakeNet
Highly Accurate | Re-Trainable | Out of Box Deployment

ANNOUNCING
JARVIS OPEN BETA
Integrated AI Skills with Pre-Trained Models
Fully Customizable Application Pipeline
Human Voice with Neural TTS
Superhuman NLU with Megatron-BERT
<300 ms Latency | 7X Throughput | 1/3rd Cost
Sign Up at developer.nvidia.com/nvidia-jarvis
State-of-the-Art Conversational AI

49
LEARN MORE
Conversational AI
Developer Overview
NVIDIA Jarvis
Product Page
Conversational AI Demo Videos
"Misty" | "Mark" | In-car
Conversational AI Explainer Videos
YouTube Playlist
Jarvis Intro Blog Conversational AI Corp Blogs Intro to building Conversational AI
Apps for Enterprise (Webinar)

RECOMMENDERS —
THE PERSONALIZATION ENGINE OF THE INTERNET
DIGITAL CONTENT
2.7 Billion
Monthly Active Users
E-COMMERCE
2 Billion
Digital Shoppers
SOCIAL MEDIA
3.8 Billion
Active Users
DIGITAL ADVERTISING
4.7 Billion
Internet Users
Item
Candidate
Generation
O(102)
Ranking
User
Embedding
User
Items
Recommende
d
Items
Item
Embedding
O(10)
O(109)

51
TRANSFER LEARNING TOOLKIT (TLT)
Zero Code Approach| Domain Adaptability
Purpose-Built
Pretrained Models
Quantization Aware
Training with TLT
Automatic Mixed
Precision with TLT
2X
Inference
Speedup
1.5X
Training time
Speedup
10X
Overall Development
Time Speedup
SmartCow is building turnkey AIoT solutions to
optimize turnaround time at ports and dry
docks. “By using TLT, we were able to reduce
the training iterations by 9x and reduce the
data collection and labeling effort by 5x which
significantly reduces our training cost by 2x”
“Using NVIDIA’S TLT made training a real time
car detector and license plate detector easy. It
eliminated our need to build models from the
ground up, resulting in faster development of
models and ability to explore options”
Highly Accurate

53
First and only workstation with 4-way NVIDIA A100
GPUs, NVLink, and MIG
Four A100 Tensor Core GPUs, 320 GB total HBM2E
Multi-Instance GPU (MIG) for up to 28 GPU instances
in a single DGX Station A100
3rd generation NVLink
200 GB/s bi-directional bandwidth between any GPU
pair, almost 3x compared to PCIe Gen4
New maintenance-free refrigerant cooling system
DGX STATION A100 320G
Workgroup Appliance for the Age of AI
CPU and Memory
64-core AMD® EPYC® CPU, PCIe Gen4
512 GB system memory
Internal Storage
1.92 TB NVME M.2 SSD for OS
7.68TB NVME U.2 SSD for data cache
Connectivity
2x 10GbE (RJ45)
4x Mini DisplayPort for display out
Remote management 1GbE LAN port (RJ45)

54
NEW DGX A100
640GB SYSTEM
Speedups Normalized to Number of GPUs | Comparisons to A100 40GB | Measurements performed DGX
A100 servers . AI Training: DLRM (Huge CTR) | DGX A100: 16x A100 40GB vs 8x A100 80GB | speedup =
1.4X. Speedup normalized to number of GPUs = 2.8X. AI Inference: RNN-T (MLPerf 0.7 Single stream
latency) | DGX A100: A100 40GB vs A100 80GB on 1MIG@10GB when configured for 7MIGs | Data Analytics:
big data benchmark with RAPIDS(0.16), BlazingSQL(0.16), DASK(2.2.0) | 30 analytical retail queries, ETL,
ML, NLP | 96x A100 40GB vs 48x A100 80GB | HPC: Quantum Espresso - CNT10POR8 40x A100 40GB vs 24x
A100 80GB | Speedup normalized to number of GPUs = 1.8X
640 GB of GPU memory per system to increase
model accuracy and reduce-time-to-solution
Up to 3X higher throughput for large-scale workloads
Double the GPU memory for MIG for more flexible
AI development, analytics, and inference
Available individually, or part of DGX SuperPOD
Solution for Enterprise
Upgrade option for current DGX A100 customers
For the Largest AI Workloads

56
NVIDIA GPUs IN THE CLOUD
AVAILABLE ON-DEMAND FROM THE TOP CLOUD SERVICE PROVIDERS
• Immediate access to NVIDIA GPU
infrastructure for data science in the
cloud
• Wide variety of deployment and
management options using
containers, Kubernetes, Kubeflow,
support for cloud native services, and
more

57
RICH
CONTENT
PORTFOLIO
Fundamentals and advanced
hands-on training in key
technologies and application
domains
AI for
Digital Content Creation
Deep Learning
Fundamentals
AI for Healthcare
AI for Autonomous Vehicles
AI for
Intelligent Video Analytics
Accelerated Computing
Fundamentals
AI for Robotics
AI for
Predictive Maintenance
Accelerated Data Science
Fundamentals
Intro to AI in the Data
Center
AI for Anomaly Detection
AI for Industrial Inspection
NVIDIA.com/dli

58
PROFESSIONAL
SERVICES
NVIDIA works with a large network of service
delivery partners to provide services on NVIDIA-
accelerated platforms.
AI Service Delivery Partners
Contact us directly to start a dialogue about your
specific needs:
professionalservices@nvidia.com
Jay/Pat: Several proposed
claims here we need to vet
with Marc H.

NVIDIA INCEPTION
ACCELERATING 6K STARTUPS WORLDWIDE
EXPERTISE
NVIDIA Deep Learning Institute
Training in AI, accelerated computing, and
accelerated data science
TECHNOLOGY ASSISTANCE
Developer resources, preferred pricing on on-prem
GPUs, and cloud credits through our global partners
GO-TO-MARKET SUPPORT
Networking events and exposure opportunities
through NVIDIA
VENTURE CAPITAL FUNDING & ECOSYSTEM
NVIDIA Inception GPU Ventures
Investing in breakthrough startups and facilitating
engagements with the VC community
www.nvidia.com/inception

Harnessing the virtual realm for successful real world artificial intelligence

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Harnessing the virtual realm for successful real world artificial intelligence

Ähnlich wie Harnessing the virtual realm for successful real world artificial intelligence (20)

Mehr von Alison B. Lowndes

Mehr von Alison B. Lowndes (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Harnessing the virtual realm for successful real world artificial intelligence