HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD

inside-BigData.com
inside-BigData.comPresident of insideHPC Media um inside-BigData.com
HPC AT-SCALE ENABLED BY
DDN A3I AND NVIDIA SUPERPOD
William Beaudin
Sr Director, Engineering
wbeaudin@ddn.com
DDN ©2020 DataDirect Networks, Inc.
THE LEADER AT SCALE,
PROVEN IN PRODUCTION
20 years of HPC experience
The largest environments,
the most exacting requirements
DDN A3I Solutions – The World’s Fastest HPC Storage – Made Simple
FASTEST PERFORMANCE,
EFFORTLESS GROWTH
Faster and deeper insight
Complexity eliminated,
seamless end-to-end integration
RELIABLE, RESILIENT
AND FLEXIBLE
24x7 Productivity
The universal AI platform
for all stages of the data cycle
DDN ©2020 DataDirect Networks, Inc.
Optimized AI Platforms
For Every Use Case
Accelerate applications by achieving full GPU saturation on DGX
Streamline concurrent and continuous deep learning workflows
Flexible configuration with best technology and economics
Seamless scaling to match evolving workflow needs
Optimized for DGX platforms and NGC containers for DL and HPC
Easy to deploy and manage with turnkey support from DDN and partners
®
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
DDN A3I X-APPLIANCES – THE BUILDING BLOCKS FOR DATA AT-SCALE
ALL-NVME APPLIANCE FULLY-OPTIMIZED
FOR THE MOST INTENSIVE WORKLOADS
FAST, FLEXIBLE HYBRID APPLIANCE
SCALES EFFORTLESSLY WITH BEST DENSITY
FULLY VALIDATED AT-SCALE WITH NVIDIA DGX-2 SUPERPOD!
AI7990XXAI400XX
DDN ©2020 DataDirect Networks, Inc.
DDN
THE SUPERPOD ACCELERATOR
AI400XX
DDN ©2020 DataDirect Networks, Inc.
DDN A3I APPLIANCES MAKE SUPERPOD FAST AND EASY TO DEPLOY
► Simplified design with predictable performance
and capacity with future scaling
► Validated reference architectures for easier
planning with at-scale workflows
► Fully-configured appliances arrive ready to
deploy and install in minutes
► Seamless integration with NVIDIA DGX for
moving rapidly to production
► Comprehensive expert services from DDN and
partners delivered globally
DDN ©2020 DataDirect Networks, Inc.
HBA STORAGESAN SWITCHHBASERVERHCA/NICSWITCHHCA/NICCLIENT
DDN A3I IO PATH
SWITCHHCA/NICCLIENT
SIMPLIFIED STACK WITH
DDN A3I APPLIANCES
FILESYSTEM
COMMON IO PATH
DDN A3I APPLIANCES REDUCE COST AND COMPLEXITY
SIMPLE TO DEPLOY, MANAGE AND SCALE!
DDN ©2020 DataDirect Networks, Inc.
DDN A3I SHARED PARALLEL ARCHITECTURE
DDN A3I ARCHITECTURE - TRUE END-TO-END PARALLELISMLEGACY NAS ARCHITECTURE
NAS FILE
SERVER
NAS CLIENT
EMBEDDED
A3I SERVER
EMBEDDED
A3I SERVER
NAS IS A BOTTLENECK
CRIPPLES AS YOU GROW
DDN IS FAST AND RELIABLE
SCALES SEAMLESSLY AS YOU GROW
A3I CLIENTA3I CLIENT
EMBEDDED
A3I SERVER
EMBEDDED
A3I SERVER
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
INFINITE DGX POD
PERFORMANCE
0
20000
40000
60000
80000
100000
120000
140000
8 16 32 64 128
IMAGESPERSECOND
NUMBER OF GPUs
NFS Storage DDN Storage
At-scale multi-node distributed training
application using 8 NVIDIA V100 GPUs per
server engaged simultaneously.
Results demonstrate linear scaling, with full
application performance up to 128 GPUs.
NFS architecture and protocol storage stalls
at 4 nodes, event with all-flash disks and
high-speed InfiniBand network.
DDN A3I WITH DGX POD
SCALES SEAMLESSLY WITHOUT BOUNDS
NFS MAX
DDN LIMITLESS
SCALING
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
DDN A3I SOLUTIONS – NVIDIA SUPERPOD REFERENCE ARCHITECTURE
► AI400 + DGX-2 SuperPOD at-scale testing and
validation published by NVIDIA:
• The AI400 All-flash appliance delivers incredible
sequential and random read performance, as
required by the heaviest DL workloads.
• Metadata performance scales well from 1 to 96
nodes, with no degradation as the number of
nodes and threads increases.
• The AI400 is a fully-integrated platform that’s easy
to deploy. DDN provides excellent technical
deployment and support services.
► RA document available from NVIDIA website
NVIDIA DGX-2 SUPERPOD
REFERENCE ARCHITECTURE
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
DDN A3I SCALES FLEXIBLY TO MATCH YOUR SUPERPOD ENVIRONMENT
ALL-NVME MULTI-TIER MULTI-SITE
ALL-FLASH SOLUTION
FOR MAXIMUM PERFORMANCE
HYBRID OPTIMIZED
FOR MIXED WORKFLOWS
DISTRIBUTED SYSTEMS
PER DATACENTER CAPACITY
MULTI-CLOUD
FULL-CLOUD INTEGRATION
AND BETWEEN CLOUDS
DDN ©2020 DataDirect Networks, Inc.
DDN A3I MULTIRAIL ENABLES PLUG-AND-PLAY HPC NETWORKING
Fast, secure, resilient
networking made easy
Enhanced algorithm enables grouping of
multiple network interfaces and achieve full
aggregate throughput capabilities on a node.
Intelligent interface selection and traffic
management deliver unprecedented node
performance and dynamic load-balancing.
Active link health monitoring ensures rapid
failure detection and automatic recovery.
Multi-Rail Networking Discrete Networking
DDN A3I Multirail greatly simplifies DGX deployments.
DDN ©2020 DataDirect Networks, Inc.
DDN A3I - GPUDIRECT TO STORAGE DOUBLES DGX-2 THROUGHPUT!
80 GB/s per client,
20 X performance gains
Native client integration with A3I, fully-
transparent to users and applications
Enables a direct path to transfer data
between GPU memory and data storage
Eliminates unnecessary memory copies,
lowers CPU overhead, reduces latency,
bypasses hardware architecture limitations
Improves AI, DL, HPC application performance
39.8 GB/s
80 GB/s
0 GB/s
10 GB/s
20 GB/s
30 GB/s
40 GB/s
50 GB/s
60 GB/s
70 GB/s
80 GB/s
NO GPUDirect WITH GPUDirect
GPU READ THROUGHPUT WITH TWO DDN AI400
CPU IO GPUDirect
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
5 DGX-2
10 AI400
400 GB/s
0 GB/s
50 GB/s
100GB/s
150GB/s
200GB/s
250GB/s
300GB/s
350GB/s
400GB/s
1 DGX-2 2 DGX-2 3 DGX-2 4 DGX-2 5 DGX-2
GPU READ THROUGHPUT SCALING WITH TEN AI400s
®
DDN DELIVERS LINEAR PERFORMANCE SCALING
DDN A3I GPUDirect integration delivers
up to 80 GB/s of throughput per DGX-2
Enables a direct path to transfer data
between GPU memory and data storage
Performance scales linearly and provides
maximum at-scale application acceleration
DDN ©2020 DataDirect Networks, Inc.
Preprocess
SupercomputingMulti-Physics Workflows AMR Checkpoint
Classify Manage Train Tune Infer
User & Service Management
Data Management
Multi-Tenanted
Security
More Value from All
your Data In One Place
► Your data in the right place at the right time
► Multicloud Ready
► The right user and service management to aid your
environment’s efficiencies
► Comprehensive Security for Containers, Cloud and On
Prem
Transparent Tiering
NFS SMB HDFSS3
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
REAL-TIME
EXASCALE
ANALYTICS
DDN A3I INTEGRATED MONITORING
PLATFORM DELIVERS CURRENT AND
HISTORICAL METRICS TO UNDERSTAND
SYSTEM PERFORMANCE, OPTIMIZE
OPERATIONS, AND ACHIEVE THE FULL
POTENTIONAL OF YOUR APPLICATIONS. DISCOVER INSIGHTFUL TRENDS TO
OPTIMIZE YOUR SUPERPOD
DDN ©2020 DataDirect Networks, Inc.
DDN A3I - PARALLEL DATA PATHS TO CONTAINERS
Network Infrastructure
Host Operating System
Containerized Apps
DDN A3I Container-Optimized Client
DDN A3I Appliances
NGC
Full NVMe Performance,
In-Container
DDN A3I enables seamless, fastest file-level
access to shared storage directly from
containerized applications at runtime
Compartmentalized data access and multi-
tenancy with trusted levels of segregation
Capability inserted at runtime with a
universal wrapper and does not require any
modification to application or container
DDN ©2020 DataDirect Networks, Inc.
DDN A3I: SCALE SECURELY WITH SUPERIOR VISIBILITY AND CONTROL
AUTHENTICATION ACCESS CONTROL MULTITENANCY ENCRYPTION AUDITING
Establish user and
node identity with
full confidence.
Enforce policy and
multiple levels of
classification.
Share infrastructure
to enable limitless
at-scale flexibility.
Secure all your data
end-to-end,
live and at rest.
Record and retain
activity for review
and compliance.
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
SECURED
INNOVATION
WITH MULTITENANCY
Shared access to high-performance
infrastructure enables the most productive
and efficient collaborations at-scale.
Container-based authentication allows
your business to securely and seamlessly
deliver the right data to the right teams.
DDN LIMITLESS FLEXIBILITY AND SCALABILITY FOR ALL WORKLOADS
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
DDN A3I
MASSIVELY
ACCELERATES
LIFE SCIENCES
RESEARCH
100X
FASTER GENOMICS
1600X
FASTER MICROSCOPY
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
40X FASTER
GENOMICS
WITH DDN A3I SOLUTIONS
AND NVIDIA DGX AT SCALE!
• Robust national biobank improved through
wider access to data and better performance
through an expanding storage infrastructure
• HPC computational resources: 300 compute
node cluster, 3 shared memory compute
nodes, and NVIDIA DGX systems
• 29 PB of data and growing
• 40X speedup of genomic analysis with DDN,
NVIDIA and Parabricks
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
DDN A3I ACCELERATED LIFE SCIENCES RESEARCH DATA SOLUTIONS
STORE, PROCESS, ANALYZE, VISUALIZE
FROM A SINGLE DATA PLATFORM
CAPTURE IN REAL TIME
FROM MULTIPLE INSTRUMENTS
THE FASTEST
TIME TO RESULTS
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
TURNKEY SUPERPOD SOLUTIONS FROM DDN AND GLOBAL PARTNERS
NVIDIA SuperPOD approved partners
delivering completely integrated solutions
with DDN A3I end-to-end workflow
enablement and acceleration.
Jade -- 1st SuperPOD Worldwide
Oxford University and The Hartree Centre
SuperPOD solution delivered by ATOS
DDN Confidential
DDN ©2020 DataDirect Networks, Inc.
ACCELERATE YOUR
SUPERPOD WITH AI400XX
ACHIEVE FULL GPU PERFORMANCE
GROW SEAMLESSLY TO EXASCALE
BOOST WORKLOADS WITH GPUDIRECT
ENABLE END-TO-END AI WORKFLOWS
RELY ON THE AT-SCALE EXPERTS
DDN ©2020 DataDirect Networks, Inc.
DDN.COM/A3I
1 von 25

Recomendados

Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ... von
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
709 views53 Folien
Versal Premium ACAP for Network and Cloud Acceleration von
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
584 views34 Folien
HPC Impact: EDA Telemetry Neural Networks von
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
309 views18 Folien
State of ARM-based HPC von
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPCinside-BigData.com
767 views18 Folien
Accelerated Any-Scale Solutions from DDN von
Accelerated Any-Scale Solutions from DDNAccelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNinside-BigData.com
411 views25 Folien
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently von
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
493 views17 Folien

Más contenido relacionado

Was ist angesagt?

Mellanox OpenPOWER features von
Mellanox OpenPOWER featuresMellanox OpenPOWER features
Mellanox OpenPOWER featuresGanesan Narayanasamy
966 views25 Folien
CUDA-Python and RAPIDS for blazing fast scientific computing von
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
1K views82 Folien
Mellnox Interconnect presentation in OpenPOWER Brazil workshop von
Mellnox Interconnect presentation in OpenPOWER Brazil workshopMellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopGanesan Narayanasamy
449 views33 Folien
OpenPOWER System Marconi100 von
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100Ganesan Narayanasamy
490 views12 Folien
InfiniBand In-Network Computing Technology and Roadmap von
InfiniBand In-Network Computing Technology and RoadmapInfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and Roadmapinside-BigData.com
1.1K views31 Folien
AI + E-commerce von
AI + E-commerceAI + E-commerce
AI + E-commerceAlison B. Lowndes
261 views79 Folien

Was ist angesagt?(20)

CUDA-Python and RAPIDS for blazing fast scientific computing von inside-BigData.com
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
Mellnox Interconnect presentation in OpenPOWER Brazil workshop von Ganesan Narayanasamy
Mellnox Interconnect presentation in OpenPOWER Brazil workshopMellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
InfiniBand In-Network Computing Technology and Roadmap von inside-BigData.com
InfiniBand In-Network Computing Technology and RoadmapInfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and Roadmap
inside-BigData.com1.1K views
End-to-End Big Data AI with Analytics Zoo von Jason Dai
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics Zoo
Jason Dai261 views
Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*) von Michelle Holley
Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*)Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*)
Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*)
Michelle Holley528 views
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi... von KTN
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
KTN229 views
Building the SD-Branch using uCPE von Michelle Holley
Building the SD-Branch using uCPEBuilding the SD-Branch using uCPE
Building the SD-Branch using uCPE
Michelle Holley2.4K views
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp von FPGA Central
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA CampState Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
FPGA Central3.5K views
OpenShift Kubernetes Native Infrastructure for 5GC and Telco Edge Cloud von Hidetsugu Sugiyama
OpenShift  Kubernetes Native Infrastructure for 5GC and Telco Edge Cloud OpenShift  Kubernetes Native Infrastructure for 5GC and Telco Edge Cloud
OpenShift Kubernetes Native Infrastructure for 5GC and Telco Edge Cloud
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra... von Linaro
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
Linaro374 views
Open Source 5G/Edge Automation via ONAP von Liz Warner
Open Source 5G/Edge Automation via ONAPOpen Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAP
Liz Warner663 views

Similar a HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD

22by7 and DellEMC Tech Day July 20 2017 - Power Edge von
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power EdgeSashikris
1.6K views55 Folien
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a... von
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...inside-BigData.com
2.2K views29 Folien
Harnessing the virtual realm for successful real world artificial intelligence von
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
150 views59 Folien
Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3... von
Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3...Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3...
Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3...Citrix
1.3K views33 Folien
Cisco connect montreal 2018 compute v final von
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v finalCisco Canada
1.6K views109 Folien
robust-company-profile-2015 von
robust-company-profile-2015robust-company-profile-2015
robust-company-profile-2015Tecsun Yeep
150 views2 Folien

Similar a HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD(20)

22by7 and DellEMC Tech Day July 20 2017 - Power Edge von Sashikris
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
Sashikris1.6K views
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a... von inside-BigData.com
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
inside-BigData.com2.2K views
Harnessing the virtual realm for successful real world artificial intelligence von Alison B. Lowndes
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
Alison B. Lowndes150 views
Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3... von Citrix
Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3...Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3...
Synergy 2015 Session Slides: SYN239 Professional Graphics Delivery With HDX 3...
Citrix1.3K views
Cisco connect montreal 2018 compute v final von Cisco Canada
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
Cisco Canada1.6K views
robust-company-profile-2015 von Tecsun Yeep
robust-company-profile-2015robust-company-profile-2015
robust-company-profile-2015
Tecsun Yeep150 views
Accelerating Innovation from Edge to Cloud von Rebekah Rodriguez
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
Rebekah Rodriguez283 views
VMworld 2013: Graphics and Users in VDI von VMworld
VMworld 2013: Graphics and Users in VDI VMworld 2013: Graphics and Users in VDI
VMworld 2013: Graphics and Users in VDI
VMworld936 views
Dell NVIDIA AI Powered Transformation Webinar von Bill Wong
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
Bill Wong115 views
ArchivePod a legacy data solution when migrating to the #CLOUD von Garet Keller
ArchivePod a legacy data solution when migrating to the #CLOUDArchivePod a legacy data solution when migrating to the #CLOUD
ArchivePod a legacy data solution when migrating to the #CLOUD
Garet Keller48 views
dassault-systemes-catia-application-scalability-guide von Jason Kyungho Lee
dassault-systemes-catia-application-scalability-guidedassault-systemes-catia-application-scalability-guide
dassault-systemes-catia-application-scalability-guide
Jason Kyungho Lee221 views
Netronome Corporate Brochure von Netronome
Netronome Corporate BrochureNetronome Corporate Brochure
Netronome Corporate Brochure
Netronome257 views
MT25 Server technology trends, workload impacts, and the Dell Point of View von Dell EMC World
MT25 Server technology trends, workload impacts, and the Dell Point of ViewMT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of View
Dell EMC World1.3K views
Best Data Center Service Provider in India - Best Hybrid Cloud Hosting Servi... von NetData Vault
Best Data Center Service Provider in India -  Best Hybrid Cloud Hosting Servi...Best Data Center Service Provider in India -  Best Hybrid Cloud Hosting Servi...
Best Data Center Service Provider in India - Best Hybrid Cloud Hosting Servi...
NetData Vault92 views
dinCloud Hosted Virtual Desktop von dinCloud Inc.
dinCloud Hosted Virtual DesktopdinCloud Hosted Virtual Desktop
dinCloud Hosted Virtual Desktop
dinCloud Inc.133 views
Cloud Service Brief - dinCloud Hosted Virtual Desktop von dinCloud Inc.
Cloud Service Brief - dinCloud Hosted Virtual DesktopCloud Service Brief - dinCloud Hosted Virtual Desktop
Cloud Service Brief - dinCloud Hosted Virtual Desktop
dinCloud Inc.83 views
ProfitBricks Cloud Computing IaaS An Introduction von ProfitBricks
ProfitBricks Cloud Computing IaaS An IntroductionProfitBricks Cloud Computing IaaS An Introduction
ProfitBricks Cloud Computing IaaS An Introduction
ProfitBricks1.1K views

Más de inside-BigData.com

Major Market Shifts in IT von
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in ITinside-BigData.com
5K views27 Folien
Transforming Private 5G Networks von
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
1.3K views14 Folien
The Incorporation of Machine Learning into Scientific Simulations at Lawrence... von
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
1.4K views33 Folien
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod... von
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
1K views47 Folien
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring von
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
456 views70 Folien
Machine Learning for Weather Forecasts von
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
3K views26 Folien

Más de inside-BigData.com(20)

The Incorporation of Machine Learning into Scientific Simulations at Lawrence... von inside-BigData.com
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com1.4K views
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod... von inside-BigData.com
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring von inside-BigData.com
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com456 views
Fugaku Supercomputer joins fight against COVID-19 von inside-BigData.com
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com2.1K views
Energy Efficient Computing using Dynamic Tuning von inside-BigData.com
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com454 views
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc... von inside-BigData.com
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
inside-BigData.com249 views
Scientific Applications and Heterogeneous Architectures von inside-BigData.com
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
inside-BigData.com165 views
SW/HW co-design for near-term quantum computing von inside-BigData.com
SW/HW co-design for near-term quantum computingSW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computing
inside-BigData.com222 views

Último

AMAZON PRODUCT RESEARCH.pdf von
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdfJerikkLaureta
26 views13 Folien
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors von
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensorssugiuralab
19 views15 Folien
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... von
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
37 views69 Folien
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf von
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdfDr. Jimmy Schwarzkopf
19 views29 Folien
virtual reality.pptx von
virtual reality.pptxvirtual reality.pptx
virtual reality.pptxG036GaikwadSnehal
11 views15 Folien
Tunable Laser (1).pptx von
Tunable Laser (1).pptxTunable Laser (1).pptx
Tunable Laser (1).pptxHajira Mahmood
24 views37 Folien

Último(20)

TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors von sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab19 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... von Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker37 views
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf von Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
Special_edition_innovator_2023.pdf von WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2217 views
Unit 1_Lecture 2_Physical Design of IoT.pdf von StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec12 views
Attacking IoT Devices from a Web Perspective - Linux Day von Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri16 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows von Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software263 views
Five Things You SHOULD Know About Postman von Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman33 views
Transcript: The Details of Description Techniques tips and tangents on altern... von BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada136 views

HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD

  • 1. HPC AT-SCALE ENABLED BY DDN A3I AND NVIDIA SUPERPOD William Beaudin Sr Director, Engineering wbeaudin@ddn.com
  • 2. DDN ©2020 DataDirect Networks, Inc. THE LEADER AT SCALE, PROVEN IN PRODUCTION 20 years of HPC experience The largest environments, the most exacting requirements DDN A3I Solutions – The World’s Fastest HPC Storage – Made Simple FASTEST PERFORMANCE, EFFORTLESS GROWTH Faster and deeper insight Complexity eliminated, seamless end-to-end integration RELIABLE, RESILIENT AND FLEXIBLE 24x7 Productivity The universal AI platform for all stages of the data cycle
  • 3. DDN ©2020 DataDirect Networks, Inc. Optimized AI Platforms For Every Use Case Accelerate applications by achieving full GPU saturation on DGX Streamline concurrent and continuous deep learning workflows Flexible configuration with best technology and economics Seamless scaling to match evolving workflow needs Optimized for DGX platforms and NGC containers for DL and HPC Easy to deploy and manage with turnkey support from DDN and partners ®
  • 4. DDN Confidential DDN ©2020 DataDirect Networks, Inc. DDN A3I X-APPLIANCES – THE BUILDING BLOCKS FOR DATA AT-SCALE ALL-NVME APPLIANCE FULLY-OPTIMIZED FOR THE MOST INTENSIVE WORKLOADS FAST, FLEXIBLE HYBRID APPLIANCE SCALES EFFORTLESSLY WITH BEST DENSITY FULLY VALIDATED AT-SCALE WITH NVIDIA DGX-2 SUPERPOD! AI7990XXAI400XX
  • 5. DDN ©2020 DataDirect Networks, Inc. DDN THE SUPERPOD ACCELERATOR AI400XX
  • 6. DDN ©2020 DataDirect Networks, Inc. DDN A3I APPLIANCES MAKE SUPERPOD FAST AND EASY TO DEPLOY ► Simplified design with predictable performance and capacity with future scaling ► Validated reference architectures for easier planning with at-scale workflows ► Fully-configured appliances arrive ready to deploy and install in minutes ► Seamless integration with NVIDIA DGX for moving rapidly to production ► Comprehensive expert services from DDN and partners delivered globally
  • 7. DDN ©2020 DataDirect Networks, Inc. HBA STORAGESAN SWITCHHBASERVERHCA/NICSWITCHHCA/NICCLIENT DDN A3I IO PATH SWITCHHCA/NICCLIENT SIMPLIFIED STACK WITH DDN A3I APPLIANCES FILESYSTEM COMMON IO PATH DDN A3I APPLIANCES REDUCE COST AND COMPLEXITY SIMPLE TO DEPLOY, MANAGE AND SCALE!
  • 8. DDN ©2020 DataDirect Networks, Inc. DDN A3I SHARED PARALLEL ARCHITECTURE DDN A3I ARCHITECTURE - TRUE END-TO-END PARALLELISMLEGACY NAS ARCHITECTURE NAS FILE SERVER NAS CLIENT EMBEDDED A3I SERVER EMBEDDED A3I SERVER NAS IS A BOTTLENECK CRIPPLES AS YOU GROW DDN IS FAST AND RELIABLE SCALES SEAMLESSLY AS YOU GROW A3I CLIENTA3I CLIENT EMBEDDED A3I SERVER EMBEDDED A3I SERVER
  • 9. DDN Confidential DDN ©2020 DataDirect Networks, Inc. INFINITE DGX POD PERFORMANCE 0 20000 40000 60000 80000 100000 120000 140000 8 16 32 64 128 IMAGESPERSECOND NUMBER OF GPUs NFS Storage DDN Storage At-scale multi-node distributed training application using 8 NVIDIA V100 GPUs per server engaged simultaneously. Results demonstrate linear scaling, with full application performance up to 128 GPUs. NFS architecture and protocol storage stalls at 4 nodes, event with all-flash disks and high-speed InfiniBand network. DDN A3I WITH DGX POD SCALES SEAMLESSLY WITHOUT BOUNDS NFS MAX DDN LIMITLESS SCALING
  • 10. DDN Confidential DDN ©2020 DataDirect Networks, Inc. DDN A3I SOLUTIONS – NVIDIA SUPERPOD REFERENCE ARCHITECTURE ► AI400 + DGX-2 SuperPOD at-scale testing and validation published by NVIDIA: • The AI400 All-flash appliance delivers incredible sequential and random read performance, as required by the heaviest DL workloads. • Metadata performance scales well from 1 to 96 nodes, with no degradation as the number of nodes and threads increases. • The AI400 is a fully-integrated platform that’s easy to deploy. DDN provides excellent technical deployment and support services. ► RA document available from NVIDIA website NVIDIA DGX-2 SUPERPOD REFERENCE ARCHITECTURE
  • 11. DDN Confidential DDN ©2020 DataDirect Networks, Inc. DDN A3I SCALES FLEXIBLY TO MATCH YOUR SUPERPOD ENVIRONMENT ALL-NVME MULTI-TIER MULTI-SITE ALL-FLASH SOLUTION FOR MAXIMUM PERFORMANCE HYBRID OPTIMIZED FOR MIXED WORKFLOWS DISTRIBUTED SYSTEMS PER DATACENTER CAPACITY MULTI-CLOUD FULL-CLOUD INTEGRATION AND BETWEEN CLOUDS
  • 12. DDN ©2020 DataDirect Networks, Inc. DDN A3I MULTIRAIL ENABLES PLUG-AND-PLAY HPC NETWORKING Fast, secure, resilient networking made easy Enhanced algorithm enables grouping of multiple network interfaces and achieve full aggregate throughput capabilities on a node. Intelligent interface selection and traffic management deliver unprecedented node performance and dynamic load-balancing. Active link health monitoring ensures rapid failure detection and automatic recovery. Multi-Rail Networking Discrete Networking DDN A3I Multirail greatly simplifies DGX deployments.
  • 13. DDN ©2020 DataDirect Networks, Inc. DDN A3I - GPUDIRECT TO STORAGE DOUBLES DGX-2 THROUGHPUT! 80 GB/s per client, 20 X performance gains Native client integration with A3I, fully- transparent to users and applications Enables a direct path to transfer data between GPU memory and data storage Eliminates unnecessary memory copies, lowers CPU overhead, reduces latency, bypasses hardware architecture limitations Improves AI, DL, HPC application performance 39.8 GB/s 80 GB/s 0 GB/s 10 GB/s 20 GB/s 30 GB/s 40 GB/s 50 GB/s 60 GB/s 70 GB/s 80 GB/s NO GPUDirect WITH GPUDirect GPU READ THROUGHPUT WITH TWO DDN AI400 CPU IO GPUDirect
  • 14. DDN Confidential DDN ©2020 DataDirect Networks, Inc. 5 DGX-2 10 AI400 400 GB/s 0 GB/s 50 GB/s 100GB/s 150GB/s 200GB/s 250GB/s 300GB/s 350GB/s 400GB/s 1 DGX-2 2 DGX-2 3 DGX-2 4 DGX-2 5 DGX-2 GPU READ THROUGHPUT SCALING WITH TEN AI400s ® DDN DELIVERS LINEAR PERFORMANCE SCALING DDN A3I GPUDirect integration delivers up to 80 GB/s of throughput per DGX-2 Enables a direct path to transfer data between GPU memory and data storage Performance scales linearly and provides maximum at-scale application acceleration
  • 15. DDN ©2020 DataDirect Networks, Inc. Preprocess SupercomputingMulti-Physics Workflows AMR Checkpoint Classify Manage Train Tune Infer User & Service Management Data Management Multi-Tenanted Security More Value from All your Data In One Place ► Your data in the right place at the right time ► Multicloud Ready ► The right user and service management to aid your environment’s efficiencies ► Comprehensive Security for Containers, Cloud and On Prem Transparent Tiering NFS SMB HDFSS3
  • 16. DDN Confidential DDN ©2020 DataDirect Networks, Inc. REAL-TIME EXASCALE ANALYTICS DDN A3I INTEGRATED MONITORING PLATFORM DELIVERS CURRENT AND HISTORICAL METRICS TO UNDERSTAND SYSTEM PERFORMANCE, OPTIMIZE OPERATIONS, AND ACHIEVE THE FULL POTENTIONAL OF YOUR APPLICATIONS. DISCOVER INSIGHTFUL TRENDS TO OPTIMIZE YOUR SUPERPOD
  • 17. DDN ©2020 DataDirect Networks, Inc. DDN A3I - PARALLEL DATA PATHS TO CONTAINERS Network Infrastructure Host Operating System Containerized Apps DDN A3I Container-Optimized Client DDN A3I Appliances NGC Full NVMe Performance, In-Container DDN A3I enables seamless, fastest file-level access to shared storage directly from containerized applications at runtime Compartmentalized data access and multi- tenancy with trusted levels of segregation Capability inserted at runtime with a universal wrapper and does not require any modification to application or container
  • 18. DDN ©2020 DataDirect Networks, Inc. DDN A3I: SCALE SECURELY WITH SUPERIOR VISIBILITY AND CONTROL AUTHENTICATION ACCESS CONTROL MULTITENANCY ENCRYPTION AUDITING Establish user and node identity with full confidence. Enforce policy and multiple levels of classification. Share infrastructure to enable limitless at-scale flexibility. Secure all your data end-to-end, live and at rest. Record and retain activity for review and compliance.
  • 19. DDN Confidential DDN ©2020 DataDirect Networks, Inc. SECURED INNOVATION WITH MULTITENANCY Shared access to high-performance infrastructure enables the most productive and efficient collaborations at-scale. Container-based authentication allows your business to securely and seamlessly deliver the right data to the right teams. DDN LIMITLESS FLEXIBILITY AND SCALABILITY FOR ALL WORKLOADS
  • 20. DDN Confidential DDN ©2020 DataDirect Networks, Inc. DDN A3I MASSIVELY ACCELERATES LIFE SCIENCES RESEARCH 100X FASTER GENOMICS 1600X FASTER MICROSCOPY
  • 21. DDN Confidential DDN ©2020 DataDirect Networks, Inc. 40X FASTER GENOMICS WITH DDN A3I SOLUTIONS AND NVIDIA DGX AT SCALE! • Robust national biobank improved through wider access to data and better performance through an expanding storage infrastructure • HPC computational resources: 300 compute node cluster, 3 shared memory compute nodes, and NVIDIA DGX systems • 29 PB of data and growing • 40X speedup of genomic analysis with DDN, NVIDIA and Parabricks
  • 22. DDN Confidential DDN ©2020 DataDirect Networks, Inc. DDN A3I ACCELERATED LIFE SCIENCES RESEARCH DATA SOLUTIONS STORE, PROCESS, ANALYZE, VISUALIZE FROM A SINGLE DATA PLATFORM CAPTURE IN REAL TIME FROM MULTIPLE INSTRUMENTS THE FASTEST TIME TO RESULTS
  • 23. DDN Confidential DDN ©2020 DataDirect Networks, Inc. TURNKEY SUPERPOD SOLUTIONS FROM DDN AND GLOBAL PARTNERS NVIDIA SuperPOD approved partners delivering completely integrated solutions with DDN A3I end-to-end workflow enablement and acceleration. Jade -- 1st SuperPOD Worldwide Oxford University and The Hartree Centre SuperPOD solution delivered by ATOS
  • 24. DDN Confidential DDN ©2020 DataDirect Networks, Inc. ACCELERATE YOUR SUPERPOD WITH AI400XX ACHIEVE FULL GPU PERFORMANCE GROW SEAMLESSLY TO EXASCALE BOOST WORKLOADS WITH GPUDIRECT ENABLE END-TO-END AI WORKFLOWS RELY ON THE AT-SCALE EXPERTS
  • 25. DDN ©2020 DataDirect Networks, Inc. DDN.COM/A3I