SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
THE CRAY SHASTA ARCHITECTURE:
DESIGNED FOR THE EXASCALE ERA
Steve Scott
SVP, Senior Fellow, and CTO for HPC & AI
March 3, 2020
All three worldwide announced Exascale systems are based on Cray Shasta
THE EXASCALE ERA IS UPON US
2COPYRIGHT 2020 HPE
It’s not just a new machine,
IT’S A NEW ERA
COPYRIGHT 2020 HPE 3
MAJOR TRENDS MOTIVATING THE SHASTA ARCHITECTURE
Hot and Heterogeneous
Simulation &
Modeling
Big Data &
Analytics
Artificial
Intelligence
Data
Intensive
Computing
4COPYRIGHT 2020 HPE
Platform for the Exascale Era
HPE-CRAY SHASTA
HPC AnalyticsAI
Dynamic, Cloud-like Environment for
Hybrid Workflows
Wide Diversity of Processors
Flexible, Efficient, & Extensible
Hardware Infrastructure
High-Performance, Tiered, Integrated Storage
Slingshot HPC Ethernet Interconnect
Cloud IoTData Management
5COPYRIGHT 2020 HPE
HPE ACQUISITION OF CRAY
• Closed September 25, 2019
• Organizations fully integrated within one month (we are one team)
• Product roadmaps reconciled and integrated before SC’19 in November 2019
• Fully merged January 1, 2020 (Cray subsidiary dissolved; brand continues)
6COPYRIGHT 2020 HPE
CRN Oct 16, 2019 – Best of Breed Conference 2019
https://www.crn.com/slide-shows/data-center/antonio-neri-outposts-is-
aws-bid-to-lock-data-in-public-cloud/1
“The reason we bought Cray is they
have the foundational technology in
the connect fabric and the software
stack to manage these data-
intensive workloads.
That ultimately manifests itself in
some sort of HPC cluster and in the
future an Exascale supercomputer.
You should expect us to take those
technologies which are designed for
scale, speed and latency into the
commercial space.”
CRAY TECHNOLOGIES WITHIN HPE
7COPYRIGHT 2020 HPE
SHASTA FLEXIBLE COMPUTE INFRASTRUCTURE
8
“Olympus”
Dense, scale-optimized Cabinet
• Direct warm water cooling
• Supports high-powered processors with high density
• Flexible, high-density interconnect
• Air cooled with liquid cooling options
• Wide range of available compute and storage
“Apollo”
Standard 19” Rack
Same Interconnect - Same Software Environment
COPYRIGHT 2020 HPE
Architected for maximum performance, density, efficiency, and scale
SHASTA OLYMPUS INFRASTRUCTURE
COPYRIGHT 2020 HPE 9
• Up to 64 compute blades, and 512
GPUs + 128 CPUs per cabinet
• Flexible bladed architecture supports
multiple generations of CPUs, GPUs,
and interconnect
• 100% direct liquid cooling enables
300KW capability per cabinet (later
up to 400KW)
• Up to 64 Slingshot switches per
cabinet
• Scales from one to 100’s of cabinets
2 nodes per
blade
To Slingshot
DDR
DDR
DDR
DDR
To Slingshot
AMD EPYC
(NERSC)
Nvidia GPU
(NERSC)
Intel Xe
(ANL)
AMD GPU
(ORNL)
2 nodes per
blade
To Slingshot
DDR DDR 4 nodes per
blade
10COPYRIGHT 2020 HPE
SLINGSHOT OVERVIEW
Slingshot is Cray’s 8th generation
scalable interconnect
Earlier, Cray pioneered:
• Adaptive routing
• High-radix switch design
• Dragonfly topology
64 ports x 200
Gbps
Over 250K endpoints
with a diameter of
just three hops
Ethernet
Compliant
Easy connectivity to
datacenters and
third-party storage;
“HPC inside”
World class
Adaptive Routing
and QoS
High utilization at
scale; flawless
support for hybrid
workloads
Efficient
Congestion
Control
Performance isolation
between workloads
Low, Uniform
Latency
Focus on tail latency,
because real apps
synchronize
11COPYRIGHT 2020 HPE
SLINGSHOT PACKAGING
PCIe NIC TOR Switch
Standard
Packaging
Rack mounted
Apollo & 3rd-party
Network card
Custom
Packaging
Dense
Liquid cooled
NIC Mezzanine
Cabling
12COPYRIGHT 2020 HPE
SLINGSHOT IS RUNNING AT SCALE AND ACHIEVING HIGH EFFICIENCY
0
5
10
15
20
25
30
Bandwidth(GB/sec)
Global Link Load – All-to-All Communication
Global Link Number
“Shandy” in-house system
• 8 groups
• 1024 nodes
• Dual CX5 injection per node
• 25 TB/s aggregate injection BW
• 50% global bandwidth taper
• 12.5 TB/s aggregate global BW
13COPYRIGHT 2020 HPE
SLINGSHOT CONGESTION MANAGEMENT
• Hardware automatically tracks all outstanding packets
• Knows what is flowing between every pair of endpoints
• Quickly identifies and controls causes of congestion
• Pushes back on sources… just enough
• Frees up buffer space for everyone else
• Other traffic not affected and can pass stalled traffic
• Avoids HOL blocking across entire fabric
• Fundamentally different than traditional ECN-based congestion control
• Fast and stable across wide variety of traffic patterns
• Suitable for dynamic HPC traffic
• Performance isolation between apps on same QoS class
• Applications much less vulnerable to other traffic on the network
• Predictable runtimes
• Lower mean and tail latency – a big benefit in apps with global synchronization
CONGESTION
MANAGEMENT
14COPYRIGHT 2020 HPE
CONGESTION MANAGEMENT PROVIDES PERFORMANCE ISOLATION
0
50
100
150
200
250
2
58
115
171
227
283
340
396
452
508
565
621
677
733
790
846
902
958
1015
1071
1127
1183
1240
1296
1352
1408
1465
1521
1577
1633
1690
1743
1800
1856
1912
1968
2025
2081
2137
2193
(Gb/s)
Simulation time (uSec)
All to All
Global Sync
Many to one
0
50
100
150
200
250
(Gb/s)
Avg egress BW / endpoint
Many to one
All to All
Global Sync
2 ms
0
50
100
150
200
250
2
58
115
171
227
283
340
396
452
508
565
621
677
733
790
846
902
958
1015
1071
1127
1183
1240
1296
1352
1408
1465
1521
1577
1633
1690
1743
1800
1856
1912
1968
2025
2081
2137
2193
(Gb/s)
Simulation time (uSec)
All to All
Global Sync
Many to one
0
50
100
150
200
250
(Gb/s)
All to All
Many to one
Global Sync
Job Interference in
today’s networks
Congesting (green)
traffic hurts well-
behaved (blue) traffic,
and really hurts latency-
sensitive, synchronized
(red) traffic.
100% peak
With Slingshot
Advanced
Congestion
Management
15COPYRIGHT 2020 HPE
(Global Performance and Congestion Network Tests)
NEW BENCHMARK: GPCNET
• Developed in collaboration with NERSC and ANL
• Publicly available at https://github.com/netbench/GPCNET
• Goals:
• Proxy real-world communication patterns
• Measure network performance under load
• Look at both mean and tail latency
• Look at interference between workloads
(How well does network perform congestion management?)
• Highly configurable to explore workloads of interest
• Benchmark outputs a rich set of metrics, including absolute and relative performance
COPYRIGHT 2020 HPE 16
CONGESTION IMPACT IN REAL SYSTEMS
CI = Latencycongested / Latencybaseline
0.1
1
10
100
1000
10000
Crystal Theta Edison Osprey Sierra Summit Malbec
696 Nodes 4096 Nodes 5575 Nodes 128 Nodes 4200 Nodes 4500 Nodes 485 Nodes
20 PPPort 16 PPPort 24 PPPort 20 PPPort 20 PPPort 21 PPPort 20 PPPort
Aries Aries Aries EDR EDR EDR SS10
CongestionImpact
Random Ring Latency Congestion Impact by System
Average 99% Tail
Aries EDR IB Slingshot
Crystal Theta Edison Osprey Sierra Summit Malbec
696 4,096 5,575 128 4,200 4,500 485
100% 50% 50% 100% 50% 100% 50%
System size (nodes):
Global network BW:
• Impact worsens with scale
and taper
• Infiniband does somewhat
better than Aries
• Slingshot does really well
20 16 24 20 20 21 20Processes per Network Port:
Line of
No Congestion
Impact
17COPYRIGHT 2020 HPE
SHASTA PULLS STORAGE ONTO SLINGSHOT NETWORK
Tiered Flash and HDD Servers
Traditional
Model
OSS
(HDD)
OSS
(HDD)
High Speed
Network
Storage Area
Network
LNET
LNET
Compute
Node
High Speed
Network
Shasta
ClusterStor E1000
OSS & MDS
(SSD)
~80 GB/s
OSS
(HDD)
~30 GB/s
Compute
Node
Compute
Node
Compute
Node
Benefits:
• Lower cost
• Lower complexity
• Lower latency
• Improved small I/O performance
18COPYRIGHT 2020 HPE
CLUSTERSTOR E1000 FLEXIBILITY
Extreme Perf (Flash) Hybrid Flexibility HDD Performance HDD Capacity
SSD Performance (read/write) 80 / 60 GB/s 80 / 60 GB/s
SSD Usable Capacity (3.2 TB) 55.3 TB 55.3 TB
HDD Performance 15 GB/s 30 GB/s 30 GB/s
HDD Usable Capacity (14TB) 1.07 PB 2.14 PB 4.27 PB
Network ports 6 x 200 Gbps 4 x 200 Gbps 2 x 200 Gbps 2 x 200 Gbps
Height Rack Units 2 6 10 18
Compared 2 x L300N (10RU) 15 times faster 15 times (flash), 0.7 (HDD) 50% faster 50% faster
Up to 120 GB/sec
Up to 10 PB usable capacity
Up to 1,600 GB/sec (read)
Up to 4.2 PB usable capacity
Per Rack
19COPYRIGHT 2020 HPE
SHASTA: A MORE OPEN, CUSTOMIZABLE STACK
Cray XC Stack:
Sleek…
Scalable…
Monolithic…
Shasta HW (storage, compute, networks)
Hardware support services
Infrastructure support services
Platform support services
Consumer
Cray Shasta System Management Stack:
Open, documented, RESTful APIs
Ability to substitute different components
Buildable source
20COPYRIGHT 2020 HPE
SHASTA SOFTWARE PLATFORM ARCHITECTURE
Cray Linux Environment
Linux + HPC Extensions
HPC Batch Job Mgmt. + Orchestration (Kubernetes)
Network and I/O Abstractions
Parallel Performance Libraries
Developer
Environment
Runtimes
Administrator System Services Developer Services
Cray Programming Environment
Shasta
Management
Services
Shasta
Monitoring
Framework
Linux Environment
Linux
Orchestration (Kubernetes)
Network and I/O Abstractions
Urika Manager
Cray Urika AI/Analytics Suite
Expanding the power of supercomputing with the
flexibility of cloud and full datacenter interoperability
Analytics Microservices
Analytics Libraries & Frameworks
Parallel Performance Libraries
Containerized Services Containerized ServicesContainerized Services
OpenAPIs
OpenAPIs
21COPYRIGHT 2020 HPE
OPTIONS TO MEET A FULL RANGE OF AS-A-SERVICE NEEDS
Consumption-based
Cloud Architecture
Managed Service, Off-Premises
Public Cloud Ecosystem
GreenLake
Flexible Capacity
HPC Platform as a
Service
Managed HPC as a
Service
HPC as a Service in the
Public Cloud
Strategic partner to manage end-to-end, Hybrid HPC and
AI portfolio across deployment and consumption models
• HPE GreenLake Flexible
Capacity
• Ready for BlueData, RedHat
OCP and Singularity
• HPCM (and APIs), VMware
and Cray’s software
environment
• As a service offerings
(Advania, ScaleMatrix,
Markley)
• Data center offerings
(Equinix, CyrusOne)
• SI partners (Accenture,
DXC)
• ClusterStor in Azure
• Cray in Azure for
Manufacturing
• Cray in Azure for EDA
22COPYRIGHT 2020 HPE
WE ARE ENTERING THE EXASCALE ERA
• HPC, Enterprise, and hyperscale are converging
• Data-centric, hybrid workflows: AI + analytics + HPC
• Growing complexity, but tremendous opportunity to extract value
• Need new infrastructure for these new workloads
• Shasta provides the infrastructure for the Exascale Era
• Chosen for all three announced exascale systems
• Flexibility (processors, storage, network, software)
• Extensibility (hardware and software)
• Scalability (Up to exascale, down to a single 19” rack)
• Standards-based (interoperable and open)
• Cloud-like software stack for dynamic, heterogeneous workloads
23COPYRIGHT 2020 HPE
THANK YOU
24COPYRIGHT 2020 HPE

Weitere ähnliche Inhalte

Mehr von inside-BigData.com

Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...inside-BigData.com
 
Adaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and EigensolversAdaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and Eigensolversinside-BigData.com
 
Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architecturesinside-BigData.com
 
SW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computingSW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computinginside-BigData.com
 
Deep Learning State of the Art (2020)
Deep Learning State of the Art (2020)Deep Learning State of the Art (2020)
Deep Learning State of the Art (2020)inside-BigData.com
 

Mehr von inside-BigData.com (20)

Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 
Data Parallel Deep Learning
Data Parallel Deep LearningData Parallel Deep Learning
Data Parallel Deep Learning
 
Making Supernovae with Jets
Making Supernovae with JetsMaking Supernovae with Jets
Making Supernovae with Jets
 
Adaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and EigensolversAdaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and Eigensolvers
 
Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
 
SW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computingSW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computing
 
FPGAs and Machine Learning
FPGAs and Machine LearningFPGAs and Machine Learning
FPGAs and Machine Learning
 
Deep Learning State of the Art (2020)
Deep Learning State of the Art (2020)Deep Learning State of the Art (2020)
Deep Learning State of the Art (2020)
 

Kürzlich hochgeladen

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

The Cray Shasta Architecture - Designed for the Exascale Era

  • 1. THE CRAY SHASTA ARCHITECTURE: DESIGNED FOR THE EXASCALE ERA Steve Scott SVP, Senior Fellow, and CTO for HPC & AI March 3, 2020
  • 2. All three worldwide announced Exascale systems are based on Cray Shasta THE EXASCALE ERA IS UPON US 2COPYRIGHT 2020 HPE
  • 3. It’s not just a new machine, IT’S A NEW ERA COPYRIGHT 2020 HPE 3
  • 4. MAJOR TRENDS MOTIVATING THE SHASTA ARCHITECTURE Hot and Heterogeneous Simulation & Modeling Big Data & Analytics Artificial Intelligence Data Intensive Computing 4COPYRIGHT 2020 HPE
  • 5. Platform for the Exascale Era HPE-CRAY SHASTA HPC AnalyticsAI Dynamic, Cloud-like Environment for Hybrid Workflows Wide Diversity of Processors Flexible, Efficient, & Extensible Hardware Infrastructure High-Performance, Tiered, Integrated Storage Slingshot HPC Ethernet Interconnect Cloud IoTData Management 5COPYRIGHT 2020 HPE
  • 6. HPE ACQUISITION OF CRAY • Closed September 25, 2019 • Organizations fully integrated within one month (we are one team) • Product roadmaps reconciled and integrated before SC’19 in November 2019 • Fully merged January 1, 2020 (Cray subsidiary dissolved; brand continues) 6COPYRIGHT 2020 HPE
  • 7. CRN Oct 16, 2019 – Best of Breed Conference 2019 https://www.crn.com/slide-shows/data-center/antonio-neri-outposts-is- aws-bid-to-lock-data-in-public-cloud/1 “The reason we bought Cray is they have the foundational technology in the connect fabric and the software stack to manage these data- intensive workloads. That ultimately manifests itself in some sort of HPC cluster and in the future an Exascale supercomputer. You should expect us to take those technologies which are designed for scale, speed and latency into the commercial space.” CRAY TECHNOLOGIES WITHIN HPE 7COPYRIGHT 2020 HPE
  • 8. SHASTA FLEXIBLE COMPUTE INFRASTRUCTURE 8 “Olympus” Dense, scale-optimized Cabinet • Direct warm water cooling • Supports high-powered processors with high density • Flexible, high-density interconnect • Air cooled with liquid cooling options • Wide range of available compute and storage “Apollo” Standard 19” Rack Same Interconnect - Same Software Environment COPYRIGHT 2020 HPE
  • 9. Architected for maximum performance, density, efficiency, and scale SHASTA OLYMPUS INFRASTRUCTURE COPYRIGHT 2020 HPE 9 • Up to 64 compute blades, and 512 GPUs + 128 CPUs per cabinet • Flexible bladed architecture supports multiple generations of CPUs, GPUs, and interconnect • 100% direct liquid cooling enables 300KW capability per cabinet (later up to 400KW) • Up to 64 Slingshot switches per cabinet • Scales from one to 100’s of cabinets
  • 10. 2 nodes per blade To Slingshot DDR DDR DDR DDR To Slingshot AMD EPYC (NERSC) Nvidia GPU (NERSC) Intel Xe (ANL) AMD GPU (ORNL) 2 nodes per blade To Slingshot DDR DDR 4 nodes per blade 10COPYRIGHT 2020 HPE
  • 11. SLINGSHOT OVERVIEW Slingshot is Cray’s 8th generation scalable interconnect Earlier, Cray pioneered: • Adaptive routing • High-radix switch design • Dragonfly topology 64 ports x 200 Gbps Over 250K endpoints with a diameter of just three hops Ethernet Compliant Easy connectivity to datacenters and third-party storage; “HPC inside” World class Adaptive Routing and QoS High utilization at scale; flawless support for hybrid workloads Efficient Congestion Control Performance isolation between workloads Low, Uniform Latency Focus on tail latency, because real apps synchronize 11COPYRIGHT 2020 HPE
  • 12. SLINGSHOT PACKAGING PCIe NIC TOR Switch Standard Packaging Rack mounted Apollo & 3rd-party Network card Custom Packaging Dense Liquid cooled NIC Mezzanine Cabling 12COPYRIGHT 2020 HPE
  • 13. SLINGSHOT IS RUNNING AT SCALE AND ACHIEVING HIGH EFFICIENCY 0 5 10 15 20 25 30 Bandwidth(GB/sec) Global Link Load – All-to-All Communication Global Link Number “Shandy” in-house system • 8 groups • 1024 nodes • Dual CX5 injection per node • 25 TB/s aggregate injection BW • 50% global bandwidth taper • 12.5 TB/s aggregate global BW 13COPYRIGHT 2020 HPE
  • 14. SLINGSHOT CONGESTION MANAGEMENT • Hardware automatically tracks all outstanding packets • Knows what is flowing between every pair of endpoints • Quickly identifies and controls causes of congestion • Pushes back on sources… just enough • Frees up buffer space for everyone else • Other traffic not affected and can pass stalled traffic • Avoids HOL blocking across entire fabric • Fundamentally different than traditional ECN-based congestion control • Fast and stable across wide variety of traffic patterns • Suitable for dynamic HPC traffic • Performance isolation between apps on same QoS class • Applications much less vulnerable to other traffic on the network • Predictable runtimes • Lower mean and tail latency – a big benefit in apps with global synchronization CONGESTION MANAGEMENT 14COPYRIGHT 2020 HPE
  • 15. CONGESTION MANAGEMENT PROVIDES PERFORMANCE ISOLATION 0 50 100 150 200 250 2 58 115 171 227 283 340 396 452 508 565 621 677 733 790 846 902 958 1015 1071 1127 1183 1240 1296 1352 1408 1465 1521 1577 1633 1690 1743 1800 1856 1912 1968 2025 2081 2137 2193 (Gb/s) Simulation time (uSec) All to All Global Sync Many to one 0 50 100 150 200 250 (Gb/s) Avg egress BW / endpoint Many to one All to All Global Sync 2 ms 0 50 100 150 200 250 2 58 115 171 227 283 340 396 452 508 565 621 677 733 790 846 902 958 1015 1071 1127 1183 1240 1296 1352 1408 1465 1521 1577 1633 1690 1743 1800 1856 1912 1968 2025 2081 2137 2193 (Gb/s) Simulation time (uSec) All to All Global Sync Many to one 0 50 100 150 200 250 (Gb/s) All to All Many to one Global Sync Job Interference in today’s networks Congesting (green) traffic hurts well- behaved (blue) traffic, and really hurts latency- sensitive, synchronized (red) traffic. 100% peak With Slingshot Advanced Congestion Management 15COPYRIGHT 2020 HPE
  • 16. (Global Performance and Congestion Network Tests) NEW BENCHMARK: GPCNET • Developed in collaboration with NERSC and ANL • Publicly available at https://github.com/netbench/GPCNET • Goals: • Proxy real-world communication patterns • Measure network performance under load • Look at both mean and tail latency • Look at interference between workloads (How well does network perform congestion management?) • Highly configurable to explore workloads of interest • Benchmark outputs a rich set of metrics, including absolute and relative performance COPYRIGHT 2020 HPE 16
  • 17. CONGESTION IMPACT IN REAL SYSTEMS CI = Latencycongested / Latencybaseline 0.1 1 10 100 1000 10000 Crystal Theta Edison Osprey Sierra Summit Malbec 696 Nodes 4096 Nodes 5575 Nodes 128 Nodes 4200 Nodes 4500 Nodes 485 Nodes 20 PPPort 16 PPPort 24 PPPort 20 PPPort 20 PPPort 21 PPPort 20 PPPort Aries Aries Aries EDR EDR EDR SS10 CongestionImpact Random Ring Latency Congestion Impact by System Average 99% Tail Aries EDR IB Slingshot Crystal Theta Edison Osprey Sierra Summit Malbec 696 4,096 5,575 128 4,200 4,500 485 100% 50% 50% 100% 50% 100% 50% System size (nodes): Global network BW: • Impact worsens with scale and taper • Infiniband does somewhat better than Aries • Slingshot does really well 20 16 24 20 20 21 20Processes per Network Port: Line of No Congestion Impact 17COPYRIGHT 2020 HPE
  • 18. SHASTA PULLS STORAGE ONTO SLINGSHOT NETWORK Tiered Flash and HDD Servers Traditional Model OSS (HDD) OSS (HDD) High Speed Network Storage Area Network LNET LNET Compute Node High Speed Network Shasta ClusterStor E1000 OSS & MDS (SSD) ~80 GB/s OSS (HDD) ~30 GB/s Compute Node Compute Node Compute Node Benefits: • Lower cost • Lower complexity • Lower latency • Improved small I/O performance 18COPYRIGHT 2020 HPE
  • 19. CLUSTERSTOR E1000 FLEXIBILITY Extreme Perf (Flash) Hybrid Flexibility HDD Performance HDD Capacity SSD Performance (read/write) 80 / 60 GB/s 80 / 60 GB/s SSD Usable Capacity (3.2 TB) 55.3 TB 55.3 TB HDD Performance 15 GB/s 30 GB/s 30 GB/s HDD Usable Capacity (14TB) 1.07 PB 2.14 PB 4.27 PB Network ports 6 x 200 Gbps 4 x 200 Gbps 2 x 200 Gbps 2 x 200 Gbps Height Rack Units 2 6 10 18 Compared 2 x L300N (10RU) 15 times faster 15 times (flash), 0.7 (HDD) 50% faster 50% faster Up to 120 GB/sec Up to 10 PB usable capacity Up to 1,600 GB/sec (read) Up to 4.2 PB usable capacity Per Rack 19COPYRIGHT 2020 HPE
  • 20. SHASTA: A MORE OPEN, CUSTOMIZABLE STACK Cray XC Stack: Sleek… Scalable… Monolithic… Shasta HW (storage, compute, networks) Hardware support services Infrastructure support services Platform support services Consumer Cray Shasta System Management Stack: Open, documented, RESTful APIs Ability to substitute different components Buildable source 20COPYRIGHT 2020 HPE
  • 21. SHASTA SOFTWARE PLATFORM ARCHITECTURE Cray Linux Environment Linux + HPC Extensions HPC Batch Job Mgmt. + Orchestration (Kubernetes) Network and I/O Abstractions Parallel Performance Libraries Developer Environment Runtimes Administrator System Services Developer Services Cray Programming Environment Shasta Management Services Shasta Monitoring Framework Linux Environment Linux Orchestration (Kubernetes) Network and I/O Abstractions Urika Manager Cray Urika AI/Analytics Suite Expanding the power of supercomputing with the flexibility of cloud and full datacenter interoperability Analytics Microservices Analytics Libraries & Frameworks Parallel Performance Libraries Containerized Services Containerized ServicesContainerized Services OpenAPIs OpenAPIs 21COPYRIGHT 2020 HPE
  • 22. OPTIONS TO MEET A FULL RANGE OF AS-A-SERVICE NEEDS Consumption-based Cloud Architecture Managed Service, Off-Premises Public Cloud Ecosystem GreenLake Flexible Capacity HPC Platform as a Service Managed HPC as a Service HPC as a Service in the Public Cloud Strategic partner to manage end-to-end, Hybrid HPC and AI portfolio across deployment and consumption models • HPE GreenLake Flexible Capacity • Ready for BlueData, RedHat OCP and Singularity • HPCM (and APIs), VMware and Cray’s software environment • As a service offerings (Advania, ScaleMatrix, Markley) • Data center offerings (Equinix, CyrusOne) • SI partners (Accenture, DXC) • ClusterStor in Azure • Cray in Azure for Manufacturing • Cray in Azure for EDA 22COPYRIGHT 2020 HPE
  • 23. WE ARE ENTERING THE EXASCALE ERA • HPC, Enterprise, and hyperscale are converging • Data-centric, hybrid workflows: AI + analytics + HPC • Growing complexity, but tremendous opportunity to extract value • Need new infrastructure for these new workloads • Shasta provides the infrastructure for the Exascale Era • Chosen for all three announced exascale systems • Flexibility (processors, storage, network, software) • Extensibility (hardware and software) • Scalability (Up to exascale, down to a single 19” rack) • Standards-based (interoperable and open) • Cloud-like software stack for dynamic, heterogeneous workloads 23COPYRIGHT 2020 HPE