SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
© 2019 Arm Limited
Brent Gorda
September 25th, 2019
Arm in HPC
2 © 2019 Arm Limited
© 2019 Arm Limited
• Parkinson’s & Osteoporosis
Ongoing research in Bristol: New Drugs ‘In Silico’
Images courtesy of Bristol University
3 © 2019 Arm Limited
© 2019 Arm Limited
Multiphysics Simulations: Fluid Dynamics, Heat Diffusion, Electromagnetics
Images courtesy of Bristol University
4 © 2019 Arm Limited
© 2019 Arm Limited
What is “Super” or “High Performance” Computing?
Lake Tahoe ~40 Trillion Gallons of water (4.0x10^12)
~2002 Supercomputers hit 40 Teraflops (Earth Simulator – Japan/NEC)
5 © 2019 Arm Limited
© 2019 Arm Limited
What is “Super” or “High Performance” Computing?
The Great Lakes hold ~6.5 Quadrillion gallons of water (6.5x10^15)
2008 Supercomputers hit 1 Petaflop 1.0x10^15 (US IBM Roadrunner)
6 © 2019 Arm Limited
Top500 systems over the past 25 years
1.00E-01
1.00E+00
1.00E+01
1.00E+02
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018
59.7 GFlop/s
422 MFlop/s
1.17 TFlop/s
149 PFlop/s
1.01 PFlop/s
1.56 EFlop/s
1 Gflop/s
1 Tflop/s
100 Mflop/s
100 Gflop/s
100 Tflop/s
10 Gflop/s
10 Tflop/s
1 Pflop/s
100 Pflop/s
10 Pflop/s
1 Eflop/s
SUM
N=1
N=500
Astra HPE/ArmEarth Simulator NEC ‘02 Jaguar AMD ‘09
Images courtesy of www.top500.org
7 © 2019 Arm Limited
© 2019 Arm Limited
These are not embedded devices:
Images courtesy of Bristol University
8 © 2019 Arm Limited
© 2019 Arm Limited
Mont-Blanc
The “legacy” Mont-Blanc vision
Denver, Nov 13th 2017Arm HPC User Group2
Vision: to leverage the fast growing market of mobile technology for
scientific computation, HPC and data centers.
2012 2013 2014 20162015 2017 2018
Mont-Blanc 2
Mont-Blanc 3
Early Research into the Efficacy of Arm for HPC
9 © 2019 Arm Limited
© 2019 Arm Limited
Catalyst UK: Accelerating ARM Adoption in UK
Industry PartnersProgram Goals
Measures of SuccessConfigs & Timeline
– Deployment: Deployment of HPC clusters at
multiple UK sites, supported for 3-year period
providing access to academia & industry
– Adoption: Early adoption of ARM for HPC in UK;
Apollo 70 Early Ship followed by customer collab.
– Applications: Customer-driven porting and opt
– Collaboration: Leveraging the success “Project
Comanche” model of customer-centric
collaboration; but based instead on Early Ship
HPE Apollo 70 product
– Exascale: Establish foundation for Exascale collab
UK Collaborations
Intended outcomes include:
– Critical HPC apps ported and demonstrated
– ISV engagements and demonstrations
– Demonstrated performance improvements
– Publications and follow-on collaborations
– Bugs filed, fixed & up-streamed to open source
– HPE: Apollo 70, HPE Performance Software - Cluster Manager, HPE
Performance Software – Message Passing Interface
– ARM: Allinea Studio (Compiler, Libraries, Forge-DDT & MAP),
OpenHPC
– Mellanox: OFED, HPC-X, OpenMPI, OpenSHMEM, MXM, SHArP
– SuSE: SLES, OpenStack, HPC Module
– Cavium: ThunderX2 SoC, technical support
– Qualcomm: Centriq SoC, technical support (tentative)
– EPCC: WRF, OpenFOAM, Rolls
Royce Hydra opt, 2 PhD candidates
– Leicester: Data-intensive apps,
genomics, MOAB Torque, DiRAC
collab
– Bristol: VASP, CASTEP, Gromacs,
CP2K, Unified Model, Hydra, NAMD,
Oasis, NEMO, OpenIFS, CASINO,
LAMMPS
– UK Government: Dept. for Bus.,
Energy & Industrial Strategy (BEIS)
Typical for each site:
– 64 Apollo 70
– Compute Nodes:
– Cavium 32c, 2.2 GHz
– 256GB memory (16GB
DIMMs)
– IB EDR CX5 Clos
– 4096+ cores
– 6 CL4300 (tentative)
– Services/Storage:
– Qualcomm Centriq
Sep-Dec: Structure
partnership, alignment
Jan: HPE/ARM SOW
Feb: Customer SoWs,
quotations, POs
Mar: SW stack validation (3rd
Party Runtime library)
Apr: Systems build, public
announcements
May: Delivery and acceptance
HPE will deliver >12,000 cores across 3
sites; amongst the largest ARM HPC
deployments in the world
HPE Confidential
Catalyst UK
10 © 2019 Arm Limited
© 2019 Arm Limited
Isambard The World’s First Arm-based Production Supercomputer
11 © 2019 Arm Limited
© 2019 Arm Limited
Vanguard Astra by HPE: #156 on top500
• 2,592 HPE Apollo 70 compute nodes
• 5,184 CPUs, 145,152 cores, 2.3 PFLOPs (peak)
• Marvell ThunderX2 ARM SoC, 28 core, 2.0 GHz
• Memory per node: 128 GB (16 x 8 GB DR DIMMs)
• Aggregate capacity: 332 TB, 885 TB/s (peak)
• Mellanox IB EDR, ConnectX-5
• 112 36-port edges, 3 648-port spine
switches
• Red Hat RHEL for Arm
• HPE Apollo 4520 All–flash Lustre storage
• Storage Capacity: 403 TB (usable)
• Storage Bandwidth: 244 GB/s
12 © 2019 Arm Limited
© 2019 Arm Limited
Exascale – the race underway at the high end
Projected Exascale System Dates
U.S.
▪ Sustained ES*: 2022-2023
▪ Peak ES: 2021
▪ ES Vendors: U.S.
▪ Processors: U.S. (some ARM?)
▪ Cost: $500M-$600M per system
(for early systems), plus heavy
R&D investments
52
China
▪ Sustained ES*: 2021-2022
▪ Peak ES: 2020
▪ Vendors: Chinese (multiple sites)
▪ Processors: Chinese (plus U.S.?)
▪ 13th 5-Year Plan
▪ Cost: $350-$500M per system,
plus heavy R&D
EU
▪ PEAK ES: 2023-2024
▪ Pre-ES: 2020-2022 (~$125M)
▪ Vendors: US and then European
▪ Processors: x86, ARM & RISC-V
▪ Initiatives: EuroHPC, EPI, ETP4HPC, JU
▪ Cost: Over $300M per system, plus heavy
R&D investments
Japan
▪ Sustained ES*: ~2021/2022
▪ Peak ES: Likely as a AI/ML/DL system
▪ Vendors: Japanese
▪ Processors: Japanese ARM
▪ Cost: ~$1B, this includes both 1 system
and the R&D costs
▪ They will also do many smaller size
systems
* 1 exaflops on a 64-bit real application 52© Hyperion Research
13 © 2019 Arm Limited
© 2019 Arm Limited
Exascale - Fujitsu A64FX
14 © 2019 Arm Limited
© 2019 Arm Limited
Exascale – European Processor Initiative
GPP AND COMMON ARCHITECTURE
9
ZEUS MPPA
eFPGA
FPGA
FPGA
ZEUS ZEUS
ZEUS ZEUS EPAC
HBM
memories
DDR
memories
PCIe gen5
links
HSL
links
D2D links
to adjacent chiplets
EPAC - EPI Accelerator (TITAN)
MPPA - Multi-Purpose Processing Array
eFPGA - embedded FPGA
Cryptographic ASIC (EU Sovereignty)
15 © 2019 Arm Limited
Arm HPC Software Ecosystem
ClusterManagementTools:
Bright,HPECMU,xCat,Warewulf
Linux OS Distro of choice:
RHEL, SUSE, CENTOS,…
Arm Server Ready Platform:
Standard OS compatible FW and RAS features
HPC Applications:
Open-source, Owned, and Commercial ISV codes
Job schedulers
and Resource
Management:
SLURM, IBM LSF,
Altair PBS Pro,
etc.
Programming
Languages:
Fortran, C, C++
via
GNU, LLVM, Arm
& OEMs
Debug and
performance
analysis tools:
Arm Forge,
Rogue Wave,
TAU, etc.
Filesystems:
BeeGFS,
LUSTRE, ZFS,
HDFS, GPFS
App/ISA specific optimizations, optimized libs and intrinsics:
Arm PL, BLAS, FFTW, etc.
Communication Stacks and run-times:
Mellanox IB/OFED/HPC-X, OpenMPI, MPICH, MVAPICH2, OpenSHMEM, OpenUCX, HPE MPI
Parallelism
standards:
OpenMP
(omp / gomp),
MPI, SHMEM
(see below)
User-space
utilities,
scripting,
containers, and
other packages:
Singularity,
Openstack,
OpenHPC,
Python, NumPy,
SciPy, etc.
16 © 2019 Arm Limited
Porting HPC apps to the Arm platforms
Ø The platform just works – porting in 2 days is the common experience
Build recipes online at https://gitlab.com/arm-hpc/packages/wikis/home
LAMMPS CESM2 MrBayes Bowtie
AMBER Paraview SIESTA UMNAMD
VASP MILCWRF GEANT4
Quantum
ESPRESSO
DL-Poly NEMOGAMESSOpenFOAM VisIT
QMCPACKAbinitBLAST NWCHEM BWA
GROMACS
Chem/Phys
Weather
CFD
Visualization
Genomics
17 © 2019 Arm Limited
© 2018 Arm Limited
Arm in IOT
We design & license IP, we do not
manufacture chips
Partners build products for their
target markets
One size does not fit for all
HPC is a great fit for
co-design and collaboration
Partnership is key Choice is good
21 billion chips in the past year
Mobile/Embedded/IoT/
Automotive/GPUs
And now … servers
Arm Technology Connects the World
18 © 2019 Arm Limited
© 2019 Arm Limited
Edge
Edge
Critical Data
Massive Amounts of Data
z
z
Edge
5G
CORTEX
HPC
Cloud
Data Centers
The New Architecture
19 © 2019 Arm Limited
Confidential © 2019 Arm Limited
• Historically strong focus on high-end systems and balance:
• B:F Ratio’s of the late 1990’s thru 2010
• Parallel processing at massive scale
• Low-latency / high BW interconnects
• Citing: S/W maintenance, roll-out, cooling/power
• Workloads:
•Historical workloads scientific simulation
•Recent new workloads attracted to “high-end” capabilities of
HPC architectures: big data, Deep Learning/AI
•HPC Leads in technology acceptance (think Formula-1)
HPC is an excellent partner for the ecosystem
HPC is an Architecture
20 © 2019 Arm Limited
© 2019 Arm Limited
Arm is Data driven, from the edge to the core
The Cloud to Edge Infrastructure Foundation
for a World of 1T Intelligent Devices
Thank You
Arm.com/hpc

Weitere ähnliche Inhalte

Was ist angesagt?

Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Deepak Shankar
 

Was ist angesagt? (20)

OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
TAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platformTAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platform
 
Phytium 64 core cpu preview
Phytium 64 core cpu previewPhytium 64 core cpu preview
Phytium 64 core cpu preview
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
 
Japan's post K Computer
Japan's post K ComputerJapan's post K Computer
Japan's post K Computer
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
 
Introduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K ComputerIntroduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K Computer
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Cluster
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
EMC in HPC – The Journey so far and the Road Ahead
EMC in HPC – The Journey so far and the Road AheadEMC in HPC – The Journey so far and the Road Ahead
EMC in HPC – The Journey so far and the Road Ahead
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
A Fresh Look at HPC from Huawei Enterprise
A Fresh Look at HPC from Huawei EnterpriseA Fresh Look at HPC from Huawei Enterprise
A Fresh Look at HPC from Huawei Enterprise
 
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center
 

Ähnlich wie An Update on Arm HPC

Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john mao
NAVER D2
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
NETWAYS
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Heiko Joerg Schick
 

Ähnlich wie An Update on Arm HPC (20)

OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
AMD It's Time to ROC
AMD It's Time to ROCAMD It's Time to ROC
AMD It's Time to ROC
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and Beyond
 
Deview 2013 rise of the wimpy machines - john mao
Deview 2013   rise of the wimpy machines - john maoDeview 2013   rise of the wimpy machines - john mao
Deview 2013 rise of the wimpy machines - john mao
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
 
How to get access to World Largest AI super Computer to do Advanced AI research
How to get access to World Largest AI super Computer to  do Advanced AI researchHow to get access to World Largest AI super Computer to  do Advanced AI research
How to get access to World Largest AI super Computer to do Advanced AI research
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
Arm - ceph on arm update
Arm - ceph on arm updateArm - ceph on arm update
Arm - ceph on arm update
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFV
 
EclipseOMRBuildingBlocks4Polyglot_TURBO18
EclipseOMRBuildingBlocks4Polyglot_TURBO18EclipseOMRBuildingBlocks4Polyglot_TURBO18
EclipseOMRBuildingBlocks4Polyglot_TURBO18
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer
 
OpenCAPI Technology Ecosystem
OpenCAPI Technology EcosystemOpenCAPI Technology Ecosystem
OpenCAPI Technology Ecosystem
 

Mehr von inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
inside-BigData.com
 

Mehr von inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 
Data Parallel Deep Learning
Data Parallel Deep LearningData Parallel Deep Learning
Data Parallel Deep Learning
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

An Update on Arm HPC

  • 1. © 2019 Arm Limited Brent Gorda September 25th, 2019 Arm in HPC
  • 2. 2 © 2019 Arm Limited © 2019 Arm Limited • Parkinson’s & Osteoporosis Ongoing research in Bristol: New Drugs ‘In Silico’ Images courtesy of Bristol University
  • 3. 3 © 2019 Arm Limited © 2019 Arm Limited Multiphysics Simulations: Fluid Dynamics, Heat Diffusion, Electromagnetics Images courtesy of Bristol University
  • 4. 4 © 2019 Arm Limited © 2019 Arm Limited What is “Super” or “High Performance” Computing? Lake Tahoe ~40 Trillion Gallons of water (4.0x10^12) ~2002 Supercomputers hit 40 Teraflops (Earth Simulator – Japan/NEC)
  • 5. 5 © 2019 Arm Limited © 2019 Arm Limited What is “Super” or “High Performance” Computing? The Great Lakes hold ~6.5 Quadrillion gallons of water (6.5x10^15) 2008 Supercomputers hit 1 Petaflop 1.0x10^15 (US IBM Roadrunner)
  • 6. 6 © 2019 Arm Limited Top500 systems over the past 25 years 1.00E-01 1.00E+00 1.00E+01 1.00E+02 1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 1.00E+08 1.00E+09 1.00E+10 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 59.7 GFlop/s 422 MFlop/s 1.17 TFlop/s 149 PFlop/s 1.01 PFlop/s 1.56 EFlop/s 1 Gflop/s 1 Tflop/s 100 Mflop/s 100 Gflop/s 100 Tflop/s 10 Gflop/s 10 Tflop/s 1 Pflop/s 100 Pflop/s 10 Pflop/s 1 Eflop/s SUM N=1 N=500 Astra HPE/ArmEarth Simulator NEC ‘02 Jaguar AMD ‘09 Images courtesy of www.top500.org
  • 7. 7 © 2019 Arm Limited © 2019 Arm Limited These are not embedded devices: Images courtesy of Bristol University
  • 8. 8 © 2019 Arm Limited © 2019 Arm Limited Mont-Blanc The “legacy” Mont-Blanc vision Denver, Nov 13th 2017Arm HPC User Group2 Vision: to leverage the fast growing market of mobile technology for scientific computation, HPC and data centers. 2012 2013 2014 20162015 2017 2018 Mont-Blanc 2 Mont-Blanc 3 Early Research into the Efficacy of Arm for HPC
  • 9. 9 © 2019 Arm Limited © 2019 Arm Limited Catalyst UK: Accelerating ARM Adoption in UK Industry PartnersProgram Goals Measures of SuccessConfigs & Timeline – Deployment: Deployment of HPC clusters at multiple UK sites, supported for 3-year period providing access to academia & industry – Adoption: Early adoption of ARM for HPC in UK; Apollo 70 Early Ship followed by customer collab. – Applications: Customer-driven porting and opt – Collaboration: Leveraging the success “Project Comanche” model of customer-centric collaboration; but based instead on Early Ship HPE Apollo 70 product – Exascale: Establish foundation for Exascale collab UK Collaborations Intended outcomes include: – Critical HPC apps ported and demonstrated – ISV engagements and demonstrations – Demonstrated performance improvements – Publications and follow-on collaborations – Bugs filed, fixed & up-streamed to open source – HPE: Apollo 70, HPE Performance Software - Cluster Manager, HPE Performance Software – Message Passing Interface – ARM: Allinea Studio (Compiler, Libraries, Forge-DDT & MAP), OpenHPC – Mellanox: OFED, HPC-X, OpenMPI, OpenSHMEM, MXM, SHArP – SuSE: SLES, OpenStack, HPC Module – Cavium: ThunderX2 SoC, technical support – Qualcomm: Centriq SoC, technical support (tentative) – EPCC: WRF, OpenFOAM, Rolls Royce Hydra opt, 2 PhD candidates – Leicester: Data-intensive apps, genomics, MOAB Torque, DiRAC collab – Bristol: VASP, CASTEP, Gromacs, CP2K, Unified Model, Hydra, NAMD, Oasis, NEMO, OpenIFS, CASINO, LAMMPS – UK Government: Dept. for Bus., Energy & Industrial Strategy (BEIS) Typical for each site: – 64 Apollo 70 – Compute Nodes: – Cavium 32c, 2.2 GHz – 256GB memory (16GB DIMMs) – IB EDR CX5 Clos – 4096+ cores – 6 CL4300 (tentative) – Services/Storage: – Qualcomm Centriq Sep-Dec: Structure partnership, alignment Jan: HPE/ARM SOW Feb: Customer SoWs, quotations, POs Mar: SW stack validation (3rd Party Runtime library) Apr: Systems build, public announcements May: Delivery and acceptance HPE will deliver >12,000 cores across 3 sites; amongst the largest ARM HPC deployments in the world HPE Confidential Catalyst UK
  • 10. 10 © 2019 Arm Limited © 2019 Arm Limited Isambard The World’s First Arm-based Production Supercomputer
  • 11. 11 © 2019 Arm Limited © 2019 Arm Limited Vanguard Astra by HPE: #156 on top500 • 2,592 HPE Apollo 70 compute nodes • 5,184 CPUs, 145,152 cores, 2.3 PFLOPs (peak) • Marvell ThunderX2 ARM SoC, 28 core, 2.0 GHz • Memory per node: 128 GB (16 x 8 GB DR DIMMs) • Aggregate capacity: 332 TB, 885 TB/s (peak) • Mellanox IB EDR, ConnectX-5 • 112 36-port edges, 3 648-port spine switches • Red Hat RHEL for Arm • HPE Apollo 4520 All–flash Lustre storage • Storage Capacity: 403 TB (usable) • Storage Bandwidth: 244 GB/s
  • 12. 12 © 2019 Arm Limited © 2019 Arm Limited Exascale – the race underway at the high end Projected Exascale System Dates U.S. ▪ Sustained ES*: 2022-2023 ▪ Peak ES: 2021 ▪ ES Vendors: U.S. ▪ Processors: U.S. (some ARM?) ▪ Cost: $500M-$600M per system (for early systems), plus heavy R&D investments 52 China ▪ Sustained ES*: 2021-2022 ▪ Peak ES: 2020 ▪ Vendors: Chinese (multiple sites) ▪ Processors: Chinese (plus U.S.?) ▪ 13th 5-Year Plan ▪ Cost: $350-$500M per system, plus heavy R&D EU ▪ PEAK ES: 2023-2024 ▪ Pre-ES: 2020-2022 (~$125M) ▪ Vendors: US and then European ▪ Processors: x86, ARM & RISC-V ▪ Initiatives: EuroHPC, EPI, ETP4HPC, JU ▪ Cost: Over $300M per system, plus heavy R&D investments Japan ▪ Sustained ES*: ~2021/2022 ▪ Peak ES: Likely as a AI/ML/DL system ▪ Vendors: Japanese ▪ Processors: Japanese ARM ▪ Cost: ~$1B, this includes both 1 system and the R&D costs ▪ They will also do many smaller size systems * 1 exaflops on a 64-bit real application 52© Hyperion Research
  • 13. 13 © 2019 Arm Limited © 2019 Arm Limited Exascale - Fujitsu A64FX
  • 14. 14 © 2019 Arm Limited © 2019 Arm Limited Exascale – European Processor Initiative GPP AND COMMON ARCHITECTURE 9 ZEUS MPPA eFPGA FPGA FPGA ZEUS ZEUS ZEUS ZEUS EPAC HBM memories DDR memories PCIe gen5 links HSL links D2D links to adjacent chiplets EPAC - EPI Accelerator (TITAN) MPPA - Multi-Purpose Processing Array eFPGA - embedded FPGA Cryptographic ASIC (EU Sovereignty)
  • 15. 15 © 2019 Arm Limited Arm HPC Software Ecosystem ClusterManagementTools: Bright,HPECMU,xCat,Warewulf Linux OS Distro of choice: RHEL, SUSE, CENTOS,… Arm Server Ready Platform: Standard OS compatible FW and RAS features HPC Applications: Open-source, Owned, and Commercial ISV codes Job schedulers and Resource Management: SLURM, IBM LSF, Altair PBS Pro, etc. Programming Languages: Fortran, C, C++ via GNU, LLVM, Arm & OEMs Debug and performance analysis tools: Arm Forge, Rogue Wave, TAU, etc. Filesystems: BeeGFS, LUSTRE, ZFS, HDFS, GPFS App/ISA specific optimizations, optimized libs and intrinsics: Arm PL, BLAS, FFTW, etc. Communication Stacks and run-times: Mellanox IB/OFED/HPC-X, OpenMPI, MPICH, MVAPICH2, OpenSHMEM, OpenUCX, HPE MPI Parallelism standards: OpenMP (omp / gomp), MPI, SHMEM (see below) User-space utilities, scripting, containers, and other packages: Singularity, Openstack, OpenHPC, Python, NumPy, SciPy, etc.
  • 16. 16 © 2019 Arm Limited Porting HPC apps to the Arm platforms Ø The platform just works – porting in 2 days is the common experience Build recipes online at https://gitlab.com/arm-hpc/packages/wikis/home LAMMPS CESM2 MrBayes Bowtie AMBER Paraview SIESTA UMNAMD VASP MILCWRF GEANT4 Quantum ESPRESSO DL-Poly NEMOGAMESSOpenFOAM VisIT QMCPACKAbinitBLAST NWCHEM BWA GROMACS Chem/Phys Weather CFD Visualization Genomics
  • 17. 17 © 2019 Arm Limited © 2018 Arm Limited Arm in IOT We design & license IP, we do not manufacture chips Partners build products for their target markets One size does not fit for all HPC is a great fit for co-design and collaboration Partnership is key Choice is good 21 billion chips in the past year Mobile/Embedded/IoT/ Automotive/GPUs And now … servers Arm Technology Connects the World
  • 18. 18 © 2019 Arm Limited © 2019 Arm Limited Edge Edge Critical Data Massive Amounts of Data z z Edge 5G CORTEX HPC Cloud Data Centers The New Architecture
  • 19. 19 © 2019 Arm Limited Confidential © 2019 Arm Limited • Historically strong focus on high-end systems and balance: • B:F Ratio’s of the late 1990’s thru 2010 • Parallel processing at massive scale • Low-latency / high BW interconnects • Citing: S/W maintenance, roll-out, cooling/power • Workloads: •Historical workloads scientific simulation •Recent new workloads attracted to “high-end” capabilities of HPC architectures: big data, Deep Learning/AI •HPC Leads in technology acceptance (think Formula-1) HPC is an excellent partner for the ecosystem HPC is an Architecture
  • 20. 20 © 2019 Arm Limited © 2019 Arm Limited Arm is Data driven, from the edge to the core
  • 21. The Cloud to Edge Infrastructure Foundation for a World of 1T Intelligent Devices Thank You Arm.com/hpc