SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
© 2017 Arm Limited
Arm Architecture HPC Workshop
Linaro, HiSilicon, Huawei
HPC Network
Stack on Arm
Pavel Shamis/Pasha, Principal Research
Engineer
© 2017 Arm Limited
Arm Architecture HPC Workshop
Linaro, HiSilicon, Huawei
HPC Network
Stack on Arm
Pavel Shamis/Pasha, Principal Research
Engineer
© 2017 Arm Limited
Arm Architecture HPC Workshop
Linaro, HiSilicon, Huawei
Let’s talk about
MPI
Pavel Shamis/Pasha, Principal Research
Engineer
© 2017 Arm Limited
Let’s talk about
RDMA…
© 2017 Arm Limited5
VERBs API on Arm
• Besides bug fixes not much work was required
• Mellanox OFED 2.4 and above supports Arm
• Linux Kernel 4.5.0 and above (maybe even earlier)
• Linux Distribution Support – on going process
• OFED – no official ARMv8 support
© 2017 Arm Limited6
OpenUCX v1.3.0
• https://github.com/openucx/ucx/releases/tag/v1.3.0
WWW.OPENUCX.ORG
https://github.com/openucx/ucx
UC-T (Hardware Transports) - Low Level API
RMA, Atomic, Tag-matching, Send/Recv, Active Message
Transport for InfiniBand VERBs
driver
RC UD XRC DCT
Transport for intra-node host memory communication
SYSV POSIX KNEM CMA XPMEM
Transport for
Accelerator Memory
communucation
GPU
Transport for
Gemini/Aries
drivers
GNI
UC-S
(Services)
Common utilities
UC-P (Protocols) - High Level API
Transport selection, cross-transrport multi-rail, fragmentation, operations not supported by hardware
Message Passing API Domain:
tag matching, randevouze
PGAS API Domain:
RMAs, Atomics
Task Based API Domain:
Active Messages
I/O API Domain:
Stream
Utilities
Data
stractures
Hardware
MPICH, Open-MPI, etc.
OpenSHMEM, UPC, CAF, X10,
Chapel, etc.
Parsec, OCR, Legions, etc. Burst buffer, ADIOS, etc.
Applications
UCX
Memory
Management
OFA Verbs Driver Cray Driver OS Kernel Cuda
© 2017 Arm Limited7
UCX Features
• Support for InfiniBand and RoCE: RC, UD, DC
• InfiniBand: Hardware tag-matching, extended AMOs, etc.
• Support for GPU Devices/Memory – AMD ROCM, Nvidia CUDA
• Support for TCP (Beta)
• Java bindings (Beta)
• Support for Accelerated Verbs – 40% speedup on Arm compared to vanilla Verbs
• Support for UGNI API for Aries and Gemini (Thanks to ORNL and LANL!)
• Support for Shared Memory: KNEM, CMA, XPMEM, Posix, SySV
• Support for x86, ARMv8, Power
• Efficient memory polling – 36% increase in efficiency on Arm
• UCX interface is integrated with MPICH, OpenMPI, OSHMEM, ORNL-SHMEM, OSSS-SHMEM,
etc.
© 2017 Arm Limited
Let’s talk about MPI…
© 2017 Arm Limited9
Message Passing Interface - MPI
De-facto standard developed by HPC community - https://www.mpi-forum.org/
Excellent overview of MPI by Jeff Squyres - https://www.slideshare.net/jsquyres/the-
message-passing-interface-mpi-in-laymans-terms
Node 1 Node 2
© 2017 Arm Limited10
Programing models
MPICH 3.3b – works on ARMv8
Open MPI 3.x – works on ARMv8
MVAPICH 2.3b – works on ARMv8
OSHMEM – work on ARMv8
© 2017 Arm Limited11
HPE Comanche (Apollo 70) with Cavium Thunder X2 SINGLE core,
Mellanox ConnextX-4 100Gb/s (EDR) - Bandwidth
0
2000
4000
6000
8000
10000
12000
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
MB/s
Message Size
MPI Bandwidth
Higherisbetter
© 2017 Arm Limited12
0
1
2
3
4
5
6
7
8
9
0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192
Micro-Second
Message Size
MPI Latency
HPE Comanche (Apollo 70) with Cavium Thunder X2, Mellanox
ConnectX-4 100Gb/s (EDR) – Latency/Ping Pong
Lowerisbetter
© 2017 Arm Limited13
HPE Comanche (Apollo 70) with Cavium Thunder X2, Mellanox
ConnectX-4 100Gb/s (EDR) – MPI Message Rate (28 cores)
0
10
20
30
40
50
60
70
80
90
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
MessagePerSecond
Millions
Message Size
MPI Message Rate
Higherisbetter
© 2017 Arm Limited14
Lets talk about scale…
© 2017 Arm Limited15
MPI is HUGE!
• MPI Language bindings and compiler wrappers – C, Fortran
• MPI runtimes – PMI, PMI-2, ORTE, PMIX, etc.
• MPI Network layer – supports every possible and impossible exotic network device
Open MPI
MVAPICH
Over 1,200,000 code lines !
© 2017 Arm Limited16
Let’s talk about testing …
Thanks to Howard Pritchard (LANL) and Hari Subramoni (MVAPICH) for
providing this information
Open MPI - 150,000 (aggregated) nightly tests !
MVAPICH – 18,000 Core hours (750 days) per-
commit
© 2017 Arm Limited17
Open MPI testing
CPU arch: CPU types: x86_64 , ARMv8, PPC64le
Compilers: clang37, clang38, gcc (multiple versions, 32/64bit), ibm xlc , absoft 18.0,
armclang (18.3)
Distros: AWS linux 17.03, CentOS, CrayCLE, etc
Thanks to Howard Pritchard (LANL) for providing this information
Arch x Compiler x OS x Interconnect x MPI configurations
© 2017 Arm Limited18
How YOU can help ?
• Providing hardware and software
• Running CI for MPI projects (Jenkins, etc.)
• Running distributed/scale regression (MTT, etc)
• Been active member of the community (mail lists, github, etc)
• https://www.open-mpi.org
• https://www.mpich.org
• http://mvapich.cse.ohio-state.edu
• Answer user questions, fix bugs, etc
• Contribute new features and optimizations !
• Participate in MPI forum - https://www.mpi-forum.org
© 2017 Arm Limited19
HPCAC - http://hpcadvisorycouncil.com/
2020 © 2017 Arm Limited
The Arm trademarks featured in this presentation are registered trademarks or
trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights
reserved. All other marks featured may be trademarks of their respective owners.
www.arm.com/company/policies/trademarks
2121
Thank You!
Danke!
Merci!
!
!
Gracias!
Kiitos!
감사합니다
ध"यवाद
© 2017 Arm Limited

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
OpenPOWER foundation update new executive director and bright open future_i...
OpenPOWER  foundation update  new executive director and bright open future_i...OpenPOWER  foundation update  new executive director and bright open future_i...
OpenPOWER foundation update new executive director and bright open future_i...Ganesan Narayanasamy
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)Drew Fustini
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureBruno Cornec
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...Linaro
 
BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64Linaro
 
IPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishIPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishBruno Cornec
 
Isn’t it Ironic that a Redfish is software defining you
Isn’t it Ironic that a Redfish is software defining you Isn’t it Ironic that a Redfish is software defining you
Isn’t it Ironic that a Redfish is software defining you Bruno Cornec
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopLinaro
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISAGanesan Narayanasamy
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemLinaro
 
Copr HD OpenStack Day India
Copr HD OpenStack Day IndiaCopr HD OpenStack Day India
Copr HD OpenStack Day Indiaopenstackindia
 
Seagate - ceph day taiwan 2017 opening session
Seagate - ceph day taiwan 2017 opening sessionSeagate - ceph day taiwan 2017 opening session
Seagate - ceph day taiwan 2017 opening sessioninwin stack
 
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...Linaro
 
LAS16-200: SCMI - System Management and Control Interface
LAS16-200:  SCMI - System Management and Control InterfaceLAS16-200:  SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control InterfaceLinaro
 
2013 linux days final
2013 linux days final2013 linux days final
2013 linux days finalRandomShare
 
AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...
AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...
AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...AWS Summits
 

Was ist angesagt? (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
OpenPOWER foundation update new executive director and bright open future_i...
OpenPOWER  foundation update  new executive director and bright open future_i...OpenPOWER  foundation update  new executive director and bright open future_i...
OpenPOWER foundation update new executive director and bright open future_i...
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
 
BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64
 
IPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishIPMI is dead, Long live Redfish
IPMI is dead, Long live Redfish
 
Isn’t it Ironic that a Redfish is software defining you
Isn’t it Ironic that a Redfish is software defining you Isn’t it Ironic that a Redfish is software defining you
Isn’t it Ironic that a Redfish is software defining you
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing Hadoop
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem
 
Copr HD OpenStack Day India
Copr HD OpenStack Day IndiaCopr HD OpenStack Day India
Copr HD OpenStack Day India
 
Seagate - ceph day taiwan 2017 opening session
Seagate - ceph day taiwan 2017 opening sessionSeagate - ceph day taiwan 2017 opening session
Seagate - ceph day taiwan 2017 opening session
 
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
 
LAS16-200: SCMI - System Management and Control Interface
LAS16-200:  SCMI - System Management and Control InterfaceLAS16-200:  SCMI - System Management and Control Interface
LAS16-200: SCMI - System Management and Control Interface
 
2013 linux days final
2013 linux days final2013 linux days final
2013 linux days final
 
AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...
AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...
AWS Summit Singapore 2019 | Latest Trends for Cloud-Native Application Develo...
 

Ähnlich wie HPC network stack on ARM - Linaro HPC Workshop 2018

UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondEd Dodds
 
Exploring the Open Source Linux Ecosystem
Exploring the Open Source Linux EcosystemExploring the Open Source Linux Ecosystem
Exploring the Open Source Linux EcosystemIBM
 
Ucx an open source framework for hpc network ap is and beyond
Ucx  an open source framework for hpc network ap is and beyondUcx  an open source framework for hpc network ap is and beyond
Ucx an open source framework for hpc network ap is and beyondinside-BigData.com
 
Arm as a Viable Architecture for HPC and AI
Arm as a Viable Architecture for HPC and AIArm as a Viable Architecture for HPC and AI
Arm as a Viable Architecture for HPC and AIinside-BigData.com
 
Arm - ceph on arm update
Arm - ceph on arm updateArm - ceph on arm update
Arm - ceph on arm updateinwin stack
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningHagay Lupesko
 
High-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale SystemsHigh-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale Systemsinside-BigData.com
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinarseanhefty
 
Introducing ucx unified communications x framework
Introducing ucx unified communications x frameworkIntroducing ucx unified communications x framework
Introducing ucx unified communications x frameworkinside-BigData.com
 
Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418inside-BigData.com
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale SystemsDesigning Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systemsinside-BigData.com
 
Efficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesEfficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesArm
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...Edge AI and Vision Alliance
 
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...Ceph Community
 
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Arm
 

Ähnlich wie HPC network stack on ARM - Linaro HPC Workshop 2018 (20)

UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and Beyond
 
RDMA on ARM
RDMA on ARMRDMA on ARM
RDMA on ARM
 
Exploring the Open Source Linux Ecosystem
Exploring the Open Source Linux EcosystemExploring the Open Source Linux Ecosystem
Exploring the Open Source Linux Ecosystem
 
Ucx an open source framework for hpc network ap is and beyond
Ucx  an open source framework for hpc network ap is and beyondUcx  an open source framework for hpc network ap is and beyond
Ucx an open source framework for hpc network ap is and beyond
 
Arm as a Viable Architecture for HPC and AI
Arm as a Viable Architecture for HPC and AIArm as a Viable Architecture for HPC and AI
Arm as a Viable Architecture for HPC and AI
 
ARM HPC Ecosystem
ARM HPC EcosystemARM HPC Ecosystem
ARM HPC Ecosystem
 
Arm - ceph on arm update
Arm - ceph on arm updateArm - ceph on arm update
Arm - ceph on arm update
 
An Update on Arm HPC
An Update on Arm HPCAn Update on Arm HPC
An Update on Arm HPC
 
Arm in HPC
Arm in HPCArm in HPC
Arm in HPC
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep Learning
 
High-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale SystemsHigh-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale Systems
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinar
 
Introducing ucx unified communications x framework
Introducing ucx unified communications x frameworkIntroducing ucx unified communications x framework
Introducing ucx unified communications x framework
 
Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale SystemsDesigning Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Efficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesEfficient software development with heterogeneous devices
Efficient software development with heterogeneous devices
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
 
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
 
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
 

Mehr von Linaro

Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMULinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...Linaro
 
HKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramHKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramLinaro
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNLinaro
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...Linaro
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...Linaro
 
HKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: IntroductionHKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: IntroductionLinaro
 
HKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 ServersHKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 ServersLinaro
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightLinaro
 
HKG18-TR12 - LAVA for LITE Platforms and Tests
HKG18-TR12 - LAVA for LITE Platforms and TestsHKG18-TR12 - LAVA for LITE Platforms and Tests
HKG18-TR12 - LAVA for LITE Platforms and TestsLinaro
 
HKG18-419 - OpenHPC on Ansible
HKG18-419 - OpenHPC on AnsibleHKG18-419 - OpenHPC on Ansible
HKG18-419 - OpenHPC on AnsibleLinaro
 

Mehr von Linaro (19)

Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 
HKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramHKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready Program
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NN
 
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integra...
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
 
HKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: IntroductionHKG18-212 - Trusted Firmware M: Introduction
HKG18-212 - Trusted Firmware M: Introduction
 
HKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 ServersHKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 Servers
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
HKG18-TR12 - LAVA for LITE Platforms and Tests
HKG18-TR12 - LAVA for LITE Platforms and TestsHKG18-TR12 - LAVA for LITE Platforms and Tests
HKG18-TR12 - LAVA for LITE Platforms and Tests
 
HKG18-419 - OpenHPC on Ansible
HKG18-419 - OpenHPC on AnsibleHKG18-419 - OpenHPC on Ansible
HKG18-419 - OpenHPC on Ansible
 

Kürzlich hochgeladen

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Kürzlich hochgeladen (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

HPC network stack on ARM - Linaro HPC Workshop 2018

  • 1. © 2017 Arm Limited Arm Architecture HPC Workshop Linaro, HiSilicon, Huawei HPC Network Stack on Arm Pavel Shamis/Pasha, Principal Research Engineer
  • 2. © 2017 Arm Limited Arm Architecture HPC Workshop Linaro, HiSilicon, Huawei HPC Network Stack on Arm Pavel Shamis/Pasha, Principal Research Engineer
  • 3. © 2017 Arm Limited Arm Architecture HPC Workshop Linaro, HiSilicon, Huawei Let’s talk about MPI Pavel Shamis/Pasha, Principal Research Engineer
  • 4. © 2017 Arm Limited Let’s talk about RDMA…
  • 5. © 2017 Arm Limited5 VERBs API on Arm • Besides bug fixes not much work was required • Mellanox OFED 2.4 and above supports Arm • Linux Kernel 4.5.0 and above (maybe even earlier) • Linux Distribution Support – on going process • OFED – no official ARMv8 support
  • 6. © 2017 Arm Limited6 OpenUCX v1.3.0 • https://github.com/openucx/ucx/releases/tag/v1.3.0 WWW.OPENUCX.ORG https://github.com/openucx/ucx UC-T (Hardware Transports) - Low Level API RMA, Atomic, Tag-matching, Send/Recv, Active Message Transport for InfiniBand VERBs driver RC UD XRC DCT Transport for intra-node host memory communication SYSV POSIX KNEM CMA XPMEM Transport for Accelerator Memory communucation GPU Transport for Gemini/Aries drivers GNI UC-S (Services) Common utilities UC-P (Protocols) - High Level API Transport selection, cross-transrport multi-rail, fragmentation, operations not supported by hardware Message Passing API Domain: tag matching, randevouze PGAS API Domain: RMAs, Atomics Task Based API Domain: Active Messages I/O API Domain: Stream Utilities Data stractures Hardware MPICH, Open-MPI, etc. OpenSHMEM, UPC, CAF, X10, Chapel, etc. Parsec, OCR, Legions, etc. Burst buffer, ADIOS, etc. Applications UCX Memory Management OFA Verbs Driver Cray Driver OS Kernel Cuda
  • 7. © 2017 Arm Limited7 UCX Features • Support for InfiniBand and RoCE: RC, UD, DC • InfiniBand: Hardware tag-matching, extended AMOs, etc. • Support for GPU Devices/Memory – AMD ROCM, Nvidia CUDA • Support for TCP (Beta) • Java bindings (Beta) • Support for Accelerated Verbs – 40% speedup on Arm compared to vanilla Verbs • Support for UGNI API for Aries and Gemini (Thanks to ORNL and LANL!) • Support for Shared Memory: KNEM, CMA, XPMEM, Posix, SySV • Support for x86, ARMv8, Power • Efficient memory polling – 36% increase in efficiency on Arm • UCX interface is integrated with MPICH, OpenMPI, OSHMEM, ORNL-SHMEM, OSSS-SHMEM, etc.
  • 8. © 2017 Arm Limited Let’s talk about MPI…
  • 9. © 2017 Arm Limited9 Message Passing Interface - MPI De-facto standard developed by HPC community - https://www.mpi-forum.org/ Excellent overview of MPI by Jeff Squyres - https://www.slideshare.net/jsquyres/the- message-passing-interface-mpi-in-laymans-terms Node 1 Node 2
  • 10. © 2017 Arm Limited10 Programing models MPICH 3.3b – works on ARMv8 Open MPI 3.x – works on ARMv8 MVAPICH 2.3b – works on ARMv8 OSHMEM – work on ARMv8
  • 11. © 2017 Arm Limited11 HPE Comanche (Apollo 70) with Cavium Thunder X2 SINGLE core, Mellanox ConnextX-4 100Gb/s (EDR) - Bandwidth 0 2000 4000 6000 8000 10000 12000 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 MB/s Message Size MPI Bandwidth Higherisbetter
  • 12. © 2017 Arm Limited12 0 1 2 3 4 5 6 7 8 9 0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 Micro-Second Message Size MPI Latency HPE Comanche (Apollo 70) with Cavium Thunder X2, Mellanox ConnectX-4 100Gb/s (EDR) – Latency/Ping Pong Lowerisbetter
  • 13. © 2017 Arm Limited13 HPE Comanche (Apollo 70) with Cavium Thunder X2, Mellanox ConnectX-4 100Gb/s (EDR) – MPI Message Rate (28 cores) 0 10 20 30 40 50 60 70 80 90 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 MessagePerSecond Millions Message Size MPI Message Rate Higherisbetter
  • 14. © 2017 Arm Limited14 Lets talk about scale…
  • 15. © 2017 Arm Limited15 MPI is HUGE! • MPI Language bindings and compiler wrappers – C, Fortran • MPI runtimes – PMI, PMI-2, ORTE, PMIX, etc. • MPI Network layer – supports every possible and impossible exotic network device Open MPI MVAPICH Over 1,200,000 code lines !
  • 16. © 2017 Arm Limited16 Let’s talk about testing … Thanks to Howard Pritchard (LANL) and Hari Subramoni (MVAPICH) for providing this information Open MPI - 150,000 (aggregated) nightly tests ! MVAPICH – 18,000 Core hours (750 days) per- commit
  • 17. © 2017 Arm Limited17 Open MPI testing CPU arch: CPU types: x86_64 , ARMv8, PPC64le Compilers: clang37, clang38, gcc (multiple versions, 32/64bit), ibm xlc , absoft 18.0, armclang (18.3) Distros: AWS linux 17.03, CentOS, CrayCLE, etc Thanks to Howard Pritchard (LANL) for providing this information Arch x Compiler x OS x Interconnect x MPI configurations
  • 18. © 2017 Arm Limited18 How YOU can help ? • Providing hardware and software • Running CI for MPI projects (Jenkins, etc.) • Running distributed/scale regression (MTT, etc) • Been active member of the community (mail lists, github, etc) • https://www.open-mpi.org • https://www.mpich.org • http://mvapich.cse.ohio-state.edu • Answer user questions, fix bugs, etc • Contribute new features and optimizations ! • Participate in MPI forum - https://www.mpi-forum.org
  • 19. © 2017 Arm Limited19 HPCAC - http://hpcadvisorycouncil.com/
  • 20. 2020 © 2017 Arm Limited The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. www.arm.com/company/policies/trademarks