Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

HPC network stack on ARM - Linaro HPC Workshop 2018

882 Aufrufe

Veröffentlicht am

Speaker: Pavel Shamis
Company: Arm
Speaker Bio:
"Pavel is a Principal Research Engineer at ARM with over 16 years of experience in development HPC solutions. His work is focused on co-design software and hardware building blocks for high-performance interconnect technologies, development communication middleware and novel programming models. Prior to joining ARM, he spent five years at Oak Ridge National Laboratory (ORNL) as a research scientist at Computer Science and Math Division (CSMD). In this role, Pavel was responsible for research and development multiple projects in high-performance communication domain including: Collective Communication Offload (CORE-Direct & Cheetah), OpenSHMEM, and OpenUCX. Before joining ORNL, Pavel spent ten years at Mellanox Technologies, where he led Mellanox HPC team and was one of the key driver in enablement Mellanox HPC software stack, including OFA software stack, OpenMPI, MVAPICH, OpenSHMEM, and other.

Pavel is a recipient of prestigious R&D100 award for his contribution in development of the CORE-Direct collective offload technology and he published in excess of 20 research papers.
"
Talk Title: HPC network stack on ARM
Talk Abstract:
Applications, programming languages, and libraries that leverage sophisticated network hardware capabilities have a natural advantage when used in today¹s and tomorrow's high-performance and data center computer environments. Modern RDMA based network interconnects provides incredibly rich functionality (RDMA, Atomics, OS-bypass, etc.) that enable low-latency and high-bandwidth communication services. The functionality is supported by a variety of interconnect technologies such as InfiniBand, RoCE, iWARP, Intel OPA, Cray¹s Aries/Gemini, and others. Over the last decade, the HPC community has developed variety user/kernel level protocols and libraries that enable a variety of high-performance applications over RDMA interconnects including MPI, SHMEM, UPC, etc. With the emerging availability HPC solutions based on ARM CPU architecture it is important to understand how ARM integrates with the RDMA hardware and HPC network software stack. In this talk, we will overview ARM architecture and system software stack, including MPI runtimes, OpenSHMEM, and OpenUCX.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

HPC network stack on ARM - Linaro HPC Workshop 2018

  1. 1. © 2017 Arm Limited Arm Architecture HPC Workshop Linaro, HiSilicon, Huawei HPC Network Stack on Arm Pavel Shamis/Pasha, Principal Research Engineer
  2. 2. © 2017 Arm Limited Arm Architecture HPC Workshop Linaro, HiSilicon, Huawei HPC Network Stack on Arm Pavel Shamis/Pasha, Principal Research Engineer
  3. 3. © 2017 Arm Limited Arm Architecture HPC Workshop Linaro, HiSilicon, Huawei Let’s talk about MPI Pavel Shamis/Pasha, Principal Research Engineer
  4. 4. © 2017 Arm Limited Let’s talk about RDMA…
  5. 5. © 2017 Arm Limited5 VERBs API on Arm • Besides bug fixes not much work was required • Mellanox OFED 2.4 and above supports Arm • Linux Kernel 4.5.0 and above (maybe even earlier) • Linux Distribution Support – on going process • OFED – no official ARMv8 support
  6. 6. © 2017 Arm Limited6 OpenUCX v1.3.0 • https://github.com/openucx/ucx/releases/tag/v1.3.0 WWW.OPENUCX.ORG https://github.com/openucx/ucx UC-T (Hardware Transports) - Low Level API RMA, Atomic, Tag-matching, Send/Recv, Active Message Transport for InfiniBand VERBs driver RC UD XRC DCT Transport for intra-node host memory communication SYSV POSIX KNEM CMA XPMEM Transport for Accelerator Memory communucation GPU Transport for Gemini/Aries drivers GNI UC-S (Services) Common utilities UC-P (Protocols) - High Level API Transport selection, cross-transrport multi-rail, fragmentation, operations not supported by hardware Message Passing API Domain: tag matching, randevouze PGAS API Domain: RMAs, Atomics Task Based API Domain: Active Messages I/O API Domain: Stream Utilities Data stractures Hardware MPICH, Open-MPI, etc. OpenSHMEM, UPC, CAF, X10, Chapel, etc. Parsec, OCR, Legions, etc. Burst buffer, ADIOS, etc. Applications UCX Memory Management OFA Verbs Driver Cray Driver OS Kernel Cuda
  7. 7. © 2017 Arm Limited7 UCX Features • Support for InfiniBand and RoCE: RC, UD, DC • InfiniBand: Hardware tag-matching, extended AMOs, etc. • Support for GPU Devices/Memory – AMD ROCM, Nvidia CUDA • Support for TCP (Beta) • Java bindings (Beta) • Support for Accelerated Verbs – 40% speedup on Arm compared to vanilla Verbs • Support for UGNI API for Aries and Gemini (Thanks to ORNL and LANL!) • Support for Shared Memory: KNEM, CMA, XPMEM, Posix, SySV • Support for x86, ARMv8, Power • Efficient memory polling – 36% increase in efficiency on Arm • UCX interface is integrated with MPICH, OpenMPI, OSHMEM, ORNL-SHMEM, OSSS-SHMEM, etc.
  8. 8. © 2017 Arm Limited Let’s talk about MPI…
  9. 9. © 2017 Arm Limited9 Message Passing Interface - MPI De-facto standard developed by HPC community - https://www.mpi-forum.org/ Excellent overview of MPI by Jeff Squyres - https://www.slideshare.net/jsquyres/the- message-passing-interface-mpi-in-laymans-terms Node 1 Node 2
  10. 10. © 2017 Arm Limited10 Programing models MPICH 3.3b – works on ARMv8 Open MPI 3.x – works on ARMv8 MVAPICH 2.3b – works on ARMv8 OSHMEM – work on ARMv8
  11. 11. © 2017 Arm Limited11 HPE Comanche (Apollo 70) with Cavium Thunder X2 SINGLE core, Mellanox ConnextX-4 100Gb/s (EDR) - Bandwidth 0 2000 4000 6000 8000 10000 12000 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 MB/s Message Size MPI Bandwidth Higherisbetter
  12. 12. © 2017 Arm Limited12 0 1 2 3 4 5 6 7 8 9 0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 Micro-Second Message Size MPI Latency HPE Comanche (Apollo 70) with Cavium Thunder X2, Mellanox ConnectX-4 100Gb/s (EDR) – Latency/Ping Pong Lowerisbetter
  13. 13. © 2017 Arm Limited13 HPE Comanche (Apollo 70) with Cavium Thunder X2, Mellanox ConnectX-4 100Gb/s (EDR) – MPI Message Rate (28 cores) 0 10 20 30 40 50 60 70 80 90 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 MessagePerSecond Millions Message Size MPI Message Rate Higherisbetter
  14. 14. © 2017 Arm Limited14 Lets talk about scale…
  15. 15. © 2017 Arm Limited15 MPI is HUGE! • MPI Language bindings and compiler wrappers – C, Fortran • MPI runtimes – PMI, PMI-2, ORTE, PMIX, etc. • MPI Network layer – supports every possible and impossible exotic network device Open MPI MVAPICH Over 1,200,000 code lines !
  16. 16. © 2017 Arm Limited16 Let’s talk about testing … Thanks to Howard Pritchard (LANL) and Hari Subramoni (MVAPICH) for providing this information Open MPI - 150,000 (aggregated) nightly tests ! MVAPICH – 18,000 Core hours (750 days) per- commit
  17. 17. © 2017 Arm Limited17 Open MPI testing CPU arch: CPU types: x86_64 , ARMv8, PPC64le Compilers: clang37, clang38, gcc (multiple versions, 32/64bit), ibm xlc , absoft 18.0, armclang (18.3) Distros: AWS linux 17.03, CentOS, CrayCLE, etc Thanks to Howard Pritchard (LANL) for providing this information Arch x Compiler x OS x Interconnect x MPI configurations
  18. 18. © 2017 Arm Limited18 How YOU can help ? • Providing hardware and software • Running CI for MPI projects (Jenkins, etc.) • Running distributed/scale regression (MTT, etc) • Been active member of the community (mail lists, github, etc) • https://www.open-mpi.org • https://www.mpich.org • http://mvapich.cse.ohio-state.edu • Answer user questions, fix bugs, etc • Contribute new features and optimizations ! • Participate in MPI forum - https://www.mpi-forum.org
  19. 19. © 2017 Arm Limited19 HPCAC - http://hpcadvisorycouncil.com/
  20. 20. 2020 © 2017 Arm Limited The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. www.arm.com/company/policies/trademarks
  21. 21. 2121 Thank You! Danke! Merci! ! ! Gracias! Kiitos! 감사합니다 ध"यवाद © 2017 Arm Limited

×