SlideShare a Scribd company logo
1 of 19
Download to read offline
||ID | SIS
2019 hpc-ch Forum – Cloud and Containers
Andrei Plamadă, Jarunan Panyasantisuk
ETH Zürich – Scientific IT Services
16.05.2019 1
Benchmarking MPI Applications in Singularity Containers
on Traditional HPC and Cloud Infrastructures
Andrei Plamadă
||ID | SIS
§ Motivation
§ User experience:
§ Traditional HPC vs HPC in the Public Cloud
§ Singularity v2.6
§ Benchmarking MPI Applications
§ OSU Micro-Benchmarks
§ Machine Learning: TensorFlow
16.05.2019Andrei Plamadă 2
Outline
||ID | SIS
§ 2018-2022: 20.2% CAGR for IaaS (see Forbes –
Gartner)
16.05.2019Andrei Plamadă 3
Motivation – Public Cloud is growing rapidly
80.0
94.8 110.5
126.7
143.7
30.5 38.9 49.1 61.9
76.7
2018 2019 2020 2021 2022
Worldwide Public Cloud SaaS and IaaS
Revenue Forecast (Billions of U.S. Dollars)
SaaS IaaS
||ID | SIS
§ 2018-2022: 20.2% CAGR for IaaS (see Forbes –
Gartner)
§ Expectations
§ More competitive prices
§ More regions
§ More heterogeneous
16.05.2019Andrei Plamadă 4
Motivation – Public Cloud is growing rapidly
80.0
94.8 110.5
126.7
143.7
30.5 38.9 49.1 61.9
76.7
2018 2019 2020 2021 2022
Worldwide Public Cloud SaaS and IaaS
Revenue Forecast (Billions of U.S. Dollars)
SaaS IaaS
||ID | SIS
§ 2018-2022: 20.2% CAGR for IaaS (see Forbes –
Gartner)
§ Expectations
§ More competitive prices
§ More regions
§ More heterogeneous
16.05.2019Andrei Plamadă 5
Motivation – Public Cloud is growing rapidly
§ Available in Switzerland
§ 2019-03-12 Google Cloud Platform in Zurich
§ Announced in Switzerland
§ 2018-03-14 Azure Switzerland North and West
80.0
94.8 110.5
126.7
143.7
30.5 38.9 49.1 61.9
76.7
2018 2019 2020 2021 2022
Worldwide Public Cloud SaaS and IaaS
Revenue Forecast (Billions of U.S. Dollars)
SaaS IaaS
||ID | SIS
§ Amazon EC2
§ 2018-11-26 c5n Instances
§ Intel Xeon Platinum ~3.0 GHz, 72 vCPUs, 2.6 GB/vCPU, 100 Gbps
§ Azure
§ 2017-10-23 Cray in Azure
§ Cray XC-series, Cray CS-series
§ 2018-11-14 New H-series in preview*
§ AMD EPYC 7551 ~3.0 GHz: 60 vCPUs, 4.0 GB/vCPU, 100 Gbps EDR InfiniBand (2019-05-14 available)
§ Intel Xeon Platinum 8168 ~3.4 GHz: 44 vCPUs, 8.0 GB/vCPU, 100 Gbps EDR InfiniBand
§ Google Cloud Platform
§ 2019-04-02 Compute-Optimized VMs (C2)
§ 2nd Gen Intel Xeon Scalable Processors ~3.8 GHz, 60 vCPUs, 4.0 GB/vCPU
16.05.2019Andrei Plamadă 6
Motivation – HPC is in the Cloud as per Press Releases
||ID | SIS
§ Containers improve portability and can address the reproducibility issue in
research (EnhanceR Survey - Science IT Consultants)
§ EnhanceR Survey - Infrastructure Providers for Container Use
§ Singularity:
§ Developed initially at LBL - Berkeley Lab - for HPC use case (multi-tenancy)
§ Open source with standard BSD 3 clause license https://github.com/sylabs/singularity
§ Under active development with 12 contributors with more than 100 commits
§ Available also with commercial support: Singularity Pro
§ Used world wide and recommended by vendors, e.g. NVIDIA, Azure Batch
§ Big worldwide community (google groups, slack)
§ Swiss community - EnhanceR
16.05.2019Andrei Plamadă 7
Motivation – Singularity as the container solution for HPC
||ID | SIS
§ Containers improve portability and can address the reproducibility issue in
research (EnhanceR Survey - Science IT Consultants)
§ EnhanceR Survey - Infrastructure Providers for Container Use
§ Main idea
16.05.2019Andrei Plamadă 8
Motivation – Singularity as the container solution for HPC
Host OS+Drivers+Middleware
(OSDM)
MPI
• mpirun
• MPI Library
SSH
Server
App
• Shared MPI
Library
Host OS+Drivers+Middleware
(OSDM)
MPI
• mpirun
SSH
Server
Container OSDM
• MPI
• App
• Shared MPI Library
||ID | SIS
§ Traditional HPC (ETH – SIS – HPC)
§ Euler IV:
§ 2x18 core Intel Xeon Gold 6150 (2.7-3.7 GHz)
§ All cores available
§ HT available
§ 7.4 GB/core Memory
§ 100 Gbps InfiniBand
§ Public Cloud - Azure
§ In preview HC-Series – Standard_HC44rs
§ 2x24 core Intel Xeon Plat 8168 (2.7-3.7 GHz)?
§ 2x2 core used by the supervisor?
§ HT disabled?
§ 8.0 GB/core Memory
§ 100 Gbps InfiniBand
16.05.2019Andrei Plamadă 9
Traditional HPC vs HPC in the Public Cloud
||ID | SIS
§ Traditional HPC (ETH – SIS – HPC)
§ Ready to be used (LSF)
§ No maintenance / set-up
§ Login and Compute Nodes
§ Moderate flexibility regarding the software
stack
§ Queue
§ It generally works as expected
§ Public Cloud - Azure
§ Needs to be set-up (Slurm Cluster) via
CycleCloud
§ As admin fully responsible
§ Master and Execute Nodes
§ High flexibility (as the admin), e.g. OpenMPI,
MPICH, MVAPICH2, Intel MPI
§ Queue (as admin high availability)
§ Auto-scaling
§ https://github.com/Azure/cyclecloud-
slurm/issues
16.05.2019Andrei Plamadă 10
User Experience – Traditional HPC vs HPC in the Public Cloud
||ID | SIS 16.05.2019Andrei Plamadă 11
User Experience on CentOS 7 – Singularity v2.6
Create
• Docker
• root access
• on your PC
Run
• Singularity
• on your PC or HPC
infrastructure
§ Multi-node: MPICH ABI Compatibility
initiative
||ID | SIS
Bytes EN m2 v2.2 EC m2 v2.2 EC m2 v2.3 AN m2 v2.3 AC m2 v2.3
8 0.16 0.15 0.16 0.16 0.08
64 1.30 1.27 1.29 1.28 1.25
512 8.27 8.21 8.14 7.87 7.65
4K 37.41 37.65 37.42 37.23 36.54
32K 88.89 89.25 89.43 83.50 82.47
2M 94.75 94.59 95.19 94.25 94.30
16M 94.95 94.75 95.50 91.49 89.99
16.05.2019Andrei Plamadă 12
Osu Micro-Benchmarks – osu_bw (Gbps) 1000 iterations
Abbreviations: Azure (A), Euler (E), MVAPICH2 (m2), Native (N), Container (C)
§ Naïve EC/AC MPICH v3.3 is working but only up to 10/4 Gbps (no InfiniBand)
§ Host: AC MPICH v3.3, Container: m2 v2.3; results as for AC m2 v2.3 - up to 100 Gbps
§ OpenMPI is not compatible with MPICH-derived MPI implementations is not working
||ID | SIS
Bytes EN m2 v2.2 EC m2 v2.2 EC m2 v2.3 AN m2 v2.3 AC m2 v2.3
8 1.25 1.26 1.30 2.37 2.34
64 1.37 1.38 1.37 2.54 2.54
512 2.12 2.09 2.12 3.44 3.38
4K 3.44 3.34 3.63 5.16 5.30
32K 8.69 8.59 8.88 14.07 13.47
2M 28.46 28.39 28.54 39.62 38.71
16M 188.68 188.70 185.10 202.52 204.84
16.05.2019Andrei Plamadă 13
Osu Micro-Benchmarks – osu_latency (μs) 100000 iterations
Abbreviations: Azure (A), Euler (E), MVAPICH2 (m2), Native (N), Container (C)
||ID | SIS 16.05.2019Andrei Plamadă 14
Osu Micro-Benchmarks – Dockerfile
||ID | SIS
§ 2018-11-24: new N-Series Azure Virtual Machines (in preview)
§ Standard_ND40s_v2:
§ Intel Skylake: 40 vCPUs, 16.8 GB/vCPU
§ 8 x NVIDIA Tesla V100 NVLINK
16.05.2019Andrei Plamadă 15
Machine Learning – Tensor Flow – on Azure
(1 iteration – NO STATISTICS)
Time to Solution (min)
No of GPUs CUDA 9 CUDA 10 Singularity CUDA 10
1 87 63 65
2 102 89 59?
4 66 46 45
8 28 19 18
||ID | SIS 16.05.2019Andrei Plamadă 16
Machine Learning – Tensor Flow – Dockerfile (1/2)
||ID | SIS 16.05.2019Andrei Plamadă 17
Machine Learning – Tensor Flow – Dockerfile (2/2)
||ID | SIS 16.05.2019Andrei Plamadă 18
Conclusion
§ User experience on Azure - HPC in the cloud is catching up:
§ CycleCloud Slurm Cluster with compute intensive VMs + 100 Gbps InfiniBand in preview
§ Big Machine learning VMs (up to 8 x Tesla V100 NVLINK) in preview
§ Singularity Containers:
§ Once the host is similar with the container we did not experience any overhead
§ HPC partially breaks the portability of containers
§ The container should be compatible with host infrastructure and host MPI implementation
§ Updating CUDA drivers (9 to 10) might improve the time to solution
||ID | SIS
ETH Zürich
Andrei Plamadă
Scientific IT Services
Weinbergstrasse 11
8092 Zürich
16.05.2019Andrei Plamadă 19
Contact Acknowledgements
SIS colleagues
Thomas Wüst
Urban Borstnik
Samuel Fux
EnhanceR colleagues
Alexander Kashev (UniBe)
Microsoft / Azure
Lukasz Miroslaw
Andy Howard
EnhanceR Survey - Infrastructure Providers for Container Use
https://forms.gle/JBW78qDPWabd4GDR8

More Related Content

What's hot

1030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.01030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.0NVIDIA Japan
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
全面保護企業的關鍵智慧資產
全面保護企業的關鍵智慧資產全面保護企業的關鍵智慧資產
全面保護企業的關鍵智慧資產NVIDIA Taiwan
 
CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014Olli-Pekka Lehto
 
AI, A New Computing Model
AI, A New Computing ModelAI, A New Computing Model
AI, A New Computing ModelNVIDIA Taiwan
 
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Hidetsugu Sugiyama
 
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2   Maximizing the utilization of GPU resources on-premise and in the cloudPart 2   Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloudUniva, an Altair Company
 
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudUniva, an Altair Company
 
HPC Top 5 Stories: April 26, 2018
HPC Top 5 Stories: April 26, 2018HPC Top 5 Stories: April 26, 2018
HPC Top 5 Stories: April 26, 2018NVIDIA
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PC Cluster Consortium
 
GPU Computing with Python and Anaconda: The Next Frontier
GPU Computing with Python and Anaconda: The Next FrontierGPU Computing with Python and Anaconda: The Next Frontier
GPU Computing with Python and Anaconda: The Next FrontierNVIDIA
 
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Renee Yao
 
Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Alison B. Lowndes
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)inside-BigData.com
 
Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015Miguel Pérez Colino
 
OpenACC Monthly Highlights- December
OpenACC Monthly Highlights- DecemberOpenACC Monthly Highlights- December
OpenACC Monthly Highlights- DecemberNVIDIA
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA Taiwan
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化NVIDIA Taiwan
 

What's hot (20)

1030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.01030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.0
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
全面保護企業的關鍵智慧資產
全面保護企業的關鍵智慧資產全面保護企業的關鍵智慧資產
全面保護企業的關鍵智慧資產
 
CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014
 
AI, A New Computing Model
AI, A New Computing ModelAI, A New Computing Model
AI, A New Computing Model
 
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
 
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2   Maximizing the utilization of GPU resources on-premise and in the cloudPart 2   Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
 
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
 
HPC Top 5 Stories: April 26, 2018
HPC Top 5 Stories: April 26, 2018HPC Top 5 Stories: April 26, 2018
HPC Top 5 Stories: April 26, 2018
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
PCCC21:日本電気株式会社「一台何役?SX-Aurora TSUBASA最新情報」
 
GPU Computing with Python and Anaconda: The Next Frontier
GPU Computing with Python and Anaconda: The Next FrontierGPU Computing with Python and Anaconda: The Next Frontier
GPU Computing with Python and Anaconda: The Next Frontier
 
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
 
Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
 
Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015Cloud Strategies for a modern hybrid datacenter - Dec 2015
Cloud Strategies for a modern hybrid datacenter - Dec 2015
 
OpenACC Monthly Highlights- December
OpenACC Monthly Highlights- DecemberOpenACC Monthly Highlights- December
OpenACC Monthly Highlights- December
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
 

Similar to Benchmarking MPI Applications in Singularity Containers on Traditional HPC and Cloud Infrastructures

OpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchDirk Petersen
 
A journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfA journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfLiang Yan
 
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Timothy Spann
 
Amazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road Ahead
Amazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road AheadAmazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road Ahead
Amazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road Aheadinside-BigData.com
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...PT Datacomm Diangraha
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseDatabricks
 
StampedeCon 2015 Keynote
StampedeCon 2015 KeynoteStampedeCon 2015 Keynote
StampedeCon 2015 KeynoteKen Owens
 
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015StampedeCon
 
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptxOpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptxOpenACC
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next DecadePaula Koziol
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...Databricks
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...OCCIware
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...Marc Dutoo
 
OpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformOpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformKangaroot
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 

Similar to Benchmarking MPI Applications in Singularity Containers on Traditional HPC and Cloud Infrastructures (20)

OpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
A journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfA journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdf
 
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
 
Amazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road Ahead
Amazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road AheadAmazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road Ahead
Amazon Elastic Fabric Adapter: Anatomy, Capabilities, and the Road Ahead
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
 
StampedeCon 2015 Keynote
StampedeCon 2015 KeynoteStampedeCon 2015 Keynote
StampedeCon 2015 Keynote
 
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
How Cisco Migrated from MapReduce Jobs to Spark Jobs - StampedeCon 2015
 
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptxOpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
 
OpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platformOpenShift 4, the smarter Kubernetes platform
OpenShift 4, the smarter Kubernetes platform
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
NFV features in kubernetes
NFV features in kubernetesNFV features in kubernetes
NFV features in kubernetes
 

More from inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...inside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 
Data Parallel Deep Learning
Data Parallel Deep LearningData Parallel Deep Learning
Data Parallel Deep Learning
 

Recently uploaded

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 

Recently uploaded (20)

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Benchmarking MPI Applications in Singularity Containers on Traditional HPC and Cloud Infrastructures

  • 1. ||ID | SIS 2019 hpc-ch Forum – Cloud and Containers Andrei Plamadă, Jarunan Panyasantisuk ETH Zürich – Scientific IT Services 16.05.2019 1 Benchmarking MPI Applications in Singularity Containers on Traditional HPC and Cloud Infrastructures Andrei Plamadă
  • 2. ||ID | SIS § Motivation § User experience: § Traditional HPC vs HPC in the Public Cloud § Singularity v2.6 § Benchmarking MPI Applications § OSU Micro-Benchmarks § Machine Learning: TensorFlow 16.05.2019Andrei Plamadă 2 Outline
  • 3. ||ID | SIS § 2018-2022: 20.2% CAGR for IaaS (see Forbes – Gartner) 16.05.2019Andrei Plamadă 3 Motivation – Public Cloud is growing rapidly 80.0 94.8 110.5 126.7 143.7 30.5 38.9 49.1 61.9 76.7 2018 2019 2020 2021 2022 Worldwide Public Cloud SaaS and IaaS Revenue Forecast (Billions of U.S. Dollars) SaaS IaaS
  • 4. ||ID | SIS § 2018-2022: 20.2% CAGR for IaaS (see Forbes – Gartner) § Expectations § More competitive prices § More regions § More heterogeneous 16.05.2019Andrei Plamadă 4 Motivation – Public Cloud is growing rapidly 80.0 94.8 110.5 126.7 143.7 30.5 38.9 49.1 61.9 76.7 2018 2019 2020 2021 2022 Worldwide Public Cloud SaaS and IaaS Revenue Forecast (Billions of U.S. Dollars) SaaS IaaS
  • 5. ||ID | SIS § 2018-2022: 20.2% CAGR for IaaS (see Forbes – Gartner) § Expectations § More competitive prices § More regions § More heterogeneous 16.05.2019Andrei Plamadă 5 Motivation – Public Cloud is growing rapidly § Available in Switzerland § 2019-03-12 Google Cloud Platform in Zurich § Announced in Switzerland § 2018-03-14 Azure Switzerland North and West 80.0 94.8 110.5 126.7 143.7 30.5 38.9 49.1 61.9 76.7 2018 2019 2020 2021 2022 Worldwide Public Cloud SaaS and IaaS Revenue Forecast (Billions of U.S. Dollars) SaaS IaaS
  • 6. ||ID | SIS § Amazon EC2 § 2018-11-26 c5n Instances § Intel Xeon Platinum ~3.0 GHz, 72 vCPUs, 2.6 GB/vCPU, 100 Gbps § Azure § 2017-10-23 Cray in Azure § Cray XC-series, Cray CS-series § 2018-11-14 New H-series in preview* § AMD EPYC 7551 ~3.0 GHz: 60 vCPUs, 4.0 GB/vCPU, 100 Gbps EDR InfiniBand (2019-05-14 available) § Intel Xeon Platinum 8168 ~3.4 GHz: 44 vCPUs, 8.0 GB/vCPU, 100 Gbps EDR InfiniBand § Google Cloud Platform § 2019-04-02 Compute-Optimized VMs (C2) § 2nd Gen Intel Xeon Scalable Processors ~3.8 GHz, 60 vCPUs, 4.0 GB/vCPU 16.05.2019Andrei Plamadă 6 Motivation – HPC is in the Cloud as per Press Releases
  • 7. ||ID | SIS § Containers improve portability and can address the reproducibility issue in research (EnhanceR Survey - Science IT Consultants) § EnhanceR Survey - Infrastructure Providers for Container Use § Singularity: § Developed initially at LBL - Berkeley Lab - for HPC use case (multi-tenancy) § Open source with standard BSD 3 clause license https://github.com/sylabs/singularity § Under active development with 12 contributors with more than 100 commits § Available also with commercial support: Singularity Pro § Used world wide and recommended by vendors, e.g. NVIDIA, Azure Batch § Big worldwide community (google groups, slack) § Swiss community - EnhanceR 16.05.2019Andrei Plamadă 7 Motivation – Singularity as the container solution for HPC
  • 8. ||ID | SIS § Containers improve portability and can address the reproducibility issue in research (EnhanceR Survey - Science IT Consultants) § EnhanceR Survey - Infrastructure Providers for Container Use § Main idea 16.05.2019Andrei Plamadă 8 Motivation – Singularity as the container solution for HPC Host OS+Drivers+Middleware (OSDM) MPI • mpirun • MPI Library SSH Server App • Shared MPI Library Host OS+Drivers+Middleware (OSDM) MPI • mpirun SSH Server Container OSDM • MPI • App • Shared MPI Library
  • 9. ||ID | SIS § Traditional HPC (ETH – SIS – HPC) § Euler IV: § 2x18 core Intel Xeon Gold 6150 (2.7-3.7 GHz) § All cores available § HT available § 7.4 GB/core Memory § 100 Gbps InfiniBand § Public Cloud - Azure § In preview HC-Series – Standard_HC44rs § 2x24 core Intel Xeon Plat 8168 (2.7-3.7 GHz)? § 2x2 core used by the supervisor? § HT disabled? § 8.0 GB/core Memory § 100 Gbps InfiniBand 16.05.2019Andrei Plamadă 9 Traditional HPC vs HPC in the Public Cloud
  • 10. ||ID | SIS § Traditional HPC (ETH – SIS – HPC) § Ready to be used (LSF) § No maintenance / set-up § Login and Compute Nodes § Moderate flexibility regarding the software stack § Queue § It generally works as expected § Public Cloud - Azure § Needs to be set-up (Slurm Cluster) via CycleCloud § As admin fully responsible § Master and Execute Nodes § High flexibility (as the admin), e.g. OpenMPI, MPICH, MVAPICH2, Intel MPI § Queue (as admin high availability) § Auto-scaling § https://github.com/Azure/cyclecloud- slurm/issues 16.05.2019Andrei Plamadă 10 User Experience – Traditional HPC vs HPC in the Public Cloud
  • 11. ||ID | SIS 16.05.2019Andrei Plamadă 11 User Experience on CentOS 7 – Singularity v2.6 Create • Docker • root access • on your PC Run • Singularity • on your PC or HPC infrastructure § Multi-node: MPICH ABI Compatibility initiative
  • 12. ||ID | SIS Bytes EN m2 v2.2 EC m2 v2.2 EC m2 v2.3 AN m2 v2.3 AC m2 v2.3 8 0.16 0.15 0.16 0.16 0.08 64 1.30 1.27 1.29 1.28 1.25 512 8.27 8.21 8.14 7.87 7.65 4K 37.41 37.65 37.42 37.23 36.54 32K 88.89 89.25 89.43 83.50 82.47 2M 94.75 94.59 95.19 94.25 94.30 16M 94.95 94.75 95.50 91.49 89.99 16.05.2019Andrei Plamadă 12 Osu Micro-Benchmarks – osu_bw (Gbps) 1000 iterations Abbreviations: Azure (A), Euler (E), MVAPICH2 (m2), Native (N), Container (C) § Naïve EC/AC MPICH v3.3 is working but only up to 10/4 Gbps (no InfiniBand) § Host: AC MPICH v3.3, Container: m2 v2.3; results as for AC m2 v2.3 - up to 100 Gbps § OpenMPI is not compatible with MPICH-derived MPI implementations is not working
  • 13. ||ID | SIS Bytes EN m2 v2.2 EC m2 v2.2 EC m2 v2.3 AN m2 v2.3 AC m2 v2.3 8 1.25 1.26 1.30 2.37 2.34 64 1.37 1.38 1.37 2.54 2.54 512 2.12 2.09 2.12 3.44 3.38 4K 3.44 3.34 3.63 5.16 5.30 32K 8.69 8.59 8.88 14.07 13.47 2M 28.46 28.39 28.54 39.62 38.71 16M 188.68 188.70 185.10 202.52 204.84 16.05.2019Andrei Plamadă 13 Osu Micro-Benchmarks – osu_latency (μs) 100000 iterations Abbreviations: Azure (A), Euler (E), MVAPICH2 (m2), Native (N), Container (C)
  • 14. ||ID | SIS 16.05.2019Andrei Plamadă 14 Osu Micro-Benchmarks – Dockerfile
  • 15. ||ID | SIS § 2018-11-24: new N-Series Azure Virtual Machines (in preview) § Standard_ND40s_v2: § Intel Skylake: 40 vCPUs, 16.8 GB/vCPU § 8 x NVIDIA Tesla V100 NVLINK 16.05.2019Andrei Plamadă 15 Machine Learning – Tensor Flow – on Azure (1 iteration – NO STATISTICS) Time to Solution (min) No of GPUs CUDA 9 CUDA 10 Singularity CUDA 10 1 87 63 65 2 102 89 59? 4 66 46 45 8 28 19 18
  • 16. ||ID | SIS 16.05.2019Andrei Plamadă 16 Machine Learning – Tensor Flow – Dockerfile (1/2)
  • 17. ||ID | SIS 16.05.2019Andrei Plamadă 17 Machine Learning – Tensor Flow – Dockerfile (2/2)
  • 18. ||ID | SIS 16.05.2019Andrei Plamadă 18 Conclusion § User experience on Azure - HPC in the cloud is catching up: § CycleCloud Slurm Cluster with compute intensive VMs + 100 Gbps InfiniBand in preview § Big Machine learning VMs (up to 8 x Tesla V100 NVLINK) in preview § Singularity Containers: § Once the host is similar with the container we did not experience any overhead § HPC partially breaks the portability of containers § The container should be compatible with host infrastructure and host MPI implementation § Updating CUDA drivers (9 to 10) might improve the time to solution
  • 19. ||ID | SIS ETH Zürich Andrei Plamadă Scientific IT Services Weinbergstrasse 11 8092 Zürich 16.05.2019Andrei Plamadă 19 Contact Acknowledgements SIS colleagues Thomas Wüst Urban Borstnik Samuel Fux EnhanceR colleagues Alexander Kashev (UniBe) Microsoft / Azure Lukasz Miroslaw Andy Howard EnhanceR Survey - Infrastructure Providers for Container Use https://forms.gle/JBW78qDPWabd4GDR8