High performance computing in the cloud is enabling high scale compute- and graphics-intensive workloads across industries, ranging from aerospace, automotive, and manufacturing to life sciences, financial services, and energy. AWS provides application developers and end users with unprecedented computational power for massively parallel applications, in areas such as large-scale fluid and materials simulations, 3D content rendering, financial computing, and deep learning. This session provides an overview of HPC capabilities on AWS, describes the newest generations of accelerated computing instances (including P2), as well as highlighting customer and partner use-cases across industries.
Attendees learn about best practices for running HPC workflows in the cloud, including graphical pre- and post-processing, workflow automation, and optimization. Attendees also learn about new and emerging HPC use cases: in particular, deep learning training and inference, large-scale simulations, and high performance data analytics.
2. What to Expect from the Session
Overview of use cases for HPC in science, aerospace, automotive and
manufacturing, life sciences, financial services, and energy.
Overview of HPC capabilities on AWS, including:
• The newest compute-optimized instances
• P2 GPU instances
• F1 FPGA instances
Best practices for running traditional and new/emerging HPC workflows in
the cloud, including graphical pre- and post-processing and workflow
automation
4. • High-energy physics simulations
• Weather and climate modeling and prediction
• Analysis of fluids, structures, and materials
• Thermal and electromagnetic simulations
• Genomics, proteomics, and molecular dynamics
• Seismic and reservoir simulations
• 3D rendering and visualizations
• Deep learning training and inference
Cloud unlocks HPC for a broad range of use cases
AWS for High Performance Computing…
9. Big Data meets Big Compute
"Fugro Roames has enabled Ergon Energy to
reduce the cost of vegetation management from
AU$100 million to AU$60 million per year.”
- Josh Passenger, Technical Architect, Fugro Roames
• Aircraft equipped with cameras, laser sensors
• Repeated overflights of power networks
• Captured data is used to render detailed 3D
models of the power lines, and the environment
• Analytics and simulations are run to generate
actionable reports for directing post-disaster
repair and prioritizing ongoing maintenance
10. HGST applications for engineering:
Molecular dynamics, CAD, CFD, EDA
Collaboration tools for engineering
Big data for manufacturing yield analysis
HPC for Engineering Simulations
Running drive-head
simulations at scale:
Millions of parallel parameter
sweeps, running months of
simulations in just hours
Over 85,000 Intel cores
running at peak, using Spot
Instances
11. 16M cell, polyhedral,
external aero case
Running on c4.8xlarge
instances
Demonstrates excellent
scalability for typical
CFD models
HPC for Aerospace
12. Mapping HPC Use-Cases
Data Light
Minimal
requirements for
high performance
storage
Data Heavy
Benefits from
access to high
performance
storage
Fluid dynamics
Weather forecasting
Materials simulations
Crash simulations
Risk simulations
Molecular modeling
Contextual search
Logistics simulations
Animation and VFX
Semiconductor verification
Image processing/GIS
Genomics
Seismic processing
Metagenomics
Astrophysics
Deep learning
Clustered (Tightly Coupled)
Distributed/Grid (Loosely Coupled)
13. Cluster HPC and Grid HPC on the Cloud
Cluster HPC
Tightly coupled,
latency sensitive
applications
Use larger EC2
compute instances,
placement groups,
enhanced networking
Grid HPC
Loosely coupled,
pleasingly parallel
Use a variety of EC2
instances, multiple
AZs, Spot, Auto
Scaling, Amazon
SQS
Grids of Clusters
Use a grid strategy on the cloud
to run a group of parallel,
individually clustered HPC jobs
14. What Does This Mean for Simulations?
Expand the simulation domain
Run larger numbers of parallel, clustered HPC jobs
18. ANSYS Mechanical FEA Performance
ENGINE BLOCK (V17cg-3) (PCG solver)
Static structural analysis of an engine block
without the internal components
19. Performance for Fluid Dynamics on AWS
ANSYS Fluent
• AWS c4.8xlarge
• 140M cells
• F1 car CFD benchmark
http://www.ansys-blog.com/simulation-on-the-cloud/
20. Test using larger, real-world examples
• Use large cases for testing: do not benchmark scalability
using only small examples
Domain decomposition
• Choose number of cells per core for either per-core
efficiency or for faster results
Instance types
• C4 or M4 are best choices today
Network
• Use a placement group
• Enable enhanced networking
Performance Considerations for HPC on AWS
21. Choose a cell-to-core
ratio to optimize core
efficiency, to optimize
license costs, or to
achieve faster results
Higher per-core
efficiency
Faster results
Domain Decomposition is Important
22. OS version
• Use Amazon Linux or a version 3.10 or later Linux kernel
Processor states and affinity
• Use P-states to reduce processor variability
• Use CPU affinity to pin threads to CPU cores
MPI libraries
• Intel MPI recommended
Hyper-threading
• Test with Hyper-threading on and off
• Usually off is best, but not always
Performance Considerations for HPC on AWS
26. CPU-Based Instances for HPC
Intel CPUs
• Up to 2.9 GHz, Turbo enabled up to 3.6 GHz
• Intel® Advanced Vector Extensions (Intel® AVX2)
• Control over C-States, P-States, and Hyper-threading
• C4, M4 are the most common instance types for HPC:
• Up to 64 vCPUs (32 physical cores)
• R3 and X1 for higher memory applications
• Up to 128 vCPUs (64 physical cores), up to 2 TB RAM
• Proprietary network delivering up to 20 Gbps
27. GPU and FPGA Instances
P2: GPU instance
• Up to 16 NVIDIA GK210 (8 X K80) GPUs in a single instance, with
peer-to-peer PCIe GPU interconnect
• Supporting a wide variety of use cases including deep learning, HPC
simulations, financial computing, and batch rendering
F1: FPGA instance
• Up to 8 Xilinx Virtex® UltraScale+™ VU9P FPGAs in a single
instance, with peer-to-peer PCIe and bidirectional ring interconnects
• Designed for hardware-accelerated applications including financial
computing, genomics, accelerated search, and image processing
P2
F1
28. P2 GPU Instances
• Up to 16 K80 GPUs in a single instance
• Including peer-to-peer PCIe GPU interconnect
• Supporting a wide variety of use cases including deep
learning, HPC simulations, and batch rendering
P2
Instance
Size
GPUs GPU Peer
to Peer
vCPUs Memory
(GiB)
Network
Bandwidth*
p2.xlarge 1 - 4 61 1.25Gbps
p2.8xlarge 8 Y 32 488 10Gbps
p2.16xlarge 16 Y 64 732 20Gbps
*In a placement group
29. F1 FPGA Instances
• Up to 8 Xilinx Virtex UltraScale Plus VU9p FPGAs in a single instance
with four high-speed DDR-4 per FPGA
• Largest size includes high performance FPGA interconnects via PCIe
Gen3 (FPGA Direct), and bidirectional ring (FPGA Link)
• Designed for hardware-accelerated applications including financial
computing, genomics, accelerated search, and image processing
F1
Instance Size FPGAs FPGA
Link
FPGA
Direct
vCPUs Memory
(GiB)
NVMe
Instance
Storage
Network
Bandwidth*
f1.2xlarge 1 - 8 122 1 x 480 5 Gbps
f1.16xlarge 8 Y Y 64 976 4 x 960 30 Gbps
*In a placement group
31. Genomic Big Data
Scale
• Population scale genomics
• Precision medicine for all
• Liquid biopsy cancer screenings
DNA data doubles every 7 months – CPU speeds double every 2 years
Size
• Each person’s genome is ~100 GB
• Computationally intensive analysis
• Multiple copies stored forever
DNA
32. FPGAs for Genomics HPC
Highly Efficient
• Algorithms implemented in hardware
• Gate-level circuit design
• No instruction set overhead
Massively Parallel
• Massively parallel circuits
• Multiple compute engines
• Rapid FPGA reconfigurability
FPGA
Speeds analysis of whole human genomes from hours to minutes
Unprecedented low cost for compute and compressed storage
35. Traditional HPC Stack
Shared file storage
HPC cluster
License managers and cluster
head nodes with job schedulers
3D graphics remote desktop servers
Remote
graphics workstations
Storage cache
Remote sites
Remote backup
36. Migrating HPC to AWS
Shared File Storage
Cloud-based, scaling HPC cluster
on EC2
License managers and cluster
head nodes with job schedulers
3D graphics virtual workstation
AWS Direct Connect
On-Premises IT
Resources
Thin or Zero Client
- No local data -
Storage CacheAmazon S3
and
Amazon
Glacier
38. Deploying HPC
on AWS (Optimized)
Use different on-demand
HPC clusters for
different applications
or end-users
1. Users access resources via secure VPN tunnel
2. Cloud desktops are GPU-enabled for graphics performance
3. Hardened and monitored proxy server used for all access
4. Optional: AWS CodeCommit used for source code repo
5. Continuous Integration server used to manage builds
6. Simple Queueing Service used for queue-based job submission
7. Application-specific compute nodes automatically scaled based on demand
8. License server can be on-premises, or in cloud with results and logs pushed to S3
9. Coverage tracking system notified and updated as jobs complete
40. Amazon S3
Secure, durable,
highly-scalable object
storage. Fast access,
low cost.
For long-term durable
storage of data, in a
readily accessible
get/put access format.
Primary durable and
scalable storage for
HPC data
Amazon Glacier
Secure, durable, long
term, highly cost-
effective object
storage.
For long-term storage
and archival of data
that is infrequently
accessed.
Use for long-term,
lower-cost archival
of HPC data
EC2+EBS
Create a single-AZ
shared file system
using EC2 and EBS,
with third-party or
open source software
(e.g., Intel Lustre).
For near-line storage
of files optimized for
high I/O performance.
Use for high-IOPs,
temporary working
storage
AWS Storage Options for HPC Workloads
EFS
Highly available,
multi-AZ, fully
managed network-
attached elastic file
system.
For near-line, highly-
available storage of
files in a traditional
NFS format (NFSv4).
Use for read-often,
temporary working
storage
41. Secure Graphics and Collaboration
Cloud can be used for pre-and post processing as well as HPC
• Use GPUs in the cloud for remote rendering and remote desktops
Cloud is more secure
for collaboration
• Encrypt the data in flight
and at rest
• Manage your own keys
and credentials
• Deliver pixels to your
collaborators, not the
actual data
42. 1) Customer Managed Application Hosting
• Customer has account with AWS and manages virtual infrastructure
• Cloud used for batch jobs via cluster management software
• Customer can also remote log in and collaborate using GPU instances
• Customer maintains traditional software vendor relationships
• Software vendor optionally offers license flexibility for scalable computing
2) Software Vendor Managed Application Hosting
• SaaS or hybrid model for managed engineering apps in the cloud
• Customer pays software vendor for cloud-hosted services
• Customer does not need to manage virtual infrastructure
Either method of software delivery is supported on AWS, and the right method will depend
on customer requirements – for security and governance, ease of deployment, etc.
Options for Software Licensing
45. Virtual screening at Novartis
• 10 million compounds screened
against a cancer target, in only 9 hours
• Approximately 87,000 compute cores
at peak
HPC Partner on AWS: Cycle Computing
Engineering simulations at HGST:
• Millions of parameter sweeps, running
months of simulations in just hours
• Over 85,000 Intel cores running at peak,
using Spot Instances
www.cyclecomputing.com
46. ● Customer:
● Reduced analysis time from 5.3 days to 12 hours
● Instantly scaled up to 48 cores
HPC Partner on AWS: Rescale
APN Advanced Partner
Rescale’s cloud HPC platform
• Offers native integration to over
180+ simulation and machine
learning applications in a SaaS
environment
• Automation of systems tools and
services enables seamless
deployment of AWS
• JL & Associates used Rescale on
AWS to utilize multiphase CFD
analysis for modeling boiling oil
(C12H26)
• The team was able to achieve their
goal of steady state convergence
which required 23k Iterations @
~20 sec/It
www.rescale.com
49. Next Steps
Visit aws.amazon.com/hpc
Additional sessions:
• CMP314 - Bringing Deep Learning to the Cloud with Amazon EC2
• CMP317 – Deep Learning, 3D Content Rendering, and Massively
Parallel, Compute-Intensive Workloads in the Cloud
• CMP318 – Building HPC Clusters as Code
• CMP320 – Delivering Graphical Applications on AWS