SlideShare ist ein Scribd-Unternehmen logo
1 von 24
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adam Boeglin, HPC Solutions Architect
Monday, October 31, 2016
Launch a thousand core HPC
cluster in minutes with AWS
CfnCluster
Webinar Highlights
• What is CfnCluster and when to use it
• Architecture guidance to fit your
security models
• How to install and configure of
CfnCluster
• Demo: Review of CfnCluster and
managing compute at scale
Introduction to CfnCluster
• AWS CloudFormation + Cluster = CfnCluster
• Simple to install, easy to manage
• Everything you need to get a cluster up and running in
minutes
• Head node with scheduler
• Shared NFS Storage
• /home
• /shared
• OpenMPI
• Compute nodes that grow and shrink on demand
Workloads Well Suited for CfnCluster
• Computational Fluid Dynamics
• Semiconductor Design
• Weather Modeling
• Genomics and Molecular Simulation
• Seismic and reservoir simulations
• 3D rendering and visualizations
• … anything that uses a traditional HPC scheduler
Cluster HPC and Grid HPC
Cluster HPC
Tightly coupled,
latency sensitive
applications
Use larger EC2
compute instances,
placement groups,
Enhanced Networking
Grid HPC
Loosely coupled,
pleasingly parallel.
Requires very little
node to node
interaction.
Grids of Clusters
Use a grid strategy on the cloud
to run a group of parallel,
individually clustered HPC jobs
Computational Fluid Dynamics
ANSYS Fluent
• AWS c4.8xlarge
• 140M cells
• F1 car CFD benchmark
http://www.ansys-blog.com/simulation-on-the-cloud/
https://aws.amazon.com/hpc/cfncluster/
Configuration Options
• Operating System
• Amazon Linux
• Centos 6
• Centos 7
• Ubuntu 14.04
• Scheduler
• Sun Grid Engine (SGE)
• OpenLava
• Torque
• SLURM
• Storage Size & IOPS
• EBS & Instance Store
Encryption
• Scaling Speed & Limits
• Provisioning Scripts
Many AWS services to tie it all together
• CloudFormation manages the state of the cluster
• Amazon CloudWatch & Auto Scaling lets compute fleet
grow and shrink on demand
• Amazon SQS & Amazon SNS allows compute nodes to
signal to master when they’re online
• AWS Identity and Access Management (IAM) allows for
fine grained access control
• Amazon S3 for storage of CloudFormation templates
Amazon S3
DynamoDB
Amazon SQS
CloudWatch
Internet
Gateway
(IGW)
region-1a
Master Server
Auto Scaling
Compute Fleet
CloudFormation
Standalone CfnCluster
Amazon S3
DynamoDB
Amazon SQS
CloudWatch
Internet
Gateway
(IGW)
Private Subnet
Master Server
Auto Scaling
Compute Fleet
CloudFormation
Public Subnet
VPC NAT
gateway
Private Subnet Route Table
VPC Traffic -> Local
0.0.0.0 -> Nat Gateway
Public Subnet Route Table
VPC Traffic -> Local
0.0.0.0 -> Internet Gateway
Isolated CfnCluster
Bastian Server
Amazon S3
DynamoDB
Amazon SQS
CloudWatch
Internet
Gateway
(IGW)
Private Subnet
Master Server
Auto Scaling
Compute Fleet
CloudFormation
Public Subnet
VPC NAT
gateway
Corporate Data Center
Engineer VPN Connection
Private Subnet Route Table
VPC Traffic -> Local
Corp IP Range -> VPN
0.0.0.0 -> Nat Gateway
Public Subnet Route Table
VPC Traffic -> Local
Corp IP Range -> VPN
0.0.0.0 -> Internet Gateway
Isolated CfnCluster w/ VPN
Private Subnet
Master Server
Auto Scaling
Compute Fleet
Amazon S3
DynamoDB
Amazon SQS
CloudWatch
CloudFormation
Corporate Data Center
Proxy Server
VPN Connection
Internet
Connection
Private Subnet Route Table
VPC Traffic -> Local
Corp IP Range -> VPN
0.0.0.0 -> VPN
Private CfnCluster w/ VPN & Proxy
Creating an IAM User
• Create an IAM user with Administrative privileges
• Fine grain access controls can be done later
• Generate an Access & Secret key and keep it safe
Create an SSH Key
• Generate or import the key you’ll use for user login
Installing the CfnCluster CLI
• On your desktop or a bastion server
$ sudo pip install cfncluster
Creating the Base Configuration
• First, create the base
config required to
start a cluster.
$ cfncluster configure
Edit the configuration file to meet your needs
• Reference the configuration docs
• http://cfncluster.readthedocs.io/en/latest/configuration.html
$ vim ~/.cfncluster/config
Launch the Cluster
$ cfncluster create mycluster
• Cluster creation usually
takes ~15 minutes
• Completely managed by
CloudFormation
Submit your first job
[ec2-user@ip-10-0-0-17 ~]$ cat hw.qsub
#!/bin/bash
#
#$ -cwd
#$ -j y
#$ -pe mpi 2
#$ -S /bin/bash
#
module load openmpi-x86_64
mpirun -np 2 hostname
[ec2-user@ip-10-0-0-17 ~]$ qsub hw.qsub
Your job 1 ("hw.qsub") has been submitted
[ec2-user@ip-10-0-0-17 ~]$ qstat
job-ID prior name user state submit/start at queue slots ja-task-ID
------------------------------------------------------------------------------------------------
1 0.55500 hw.qsub ec2-user r 02/01/2015 05:57:25 all.q@ip-10-0-0-44.ap-southeas 2
[ec2-user@ip-10-0-0-17 ~]$ ls -l
total 8
-rw-rw-r-- 1 ec2-user ec2-user 110 Feb 1 05:57 hw.qsub
-rw-r--r-- 1 ec2-user ec2-user 26 Feb 1 05:57 hw.qsub.o1
[ec2-user@ip-10-0-0-17 ~]$ cat hw.qsub.o1
ip-10-0-0-44
ip-10-0-0-45
EBS Snapshots for Software & Storage Management
• Install your applications and
store any working data to
/shared
• Create a snapshot of that
volume
• Re-use that snapshot every
time you launch your cluster
ebs_snapshot_id = snap-xxxxx
Master Server
Root & Home
Volume (/ & /home)
NFS Shared Volume
(/shared)
Amazon EBS
Snapshot
(snap-xxxxx)
Upgrading Hardware is Easy!
• Simple upgrade from Ivy Bridge to Haswell
1. Let all compute nodes stop
2. Edit ~/.cfncluster/config and change
compute_instance_type = c3.8xlarge
to
compute_instance_type = c4.8xlarge
3. Update the cluster
$ cfncluster update mycluster
C3
C4
Demo: Launching a Cluster
Thank you!

Weitere ähnliche Inhalte

Mehr von Amazon Web Services

OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAmazon Web Services
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightAmazon Web Services
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotAmazon Web Services
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Amazon Web Services
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?Amazon Web Services
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksAmazon Web Services
 

Mehr von Amazon Web Services (20)

OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSight
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced Attacks
 

Kürzlich hochgeladen

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Kürzlich hochgeladen (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Launch a Thousand Core HPC Cluster in Minutes with AWS CfnCluster

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adam Boeglin, HPC Solutions Architect Monday, October 31, 2016 Launch a thousand core HPC cluster in minutes with AWS CfnCluster
  • 2. Webinar Highlights • What is CfnCluster and when to use it • Architecture guidance to fit your security models • How to install and configure of CfnCluster • Demo: Review of CfnCluster and managing compute at scale
  • 3. Introduction to CfnCluster • AWS CloudFormation + Cluster = CfnCluster • Simple to install, easy to manage • Everything you need to get a cluster up and running in minutes • Head node with scheduler • Shared NFS Storage • /home • /shared • OpenMPI • Compute nodes that grow and shrink on demand
  • 4. Workloads Well Suited for CfnCluster • Computational Fluid Dynamics • Semiconductor Design • Weather Modeling • Genomics and Molecular Simulation • Seismic and reservoir simulations • 3D rendering and visualizations • … anything that uses a traditional HPC scheduler
  • 5. Cluster HPC and Grid HPC Cluster HPC Tightly coupled, latency sensitive applications Use larger EC2 compute instances, placement groups, Enhanced Networking Grid HPC Loosely coupled, pleasingly parallel. Requires very little node to node interaction. Grids of Clusters Use a grid strategy on the cloud to run a group of parallel, individually clustered HPC jobs
  • 6. Computational Fluid Dynamics ANSYS Fluent • AWS c4.8xlarge • 140M cells • F1 car CFD benchmark http://www.ansys-blog.com/simulation-on-the-cloud/
  • 8. Configuration Options • Operating System • Amazon Linux • Centos 6 • Centos 7 • Ubuntu 14.04 • Scheduler • Sun Grid Engine (SGE) • OpenLava • Torque • SLURM • Storage Size & IOPS • EBS & Instance Store Encryption • Scaling Speed & Limits • Provisioning Scripts
  • 9. Many AWS services to tie it all together • CloudFormation manages the state of the cluster • Amazon CloudWatch & Auto Scaling lets compute fleet grow and shrink on demand • Amazon SQS & Amazon SNS allows compute nodes to signal to master when they’re online • AWS Identity and Access Management (IAM) allows for fine grained access control • Amazon S3 for storage of CloudFormation templates
  • 10. Amazon S3 DynamoDB Amazon SQS CloudWatch Internet Gateway (IGW) region-1a Master Server Auto Scaling Compute Fleet CloudFormation Standalone CfnCluster
  • 11. Amazon S3 DynamoDB Amazon SQS CloudWatch Internet Gateway (IGW) Private Subnet Master Server Auto Scaling Compute Fleet CloudFormation Public Subnet VPC NAT gateway Private Subnet Route Table VPC Traffic -> Local 0.0.0.0 -> Nat Gateway Public Subnet Route Table VPC Traffic -> Local 0.0.0.0 -> Internet Gateway Isolated CfnCluster Bastian Server
  • 12. Amazon S3 DynamoDB Amazon SQS CloudWatch Internet Gateway (IGW) Private Subnet Master Server Auto Scaling Compute Fleet CloudFormation Public Subnet VPC NAT gateway Corporate Data Center Engineer VPN Connection Private Subnet Route Table VPC Traffic -> Local Corp IP Range -> VPN 0.0.0.0 -> Nat Gateway Public Subnet Route Table VPC Traffic -> Local Corp IP Range -> VPN 0.0.0.0 -> Internet Gateway Isolated CfnCluster w/ VPN
  • 13. Private Subnet Master Server Auto Scaling Compute Fleet Amazon S3 DynamoDB Amazon SQS CloudWatch CloudFormation Corporate Data Center Proxy Server VPN Connection Internet Connection Private Subnet Route Table VPC Traffic -> Local Corp IP Range -> VPN 0.0.0.0 -> VPN Private CfnCluster w/ VPN & Proxy
  • 14. Creating an IAM User • Create an IAM user with Administrative privileges • Fine grain access controls can be done later • Generate an Access & Secret key and keep it safe
  • 15. Create an SSH Key • Generate or import the key you’ll use for user login
  • 16. Installing the CfnCluster CLI • On your desktop or a bastion server $ sudo pip install cfncluster
  • 17. Creating the Base Configuration • First, create the base config required to start a cluster. $ cfncluster configure
  • 18. Edit the configuration file to meet your needs • Reference the configuration docs • http://cfncluster.readthedocs.io/en/latest/configuration.html $ vim ~/.cfncluster/config
  • 19. Launch the Cluster $ cfncluster create mycluster • Cluster creation usually takes ~15 minutes • Completely managed by CloudFormation
  • 20. Submit your first job [ec2-user@ip-10-0-0-17 ~]$ cat hw.qsub #!/bin/bash # #$ -cwd #$ -j y #$ -pe mpi 2 #$ -S /bin/bash # module load openmpi-x86_64 mpirun -np 2 hostname [ec2-user@ip-10-0-0-17 ~]$ qsub hw.qsub Your job 1 ("hw.qsub") has been submitted [ec2-user@ip-10-0-0-17 ~]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ------------------------------------------------------------------------------------------------ 1 0.55500 hw.qsub ec2-user r 02/01/2015 05:57:25 all.q@ip-10-0-0-44.ap-southeas 2 [ec2-user@ip-10-0-0-17 ~]$ ls -l total 8 -rw-rw-r-- 1 ec2-user ec2-user 110 Feb 1 05:57 hw.qsub -rw-r--r-- 1 ec2-user ec2-user 26 Feb 1 05:57 hw.qsub.o1 [ec2-user@ip-10-0-0-17 ~]$ cat hw.qsub.o1 ip-10-0-0-44 ip-10-0-0-45
  • 21. EBS Snapshots for Software & Storage Management • Install your applications and store any working data to /shared • Create a snapshot of that volume • Re-use that snapshot every time you launch your cluster ebs_snapshot_id = snap-xxxxx Master Server Root & Home Volume (/ & /home) NFS Shared Volume (/shared) Amazon EBS Snapshot (snap-xxxxx)
  • 22. Upgrading Hardware is Easy! • Simple upgrade from Ivy Bridge to Haswell 1. Let all compute nodes stop 2. Edit ~/.cfncluster/config and change compute_instance_type = c3.8xlarge to compute_instance_type = c4.8xlarge 3. Update the cluster $ cfncluster update mycluster C3 C4
  • 23. Demo: Launching a Cluster