SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Deep Dive on Amazon EC2 Instances
Featuring Performance Optimization Best Practices
By
Liam Morrison, Solutions Architect
© 2016 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon
Web Services, Inc.
What to Expect from the Session
 Understanding of the factors that goes into choosing an EC2
instance
 Defining system performance and how it is characterized for
different workloads
 How Amazon EC2 instances deliver performance while providing
flexibility and agility
 How to make the most of your EC2 instance experience through the
lens of several instance types
API
EC2
EC2
Amazon Elastic Compute Cloud is Big
Instances
Networking
Purchase options
Host Server
Hypervisor
Guest 1 Guest 2 Guest n
Amazon EC2 Instances
In the past
 First launched in August 2006
 M1 instance
 “One size fits all”
M1
Amazon EC2 Instances History
2006 2008 2010 2012 2014
2017
m1.small
m1.large
m1.xlarge
c1.medium
c1.xlarge
m2.xlarge
m2.4xlarge
m2.2xlarge
cc1.4xlarge
t1.micro
cg1.4xlarge
cc2.8xlarge
m1.medium
hi1.4xlarge
m3.xlarge
m3.2xlarge
hs1.8xlarge
cr1.8xlarge
c3.large
c3.xlarge
c3.2xlarge
c3.4xlarge
c3.8xlarge
g2.2xlarge
i2.xlarge
i2.2xlarge
i2.4xlarge
i2.4xlarge
m3.medium
m3.large
r3.large
r3.xlarge
r3.2xlarge
r3.4xlarge
r3.8xlarge
t2.micro
t2.small
t2.med
c4.large
c4.xlarge
c4.2xlarge
c4.4xlarge
c4.8xlarge
d2.xlarge
d2.2xlarge
d2.4xlarge
d2.8xlarge
g2.8xlarge
t2.large
m4.large
m4.xlarge
m4.2xlarge
m4.4xlarge
m4.10xlarge
x1.32xlarge
t2.nano
m4.16xlarge
p2.xlarge
p2.8xlarge
p2.16xlarge
t2.xlarge
t2.2xlarge
r4.large
r4.xlarge
r4.2xlarge
r4.4xlarge
r4.8xlarge
i3.large
i3.xlarge
i3.2xlarge
i3.4xlarge
r4.16xlarge
i3.8xlarge
i3.16xlarge
c5.large
c5.xlarge
c5.2xlarge
c5.4xlarge
c5.8xlarge
c5.16xlarge
f1.16xlarge
f1.2xlarge
Coming Soon!
Instance generation
c4.xlarge
Instance family Instance size
1 2 4 8 16 32
1
2
4
8
16
32
64
128
256
Memory(GB)
vCPU
g2.2xlarge
8 vCPU, 15 GB
1 x 60 SSD
NVIDIA GPU (1,536
CUDA cores, 4GB
Mem)
4 vCPU, 30.5 GB
i2.xlarge (High IO) - 1 x 800 SSD
d2.xlarge (Dense) - 3 x 2000
HDD
8 vCPU, 61 GB
i2.2xlarge (High IO) - 2x800 SSD
d2.2xlarge (Dense) - 6 x 2000
HDD
16 vCPU, 122 GB
i2.4xlarge (High IO) - 4x800 SSD
d2.4xlarge (Dense) - 12x2000 HDD
32 vCPU, 244 GB
i2.8xlarge (High IO) - 8x800 SSD
36 vCPU, 244 GB
d2.8xlarge (Dense) - 24x2000 HDD
m3.xlarge
4 vCPU, 15
GB
2 x 40 SSD
m3.2xlarge
8 vCPU, 30 GB
2 x 80 SSD
m3.large
2 vCPU, 7.5
GB
1 x 32 SSDm3.medium
1 vCPU, 3.75
GB,
1 x 4 SSD
t2.micro
1 vCPU,
1GB
EBS Only
t2.small
1 vCPU,
2GB
EBS Only
t2.medium
2 vCPU,
4GB
EBS Only
r3.large
2 vCPU, 15.25
GB
1 x 32 SSD
r3.xlarge
4 vCPU, 30.5 GB
1 x 80 SSD
r3.4xlarge
16 vCPU, 122 GB
1 x 320 SSD
r3.8xlarge
2 vCPU, 244 GB
2 x 320 SSD
2 vCPU, 3.75 GB
c4.large - EBS Only
c3.large - 2 x 16 SSD
4 vCPU, 7.5 GB
c4.xlarge - EBS Only
c3.xlarge - 2 x 40
SSD
8 vCPU, 15 GB
c4.2xlarge - EBS
Only
c3.2xlarge - 2 x 80
SSD
32 vCPU, 60 GB
c4.8xlarge - EBS Only
c3.8xlarge - 2 x 320
SSD
m4.large
2 vCPU, 8 GB
EBS Only
m4.xlarge
4 vCPU, 16
GB
EBS Only
m4.2xlarge
8 vCPU, 32
GB
EBS Only
m4.4xlarge
16 vCPU, 64
GB
EBS Only
m4.10xlarge
40 vCPU, 160GB
EBS Only
t2.large
2 vCPU, 8 GB
EBS Only
Storage Optimized
GPU Instances
General Purpose
Memory Optimized
Compute Optimized
New M4’s/T2 Large
t2.nano
1 vCPU, 512MB
EBS Only
g2.8xlarge
32vCPU, 60 GB
2 x 120 SSD
4 NVIDIA GPUs (1,536
CUDA cores, 4GB
Mem)
16 vCPU, 30 GB
c4.4xlarge - EBS Only
c3.4xlarge - 2 x 160
SSD
41 (latest generations) EC2 Instance Types
64
m4.16xlarge
64 vCPU, 256GB
EBS Only
P2.xlarge
4 vCPU, 61 GiB
NVIDIA K80 (2,496
CUDA cores, 12GiB
Mem)
r3.2xlarge
8 vCPU, 61 GB
1 x 160 SSD
Example: Web Application
 MediaWiki installed on Apache with 140 pages of content
 Load increased in intervals over time
Example: Web Application
 Memory stats
Example: Web Application
 Disk stats
Example: Web Application
 Network stats
Example: Web Application
 CPU stats
What’s a Virtual CPU? (vCPU)
 A vCPU is typically a hyper-threaded physical core*
 On Linux, “A” threads enumerated before “B” threads
 On Windows, threads are interleaved
 Divide vCPU count by 2 to get core count
 Cores by EC2 & RDS DB Instance type:
https://aws.amazon.com/ec2/virtualcores/
* The “t” family is special
Disable Hyper-Threading If You Need To
 Useful for FPU heavy applications
 Use ‘lscpu’ to validate layout
 Hot offline the “B” threads
for i in `seq 64 127`; do
echo 0 > /sys/devices/system/cpu/cpu${i}/online
done
 Set grub to only initialize the first half of
all threads
maxcpus=63
[ec2-user@ip-172-31-7-218 ~]$ lscpu
CPU(s): 128
On-line CPU(s) list: 0-127
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 4
NUMA node(s): 4
Model name: Intel(R) Xeon(R) CPU
Hypervisor vendor: Xen
Virtualization type: full
NUMA node0 CPU(s): 0-15,64-79
NUMA node1 CPU(s): 16-31,80-95
NUMA node2 CPU(s): 32-47,96-111
NUMA node3 CPU(s): 48-63,112-127
Instance Sizing
c4.8xlarge 2 - c4.4xlarge
≈
4 - c4.2xlarge
≈
8 - c4.xlarge
≈
Resource Allocation
 All resources assigned to you are dedicated to your instance with no
over commitment*
 All vCPUs are dedicated to you
 Memory allocated is assigned only to your instance
 Network resources are partitioned to avoid “noisy neighbors”
 Curious about the number of instances per host? Use “Dedicated
Hosts” as a guide.
*Again, the “T” family is special
“Launching new instances and running tests
in parallel is easy…[when choosing an
instance] there is no substitute for measuring
the performance of your full application.”
- EC2 documentation
Timekeeping Explained
 Timekeeping in an instance is deceptively hard
 gettimeofday(), clock_gettime(), QueryPerformanceCounter()
 The TSC
 CPU counter, accessible from userspace
 Requires calibration, vDSO
 Invariant on Sandy Bridge+ processors
 Xen pvclock; does not support vDSO
 On current generation instances, use TSC as clocksource
Benchmarking - Time Intensive Application
#include <sys/time.h>
#include <time.h>
#include <stdio.h>
#include <unistd.h>
int main()
{
time_t start,end;
time (&start);
for ( int x = 0; x < 100000000; x++ ) {
float f;
float g;
float h;
f = 123456789.0f;
g = 123456789.0f;
h = f * g;
struct timeval tv;
gettimeofday(&tv, NULL);
}
time (&end);
double dif = difftime (end,start);
printf ("Elasped time is %.2lf seconds.n", dif );
return 0;
}
Using the Xen Clock Source
[centos@ip-192-168-1-77 testbench]$ strace -c ./test
Elasped time is 12.00 seconds.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.99 3.322956 2 2001862 gettimeofday
0.00 0.000096 6 16 mmap
0.00 0.000050 5 10 mprotect
0.00 0.000038 8 5 open
0.00 0.000026 5 5 fstat
0.00 0.000025 5 5 close
0.00 0.000023 6 4 read
0.00 0.000008 8 1 1 access
0.00 0.000006 6 1 brk
0.00 0.000006 6 1 execve
0.00 0.000005 5 1 arch_prctl
0.00 0.000000 0 1 munmap
------ ----------- ----------- --------- --------- ----------------
100.00 3.323239 2001912 1 total
Using the TSC Clock Source
[centos@ip-192-168-1-77 testbench]$ strace -c ./test
Elasped time is 2.00 seconds.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
32.97 0.000121 7 17 mmap
20.98 0.000077 8 10 mprotect
11.72 0.000043 9 5 open
10.08 0.000037 7 5 close
7.36 0.000027 5 6 fstat
6.81 0.000025 6 4 read
2.72 0.000010 10 1 munmap
2.18 0.000008 8 1 1 access
1.91 0.000007 7 1 execve
1.63 0.000006 6 1 brk
1.63 0.000006 6 1 arch_prctl
0.00 0.000000 0 1 write
------ ----------- ----------- --------- --------- ----------------
100.00 0.000367 53 1 total
Change with:
Tip: Use TSC as clocksource
P-state and C-state control
 c4.8xlarge, d2.8xlarge, m4.10xlarge,
m4.16xlarge, p2.16xlarge, x1.16xlarge,
x1.32xlarge
 By entering deeper idle states, non-idle
cores can achieve up to 300MHz higher
clock frequencies
 But… deeper idle states require more
time to exit, may not be appropriate for
latency-sensitive workloads
 Limit c-state by adding
“intel_idle.max_cstate=1” to grub
Tip: P-state control for AVX2
 If an application makes heavy use of AVX2 on all cores, the processor
may attempt to draw more power than it should
 Processor will transparently reduce frequency
 Frequent changes of CPU frequency can slow an application
sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo"
See also: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
Review: T2 Instances
 Lowest cost EC2 instance at $0.0065 per hour
 Burstable performance
 Fixed allocation enforced with CPU credits
Model vCPU Baseline CPU Credits
/ Hour
Memory
(GiB)
Storage
t2.nano 1 5% 3 .5 EBS Only
t2.micro 1 10% 6 1 EBS Only
t2.small 1 20% 12 2 EBS Only
t2.medium 2 40% 24 4 EBS Only
t2.large 2 60% 36 8 EBS Only
General purpose, web serving, developer environments, small databases
How Credits Work
 A CPU credit provides the performance of a
full CPU core for one minute
 An instance earns CPU credits at a steady rate
 An instance consumes credits when active
 Credits expire (leak) after 24 hours
Baseline rate
Credit
balance
Burst
rate
Tip: Monitor CPU credit balance
Review: X1 Instances
 Largest memory instance with 2 TB of DRAM
 Quad socket, Intel E7 processors with 128 vCPUs
Model vCPU Memory (GiB) Local Storage Network
x1.16xlarge 64 976 1x 1920GB SSD 10Gbps
x1.32xlarge 128 1952 2x 1920GB SSD 20Gbps
In-memory databases, big data processing, HPC workloads
NUMA
 Non-uniform memory access
 Each processor in a multi-CPU system has
local memory that is accessible through a
fast interconnect
 Each processor can also access memory
from other CPUs, but local memory access
is a lot faster than remote memory
 Performance is related to the number of
CPU sockets and how they are connected -
Intel QuickPath Interconnect (QPI)
QPI
122GB 122GB
16 vCPU’s 16 vCPU’s
r3.8xlarge
QPI
QPI
QPIQPI
QPI
488GB
488GB
488GB
488GB
32 vCPU’s 32 vCPU’s
32 vCPU’s 32 vCPU’s
x1.32xlarge
Tip: Kernel Support for NUMA Balancing
 An application will perform best when the threads of its processes are
accessing memory on the same NUMA node.
 NUMA balancing moves tasks closer to the memory they are accessing.
 This is all done automatically by the Linux kernel when automatic NUMA
balancing is active: version 3.8+ of the Linux kernel.
 Windows support for NUMA first appeared in the Enterprise and Data
Center SKUs of Windows Server 2003.
 Set “numa=off” or use numactl to reduce NUMA paging if your
application uses more memory than will fit on a single socket or has
threads that move between sockets
Operating Systems Impact Performance
 Memory intensive web application
 Created many threads
 Rapidly allocated/deallocated memory
 Comparing performance of RHEL6 vs RHEL7
 Notice high amount of “system” time in top
 Found a benchmark tool (ebizzy) with a similar performance profile
 Traced it’s performance with “perf”
On RHEL6
[ec2-user@ip-172-31-12-150-RHEL6 ebizzy-0.3]$ sudo perf stat ./ebizzy -S 10
12,409 records/s
real 10.00 s
user 7.37 s
sys 341.22 s
Performance counter stats for './ebizzy -S 10':
361458.371052 task-clock (msec) # 35.880 CPUs utilized
10,343 context-switches # 0.029 K/sec
2,582 cpu-migrations # 0.007 K/sec
1,418,204 page-faults # 0.004 M/sec
10.074085097 seconds time elapsed
RHEL6 Flame Graph Output
www.brendangregg.com/flamegraphs.html
On RHEL7
[ec2-user@ip-172-31-7-22-RHEL7 ~]$ sudo perf stat ./ebizzy-0.3/ebizzy -S 10
425,143 records/s
real 10.00 s
user 397.28 s
sys 0.18 s
Performance counter stats for './ebizzy-0.3/ebizzy -S 10':
397515.862535 task-clock (msec) # 39.681 CPUs utilized
25,256 context-switches # 0.064 K/sec
2,201 cpu-migrations # 0.006 K/sec
14,109 page-faults # 0.035 K/sec
10.017856000 seconds time elapsed
Up from 12,400 records/s!
Down from 1,418,204!
RHEL7 Flame Graph Output
Hardware
Split Driver Model
Driver Domain Guest Domain Guest Domain
VMM
Frontend
driver
Frontend
driver
Backend
driver
Device
Driver
Physical
CPU
Physical
Memory
Storage
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
1
23
4
5
Granting in pre-3.8.0 Kernels
 Requires “grant mapping” prior to 3.8.0
 Grant mappings are expensive operations due to TLB flushes
SSD
Inter domain I/O:
(1) Grant memory
(2) Write to ring buffer
(3) Signal event
(4) Read ring buffer
(5) Map grants
(6) Read or write grants
(7) Unmap grants
read(fd, buffer,…)
I/O domain Instance
Granting in 3.8.0+ Kernels, Persistent and Indirect
 Grant mappings are set up in a pool one time
 Data is copied in and out of the grant pool
SSD
read(fd, buffer…)
I/O domain Instance
Grant pool
Copy to
and from
grant pool
Validating Persistent Grants
[ec2-user@ip-172-31-4-129 ~]$ dmesg | egrep -i 'blkfront'
Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated
disks.
blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdd: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvde: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdf: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdg: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdh: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
blkfront: xvdi: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
2009 – Longer ago than you think
 Avatar was the top movie in the theaters
 Facebook overtook MySpace in active users
 President Obama was sworn into office
 The 2.6.32 Linux kernel was released
Tip: Use 3.10+ kernel
 Amazon Linux 13.09 or later
 Ubuntu 14.04 or later
 RHEL/Centos 7 or later
 Etc.
Device Pass Through: Enhanced Networking
 SR-IOV eliminates need for driver domain
 Physical network device exposes virtual function to instance
 Requires a specialized driver, which means:
 Your instance OS needs to know about it
 EC2 needs to be told your instance can use it
Hardware
After Enhanced Networking
Driver Domain Guest Domain Guest Domain
VMM
NIC
Driver
Physical
CPU
Physical
Memory
SR-IOV Network
Device
Virtual CPU
Virtual
Memory
CPU
Scheduling
Sockets
Application
1
2
3
NIC
Driver
Elastic Network Adapter
 Next Generation of Enhanced
Networking
 Hardware Checksums
 Multi-Queue Support
 Receive Side Steering
 20Gbps in a Placement Group
 New Open Source Amazon Network
Driver
Network Performance
 20 Gigabit & 10 Gigabit
 Measured one-way, double that for bi-directional (full duplex)
 High, Moderate, Low – A function of the instance size and EBS
optimization
 Not all created equal – Test with iperf if it’s important!
 Use placement groups when you need high and consistent instance
to instance bandwidth
 All traffic limited to 5 Gb/s when exiting EC2
EBS Performance
 Instance size affects throughput
 Match your volume size and
type to your instance
 Use EBS optimization if EBS
performance is important
Tip: Use HVM AMIs with EBS
 Timekeeping: use TSC
 C state and P state controls
 Monitor T2 CPU credits
 Use a modern Linux OS
 NUMA balancing
 Persistent grants for I/O performance
 Enhanced networking
 Profile your application
 Choose HVM AMIs
Summary: Getting the Most Out of EC2 Instances
 Bare metal performance goal, and in many scenarios already there
 History of eliminating hypervisor intermediation and driver domains
 Hardware assisted virtualization
 Scheduling and granting efficiencies
 Device pass through
Virtualization Themes
Next Steps
 Visit the Amazon EC2 documentation
 Launch an instance and try your app!
Thank You!
Remember to complete
your evaluations!

Weitere ähnliche Inhalte

Was ist angesagt?

Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
DataStax
 
(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems
(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems
(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems
Amazon Web Services
 

Was ist angesagt? (20)

Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block Store
 
AWS Webcast - Achieving consistent high performance with Postgres on Amazon W...
AWS Webcast - Achieving consistent high performance with Postgres on Amazon W...AWS Webcast - Achieving consistent high performance with Postgres on Amazon W...
AWS Webcast - Achieving consistent high performance with Postgres on Amazon W...
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
 
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
 
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load Testing
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS Performance
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Devnexus slides - Amazon Web Services
Devnexus slides - Amazon Web ServicesDevnexus slides - Amazon Web Services
Devnexus slides - Amazon Web Services
 
Openstack study-nova-02
Openstack study-nova-02Openstack study-nova-02
Openstack study-nova-02
 
FPGAs in the cloud? (October 2017)
FPGAs in the cloud? (October 2017)FPGAs in the cloud? (October 2017)
FPGAs in the cloud? (October 2017)
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log system
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
 
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
 
(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems
(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems
(GAM403) From 0 to 60 Million Player Hours in 400B Star Systems
 

Andere mochten auch

Andere mochten auch (20)

Introduction on Amazon EC2
 Introduction on Amazon EC2 Introduction on Amazon EC2
Introduction on Amazon EC2
 
Getting the most Bang for your Buck with #EC2 #Winning
Getting the most Bang for your Buck with #EC2 #WinningGetting the most Bang for your Buck with #EC2 #Winning
Getting the most Bang for your Buck with #EC2 #Winning
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Introduction to Amazon EC2 Spot
Introduction to Amazon EC2 SpotIntroduction to Amazon EC2 Spot
Introduction to Amazon EC2 Spot
 
Lessons & Use-Cases at Scale - Dr. Pete Stanski
Lessons & Use-Cases at Scale - Dr. Pete StanskiLessons & Use-Cases at Scale - Dr. Pete Stanski
Lessons & Use-Cases at Scale - Dr. Pete Stanski
 
What's New with AWS Lambda
What's New with AWS LambdaWhat's New with AWS Lambda
What's New with AWS Lambda
 
Real-time Data Processing using AWS Lambda
Real-time Data Processing using AWS LambdaReal-time Data Processing using AWS Lambda
Real-time Data Processing using AWS Lambda
 
Introduction to AWS Step Functions:
Introduction to AWS Step Functions: Introduction to AWS Step Functions:
Introduction to AWS Step Functions:
 
Modern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagementModern data architectures for real time analytics and engagement
Modern data architectures for real time analytics and engagement
 
Operating your Production API
Operating your Production APIOperating your Production API
Operating your Production API
 
Build a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersBuild a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million Users
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at ScaleModern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web Services
 
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
 
Modern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & EngagementModern Data Architectures for Real Time Analytics & Engagement
Modern Data Architectures for Real Time Analytics & Engagement
 
A Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureA Brief Look at Serverless Architecture
A Brief Look at Serverless Architecture
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
 
5 Ways To Surprise Your Audience (and keep their attention)
5 Ways To Surprise Your Audience (and keep their attention)5 Ways To Surprise Your Audience (and keep their attention)
5 Ways To Surprise Your Audience (and keep their attention)
 
Introduction to AWS X-Ray
Introduction to AWS X-RayIntroduction to AWS X-Ray
Introduction to AWS X-Ray
 
Introduction to Amazon Lightsail
Introduction to Amazon LightsailIntroduction to Amazon Lightsail
Introduction to Amazon Lightsail
 

Ähnlich wie Deep Dive on Amazon EC2

Ähnlich wie Deep Dive on Amazon EC2 (20)

SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)Deep Dive on Amazon EC2 Instances (March 2017)
Deep Dive on Amazon EC2 Instances (March 2017)
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
CMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesCMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 Instances
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Introduction on Amazon EC2
Introduction on Amazon EC2Introduction on Amazon EC2
Introduction on Amazon EC2
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Build an affordable Cloud Stroage
Build an affordable Cloud StroageBuild an affordable Cloud Stroage
Build an affordable Cloud Stroage
 
Getting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute ServicesGetting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute Services
 
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, SematextTuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
 
Getting Started with Amazon EC2 and AWS Compute Services
Getting Started with Amazon EC2 and AWS Compute ServicesGetting Started with Amazon EC2 and AWS Compute Services
Getting Started with Amazon EC2 and AWS Compute Services
 
Deep Learning for Developers
Deep Learning for DevelopersDeep Learning for Developers
Deep Learning for Developers
 
Deep Dive Amazon EC2
Deep Dive Amazon EC2Deep Dive Amazon EC2
Deep Dive Amazon EC2
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 

Kürzlich hochgeladen (20)

My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 

Deep Dive on Amazon EC2

  • 1. Deep Dive on Amazon EC2 Instances Featuring Performance Optimization Best Practices By Liam Morrison, Solutions Architect © 2016 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services, Inc.
  • 2. What to Expect from the Session  Understanding of the factors that goes into choosing an EC2 instance  Defining system performance and how it is characterized for different workloads  How Amazon EC2 instances deliver performance while providing flexibility and agility  How to make the most of your EC2 instance experience through the lens of several instance types
  • 3. API EC2 EC2 Amazon Elastic Compute Cloud is Big Instances Networking Purchase options
  • 4. Host Server Hypervisor Guest 1 Guest 2 Guest n Amazon EC2 Instances
  • 5. In the past  First launched in August 2006  M1 instance  “One size fits all” M1
  • 6. Amazon EC2 Instances History 2006 2008 2010 2012 2014 2017 m1.small m1.large m1.xlarge c1.medium c1.xlarge m2.xlarge m2.4xlarge m2.2xlarge cc1.4xlarge t1.micro cg1.4xlarge cc2.8xlarge m1.medium hi1.4xlarge m3.xlarge m3.2xlarge hs1.8xlarge cr1.8xlarge c3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarge g2.2xlarge i2.xlarge i2.2xlarge i2.4xlarge i2.4xlarge m3.medium m3.large r3.large r3.xlarge r3.2xlarge r3.4xlarge r3.8xlarge t2.micro t2.small t2.med c4.large c4.xlarge c4.2xlarge c4.4xlarge c4.8xlarge d2.xlarge d2.2xlarge d2.4xlarge d2.8xlarge g2.8xlarge t2.large m4.large m4.xlarge m4.2xlarge m4.4xlarge m4.10xlarge x1.32xlarge t2.nano m4.16xlarge p2.xlarge p2.8xlarge p2.16xlarge t2.xlarge t2.2xlarge r4.large r4.xlarge r4.2xlarge r4.4xlarge r4.8xlarge i3.large i3.xlarge i3.2xlarge i3.4xlarge r4.16xlarge i3.8xlarge i3.16xlarge c5.large c5.xlarge c5.2xlarge c5.4xlarge c5.8xlarge c5.16xlarge f1.16xlarge f1.2xlarge Coming Soon!
  • 8. 1 2 4 8 16 32 1 2 4 8 16 32 64 128 256 Memory(GB) vCPU g2.2xlarge 8 vCPU, 15 GB 1 x 60 SSD NVIDIA GPU (1,536 CUDA cores, 4GB Mem) 4 vCPU, 30.5 GB i2.xlarge (High IO) - 1 x 800 SSD d2.xlarge (Dense) - 3 x 2000 HDD 8 vCPU, 61 GB i2.2xlarge (High IO) - 2x800 SSD d2.2xlarge (Dense) - 6 x 2000 HDD 16 vCPU, 122 GB i2.4xlarge (High IO) - 4x800 SSD d2.4xlarge (Dense) - 12x2000 HDD 32 vCPU, 244 GB i2.8xlarge (High IO) - 8x800 SSD 36 vCPU, 244 GB d2.8xlarge (Dense) - 24x2000 HDD m3.xlarge 4 vCPU, 15 GB 2 x 40 SSD m3.2xlarge 8 vCPU, 30 GB 2 x 80 SSD m3.large 2 vCPU, 7.5 GB 1 x 32 SSDm3.medium 1 vCPU, 3.75 GB, 1 x 4 SSD t2.micro 1 vCPU, 1GB EBS Only t2.small 1 vCPU, 2GB EBS Only t2.medium 2 vCPU, 4GB EBS Only r3.large 2 vCPU, 15.25 GB 1 x 32 SSD r3.xlarge 4 vCPU, 30.5 GB 1 x 80 SSD r3.4xlarge 16 vCPU, 122 GB 1 x 320 SSD r3.8xlarge 2 vCPU, 244 GB 2 x 320 SSD 2 vCPU, 3.75 GB c4.large - EBS Only c3.large - 2 x 16 SSD 4 vCPU, 7.5 GB c4.xlarge - EBS Only c3.xlarge - 2 x 40 SSD 8 vCPU, 15 GB c4.2xlarge - EBS Only c3.2xlarge - 2 x 80 SSD 32 vCPU, 60 GB c4.8xlarge - EBS Only c3.8xlarge - 2 x 320 SSD m4.large 2 vCPU, 8 GB EBS Only m4.xlarge 4 vCPU, 16 GB EBS Only m4.2xlarge 8 vCPU, 32 GB EBS Only m4.4xlarge 16 vCPU, 64 GB EBS Only m4.10xlarge 40 vCPU, 160GB EBS Only t2.large 2 vCPU, 8 GB EBS Only Storage Optimized GPU Instances General Purpose Memory Optimized Compute Optimized New M4’s/T2 Large t2.nano 1 vCPU, 512MB EBS Only g2.8xlarge 32vCPU, 60 GB 2 x 120 SSD 4 NVIDIA GPUs (1,536 CUDA cores, 4GB Mem) 16 vCPU, 30 GB c4.4xlarge - EBS Only c3.4xlarge - 2 x 160 SSD 41 (latest generations) EC2 Instance Types 64 m4.16xlarge 64 vCPU, 256GB EBS Only P2.xlarge 4 vCPU, 61 GiB NVIDIA K80 (2,496 CUDA cores, 12GiB Mem) r3.2xlarge 8 vCPU, 61 GB 1 x 160 SSD
  • 9. Example: Web Application  MediaWiki installed on Apache with 140 pages of content  Load increased in intervals over time
  • 14. What’s a Virtual CPU? (vCPU)  A vCPU is typically a hyper-threaded physical core*  On Linux, “A” threads enumerated before “B” threads  On Windows, threads are interleaved  Divide vCPU count by 2 to get core count  Cores by EC2 & RDS DB Instance type: https://aws.amazon.com/ec2/virtualcores/ * The “t” family is special
  • 15.
  • 16. Disable Hyper-Threading If You Need To  Useful for FPU heavy applications  Use ‘lscpu’ to validate layout  Hot offline the “B” threads for i in `seq 64 127`; do echo 0 > /sys/devices/system/cpu/cpu${i}/online done  Set grub to only initialize the first half of all threads maxcpus=63 [ec2-user@ip-172-31-7-218 ~]$ lscpu CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 4 NUMA node(s): 4 Model name: Intel(R) Xeon(R) CPU Hypervisor vendor: Xen Virtualization type: full NUMA node0 CPU(s): 0-15,64-79 NUMA node1 CPU(s): 16-31,80-95 NUMA node2 CPU(s): 32-47,96-111 NUMA node3 CPU(s): 48-63,112-127
  • 17.
  • 18. Instance Sizing c4.8xlarge 2 - c4.4xlarge ≈ 4 - c4.2xlarge ≈ 8 - c4.xlarge ≈
  • 19. Resource Allocation  All resources assigned to you are dedicated to your instance with no over commitment*  All vCPUs are dedicated to you  Memory allocated is assigned only to your instance  Network resources are partitioned to avoid “noisy neighbors”  Curious about the number of instances per host? Use “Dedicated Hosts” as a guide. *Again, the “T” family is special
  • 20. “Launching new instances and running tests in parallel is easy…[when choosing an instance] there is no substitute for measuring the performance of your full application.” - EC2 documentation
  • 21. Timekeeping Explained  Timekeeping in an instance is deceptively hard  gettimeofday(), clock_gettime(), QueryPerformanceCounter()  The TSC  CPU counter, accessible from userspace  Requires calibration, vDSO  Invariant on Sandy Bridge+ processors  Xen pvclock; does not support vDSO  On current generation instances, use TSC as clocksource
  • 22. Benchmarking - Time Intensive Application #include <sys/time.h> #include <time.h> #include <stdio.h> #include <unistd.h> int main() { time_t start,end; time (&start); for ( int x = 0; x < 100000000; x++ ) { float f; float g; float h; f = 123456789.0f; g = 123456789.0f; h = f * g; struct timeval tv; gettimeofday(&tv, NULL); } time (&end); double dif = difftime (end,start); printf ("Elasped time is %.2lf seconds.n", dif ); return 0; }
  • 23. Using the Xen Clock Source [centos@ip-192-168-1-77 testbench]$ strace -c ./test Elasped time is 12.00 seconds. % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.99 3.322956 2 2001862 gettimeofday 0.00 0.000096 6 16 mmap 0.00 0.000050 5 10 mprotect 0.00 0.000038 8 5 open 0.00 0.000026 5 5 fstat 0.00 0.000025 5 5 close 0.00 0.000023 6 4 read 0.00 0.000008 8 1 1 access 0.00 0.000006 6 1 brk 0.00 0.000006 6 1 execve 0.00 0.000005 5 1 arch_prctl 0.00 0.000000 0 1 munmap ------ ----------- ----------- --------- --------- ---------------- 100.00 3.323239 2001912 1 total
  • 24. Using the TSC Clock Source [centos@ip-192-168-1-77 testbench]$ strace -c ./test Elasped time is 2.00 seconds. % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 32.97 0.000121 7 17 mmap 20.98 0.000077 8 10 mprotect 11.72 0.000043 9 5 open 10.08 0.000037 7 5 close 7.36 0.000027 5 6 fstat 6.81 0.000025 6 4 read 2.72 0.000010 10 1 munmap 2.18 0.000008 8 1 1 access 1.91 0.000007 7 1 execve 1.63 0.000006 6 1 brk 1.63 0.000006 6 1 arch_prctl 0.00 0.000000 0 1 write ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000367 53 1 total
  • 25. Change with: Tip: Use TSC as clocksource
  • 26. P-state and C-state control  c4.8xlarge, d2.8xlarge, m4.10xlarge, m4.16xlarge, p2.16xlarge, x1.16xlarge, x1.32xlarge  By entering deeper idle states, non-idle cores can achieve up to 300MHz higher clock frequencies  But… deeper idle states require more time to exit, may not be appropriate for latency-sensitive workloads  Limit c-state by adding “intel_idle.max_cstate=1” to grub
  • 27. Tip: P-state control for AVX2  If an application makes heavy use of AVX2 on all cores, the processor may attempt to draw more power than it should  Processor will transparently reduce frequency  Frequent changes of CPU frequency can slow an application sudo sh -c "echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo" See also: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
  • 28. Review: T2 Instances  Lowest cost EC2 instance at $0.0065 per hour  Burstable performance  Fixed allocation enforced with CPU credits Model vCPU Baseline CPU Credits / Hour Memory (GiB) Storage t2.nano 1 5% 3 .5 EBS Only t2.micro 1 10% 6 1 EBS Only t2.small 1 20% 12 2 EBS Only t2.medium 2 40% 24 4 EBS Only t2.large 2 60% 36 8 EBS Only General purpose, web serving, developer environments, small databases
  • 29. How Credits Work  A CPU credit provides the performance of a full CPU core for one minute  An instance earns CPU credits at a steady rate  An instance consumes credits when active  Credits expire (leak) after 24 hours Baseline rate Credit balance Burst rate
  • 30. Tip: Monitor CPU credit balance
  • 31. Review: X1 Instances  Largest memory instance with 2 TB of DRAM  Quad socket, Intel E7 processors with 128 vCPUs Model vCPU Memory (GiB) Local Storage Network x1.16xlarge 64 976 1x 1920GB SSD 10Gbps x1.32xlarge 128 1952 2x 1920GB SSD 20Gbps In-memory databases, big data processing, HPC workloads
  • 32. NUMA  Non-uniform memory access  Each processor in a multi-CPU system has local memory that is accessible through a fast interconnect  Each processor can also access memory from other CPUs, but local memory access is a lot faster than remote memory  Performance is related to the number of CPU sockets and how they are connected - Intel QuickPath Interconnect (QPI)
  • 33. QPI 122GB 122GB 16 vCPU’s 16 vCPU’s r3.8xlarge
  • 34. QPI QPI QPIQPI QPI 488GB 488GB 488GB 488GB 32 vCPU’s 32 vCPU’s 32 vCPU’s 32 vCPU’s x1.32xlarge
  • 35. Tip: Kernel Support for NUMA Balancing  An application will perform best when the threads of its processes are accessing memory on the same NUMA node.  NUMA balancing moves tasks closer to the memory they are accessing.  This is all done automatically by the Linux kernel when automatic NUMA balancing is active: version 3.8+ of the Linux kernel.  Windows support for NUMA first appeared in the Enterprise and Data Center SKUs of Windows Server 2003.  Set “numa=off” or use numactl to reduce NUMA paging if your application uses more memory than will fit on a single socket or has threads that move between sockets
  • 36. Operating Systems Impact Performance  Memory intensive web application  Created many threads  Rapidly allocated/deallocated memory  Comparing performance of RHEL6 vs RHEL7  Notice high amount of “system” time in top  Found a benchmark tool (ebizzy) with a similar performance profile  Traced it’s performance with “perf”
  • 37. On RHEL6 [ec2-user@ip-172-31-12-150-RHEL6 ebizzy-0.3]$ sudo perf stat ./ebizzy -S 10 12,409 records/s real 10.00 s user 7.37 s sys 341.22 s Performance counter stats for './ebizzy -S 10': 361458.371052 task-clock (msec) # 35.880 CPUs utilized 10,343 context-switches # 0.029 K/sec 2,582 cpu-migrations # 0.007 K/sec 1,418,204 page-faults # 0.004 M/sec 10.074085097 seconds time elapsed
  • 38. RHEL6 Flame Graph Output www.brendangregg.com/flamegraphs.html
  • 39. On RHEL7 [ec2-user@ip-172-31-7-22-RHEL7 ~]$ sudo perf stat ./ebizzy-0.3/ebizzy -S 10 425,143 records/s real 10.00 s user 397.28 s sys 0.18 s Performance counter stats for './ebizzy-0.3/ebizzy -S 10': 397515.862535 task-clock (msec) # 39.681 CPUs utilized 25,256 context-switches # 0.064 K/sec 2,201 cpu-migrations # 0.006 K/sec 14,109 page-faults # 0.035 K/sec 10.017856000 seconds time elapsed Up from 12,400 records/s! Down from 1,418,204!
  • 41. Hardware Split Driver Model Driver Domain Guest Domain Guest Domain VMM Frontend driver Frontend driver Backend driver Device Driver Physical CPU Physical Memory Storage Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application 1 23 4 5
  • 42. Granting in pre-3.8.0 Kernels  Requires “grant mapping” prior to 3.8.0  Grant mappings are expensive operations due to TLB flushes SSD Inter domain I/O: (1) Grant memory (2) Write to ring buffer (3) Signal event (4) Read ring buffer (5) Map grants (6) Read or write grants (7) Unmap grants read(fd, buffer,…) I/O domain Instance
  • 43. Granting in 3.8.0+ Kernels, Persistent and Indirect  Grant mappings are set up in a pool one time  Data is copied in and out of the grant pool SSD read(fd, buffer…) I/O domain Instance Grant pool Copy to and from grant pool
  • 44. Validating Persistent Grants [ec2-user@ip-172-31-4-129 ~]$ dmesg | egrep -i 'blkfront' Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks. blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdd: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvde: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdf: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdg: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdh: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; blkfront: xvdi: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled;
  • 45. 2009 – Longer ago than you think  Avatar was the top movie in the theaters  Facebook overtook MySpace in active users  President Obama was sworn into office  The 2.6.32 Linux kernel was released Tip: Use 3.10+ kernel  Amazon Linux 13.09 or later  Ubuntu 14.04 or later  RHEL/Centos 7 or later  Etc.
  • 46. Device Pass Through: Enhanced Networking  SR-IOV eliminates need for driver domain  Physical network device exposes virtual function to instance  Requires a specialized driver, which means:  Your instance OS needs to know about it  EC2 needs to be told your instance can use it
  • 47. Hardware After Enhanced Networking Driver Domain Guest Domain Guest Domain VMM NIC Driver Physical CPU Physical Memory SR-IOV Network Device Virtual CPU Virtual Memory CPU Scheduling Sockets Application 1 2 3 NIC Driver
  • 48. Elastic Network Adapter  Next Generation of Enhanced Networking  Hardware Checksums  Multi-Queue Support  Receive Side Steering  20Gbps in a Placement Group  New Open Source Amazon Network Driver
  • 49. Network Performance  20 Gigabit & 10 Gigabit  Measured one-way, double that for bi-directional (full duplex)  High, Moderate, Low – A function of the instance size and EBS optimization  Not all created equal – Test with iperf if it’s important!  Use placement groups when you need high and consistent instance to instance bandwidth  All traffic limited to 5 Gb/s when exiting EC2
  • 50. EBS Performance  Instance size affects throughput  Match your volume size and type to your instance  Use EBS optimization if EBS performance is important
  • 51. Tip: Use HVM AMIs with EBS
  • 52.  Timekeeping: use TSC  C state and P state controls  Monitor T2 CPU credits  Use a modern Linux OS  NUMA balancing  Persistent grants for I/O performance  Enhanced networking  Profile your application  Choose HVM AMIs Summary: Getting the Most Out of EC2 Instances
  • 53.  Bare metal performance goal, and in many scenarios already there  History of eliminating hypervisor intermediation and driver domains  Hardware assisted virtualization  Scheduling and granting efficiencies  Device pass through Virtualization Themes
  • 54. Next Steps  Visit the Amazon EC2 documentation  Launch an instance and try your app!