Amazon EBS provides persistent block-level storage volumes for use with Amazon EC2 instances. In this technical session, you will discover how Amazon EBS can take your application deployments on EC2 to the next level. Session attendees will learn about the Amazon EBS features and benefits, how to identify applications that are appropriate for use with Amazon EBS, best practices, and details about its performance and volume types. We discuss how to maximize Amazon EBS performance, with a special emphasis on low-latency, high-throughput applications like transactional and NoSQL databases, and big data analysis frameworks like Hadoop and Kafka. We will also dive deep and discuss Elastic Volumes, our latest EBS feature that allows you to dynamically increase capacity, tune performance, and change the type of EBS volumes on the fly. Throughout, we share tips for success.
2. The AWS Storage Portfolio
Cloud Data Migration
AWS Direct
Connect
AWS Snow*
data transport
family
3rd Party
Connectors
Transfer
Acceleration
AWS
Storage
Gateway
Amazon Kinesis
Firehose
Object
Amazon GlacierAmazon S3
Block
Amazon EBS
(persistent)
Amazon EC2
Instance Store
(ephemeral)
File
Amazon EFS
5. What is Amazon EC2 instance store?
EC2 instances • Local to instance
• Non-persistent data store
• Data not replicated (by default)
• No snapshot support
• SSD or HDD
Physical Host
Instance Store
or
10. What is EBS?
EC2
instance
EC2
instance
• Volumes persist independent of
EC2
• Select storage and compute
based on your workload
• Detach and attach between
instances within the same
Availability Zone
Availability Zone
AWS Region
EBS
volume
11. What is EBS?
EC2
instance
• Volumes attach to one
instance at a time
• Many volumes can attach to
an instance
• Separate boot volume from
data volumes
EBS
volume
(boot)
EBS
volume
(data)
EBS
volume
(data)
Availability Zone
AWS Region
16. or
Choosing an EBS volume type
What is more important to your workload?
IOPS Throughput?
17. i3
gp2
Choosing an EBS volume type
Latency?
< 1 ms Single-digit ms
Which is more important?
Cost Performance
IOPS
≤ 75,000> 75,000
is more important
18. EBS volume types: General Purpose SSD
gp2
Throughput: 160 MiB/s
Latency: Single-digit ms
Capacity: 1 GiB to 16 TiB
Baseline: 100 - 10,000 IOPS; 3 IOPS per GiB
Burst: 3,000 IOPS (for volumes up to 1,000 GiB)
Great for boot volumes, low-latency applications, and bursty databases
General Purpose SSD
20. Burst bucket: General Purpose SSD (gp2)
Max I/O credit per bucket is 5.4M
You can spend up to
3000 credits per second
Baseline performance = 3 IOPS per GiB or 100 IOPS
Always accumulating
3 IOPS per GiB per second
gp2
21. How long can I burst on gp2?
0
150
300
450
600
750
1 8 30 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950
MinutesofBurst
Volume size in GB
43 min
300 GiB
1 hour
500 GiB
10 hours
950 GiB
22. i3
gp2 io1
Choosing an EBS volume type
Latency?
< 1 ms Single-digit ms
Which is more important?
Cost Performance
IOPS
≤ 75,000> 75,000
is more important
23. EBS volume types: I/O Provisioned
Provisioned IOPS SSD
io1
Baseline: 100 to 20,000 IOPS
Throughput: 320 MiB/s
Latency: Single-digit ms
Capacity: 4 GiB to 16 TiB
Ideal for critical applications and databases with sustained IOPS
24. Scaling Provisioned IOPS SSD (io1)IOPS
0 2 16
1,000
5,000
10,000
15,000
20,000
6 90.4
MAX PROVISIONED IOPS
(Maximum IOPS:GB ratio of 50:1)
Available Provisioned IOPS
Volume Size (TiB)
~ 400 GiB
25. gp2 io1
Choosing an EBS volume type
Latency?
< 1 ms Single-digit ms
Which is more important?
PerformanceCost
IOPS
≤ 75,000> 75,000
is more important
Throughput?
i3
26. Throughput
is more important
Small, random I/O Large, sequential I/O
gp2 io1 st1
d2
Choosing an EBS volume type
Latency?
< 1 ms Single-digit ms ≤ 1,750 MiB/s
Aggregate throughput?
> 1,750 MiB/s
Which is more important?
Cost Performance
IOPS
≤ 75,000> 75,000
is more important
Which is more important?
Cost Performance
i3
27. EBS volume types: Throughput Provisioned
st1
Baseline: 40 MiB/s per TiB up to 500 MiB/s
Capacity: 500 GiB to 16 TiB
Burst: 250 MiB/s per TiB up to 500 MiB/s
Ideal for large-block, high-throughput sequential workloads
28. Throughput Optimized HDD – burst and base
0
125
250
375
500
625
0.5 1 2 4 6 8 10 12 14 16
ThroughputinMiB/s
Volume Size in TiB
Burst Base
320
ST1
29. Burst bucket: Throughput Optimized HDD (st1)
Max I/O bucket credit is 1 TiB of
credit per TiB in volume
You can spend up to 500
MiB/s per TiB
Baseline performance = 40 MiB/s per TiB
Always accumulating 40 MiB/s per TiB
st1
30. Up to 8 TiB in I/O credit
Always accumulating 320 MiB/s
You can spend up
to 500 MiB/s
Burst bucket: example 8 TiB st1 volume
Baseline performance = 320 MiB/s
st1
31. Throughput
is more important
Small, random I/O Large, sequential I/O
Which is more important?
Latency?
i2
gp2 io1 sc1 st1
d2
Choosing an EBS volume type
IOPS
≤ 75,000> 75,000
< 1 ms Single-digit ms ≤ 1,750 MiB/s
Aggregate throughput?
> 1,750 MiB/s
is more important
Cost Performance
Which is more important?
Cost Performance
32. sc1
EBS volume types: Throughput Provisioned
Baseline: 12 MiB/s per TB up to 192 MiB/s
Capacity: 500 GiB to 16 TiB
Burst: 80 MiB/s per TB up to 250 MiB/s
Ideal for sequential throughput workloads, such as logging and backup
Cold HDD
33. Cold HDD – burst and base
0
75
150
225
300
0.5 1 2 4 6 8 10 12 14 16
ThroughputinMiB/s
Volume size in TiB
Burst Base
192
SC1
34. Burst bucket: Cold HDD (sc1)
Max I/O bucket credit is 1 TiB of
credit per TiB in volume
You can spend up to 250
MiB/s per TiB
Baseline performance = 12 MiB/s per TiB
Always accumulating 12 MiB/s per TiB
sc1
35. Throughput
is more important
Small, random I/O Large, sequential I/O
Which is more important?
Latency?
i3
gp2 io1 sc1 st1
d2
Choosing an EBS volume type
IOPS
≤ 75,000> 75,000
< 1 ms Single-digit ms ≤ 1,250 MiB/s
Aggregate throughput?
> 1,250 MiB/s
is more important
Cost Performance
Which is more important?
Cost Performance
37. Baseline: 100 - 10,000 IOPS; 3 IOPS per GiB
EBS volume types: General Purpose SSD
gp2
Throughput: 160MiB/s
Latency: Single-digit ms
Capacity: 1 GiB to 16 TiB
Burst: 3,000 IOPS (for volumes up to 1 TiB)
Great for boot volumes, low-latency applications, and bursty databases
General Purpose SSD
38. I/O Provisioned Volumes Throughput Provisioned Volumes
sc1st1io1gp2
$0.10 per GiB $0.125 per GiB
$0.065 per PIOPS
* All prices are per month, prorated to the hour and from the us-west-2 Region as of April 2017
$0.045 per GiB $0.025 per GiB
Snapshot storage for all volume types is $0.05 per GiB per month
39. Hybrid volume use cases
c4
gp2
st1
Case Study:
Librato’s Experience Running Cassandra
Using Amazon EBS
Data files
Commit log
i2
50. EBS volume types: Throughput Provisioned
Throughput Optimized
HDD
st1
Baseline: 40 MiB/s per TiB up to 500 MiB/s
Capacity: 500 GiB to 16 TiB
Burst: 250 MiB/s per TiB up to 500 MiB/s
Ideal for large-block, high-throughput sequential workloads
53. Elastic Volumes - Automation Ideas
• Use a CloudWatch alarm to
watch for a volume that is
running at or near its IOPS limit.
Amazon CloudWatch
Right-sizing:
• Initiate workflow and approval
process to provision additional
IOPS
• Publish a “free space” metric to
CloudWatch and use a similar
approval process to resize the
volume and the filesystem.
54. Elastic Volumes - Automation Ideas
• Use metrics or schedules to
reduce IOPS or to change the
type of a volume
Amazon CloudWatch
Cost Reduction:
https://github.com/awslabs/aws-elastic-volumes
55. Hybrid volume use cases
gp2 st1
Case Study:
How Videology and Zendesk Modernized
Their Big Data Platforms on Amazon EBS
Hot data
0–7 Days
Warm data
8–30 days
sc1
Cold data
31–60 days
Tiered Elasticsearch data:
58. What is EBS encryption?
Encryption
• Attach both encrypted and unencrypted
• No volume performance impact
• Any current generation instance
• Supported by all EBS volume types
• Snapshots also encrypted
• No extra cost
• Boot and data volumes can be encrypted
60. Best practice: Encryption
Create a new AWS KMS master key for EBS
• Define key rotation policy
• Enable AWS CloudTrail auditing
• Control who can use key
• Control who can administer key
63. EBS is designed for:
What is EBS?
99.999% service availability
0.1% to 0.2% annual failure rate (AFR)
64. What is an EBS snapshot?
EBS
volume
Availability Zone
AWS Region
Amazon
S3
Availability Zone
Replica
EBS snapshot
65. How does an EBS snapshot work?
• Point-in-time backup of modified volume blocks
• Stored in S3, accessed via EBS APIs
• Subsequent snapshots are incremental
• Deleting snapshot will only remove data exclusive
to that snapshot
EBS
volume
EBS
snapshot
67. Best practice: taking snapshots from Linux
Quiesce I/O
1. Database: FLUSH and LOCK tables
2. Filesystem: sync and fsfreeze
3. EBS: snapshot all volumes
4. When CreateSnapshot API returns
success, it is safe to resume
68. Best practice: taking snapshots from Windows
1. sync equivalent available
2. Use Volume Shadow Copy Service (VSS) -
aware utilities for backups
3. EBS: backups on dedicated volume for
snapshots
70. Hybrid volume use cases
st1
Case study:
https://aws.amazon.com/solutions/case-studies/infor-ebs/
Transaction logs
“We’ve seen much stronger performance for our database backup
workloads with the Amazon EBS ST1 volumes, and we’re also
saving 75 percent on our monthly backup costs.”
Randy Young, Director of Cloud Operations, Infor
i2
st1
Full backups
st1
Partial backups
SQL Server
Database
EBS
snapshots
71. Tracking snapshots and costs (tagging)
• Custom tags provide the ability to assign key / value pairs to AWS
resources
• EBS snapshots supports custom tags for identification and
management
• Tags assigned to EBS snapshots can be activated as “cost
allocation” tags. Allows for greater visibility into snapshot storage
costs
EBS
snapshots
Development
EBS volume
Key = usage
Value = dev
Key = usage
Value = dev
EBS
snapshots
Production
EBS volume
Key = usage
Value = backup
Key = usage
Value = backup
72. Tracking snapshots and costs (tagging)
• Tags can be added to an existing snapshot using:
• AWS console
• API call:
• CLI command: aws ec2 create-tags --resources
snap-06d43f58bce0f49d9 --tags
Key=usage,Value=dev
CreateTags
Key = usage Value = dev
73. Tracking snapshots and costs (Cost Explorer)
• First, activate customer tag for cost allocation
• Generate reports with …..
• View usage and costs broken down by “usage” tag value
(example : metrics, dev, backup)
74. What can you do with a snapshot?
EBS
volume
Availability Zone
AWS Region
Amazon
S3
EBS snapshot
Availability Zone
EBS
volume
Replica Replica
75. What can you do with a snapshot?
EBS
volume
Availability Zone
AWS Region
Amazon
S3
EBS snapshot
EBS
volume
Availability Zone
AWS Region
EBS snapshot
Replica Replica
76. What can you do with a snapshot?
AWS Region
Public datasets on
AWS available as
EBS snapshots:
https://aws.amazon.com/public-data-sets/
• Genomic
• Census
• Global weather
• Transportation
77. Best practice: Automate snapshots
Key ingredients:
AWS Lambda Amazon EC2
Run Command
Tagging
https://aws.amazon.com/ec2/run-command/
78. Best practice: Automate snapshots
Lambda
scheduled event:
daily snapshots
EC2
Backup
Retention
30 days
Search for instances
tagged “Backup”
Use EC2 Run Command to
quiesce file system
Snapshot attached
volumes
Tag snapshots with
expire date
1. 2. 3. 4.
instances
79. Best practice: Automate snapshot expiration
Lambda
scheduled event:
daily expire
Search for snapshots
tagged to “Expire On”
today
Delete expired snapshots
1. 2.
EBS
snapshots
Backup
ExpireOn
Date
87. How do we count I/Os for GP2 and IO1?
When possible, we logically merge sequential I/Os (up to
256 KiB in size)
...To minimize I/O charges on IO1
and maximize burst on GP2
88. How do we count I/Os for GP2 and IO1?
Example 1: Random I/Os
• 4 random I/Os (i.e., non sequential I/Os)
• Each I/O 64 KiB
Up to 256 KiB
EC2
instance
EBS
Counted as 4 I/Os
89. How do we count I/Os for GP2 and IO1?
Example 2: Sequential I/O
• 4 sequential I/Os
• Each I/O 64 KiB
Up to 256 KiB
EC2
instance
EBS
Counted as 1 I/O
90. How do we count I/Os for GP2 and IO1?
Example 3: Large I/O
• 1 I/O
• 1024 KiB
Up to 256 KiB
EC2
instance
EBS
Counted as 4 I/Os
91. How do we count I/Os for ST1 and SC1?
• When possible, we merge sequential I/Os (up to 1 MiB in size)
• Workloads with primarily large, sequential I/Os perform best on ST1
and SC1
• Examples: Big Data/EMR, Hadoop, Kafka, log processing, data
warehouses
92. How do we count I/Os for ST1 and SC1?
Example 1: Random I/Os
• 4 random I/Os
• Each I/O 64 KiB
Up to 1024 KiB
EC2
instance
EBS
Counted as 4 I/Os or 4 MiB/s of burst
93. How do we count I/Os for ST1 and SC1?
Example 2: Sequential I/O
• 4 sequential I/Os
• Each I/O 1024 KiB
Up to 1024 KiB
EC2
instance
EBS
Counted as 4 I/Os or 4 MiB/s of burst
94. How do we count I/Os for ST1 and SC1?
Example 3: Mixed I/O
• 2 * 512 KB sequential I/Os
• 2 * 64 KB random I/Os
• 2 * 128 KB sequential I/Os
Up to 1024 KiB
EC2
instance
EBS
Counted as 4 I/Os or 4 MiB/s of burst (but only ~ 1.4 MiB of data transferred)
95. Burst balance for ST1 and SC1 - Sequential
0
25
50
75
100
125
0 1 2 3 4 5 6 7 8 9 10
BurstBalance%
Time in Hours
1 MB Sequential
4 TiB ST1 volume
1 MB Sequential:
500 MB/s for 3 hours
96. Burst balance for ST1 and SC1 - Random
0
25
50
75
100
125
0 1 2 3 4 5 6 7 8 9 10
BurstBalance%
Time in Hours
16 KB Random
4 TiB ST1 volume
1 MiB Sequential:
500 MB/s for 3 hours
16 KiB Random:
8 MB/s for 3 hours
97. Burst balance for ST1 and SC1
4 TiB ST1 Volume5,400 GiB
87 GiB
0
1500
3000
4500
6000
Data Transferred in GiB
Data Transfer Comparison: Sequential vs. Random
1 MiB Sequential 16 KiB Random
1 MiB Sequential:
5.4 TiB transferred
16 KiB Random:
87 GiB transferred
98. 2046 sectors x 512 bytes/sector = ~1024 KiB
$ iostat –xm
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
xvdf 0.00 0.20 0.00 523.40 0.00 523.00 2046.44 3.99 7.62 1.61 100.00
Verify workload I/O patterns
iostat for Linux
perfmon for Windows
102. ST1 & SC1: Linux performance tuning
Increase read-ahead buffer:
• Recommended for high-throughput read workloads
• Per volume configuration
• Default is 128 KiB (256 sectors) for Amazon Linux
• Smaller or random I/O will degrade performance
For example:
$ sudo blockdev –setra 2048 /dev/xvdf
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
103. ST1 & SC1: Linux performance tuning
Increase maximum I/O size:
• Recommended for high-throughput read workloads
• Per instance configuration for all volumes
• Default is 128 KiB (32 pages) for Amazon Linux
• Increased memory consumption per volume
3.10 – 4.5:
kernel /boot/vmlinuz-4.4.5-15.26.amzn1.x86_64 root=LABEL=/ console=ttyS0 xen_blkfront.max=256
4.6+:
kernel /boot/vmlinuz-4.9.20-11.31.amzn1.x86_64 root=LABEL=/ console=ttyS0
xen_blkfront.max_indirect_segments=256
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
104. How do I monitor my burst balance?
VolumeWriteOpsBurstBalance
900,000
write IOs
over 5 min =
3000 IOPS
450,000
write IOs
over 5 min =
1500 IOPS
One Hour
105. What is an EBS-optimized instance?
EBS
volume
Availability Zone
AWS Region
EBS-optimized
EC2 instance
106. What is an EBS-optimized instance?
EC2 instances
InternetDatabases
c3.2xlarge
~ 125 MiB/s
Shared
S3
EBS
107. What is an EBS-optimized instance?
c3.2xlarge
~ 125 MiB/s
Shared
~ 125 MiB/s
Dedicated
EBS
EC2 instances
InternetDatabases
S3
108. What is an EBS-optimized instance?
More details:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html
• Dedicated network bandwidth for EBS I/O
• Enabled by default on c4, d2, m4, r4, p2, and x1 instances
• Can be enabled at instance launch or on a running instance
• Not an option on some 10 Gbps instance types
• (c3.8xlarge, r3.8xlarge, i2.8xlarge)
112. Best practice: RAID
When to RAID?
• Storage requirement > 16 TiB
• Throughput requirement > 500 MiB/s
• IOPS requirement > 20,000 @ 16K
EBS EBS EBS
113. Best practice: RAID
Avoid RAID for redundancy
• EBS data is already replicated
• RAID1 halves available EBS bandwidth
• RAID5/6 loses 20 – 30% of usable I/O to parity
EBS EBS EBS
114. EBS volume initialization
New EBS volume? New EBS volume from snapshot?
• Attach and it’s ready to go • Initialize for best performance
• Random read across volume
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html
116. Summary
Use encryption if you
need it
Take snapshots,
tag snapshots
Select the right
instance for your
workload
Select the right
volume for your
workload