Read the blogpost: http://skyscrape.rs/2013/11/15/awsugbe-2-aws-use-cases-and-s3-best-practicesupload-performance/
At the second AWS User Group Belguim, I presented “S3 Intro, tips and filling it up with data quickly”. The first half focused on a general introduction to S3 on how to use it. The second section focused on how to get your data onto S3 as quickly as possible using standard tools.
After some theory on best practices, we progressed to do some tests and formulate conclusions. The tests started at around 18 megabytes per second of data transferred from an EC2 ramdisk to S3. However, through some simple optimisations we got up to 248 megabytes per second using just standard command line tools.
The two main benefactors to this dramatic performance increase were:
- instance type and related IO performance class
- the use of multiple upload threads.
Theoretically a Very High I/O instance should go up to 10 Gbit, or about 1,1 gigabytes per second. Some people (http://improve.dk/pushing-the-limits-of-amazon-s3-upload-performance/) on the internet claim to have gotten up to such speeds. Alex shed some light on how we might be able to reach that goal by taking into consideration how S3 indexing and partitioning (these two might help: http://www.slideshare.net/AmazonWebServices/building-scalable-applications-on-amazon-s3-stg303
http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html) works. Unfortunately I haven't had the time to test that out yet. Any takers? :-)
More than Just Lines on a Map: Best Practices for U.S Bike Routes
S3 intro, tips and filling it up with data aws ug be #2
1. S3
Intro, tips and filling it up with data quickly
AWS User Group Belgium #2 - 2013/11/06
@fdenkens
frederik@skyscrape.rs
http://skyscrape.rs
2. The Skyscrapers ...
● help companies figure out cloud
● design and build platforms in the cloud
● take care of the complete lifecycle, so you
can focus on your business
26. Some numbers
threads
moderate IO
high IO
very high IO
avg
max
avg
max
avg
max
1
18
23
21
20
19
19
10
90
112
100
118
153
164
40
86
114
114
119
248
248
50
86
117
119
122
207
242
(Megabytes per second)
27. Conclusions (1)
●
●
●
●
Optimisation is certainly possible
Single stream max 150 Mbit/20 MB/s
Newer generations are faster, slightly
Couldn’t get to 10 Gbit
28. Conclusions (2)
●
●
●
●
Instance IO classes = relative concept
50 threads seem sweet-spot
Part size seemed not that important
Do error control on multi-part