With AWS, you can choose the right storage service for the right use case. This session shows the range of AWS choices that are available to you: Amazon S3, Amazon EBS, Amazon EFS, Amazon Glacier and Cloud Data Migration solutions.
4. Cloud Data Migration
Direct
Connect
Snow* data
transport
family
3rd Party
Connectors
Transfer
Acceleration
Storage
Gateway
Kinesis Firehose
The AWS Storage Portfolio
Object
Amazon GlacierAmazon S3
Block
Amazon EBS
(persistent)
Amazon EC2
Instance Store
(ephemeral)
File
Amazon EFS
5. AWS Storage: Value in Every GB
Proven Reliability and Scale
• 11 years experience
• Secure, Available, Durable
Innovation
• Analytics, Tagging, Inventory, Lifecycle management
• Glacier, EFS services
Cloud Data Migration
• Snow*, Storage Gateway, S3-TA, Kinesis
• Technology partner solutions
Integration with other AWS services
• Athena, EMR, EC2, Redshift, QuickSight…
6. 1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
Storage Utilization – Pay as you Grow
Amazon
Storage
8. S3 Standard S3 Standard - Infrequent
Access
Amazon Glacier
Active data Archive dataInfrequently accessed data
Milliseconds Minutes to HoursMilliseconds
$0.021/GB/mo $0.004/GB/mo$0.0125/GB/mo
Choice of storage classes on Amazon S3
9. A Closer Look: S3-IA and Amazon Glacier
S3 - IA
• Same durability and throughput as S3 Standard
• Instant access
• $0.01/GB on each data retrieval
Amazon Glacier
• Same 11 9s durability as S3 Standard
• 3 types of retrieval (expedited, standard, bulk)
– 1-5 min ($.03/GB), 3-5 hr ($.01/GB), 5-12 hr ($.0025/GB)
• Suitable for tape replacement scenarios
S3 Standard - Infrequent
Access
Amazon Glacier
11. Data-driven storage management for S3
• Analyze storage usage to transition the right data to the right storage class
• Understand how storage usage changes as your S3 objects get older
• Discover how much of your storage is retrieved over time
12. Manage your data: Object Level Tags
Data Classification and Management
Manage data based on what it is as opposed to where its located
• Easy data management
• Classify your data
• Tag your objects with key-value pairs
• Write policies once based on the type of data
Classification Lifecycle PolicyAccess Control
13. New S3 features: Pulling it all together
• SaaS security & compliance solution
• Built on AWS
• On-Premise NAS migration to AWS S3, Glacier
• Multi-PB migration
• Performant and scalable multi-tenant storage
subsystem
• 1.7PB per-month, growing +110% per-year
• 4000 customers, growing +35% per-year
• Leveraged new AWS Storage Features
• S3 tagging for scalability, lifecycle management
• S3 analytics to understand data access patterns
• Cross-region replication
• Glacier expedited retrieval for multi-region availability
“We couldn’t have completed
the migration, optimized
performance, cost optimized
via lifecycle & classified our
(4000+) customers without
using S3’s new analytics,
tagging and lifecycle features
14. S3 Use Case: Data Lake
Use Case
• Quickly ingest data from many sources, and store it efficiently in one location
• Multiple analytics/processing tools (eg, Athena, EMR, Rekognition, Redshift) to speed time to value
• Migrate on-premise Data Warehouses, Hadoop & Big Data clusters to AWS
Value Proposition
• Decouple storage and compute—scale optimally and cost efficiently
• Eliminate silos—centrally manage, govern and access all data
• Use the right analytic tools for the job—seamless evolve as requirements change, with no data migration
Customer Example: FINRA
• Analyzes & stores 75 Billion events per day with S3, EMR & Redshift
• Securely stores 5PB of historical data on S3 for deeper ad-hoc analytics
• Increased agility and speed to results
• Estimated savings of $10-20M per year over previous on-premise solution
15. -Transition Standard to Standard-IA
-Transition Standard-IA to Amazon Glacier
-Expiration lifecycle policy
-Versioning support
-Prefix support
Object Storage -- Data Lifecycle Management
T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days
Data access frequency over time
16. Amazon Glacier
Archival storage for infrequently accessed data
Amazon Glacier
is optimized for
infrequent retrieval
Stop managing
physical media
Even lower cost than
Amazon S3;
Same high durability
17. Amazon S3 and Glacier Durability
4 9s durability
5 9s durability
S3 - IA Glacier
11 9s durability
18. Glacier Retrievals: Expedited and Bulk Retrievals
Expedited Standard Bulk
Data Access Time 1 - 5 minutes 3 - 5 hours 5 - 12 hours
Data Retrievals $0.03 per GB $0.01 per GB $0.0025 per GB
Retrieval Requests $0.01 per request $0.05 per 1,000 requests $0.025 per 1,000 requests
• Expedited: designed for occasional urgent access to a small number of archives
• Standard: Low-cost option for retrieving data in just a few hours
• Bulk: Lowest cost option optimized for large retrievals, up to petabytes of data in 12 hours
• Three flexible and powerful retrieval options to access any of your Glacier data
19. Backup and Recovery
a
S3 & Glacier Use Case: Backup & Recovery
IDC predicts disk-based data protection and
recovery market will grow with a compound
annual growth rate (CAGR) of 6.3%, with
revenue totaling $17.0 billion in 2017
20. Enterprise Backup: Direct Integration or Gateway
Amazon S3
Amazon
Glacier
AWS
Direct
Connect
Internet
Amazon S3-IA
Application
servers
Cloud Gateway
Local disk
Media
server
Cloud Gateway
Application
servers
Cloud Connector/Native Integration
Local disk
Media server
with cloud
connector
21. Which On-Premise Backup Software? All of them!
AWS Storage Gateway VTL Native S3 Integration
22. S3/Glacier Use Case: Enterprise Backup
Use Case
• On Premise Backup to tape, disk
Value Proposition
• Reduce costs – eliminate tape/hw or VTL
• Simplify management
• Infinite scale compared to on-prem
• Maintain existing BU Software + paradigms
Customer: Dow Jones
• Before AWS, backup strategy was tape
• 30,000 offsite tapes to manage, poor ROI, and very poor recovery SLAs
• Engaged AWS and Commvault to build next generation on S3 and Glacier
• Now protecting over 4,000 instances at lower cost and higher resiliency.
24. § A fully managed file system for Amazon EC2 instances
§ Exposes a file system interface that works with standard
operating system APIs
§ Provides file system access semantics (consistency, locking)
§ Sharable across thousands of instances
§ Designed to grow elastically to petabyte scale
§ Built for performance across a wide variety of workloads
§ Highly available and durable
What is Amazon EFS?
25. Building your own on the cloud is too much
work and is expensive
Use a shared file
layer
Replicate EBS
volumes (1 per
EC2 instance)
§ Substantial management overhead (sync data, provision
and manage volumes)
§ Costly (one volume per instance)
§ Complex to set up and maintain
§ Scale challenges
§ Costly (compute + storage)
26. We focused on changing the game
Simple Elastic Scalable
1 2 3
Highly durable
Highly available
27. Amazon EFS is simple
Fully managed
- No hardware, network, file layer
- Create a scalable file system in seconds!
Seamless integration with existing tools and apps
- NFS v4.1—widespread, open
- Standard file system access semantics
- Works with standard OS file system APIs
Simple pricing = simple forecasting
1
28. Amazon EFS is elastic
File systems grow and shrink automatically
as you add and remove files
No need to provision storage capacity or
performance
You pay only for the storage space you use,
with no minimum fee
2
29. File systems can grow to petabyte scale
Throughput and IOPS scale automatically
as file systems grow
Consistent low latencies regardless of file
system size
Support for thousands of concurrent NFS
connections
Amazon EFS is scalable3
30. Designed to sustain AZ offline conditions
Superior to traditional NAS availability
models
Appropriate for production/tier 0
applications
Highly durable and highly available (multi-AZ)
31. Two performance modes designed to support
a broad spectrum of use cases
Optimized for latency-sensitive applications and general-purpose
file-based workloads – this mode is the best option for the majority
of use cases
General
purpose
mode
Max I/O
mode
Optimized for large-scale and data-heavy applications where tens,
hundreds, or thousands of EC2 instances are accessing the file
system — it scales to higher levels of aggregate throughput and ops
per second with a tradeoff of slightly higher latencies for file operations
Default: Recommended for most use cases
Use CloudWatch to determine whether your application can benefit from Max I/O;
if not, you’ll get the best performance in general purpose mode
32. Access your EFS file system via Direct Connect
Direct Connect EFS in your Amazon VPCOn-premises servers
33. EFS TCO Comparison vs. Build-your-Own
500GB with High Availability
Compute Inter-AZ Bandwidth Storage
$350/month
$30/month
$205/month
$150/month
$585/month
40. EBS Use Case: Big Data (Hadoop, Kafka, logs)
Use Case
• EBS ST1 and SC1 Hard Disk Drive (HDD) volumes for throughput intensive,
sequential workloads like Hadoop (EMR, Hortonworks, Cloudera), Vertica,
Kafka, and Logs/Splunk.
Value Proposition
• Flexibility: Right size instance on CPU and Memory, not storage Not just limited
to D2’s, with EBS, choose any instance size
• Cost: Less expensive than on-prem, GP2, IO1, ephemeral storage
• Replication: Increased durability allows less app level replication
• Data Persistence: Detach and reattach volumes to resize your cluster
Customer Example: Netflix
• Chose 400 x M4.16xlarge + 4PB of EBS ST1 vs. D2.4xlarge
• “Our main motivation to move to EBS is for cost efficiency: we can pick an
instance that has the CPU/mem profile we need, and attach appropriate EBS
storage size (based on thruput and iops) to it”
42. AWS Snow Family
Snowball Snowball Edge Snowmobile
Petabyte-scale
data migration
Compute & Storage for
Hybrid/Edge workloads
Exabyte-scale data
migration
43. How Snowball moves data into and out of AWS
Create
a job
Connect the
Snowball
Copy data to
the Snowball
Your data
moved to
Amazon S3
In transit to you Delivered to you Delivered to AWS At AWS
Job created Job completed
44. AWS Snowball Use Cases
Cloud
Migration
Disaster
Recovery
Data Center
Decommission
Content
Distribution
45. What
• 10-100PB in a 45 foot-long, secure (256-bit) ruggedized container truck
Where & When
• Available in all AWS regions
How
• Data transferred via multiple 40Gbps interfaces up to 1Tb/s (100PB in a few weeks)
• Appears as NFS mount point
• Customer orders a Snowmobile, we dispatch it to their site, they hook it up and fill it, it returns
How much does it cost
• $0.005/GB/mo based on provisioned capacity (from site departure to AWS ingestion completion)
46. Snowmobile: Migrate 100PB in weeks
Use Case
• Mass-migration of large data sets – 10PB to 100PB
• Value Proposition
• Costly/impossible to move this much data before
Customer Example: DigitalGlobe (Satellite Imagery)
• 100PB image library = 6 billion square kilometers - 1PB new image every year
• Prior to Snowball/mobile, stored data in their own DC
• Needed elastic compute power to retrieve and analyze images
• Wanted to move data to the cloud, but no feasible solution
• Snowmobile lets DigitalGlobe migrate 60PB of data to the cloud
48. AWS Snowball Edge
Petabyte-scale hybrid device with onboard compute and storage
• 100 TB local storage
• Local compute equivalent to an Amazon
EC2 m4.4xlarge instance
• 10GBase-T, 10/25Gb SFP28, and 40Gb
QSFP+ copper, and optical networking
• Ruggedized and rack-mountable
RE:INVENT 2016 LAUNCH
49. Snowball Edge key features
S3-compatible endpoint
File interface (NFS)
Clustering
Run AWS Lambda functions
Faster data transfer
Encryption
50. Collect data Create job Copy data Moved to S3
Hybrid capabilities beyond data migrationMIGRATIONCOLLECTION
Create job Copy data Moved to S3
51. Snowball Edge can…
Extend of your
data center
Process data Expedite
move
Encrypted, secure,
and embedded
compute
Write data directly
as data is generated
Offers a fast and
cost effective way to
ensure data can be
quickly transferred to
and from the cloud
Simplify
data transfer
Uses standard
and familiar tools
for the data
transfer process
52. Snowball Edge Use Cases
Offline
Staging
Local Tiering
and Compute
IoT
Local
Transformation
53. Snowball Use Case: Data Collection from Remote/Mobile
Use Case
• Collect/important PBs in remote locations - Ships, Oil Rigs, etc
Value proposition
Lower on-prem infrastructure & management cost
Speeds collection and import vs. HD or WAN alternative
Customer Example: Oregon State University
Collect and analyze oceanic and coastal images (60TB/week)
• Environmental and ocean ecosystem research
• Prior to Snowball, used many small HDDs – took weeks to months to upload
• $4MM+ in infrastructure investment -- Expensive and inefficient
“Snowball lets us migrate TBs of data in days at a fraction of the cost”
55. Amazon S3 Transfer Acceleration
S3 Bucket
AWS Edge
Location
Uploader
Optimized
Throughput!
Typically 50%–300% faster
Change your endpoint, not your code
59 global edge locations
No firewall exceptions
No client software required
56. Rio De
Janeiro
Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los
Angeles
Seattle Tokyo Singapore
Time[hrs]
500 GB upload from these edge locations to a bucket in Singapore
Public Internet
How fast is S3 transfer acceleration?
S3 Transfer Acceleration
57. We loved how easy it was to get started with S3 transfer
acceleration — just a simple endpoint change in our
application and done.
S3 transfer acceleration reduces the average time it takes for
us to ingest videos from our global user base by almost half.
This gives our customers the ability to edit and share videos
sooner where speed is a critical factor.
All this for a fraction of the cost of the solution we evaluated
before.
- Brian Kaiser, CTO
”
“
Use Case Example: Media Uploads
59. What is AWS Storage Gateway?
Works with your existing applications
Secure and durable storage in AWS
Low latency for frequently used data
Scalable and cost-effective on-premises storage - $.01/GB
written to AWS + S3/Amazon Glacier storage fees
Service connecting an on-premises software appliance
with cloud-based storage
60. Hybrid storage use cases and architectures for
AWS Storage Gateway
Enabling cloud workloads
Move data to AWS storage for Big Data, cloud bursting, or migration
Tiered cloud storage
Easily add AWS storage to your on-premises environment
Backup, archive, and disaster recovery
Cost effective storage in AWS with local or cloud restore
62. File gateway
On-premises file storage maintained as objects in Amazon S3
Customer Premises
File
Gateway
Data stored and retrieved from your S3 buckets
One-to-one mapping from files-to-objects
File metadata stored in object metadata
Bucket access managed by IAM role you own and manage
Use S3 Lifecycle Policies, versioning, or CRR to manage data
GlacierS3
Standard
S3
Standard -
Infrequent
Access
HTTPS
NFS
v3 / v4.1
Application
Server
63. Application
Server
Volume gateway
On-premises volume storage backed by Amazon S3 with EBS snapshots
Block storage in S3 accessed via the volume gateway
Compression of data in-transit and at-rest
Backup on-premises volumes to EBS snapshots
Create on-premises volumes from EBS snapshots
Up to 1PB of total volume storage per gateway
Amazon
EBS
snapshots
Storage Gateway
bucket in
Amazon S3
Customer Premises
Volume
Gateway
iSCSI HTTPS
64. Volume Gateway Use Case: Tier Storage to the Cloud
Use Case
• Tier data to S3 for high capacity, offsite storage and snapshots/recovery
Value proposition
Lower storage costs
Higher durability and better backup
Customer Example: JustGiving
Data stored in S3 reduces cost of onsite storage
• EBS snapshots used for data protection and business continuance
“Storage Gateway works transparently, in a lights out way, archiving off to a
separate AWS account with a simple grandfather-father-son snapshot plan in place”
65. Tape gateway
Virtual tape storage in Amazon S3 and Glacier with VTL management
Virtual tape storage in S3 and Glacier accessed via tape gateway
Data compressed in-transit and at-rest
Up to 1 PB total tape storage per gateway, unlimited archive capacity
Supports leading backup applications:
Archived Tapes
stored in
Amazon Glacier
MEDIA
CHANGER
TAPE
DRIVE
Customer Premises
Tape
Gateway
Virtual Tapes
stored in
Amazon S3
Backup
Server
HTTPSiSCSI
66. Storage Partner Solutions
Technology Solutions with AWS Storage Competency
aws.amazon.com/backup-recovery/partner-solutions/ Note: Represents a sample of storage partners
Note: Dell-EMC, IBM and Veritas have solutions and are working towards competency requirements
Backup and RecoveryPrimary Storage/Tiering Archive BCDR
Solutions that leverage file, block, object,
and streamed data formats as an
extension to on-premises storage
Solutions that leverage Amazon S3 for
durable data backup
Solutions that leverage Amazon Glacier
for durable and cost-effective long-term
data backup
Solutions that utilize AWS to enable
recovery strategies focused on RTO
and RPO requirements