5. AWS storage solutions
Amazon EFS
File
Amazon EBS
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct
Connect
AWS
Snowball
ISV
Connectors
Amazon
Kinesis
Firehose
Amazon S3
Transfer
Acceleration
AWS Storage
Gateway
6. A Concept - the Content Lake
Inspired from Data Lake (Coined by James Dixon in 2010)
A single store of all of digital content that you create and
acquire in any form or factor
•Don’t assume any resolutions/formats (for now or future)
•It is up to the consumer (application consuming the content) to use the
appropriate infrastructure for processing
7. Amazon S3 : the Content Lake
• Durable, cost-effective and fast
• Highly scalable front-end
– Multi-part uploads (parallel writes)
– Range-gets (parallel reads)
• No need for capacity planning or
provisioning
• Use Amazon S3 with on-premises
storage in a hybrid model
• Secure
8. Object Storage Options
S3 Standard
Active data/WIP Deep Archive/RetentionActive Archive/Mezzanine
S3 Standard - Infrequent
Access
Amazon Glacier
Milliseconds 3-5 hoursMilliseconds
$0.03/GB/mo $0.007/GB/mo$0.0125/GB/mo
10. 1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
Storage pricing - pay only for what you use
AWS Cloud
Storage
11. - Transition Standard to Standard-IA
- Transition Standard-IA to Amazon Glacier
- Expiration lifecycle policy
- Versioning support
- Prefix support
Storage Tiering - Data Lifecycle
T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days
Data access frequency over time
12. Save money on storage
58% saving over S3 Standard
44% saving over S3 Standard-IA
* Assumes the highest public pricing tier
13. Hydrating the Content Lake
Amazon S3
Amazon S3
(multi-part Upload)
Direct Connect
N x 1G | 10G
Massively Scalable Front-end
AWS Snowball
14. What is Snowball? Petabyte scale data transport
E-ink shipping
label
Ruggedized
case
“8.5G Impact”
All data encrypted
end-to-end
80 TB
10G network
Rain & dust
resistant
Tamper-resistant
case & electronics
15. AWS Snowball - Petabyte scale data transport solution
Scale and Speed
• Up to 80TB Capacity per device
• 10Gbps and 1Gbps connectivity
• Parallel data transfer enables PBs transferred in a week
Secure
• Tamper-resistant enclosure
• 256-bit encryption with KMS
• Secure data erasure
Simple
• Manage entire process through AWS Console
• Lightweight data transfer client
• Notifications
17. How fast is Snowball?
• Less than 1 day to transfer 300TB via with 4x 80TB Snowballs,
less than 1 week including shipping
• Number of days to transfer 300TB via the Internet at typical
utilizations
Internet&Connection&Speed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 95 190 316 632
50% 47 95 158 316
75% 32 63 105 211
18. What does it cost?
Dimension Price
Usage Charge per Job $250.00
Extra Day Charge (First 10 days* are free) $15.00
Data Transfer In $0.00/GB
Data Transfer Out $0.02/GB
Shipping** Varies
Amazon S3 Charges Standard storage and request
fees apply
* Starts one day after the appliance is delivered to you. The first day the appliance is received at your site and the last day the appliance is shipped out are also free
and not included in the 10-day free usage time.
** Shipping charges are based on your shipment destination and the shipping option (e.g., overnight, 2-day) you choose.
Transfer 1 PB with 13 devices
in parallel in 1 week!
19. AWS Import/Export Snowball
• Accelerate PBs with AWS-
provided appliances
• NEW 80 TB model
AWS Storage Gateway
• Instant hybrid cloud
• Up to 120 MB/s cloud upload rate
(4x improvement), and
Data ingestion into AWS storage services
Amazon Kinesis Firehose
• Ingest data streams directly into
AWS data stores
AWS Direct Connect
• COLO to AWS
ISV Connectors
• CommVault
• VERITAS
• Dalet, Vidispine, etc.
NEW S3 Transfer Acceleration
• Accelerate object transfer up to
300% faster using AWS’s
private network
20. corporate data center
Media Archive and Metadata (cloud transition)
Onsite Archive Offsite Tape Archive
HSM
Metadata (Asset Manager)
Processing Tasks
On-Premise Tape
21. Onsite Archive
HSM
Metadata (Asset Manager)
Processing Tasks
corporate data center
AWS Region
Amazon Glacier
Cloud DAM (Syncing Metadata from on-prem)
Amazon Direct Connect
Offsite Tape ArchiveOn-Premise Tape
Media Archive (transition to the cloud)
22. Onsite Archive
HSM
Metadata (Asset Manager)
Processing Tasks
corporate data center
AWS Region
Amazon Glacier
Cloud MAM
(Syncing Metadata
from on-prem)
Amazon S3
Cloud Based Processing
Tasks
Amazon Direct Connect
On-Premise Tape Offsite Tape Archive
Media Archive (transition to the cloud)
23. Onsite Archive
HSM
Metadata (Asset Manager)
Processing Tasks
corporate data center
AWS Region
Amazon Glacier
Cloud DAM (Syncing
Metadata from on-
prem)
Amazon S3
Cloud Based Processing
Tasks
Amazon Direct Connect
Onsite Cache Offsite Tape ArchiveOn-Premise Tape
Media Archive (transition to the cloud)
25. Reference Architecture – Content Processing
Pipeline (Using Lambda)
S3 multi-part API
S3 as backend storage for Content Files acesable to
other processing tasks
Amazon Elastic
Transcoder
S3 Notification
Trigger a Lambda
Function to Start a
transcoding job
Ingest
S3 Notification
Lambda function to
generate a signed
URL to share the
file
Update CMS or
Metadata
27. Q&A
Learn more at: http://aws.amazon.com/s3/
http://aws.amazon.com/glacier/
http://aws.amazon.com/importexport/
henryz@amazon.com
28. Media Solution: Sony DADC
Problem Statement:
• Challenged by on-prem legacy infrastructure.
• Provide a performant, secure and economic media distribution
solution.
• Decrease time to market for their customer’s finished content.
Use of AWS:
• EC2 content processing and SWF, SQS, SNS for media workflow
automation
• S3 for storage, Glacier for content archive
• CloudFront for OTT.
Business Benefits:
• Workflow pipelines can be run in a highly parallelized fashion
through AWS elastic scalability.
• Significantly shorten their content delivery SLA with a new
AWS enabled target of 1-hr.
• Fully migrating away from on-prem infrastructure.
On-demand cloud-based media supply chain and delivery solution
30. Preserve, retrieve, and restore every version
of every object stored in your bucket
S3 automatically adds new versions and
preserves deleted objects with delete
markers
Easily control the number of versions kept by
using lifecycle expiration policies
Easy to turn on in the AWS Management
Console
Key = photo.gif
ID = 121212
Key = photo.gif
ID = 111111
Versioning
Enabled
PUT
Key = photo.gif
S3 versioning
31. Amazon S3 event notifications
Delivers notifications to Amazon SNS, Amazon SQS, or AWS
Lambda when events occur in Amazon S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {
…
}
Support for notification when
objects are created via Put,
Post, Copy, or Multipart
Upload.
Support for notification when
objects are deleted, as well
as with filtering on prefixes
and suffixes for all types of
notifications.
32. Elastic File System - Rendering in the Cloud
• Designed to support petabyte scale
file systems
• Throughput scales linearly with
storage
• Same latency spec across each AZ
• Thousands of concurrent NFS
connections
• Works great for large I/O sizes
• Pay for only what you use not what
you provision
• Managed with multi-copy durability
33. Securing your content on AWS
• MPAA alignment – AWS meets the latest content security
guidelines (Aug 2015)
• VPC private endpoint for Amazon S3 – enables a true
private workflow capability
• Encryption & key management capabilities