This document provides an overview of various AWS data transfer services for ingesting data into AWS cloud storage. It describes Internet/VPN ingest which uses existing internet connectivity to directly put data into S3 buckets. For accelerating data transfer, it recommends using S3 transfer acceleration, CloudFront, or AWS Direct Connect depending on needs. AWS Snowball is described as a way to transfer petabyte-scale data by shipping physical storage devices. Storage Gateway allows on-premises data to be cached locally with backups stored in AWS. Kinesis Firehose can continuously load streaming data into S3, Redshift or Elasticsearch.
6. What is Internet/VPN?
Globally available
Default method of ingesting content into Amazon S3
Simple standards-based (HTTP) connection
Use your existing internet connection
Available in a VPC for VPN connectivity
Acceleration through multipart upload
Data transfer into AWS is free
VPN connections using VPC virtual private gateway
•$0.05 per VPN connection-hour
•$0.048 per VPN connection-hour for connections to the Tokyo region
7. How does Internet/VPN ingest work?
Accelerate data transfer using
multipart upload
Ingest data directly into S3 buckets
with existing internet connectivity
S3 bucket
AWS Region
and
through console or API
customer
gateway
endpoints
VPN
connection
Internet Internet through VPN +
VPC
9. What is Amazon S3 transfer acceleration?
Network- and protocol-based data transfer service
Acceleration of data ingress/egress with S3 buckets
Typically 50% to 300% faster
Feature of S3 enabled at the bucket level
Available in all S3 regions worldwide
No client/server software required
No code changes to your application
No firewall exceptions
Simple pricing model
10. Ingest & Egress with S3 transfer acceleration
S3 bucket
AWS edge
location
Uploader
Optimized
throughput!
Uses AWS 55 global edge locations
AWS determines best edge location
Data transfer optimized between
edge and customer, and edge and S3
Data is not stored on the edge cache
11. Amazon
Route 53
Resolve
b1.s3-accelerate.amazonaws.com
HTTPS PUT/POST
upload_files.zip
HTTP/S PUT/POST
“upload_files.zip”
Service traffic flow
Client to S3 bucket example
S3 bucket
b1.s3-accelerate.amazonaws.com
EC2 proxy
AWS region
AWS edge location
Customer client
1
2
3
4
Data is not cached on the
AWS edge location
Fully managed file transfer acceleration
using all AWS edge locations
12. Using the service is as easy as 1, 2, 3…
Update application to point to new S3 URL
• Update “bucket.s3.amazonaws.com” to
“<bucket-name>.s3-accelerate.amazonaws.com”
• Original bucket location and contents are the same, only namespace
changes
Or use permissions through API
s3:PutAccelerateConfiguration
Enable the service in the AWS Management Console
Start uploading data to Amazon S3
1
2
3
13. Rio De
Janeiro
Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los
Angeles
Seattle Tokyo Singapore
Time[hrs]
500 GB upload from these edge locations to a bucket in Singapore
Public internet
How fast is S3 transfer acceleration?
S3 transfer acceleration
15. Pricing*
Dimension Price/GB
Data transfer in from Internet** $0.04 (Edge location in US, EU, JP)
$0.08 (Edge location in rest of the world)
Data transfer out to Internet $0.04
Data transfer out to another AWS region $0.04
Amazon S3 charges Standard data transfer charges apply
*Plus standard Amazon S3 data transfer charges apply
**Accelerated performance or there is no bandwidth charge
17. What is Amazon CloudFront?
Global content delivery network
55 edge locations worldwide
Supports ingest through PUT/POST methods
Works with S3 multipart upload
Supports SSL SNI and TLS connections
Integrated with ACM and AWS WAF for additional security
Proxy ingest to S3, EC2 and even your own origins
Tiered and custom pricing models
18. Using CloudFront to ingest data into AWS
AWS region
Customer client
HTTP/S PUT/POST
“upload_files.zip”
Amazon EC2
S3 bucket
ELB
Custom origin
CloudFront
edge location
Ingest content into S3, EC2, ELB, or your own custom origin
with Amazon CloudFront
Use cache behaviors to direct to the correct origin based on
PATH pattern matching
Restrict access through geo restriction or AWS WAF Web
ACL
19. Amazon CloudFront pricing
Data transfer out of Amazon CloudFront to your origin server billed at the rates
listed in the Regional Data Transfer Out to Origin (per GB) table.
Data transfer out of Amazon CloudFront to Internet will be charged at rates
listed in the Regional Data Transfer Out to Internet (per GB) table.
Amazon CloudFront offers additional pricing options through a CloudFront
Reserved Capacity (CFRC) contract. Contact sales for additional details and
pricing.
21. What is AWS Direct Connect?
Dedicated, 1 or 10 GE private pipes into AWS
Create private (VPC) or public virtual interfaces to AWS
Reduced data-out rates (data-in still free)
Consistent network performance
At least 1 location to each AWS region
Option for redundant connections
Uses BGP to exchange routing information over a VLAN
22. Physical connection
• Cross-connect at the location
• Single-mode optical fiber
- 1000Base-LX or 10GBASE-LR
• Potential onward delivery through Direct Connect partner
• Customer router
23. At the Direct Connect location
CORP
AWS Direct
Connect
Routers
Customer
Router
Colocation
DX Location
Customer
network
`
AWS backbone
network
Cross-
connect
Customer
router
Customer’s network
Demarcation
24. Dedicated port through Direct Connect partner
CORP
AWS Direct
Connect
Routers
Colocation
DX Location
Partner network
AWS backbone
network
Cross-
connect
Customer
router
Partner
network
Access
circuit
Demarcation
Partner
equipment
25. Direct Connect Locations
AWS Region AWS Direct Connect Location
Asia Pacific (Singapore) Equinix SG2, GPX, Mumbai
Asia Pacific (Seoul) KINX, Seoul
Asia Pacific (Sydney) Equinix SY3, Global Switch
Asia Pacific (Tokyo) Equinix OS1, Equinix TY2
China (Beijing) Sinnet JiuXianqiao IDC, CIDS Jiachuang IDC
EU (Frankfurt) Equinix FR5, Interxion Frankfurt
EU (Ireland) TelecityGroup, London Docklands, Eircom Clonshaugh
Equinix LD4 - LD6, London
South America (Sao Paulo) Terremark NAP do Brasil, Tivit
US East (Virginia) CoreSite NY1 & NY2, Equinix DC1 - DC6 & DC10
US West (Northern
California)
CoreSite One Wilshire & 900 North Alameda, CA,
Equinix SV1 & SV5
US West (Oregon) Equinix SE2 & SE3, Switch SUPERNAP, Las Vegas
AWS GovCloud (US) Equinix SV1 & SV5
27. Amazon Kinesis
Streams
• For technical developers
• Build your own custom
applications that process
or analyze streaming
data
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into Amazon S3, Amazon
Redshift and Amazon
Elasticsearch Service
Amazon Kinesis
Analytics
• For all developers, data
scientists
• Easily analyze data
streams using standard
SQL queries
• Coming soon
Amazon Kinesis: Streaming data made easy
Services make it easy to capture, deliver, and process streams on AWS
28. Amazon Kinesis Firehose
Load massive volumes of streaming data into Amazon S3, Amazon
Redshift and Amazon Elasticsearch Service
Zero administration: Capture and deliver streaming data into Amazon S3, Amazon Redshift,
and other destinations without writing an application or managing infrastructure.
Direct-to-data store integration: Batch, compress, and encrypt streaming data for
delivery into data destinations in as little as 60 seconds using simple configurations.
Seamless elasticity: Seamlessly scales to match data throughput without intervention.
Capture and submit
streaming data to Firehose
Analyze streaming data using your
favorite BI tools
Firehose loads streaming data
continuously into S3, Amazon Redshift
and Amazon Elasticsearch
29. Vertical/use case Accelerated ingest-
load to final destination for analytics
Ad tech/
marketing analytics
Advertising data aggregation
Consumer
online/gaming
Online customer engagement data
aggregation
Financial services Market/financial transaction order data
collection
IoT/sensor data Fitness device, vehicle sensor, telemetry data
ingestion
Amazon Kinesis Firehose use cases
31. What is AWS Storage Gateway?
Works with your existing applications
Secure and durable storage in AWS
Low latency for frequently used data
Scalable and cost-effective on-premises storage - $125 per
gateway per month + S3/Amazon Glacier storage fees
Service connecting an on-premises software appliance
with cloud-based storage
32. Common uses for AWS Storage Gateway
Backup and archive
Disaster recovery
Data migration
33. How does AWS Storage Gateway work?
Amazon EBS
snapshots
Amazon S3
Amazon
Glacier
AWS
Storage Gateway
appliance
Application
server
AWS
Storage Gateway
back end
Customer premises
S3
transfer
acceleration
AWS
Direct
Connect
Internet
34. AWS Storage Gateway configurations
iSCSI block storage
Gateway-stored volumes
iSCSI virtual tape storage
Low latency for all your data with point-in-time
backups to AWS
Replacement for on-premises physical tape
infrastructure for backup and archive
Gateway-cached volumes
Gateway virtual tape library (VTL)
Low latency for frequently used data with all data
stored in AWS
35. Gateway-virtual tape library (VTL)
• Replace or augment your aging tape infrastructure with durable object
storage
• Virtual tapes stored in AWS, frequently accessed data cached on-premises
• Up to 1,500 tapes, up to 2.5 TB each, for up to 150 TB per gateway-VTL
• Unlimited number of tapes in virtual tape shelf (VTS)
Customer data center
VTS storage
backed by
Amazon Glacier
AWS Storage
Gateway VM
Backup
server
INITIATOR
AWS
Storage Gateway
service
MEDIA
CHANGER
Upload
buffer
Cache
storage
Gateway-VTL
storage backed
by Amazon S3
VT
S
TAPE
DRIVE
37. What is AWS Snowball?
Petabyte-scale data transport
E-ink shipping
label
Ruggedized case
8.5G impact
All data encrypted
end-to-end
Rain- and dust-
resistant
Tamper-resistant
case and
electronics
80 TB
10 GE network
39. How fast is Snowball?
• Less than 1 day to transfer 200 TB through 3 x 10 GB connections
with 3 Snowballs, less than 1 week, including shipping
• Number of days to transfer 200 TB through the Internet at typical
utilizations
Internet connection speed
Utilization 1 Gbps 500 Mbps 300 Mbps 150 Mbps
25% 71 141 236 471
50% 36 71 118 236
75% 24 47 225 157
40. Use cases: AWS Snowball
Cloud
migration
Disaster
recovery
Data center
decommissioning
Content
distribution
41. Pricing
Dimension Price
Usage charge per job $200.00 (50 TB)
$250.00 (80 TB)
Extra day charge (first 10 days* are free) $15.00
Data transfer in $0.00/GB
Data transfer out $0.03/GB
Shipping** Varies
Amazon S3 charges Standard storage and request
fees apply
* Starts one day after the appliance is delivered to you. The first day the appliance is received at your site and the last day the appliance is shipped are also free and
not included in the 10-day free usage time.
** Shipping charges are based on your shipment destination and the shipping option (e.g., overnight, 2-day) you choose.
43. Amazon storage partner ecosystem
Gateway/NAS
Data
management
Sync and shareBackup/DR
Content and
acceleration
Archive
File System
44. Backup to AWS approaches
Amazon S3
Amazon
Glacier
AWS
Direct
Connect
Internet
Amazon S3-IA
Application
servers
Cloud gateway
Local disk
Media
server
Cloud gateway
Application
servers
Backup SW cloud connector
Local disk
Media
server with cloud
connector
45. CommVault ties together on-premises and cloud-data
strategies
Commvault orchestrates the enterprise
• Back up in the cloud: Keep backups of
cloud workloads internal to the cloud
• Back up to the cloud: Allow on-premises
workloads the ability to leverage AWS
• Disaster recovery to the cloud:
Automate disaster recovery to the cloud
on a scheduled basis
• Workload portability: Rest assured that
virtual servers can be moved from on-
premises to the cloud and back, keep your
data available wherever you need it
• Archiving to the cloud: Moving legacy
data to tier 2 storage in the cloud for long
term archive
Together, AWS and Commvault minimize
networking, storage, and infrastructure costs
while providing the business a sound data
protection and disaster-recovery strategy.
46. Backup to AWS approaches
Amazon S3
Amazon
Glacier
AWS
Direct
Connect
Internet
Amazon S3-IA
Application
servers
Cloud gateway
Local disk
Media
server
Cloud gateway
Application
servers
Backup SW cloud connector
Local disk
Media
server with cloud
connector
47. NetApp AltaVault backup from on-
premises to S3/Amazon Glacier
Common backup applications integrated with AltaVaultSolve backup and archive headaches with cloud-integrated
storage
90% reduction in time, cost, and data volumes
Shrink recovery times from days to minutes
85% of backup and software providers supported
Amazon
Glacier
On-premises
AWS
Cloud-integrated
storage appliance
NetApp AltaVault
FAS
E-series
Non-NetApp
storage
NetApp SnapProtect
Arcserve
CommVault Simpana
EMC NetWorker
HP Data Protector
IBM Tivoli
Storage Manager
Symantec Backup
Exec
Symantec
NetBackup
Veeam
Microsoft SQL
Server
Oracle RMAN
S3
AltaVault also available on Marketplace
to protect cloud-native workloads
Seamlessly integrates into existing
storage and backup software environment
Caches recent backups locally,
vaults older copies to the cloud
Store data in the public or private
cloud of choice
Deduplicates, compresses,
and encrypts
49. Summary: When to use each service
IF YOU NEED: CONSIDER:
An optimized or replacement Internet connection to:
connect directly into an AWS regional data center Direct Connect
migrate TB or PB of data to the cloud Snowball
accelerate data transfer
S3 transfer acceleration,
CloudFront, AWS partner
A friendly interface into S3 to:
cache data locally in a hybrid model (for
performance reasons)
Storage Gateway, AWS partner
redirect backups or archives with minimal
disruption
Storage Gateway, AWS partner
aggregate data streams from multiple devices Kinesis Firehose