2. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Session agenda
• Context: on-premises disaster recovery (DR) using AWS
• Why AWS for recovery of on-premises IT infrastructure
• The ascending levels of DR
• DR/continuity scenarios
• Demo
• Q&A
3. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Terminology
Business Continuity
Business continuity ensures that an
organization's critical business functions
continue to operate or recover quickly
despite serious incidents.
Disaster Recovery
Disaster recovery (DR) enables the
recovery or continuation of vital technology
infrastructure and systems following a
natural or human-induced disaster.
Recovery Point Objective Recovery Time Objective
RTO is a targeted duration in which a
business process must be restored after a
disaster or disruption.
RPO is the maximum targeted period in
which data might be lost from an IT
service due to a major incident.
4. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Understanding RTO and RPO
Disaster
Down time
Transactions lost
RPO
a
RTO
5. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Plan for various types of
disasters
6. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
History of DR
There have been many challenges for traditional DR for
enterprises
• Building and maintaining regional data centers
• Failed DR tests
• Not meeting RPO & RTO
• High technical debt
7. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
AWS compared to traditional
disaster recovery
Conventional
• High cost to build disaster recovery
sites or data centers (CAPEX)
• High cost of storage, backup,
archival and retrieval tools, and
processes (OPEX)
• Difficult planning, procurement, and
deployment
• Challenging to verify DR plans
• Single level of DR across the
organization
AWS
• Low cost upfront investment
(CAPEX)
• On-demand costs (OPEX)
• Consistent experience across AWS
environments
• Recovery automation
• Separate levels of DR per
application or business unit
8. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
DR topology map
ELB/appliance
EC2/Auto Scaling
Route 53
Load balancers
Web/app servers
Your Data Centers
DNS
DB failover nodes
AD failover nodes
Availability Zones
Multi-regionDisaster recovery
Data centers
AD/authentication
Database servers
9. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Ascending levels of DR
options
Backup &
Restore
Pilot Light
Warm Standby
Multi-Site
Backup of on-
premises data to
AWS to use in a DR
event
Replicate data and
minimal running
services into AWS,
ready to take over
and flare up
Replicate data and
services into AWS
ready to take over
Replicated and load
balanced
environments that
are both actively
taking production
traffic
RPO
a
RTO
COST
24 hours 24 hours
$
RPO
a
RTO
COST
12 hours 4 hours
$$
RPO
a
RTO
COST
1-4 hours 15 min
$$$
RPO
a
RTO
COST
<15 min 0-5 min
$$$$
Business continuity
begins
Un-interrupted Business
continuity
10. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Backup & Restore Pilot Light Warm Standby Multi-Site
S3Storage
Gateway
Amazon
Glacier
EBS
volumes
Route 53 Direct
Connect
VPN
NetworkingStorage
Multiple
Direct Connect
connections
Compute
Auto
Scaling
ELBEC2
Deployment/
Management
CloudFormation IAM
Added through the levels of DR
VPC
11. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Backup and restore architecture
~$200 / Month
In US-EAST
+VPN
On-premises
Active
Production
www.example.com
Corporate data center AWS region
AWS DR failover
App
Servers
DB
Server
VPN
Connection
Storage
GatewayiSCSI
Backup
System
S3 / Bucket
Glacier / Archive
Web
Servers Internet traffic
S3 (1TB)
$31/Month
Amazon Glacier
(2TB)
$22/Month
Storage Gateway
$125/Month
S3 / Bucket
S3 (1TB)
$31/Month
1TB
Data
Volume
12. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Backup and restore details
• Suitable for:
– Solutions that can sustain higher technical debt
– Lower business-critical nature
– Low cost DR option
• Leverage existing investments in
– De-duplication
– Compression
– WAN acceleration
13. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Partner backup to cloud option
• Popular DR storage appliance for storing backup
data on AWS
• De-dupes, encrypts, optimizes
• Customer-managed encryption keys
• Connects to Amazon S3 and Amazon Glacier
• Physical, virtual, or AWS-based appliance
Amazon S3
$0.03 per GB / month
30:1 storage reduction
over 3 years
after SteelStore
$0.001 per GB / month
$1/Terabyte/month
AVAILABLE IN
14. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Data Replication
On-premises
Active
Production
Route 53
www.example.com
Corporate data center
1 TB Data
Volume
AWS region
Web
Servers
AWS
Active
Production
Direct Connect
App
Servers
DB
Server
1TB
Data
Volume
DB
Server
Pilot light architecture
15. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Pilot light architecture
$309 / Month
In US-EAST
+DirectConnect
Data Replication
ELB
On-premises
Active
Production
Route 53
www.example.com
Corporate data center
1 TB Data
Volume
Web
Servers
AWS region
Web
Servers
AWS
Active
Production
Direct Connect
App
Servers
DB
Server
App
Servers
1TB
Data
Volume
DB
Server EBS (GP2)
$100/Month
EC2 (m3.xlarge)
$205/Month
EC2 (t2.medium)
$0/Month
ELB (100GB Data)
$0/Month
EC2 (t2.small)
$0/Month
ELB (100GB Data)
$0/Month
R53 (1M Query)
$4/Month
CloudFormation
16. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Pilot light details
Considerations
Suitable for:
• Solutions that need lower
RTO & RPO
• Higher business critical
nature
• Mid-range cost DR option
3rd Party & Marketplace
• CloudEndure
• Racemi
• Others
17. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Warm standby architecture
$410 / Month
In us-east-1
+AWS Direct
Connect
ELB
On-premises
Active
Production
Route 53
www.example.com
Corporate data center
1 TB Data
Volume
Web
Servers
AWS region
Web
Servers
AWS
Active
Production
App
Servers
DB
Server
App
Servers
1TB
Data
Volume
DB
Server EBS (GP2)
$100/Month
EC2 (m3.xlarge)
$205/Month
EC2 (t2.medium)
$41/Month
ELB (100GB Data)
$19/Month
EC2 (t2.small)
$22/Month
ELB (100GB Data)
$19/Month
R53 (1M Query)
$4/Month
CloudFormation
Data Replication
Direct Connect
18. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Multi-site architecture
Data Replication
ELB
On-premises
Active
Production
Route 53
www.example.com
Corporate data center
1 TB Data
Volume
Web
Servers
AWS region
Web
Servers
AWS
Active
Production
Direct Connect
App
Servers
DB
Server
App
Servers
1TB
Data
Volume
DB
Server EBS (GP2)
$100/Month
EC2 (m3.xlarge)
$205/Month
EC2 (t2.medium)
$82/Month
ELB (100GB Data)
$19/Month
EC2 (t2.small)
$44/Month
ELB (100GB Data)
$19/Month
R53 (1M Query)
$4/Month
CloudFormation
$473 / Month
In us-east-1
+AWS Direct
Connect
19. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Warm standby and multi-site details
Considerations
Suitable for:
• Solutions that require RTO
& RPO in minutes
• Core business-critical
functions
• Higher cost DR option
Partners
• Partner ecosystem
23. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Thank You.
This presentation will be loaded to SlideShare the week following the Symposium.
http://www.slideshare.net/AmazonWebServices
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Hinweis der Redaktion
Introduction
Attending the Summit in 2011 and now presenting 2015
Briefly introduce some of the things we will do
Grab attention with $1000 giveaway
Describe how it will work with a partner engagement
Not discussing BC, however we will discuss Disaster Recovery, which is part of BC
BC is the business functions recovery model
DR is the technology & infrastructure systems
There will be more questions as we get into a panel discussion during the Q&A panel
Tell the story of a friend, who is now the CEO of a mid-sized enterprise.
who lost his entire office, data center and building in a fire, then after telling the story relate those terms in RPO and RTO.
A small company with only a few employee’s at the time is now thousands strong
There are many options and variations for setting up disaster recovery. Your business requirements like RPO and RTO drives a lot of this.
Most of the DR scenarios depend on these two key metrics.
If it harms critical business processes, it may be a disaster
Time-based definition – how long can the business stand the pain?
Think about the Probability of occurrence
Fire, flood, hurricane, tornado, earthquake, volcanoes
Plane crashes, vandalism, terrorism, riots, sabotage, loss of personnel, etc.
Anything that diminishes or destroys normal data processing capabilities
User Error / Corruption / Hacking Attack - Hacking – Thief Icon
User initiated threat
High Availability in the context of corrupt data.
Systems corruption (systems corruption as in the systems stop functioning)
U
Discuss my past of designing and building out traditional DR/Data centers. The complexity that came from those scenarios.
Lead into the next slide that shows the advantages of AWS
Very manual process.
Challenges with High Technical debt and runbooks for executing a DR
Conventional vs AWS
High upfront capex
Multi-Region vs. Multi Data Center messaging / Geographic separation
Mention data guard being used with RDS
Compare AZ’s to DR data centers
9
10
Discuss the application to be used throughout all the scenarios
Open Source Software to be used for all layers
Qualify upfront: Simple, Stateless application
Backup and restore to on-prem or other location.
Same application,
Database replication is the key difference.
Pilot Light architecture
Note the addition of DirectConnect
Costs for DirectConnect not included
Consider augmenting this with existing technologies
Lets looks at warm standby scenario
The term warm standby is used to describe a DR scenario in which a scaled-down version of a fully functional environment is always up and running.
A warm standby solution extends the pilot light as it decreases the overall recovery time
Extremely low RTO/RPO
Automation becomes a critical element as you ascend to this level of Disaster Recovery
Multi-Site
Running both sites at once
Database replication going both directions
Costs: Remember this doesn’t include direct connect costs.
Warm standby and multisite are great options for low RTO/RPO’s
We have many partners in this space that are ready to assist you with these challenges.
What are common challenges you have seen as customers move up the levels of DR from Backup and Restore to Pilot Light and so on?
I am struggling with my on-premises DR today, what would be the best approach to start leveraging AWS services?
My current DR solution is very high in technical debt, how does leveraging AWS help me reduce that debt?
If I am a traditional enterprise, what would be the first place I should look to start doing DR on AWS?
How do you see AWS changing the way enterprises are doing DR?
Tell me what types of technical debt are most common in traditional DR and how AWS helps reduce those same types of technical debt?
What are some of the common
Why is DR easier to do on AWS than traditional On-premises deployments?
What are common challenges you have seen as customers move up the levels of DR from Backup and Restore to Pilot Light and so on?
I am struggling with my on-premises DR today, what would be the best approach to start leveraging AWS services?
My current DR solution is very high in technical debt, how does leveraging AWS help me reduce that debt?
If I am a traditional enterprise, what would be the first place I should look to start doing DR on AWS?
How do you see AWS changing the way enterprises are doing DR?
Tell me what types of technical debt are most common in traditional DR and how AWS helps reduce those same types of technical debt?
What are some of the common
Why is DR easier to do on AWS than traditional On-premises deployments?