SlideShare ist ein Scribd-Unternehmen logo
1 von 25
1




Cloud Immortality: Architecting for
High Availability & Disaster Recovery

Brian Adler, Sr. Professional Services Architect, RightScale
Josep Blanquer, Sr. Systems Architect, RightScale
               Watch the video of this presentation
2#



Agenda
•   Terminology/Level-Setting
•   Takeaways
•   Designing for Failure
•   Cloud and component definitions (more terminology)
•   HA and DR Checklists and Best Practices
•   Building a complete HA example across multiple clouds
•   Conclusions / Q&A




Real Cloud Experience. Shared.
3#



Terminology
• Fault Tolerance
     • Designs incorporating redundancy and replication to enable systems to
       continue operating properly (perhaps at a degraded level) if one or more
       components fails
• High Availability (HA)
     • Fault Tolerant systems are measured by their Availability in terms of
       planned and unplanned service outages for end users
          •   99% Availability = 3.65 days of downtime per year
          •   99.9% Availability = 8.76 hours of downtime per year
          •   99.99% Availability = 53 minutes of downtime per year
          •   99.999% Availability = 5.26 minutes of downtime per year

• Disaster Recovery (DR)
     • The process, policies and procedures related to restoring critical systems
       after a catastrophic event


Real Cloud Experience. Shared.
4#



Agenda
•   Terminology/Level-Setting
•   Takeaways
•   Designing for Failure
•   Cloud and component definitions (more terminology)
•   HA and DR Checklists and Best Practices
•   Building a complete HA example across multiple clouds
•   Conclusions / Q&A




Real Cloud Experience. Shared.
5#



Takeaways
• Introduction to architectural options for designing highly-
  available, fault-tolerant applications

• Best Practices for implementation of these architectural options

• Multi-Availability Zone (AZ), Multi-Region, and Multi-Cloud
     • Architectural options
     • Considerations/pros and cons of these options




Real Cloud Experience. Shared.
6#



Agenda
•   Terminology/Level-Setting
•   Takeaways
•   Designing for Failure
•   Cloud and component definitions (more terminology)
•   HA and DR Checklists and Best Practices
•   Building a complete HA example across multiple clouds
•   Conclusions / Q&A




Real Cloud Experience. Shared.
7#



Designing for Failure
• Large scale failures in the cloud are rare but do happen
• Application owners are ultimately responsible for
  availability and recoverability
• Need to balance cost and complexity of HA efforts against
  risk(s) you are willing to bear
• Cloud infrastructure has made DR and HA remarkably
  affordable versus past options
     •   Multi-server
     •   Multi-AZ
     •   Multi-Region
     •   Multi-Cloud




Real Cloud Experience. Shared.
8#



Agenda
•   Terminology/Level-Setting
•   Takeaways
•   Designing for Failure
•   Cloud and component definitions (more terminology)
•   HA and DR Checklists and Best Practices
•   Building a complete HA example across multiple clouds
•   Conclusions / Q&A




Real Cloud Experience. Shared.
9#



What do we mean by “Cloud”?
• A cloud is a physical datacenter entity behind an API endpoint

• What does that really mean?
     •   Amazon Web Services is not a cloud
     •   EC2 is not a cloud
     •   Eucalyptus, Cloud.com, OpenStack are not clouds
     •   EC2 US-East, Rackspace, „my private cloud‟… these are clouds
     •   An availability zone is not a cloud (but it is part of one)


   Think of a cloud as a “resource pool” accessed via an API



Real Cloud Experience. Shared.
10#



Regions & Availability Zones




 •Zones within a region share a LAN (high bandwidth, low latency, private IP access)
 •Zones utilize separate power sources, are physically segregated
 •Regions are “islands”, and share no resources
Real Cloud Experience. Shared.
11#



Overcoming Multi-Cloud Pain Points
• APIs differ
     • Different sets of available resources
     • Different formats, encodings and versions
• Abstractions and features differ
     • Network architectures differ: VLANs, security groups, NAT, IPs, ACLs, …
     • Storage architectures differ: local/attachable disks, backup, snapshots, …
     • Hypervisors, machine images, cost models, billing, reporting…etc.




    Each cloud is unique in some/many/all respects, with different
    access mechanisms and varying functionalities provided
    by the managed resources.

Real Cloud Experience. Shared.
12#



Overcoming Multi-Cloud Pain Points
• Navigating the obstacles
     • Design using generic concepts (“durable storage”) yet deploy using cloud
       specifics (“EBS volumes”)
     • Have tools that translate your concepts to cloud-specific ones (e.g.
       scripts/recipes that choose the correct provider for the desired resource)
     • Think of how to share resources across clouds (i.e. data sharing)




Real Cloud Experience. Shared.
13#



Agenda
• Terminology/Level-Setting
• Takeaways
• Designing for Failure
• Cloud and component definitions (more terminology)
• Infrastructure abstraction and automation as building blocks
  for highly available applications
• HA and DR Checklists and Best Practices
• Building a complete HA example across multiple clouds
• Conclusions / Q&A



Real Cloud Experience. Shared.
14#



HA/DR Checklist for Risk Mitigation

 Determine who owns the architecture, DR process and testing.
 Develop expertise in-house and / or get outside help.
 Conduct a risk assessment for each application.
 Specify your target Recovery Time Objective
  and Recovery Point Objective.
 Design for failure starting with application architecture. This will help
  drive the infrastructure architecture.
 Implement HA best practices balancing cost, complexity and risk.
      Automate infrastructure for consistency and reliability.
 Document operational processes and automations.
 Test the failover... then test it again.
 Release the Chaos Monkey.


Real Cloud Experience. Shared.
15#



General HA Best Practices
 Avoid single points of failure
 Always place (at least) one of each component (load balancers,
  app servers, databases) in at least two AZs
 Maintain sufficient capacity to absorb AZ / cloud failures
      Reserved Instances – guarantee capacity is available in a separate
       region/cloud
 Replicate data across AZs and backup or replicate across
  clouds/regions for failover
 Setup monitoring, alerts and operations to identify and
  automate problem resolution or failover process
 Design stateless applications for resilience to reboot / relaunch


Real Cloud Experience. Shared.
16#



Agenda
• Terminology/Level-Setting
• Takeaways
• Designing for Failure
• Cloud and component definitions (more terminology)
• Infrastructure abstraction and automation as building blocks
  for highly available applications
• HA and DR Checklists and Best Practices
• Building a complete HA example across multiple clouds
• Conclusions / Q&A



Real Cloud Experience. Shared.
17#



The Start: A Typical Single-Cloud Application

                                       • Three tiered Application
               Cloud 1                    • RR DNS Load Balancers
             Zone1         Zone2          • Array of Application Servers
                   LB     LB              • Master – Slave Databases

             App        App … App      • Isolation Features
                                          • Faults: using Availability Zones
               Master          Slave
                                          • Security: Security Groups / Firewalls

                                       • Plus
                                          • Monitoring
                                          • Alerts
                                          • Automation

Real Cloud Experience. Shared.
18#


Scaling: Bursting App Capacity to another Cloud


             Private Cloud           Public Cloud


               LB       LB




             App    App … App




               Master        Slave




Real Cloud Experience. Shared.
19#


Scaling: Bursting App Capacity to another Cloud


             Private Cloud             Public Cloud


                   LB     LB




             App        App … App      App   App … App   CPU
                                                               Scale Up
                                                               Scale Down


                                         ServerArray
               Master          Slave




Real Cloud Experience. Shared.
20#


Scaling: Bursting App Capacity to another Cloud

                                                     Register with LB’s:
                                                     • Using SSH
             Private Cloud                           • Using RightNet       Public Cloud
                                                     • Central repo


                   LB     LB




             App        App … App                  Public Internet           App   App … App   CPU
                                                                                                     Scale Up
                                                                                                     Scale Down


                                                                               ServerArray
               Master          Slave



                                       Security:                     Contact DB:
                                       • Access Control              • Using DNS / IP
                                       • Encryption                  • Executing a script




Real Cloud Experience. Shared.
21#


DR: Replicating Data in another Cloud


              Private Cloud                                                  Public Cloud


                     LB       LB




               App        App … App                  Public Internet         App     App … App       CPU
                                                                                                           Scale Up
                                                                                                           Scale Down


                                                                               ServerArray
           snap
            snap
             snap
              snap
                     Master        Slave   snap
                                            snap
                                             snap
                                              snap
                                                                                           snap
                                                                                            snap
                                   Slave                                                     snap
                                                                                              snap




                                                     Access and encryption         Where’s the data?



Real Cloud Experience. Shared.
22#


DR: Replicating Data in another Cloud


              Private Cloud                                                Public Cloud


                     LB       LB




               App        App … App                  Public Internet       App   App … App         CPU
                                                                                                         Scale Up
                                                                                                         Scale Down


                                                                             ServerArray
           snap
            snap
             snap
              snap
                     Master        Slave   snap
                                            snap
                                             snap
                                              snap
                                                                                         snap
                                                                                          snap
                                                                                 Slave     snap
                                                                                            snap




                                                       S3/Cloudfiles
                                                                       •     Global storage
                                                                       •     Live replication + rsync
                                                                       •     NoSQL

Real Cloud Experience. Shared.
23#


Full distribution: Splitting the Load Balancers


              Private Cloud                                            Public Cloud


                     LB       LB




               App        App … App                  Public Internet   App   App … App         CPU
                                                                                                     Scale Up
                                                                                                     Scale Down


                                                                         ServerArray
           snap
            snap
             snap
              snap
                     Master        Slave   snap
                                            snap
                                             snap
                                              snap
                                                                                     snap
                                                                                      snap
                                                                             Slave     snap
                                                                                        snap




Real Cloud Experience. Shared.
24#



Agenda
• Terminology/Level-Setting
• Takeaways
• Designing for Failure
• Cloud and component definitions (more terminology)
• Infrastructure abstraction and automation as building blocks
  for highly available applications
• Building a complete HA example across multiple clouds
• Conclusions / Q&A




Real Cloud Experience. Shared.
25#



So how do I make my service immortal?
• Be pessimistic: Design for failure
     • Assume everything will fail and architect a solution capable of handling it
     • Trade off between levels of resiliency and cost
• Embrace the Cloud mentality: Unprecedented features
     •   Build architectural building blocks you can reuse, build up and teardown
     •   Use the powerful provider capabilities within a cloud
     •   Build the glue across clouds for global redundancy (RightScale helps a lot)
     •   Automate every single last detail
• Have it your way: No single architecture that fits all cases
     • Reuse all patterns that fit your use case
     • Customize where they don‟t fit
• Crawl, then walk: Build HA within cloud, then expand
     • Cost exponentially increases, the 80-20 rule.
     • Multi-AZ HA, solid DR plan, then full multi-cloud HA
Real Cloud Experience. Shared.

Weitere ähnliche Inhalte

Was ist angesagt?

Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Adrian Cockcroft
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowAdrian Cockcroft
 
Netflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconNetflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconAdrian Cockcroft
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Adrian Cockcroft
 
Netflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksNetflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksSudhir Tonse
 
High Availability and Disaster Recovery
High Availability and Disaster RecoveryHigh Availability and Disaster Recovery
High Availability and Disaster RecoveryAkelios
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud ArchitectureAdrian Cockcroft
 
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Adrian Cockcroft
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialStuart Charlton
 
Stacking up with OpenStack: building for High Availability
Stacking up with OpenStack: building for High AvailabilityStacking up with OpenStack: building for High Availability
Stacking up with OpenStack: building for High AvailabilityOpenStack Foundation
 
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012Amazon Web Services
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture FutureKimihiko Kitase
 
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudIntuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudSid Anand
 
Back that *aa s up – bridging multiple clouds for bursting and redundancy
Back that *aa s up – bridging multiple clouds for bursting and redundancyBack that *aa s up – bridging multiple clouds for bursting and redundancy
Back that *aa s up – bridging multiple clouds for bursting and redundancyRightScale
 
Disaster Recovery using Amazon Web Services - Webinar
Disaster Recovery using Amazon Web Services - WebinarDisaster Recovery using Amazon Web Services - Webinar
Disaster Recovery using Amazon Web Services - WebinarAmazon Web Services
 
RunE2E Case Study: SAP BusinessObjects in the AWS Cloud
RunE2E Case Study: SAP BusinessObjects in the AWS CloudRunE2E Case Study: SAP BusinessObjects in the AWS Cloud
RunE2E Case Study: SAP BusinessObjects in the AWS CloudAlex Gramling
 
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)Adrian Cockcroft
 

Was ist angesagt? (20)

Netflix in the Cloud
Netflix in the CloudNetflix in the Cloud
Netflix in the Cloud
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search Roadshow
 
Netflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconNetflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at Gluecon
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011
 
NetflixOSS Meetup
NetflixOSS MeetupNetflixOSS Meetup
NetflixOSS Meetup
 
Netflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksNetflix Cloud Platform Building Blocks
Netflix Cloud Platform Building Blocks
 
High Availability and Disaster Recovery
High Availability and Disaster RecoveryHigh Availability and Disaster Recovery
High Availability and Disaster Recovery
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud Architecture
 
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
 
Gluecon keynote
Gluecon keynoteGluecon keynote
Gluecon keynote
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
 
Stacking up with OpenStack: building for High Availability
Stacking up with OpenStack: building for High AvailabilityStacking up with OpenStack: building for High Availability
Stacking up with OpenStack: building for High Availability
 
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture Future
 
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudIntuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
 
Back that *aa s up – bridging multiple clouds for bursting and redundancy
Back that *aa s up – bridging multiple clouds for bursting and redundancyBack that *aa s up – bridging multiple clouds for bursting and redundancy
Back that *aa s up – bridging multiple clouds for bursting and redundancy
 
Disaster Recovery using Amazon Web Services - Webinar
Disaster Recovery using Amazon Web Services - WebinarDisaster Recovery using Amazon Web Services - Webinar
Disaster Recovery using Amazon Web Services - Webinar
 
RunE2E Case Study: SAP BusinessObjects in the AWS Cloud
RunE2E Case Study: SAP BusinessObjects in the AWS CloudRunE2E Case Study: SAP BusinessObjects in the AWS Cloud
RunE2E Case Study: SAP BusinessObjects in the AWS Cloud
 
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)Cloud Architecture Tutorial - Platform Component Architecture (2of3)
Cloud Architecture Tutorial - Platform Component Architecture (2of3)
 

Ähnlich wie Cloud Immortality - Architecting for High Availability & Disaster Recovery

Building Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid CloudsBuilding Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid CloudsRightScale
 
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightScale
 
High Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best PracticesHigh Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best PracticesRightScale
 
AWS and VMware: How to Architect and Manage Hybrid Environments
AWS and VMware: How to Architect and Manage Hybrid EnvironmentsAWS and VMware: How to Architect and Manage Hybrid Environments
AWS and VMware: How to Architect and Manage Hybrid EnvironmentsRightScale
 
RightScale: Single Pane of Glass at Computerworld 2013
RightScale:  Single Pane of Glass at Computerworld 2013RightScale:  Single Pane of Glass at Computerworld 2013
RightScale: Single Pane of Glass at Computerworld 2013RightScale
 
Rightscale webinar-key-design-considerations-private-hybrid-clouds
Rightscale webinar-key-design-considerations-private-hybrid-cloudsRightscale webinar-key-design-considerations-private-hybrid-clouds
Rightscale webinar-key-design-considerations-private-hybrid-cloudsRightScale
 
Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)
Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)
Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)RightScale
 
ServerTemplate™ Deep Dive: Configuration for Multi-Cloud Environments
ServerTemplate™ Deep Dive: Configuration for Multi-Cloud EnvironmentsServerTemplate™ Deep Dive: Configuration for Multi-Cloud Environments
ServerTemplate™ Deep Dive: Configuration for Multi-Cloud EnvironmentsRightScale
 
Getting Started with MariaDB with Docker
Getting Started with MariaDB with DockerGetting Started with MariaDB with Docker
Getting Started with MariaDB with DockerMariaDB plc
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
 
CloudStack-Developer-Day
CloudStack-Developer-DayCloudStack-Developer-Day
CloudStack-Developer-DayKimihiko Kitase
 
Architecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud ExpoArchitecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud Exposmw355
 
The Netflix Open Source Platform
The Netflix Open Source PlatformThe Netflix Open Source Platform
The Netflix Open Source PlatformRuslan Meshenberg
 
Application-level Disaster Recovery on OpenStack
Application-level Disaster Recovery on OpenStackApplication-level Disaster Recovery on OpenStack
Application-level Disaster Recovery on OpenStackAli Hodroj
 
Web Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformWeb Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformSudhir Tonse
 

Ähnlich wie Cloud Immortality - Architecting for High Availability & Disaster Recovery (20)

Building Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid CloudsBuilding Blocks for Private and Hybrid Clouds
Building Blocks for Private and Hybrid Clouds
 
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid CloudsRightscale Webinar: Building Blocks for Private and Hybrid Clouds
Rightscale Webinar: Building Blocks for Private and Hybrid Clouds
 
High Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best PracticesHigh Availability in the Cloud - Architectural Best Practices
High Availability in the Cloud - Architectural Best Practices
 
Building a Hybrid Cloud
Building a Hybrid CloudBuilding a Hybrid Cloud
Building a Hybrid Cloud
 
AWS and VMware: How to Architect and Manage Hybrid Environments
AWS and VMware: How to Architect and Manage Hybrid EnvironmentsAWS and VMware: How to Architect and Manage Hybrid Environments
AWS and VMware: How to Architect and Manage Hybrid Environments
 
RightScale: Single Pane of Glass at Computerworld 2013
RightScale:  Single Pane of Glass at Computerworld 2013RightScale:  Single Pane of Glass at Computerworld 2013
RightScale: Single Pane of Glass at Computerworld 2013
 
Rightscale webinar-key-design-considerations-private-hybrid-clouds
Rightscale webinar-key-design-considerations-private-hybrid-cloudsRightscale webinar-key-design-considerations-private-hybrid-clouds
Rightscale webinar-key-design-considerations-private-hybrid-clouds
 
Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)
Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)
Rightscale Webinar: Designing Private & Hybrid Clouds (Hosted by Citrix)
 
ServerTemplate™ Deep Dive: Configuration for Multi-Cloud Environments
ServerTemplate™ Deep Dive: Configuration for Multi-Cloud EnvironmentsServerTemplate™ Deep Dive: Configuration for Multi-Cloud Environments
ServerTemplate™ Deep Dive: Configuration for Multi-Cloud Environments
 
Cloud patterns
Cloud patternsCloud patterns
Cloud patterns
 
Getting Started with MariaDB with Docker
Getting Started with MariaDB with DockerGetting Started with MariaDB with Docker
Getting Started with MariaDB with Docker
 
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And CloudYARN Containerized Services: Fading The Lines Between On-Prem And Cloud
YARN Containerized Services: Fading The Lines Between On-Prem And Cloud
 
CloudStack-Developer-Day
CloudStack-Developer-DayCloudStack-Developer-Day
CloudStack-Developer-Day
 
Architecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud ExpoArchitecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud Expo
 
Migrating to Public Cloud
Migrating to Public CloudMigrating to Public Cloud
Migrating to Public Cloud
 
The Netflix Open Source Platform
The Netflix Open Source PlatformThe Netflix Open Source Platform
The Netflix Open Source Platform
 
Application-level Disaster Recovery on OpenStack
Application-level Disaster Recovery on OpenStackApplication-level Disaster Recovery on OpenStack
Application-level Disaster Recovery on OpenStack
 
Web Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformWeb Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud Platform
 
Avoiding cloud lock-in
Avoiding cloud lock-inAvoiding cloud lock-in
Avoiding cloud lock-in
 
Cloud computing What Why How
Cloud computing What Why HowCloud computing What Why How
Cloud computing What Why How
 

Mehr von RightScale

10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT Governance10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT GovernanceRightScale
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsRightScale
 
Optimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScaleOptimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScaleRightScale
 
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About NowPrepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About NowRightScale
 
How to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your EnterpriseHow to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your EnterpriseRightScale
 
Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)RightScale
 
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBMComparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBMRightScale
 
How to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale OptimaHow to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale OptimaRightScale
 
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...RightScale
 
Using RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider ToolsUsing RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider ToolsRightScale
 
Best Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and ComplianceBest Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and ComplianceRightScale
 
Automating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and MoreAutomating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and MoreRightScale
 
The 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for EnterprisesThe 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for EnterprisesRightScale
 
9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage Costs9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage CostsRightScale
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMRightScale
 
Best Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP SuccessBest Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP SuccessRightScale
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMRightScale
 
2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud Report2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud ReportRightScale
 
Got a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsGot a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsRightScale
 
How to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale OptimaHow to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale OptimaRightScale
 

Mehr von RightScale (20)

10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT Governance10 Must-Have Automated Cloud Policies for IT Governance
10 Must-Have Automated Cloud Policies for IT Governance
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
 
Optimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScaleOptimize Software, SaaS, and Cloud with Flexera and RightScale
Optimize Software, SaaS, and Cloud with Flexera and RightScale
 
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About NowPrepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
Prepare Your Enterprise Cloud Strategy for 2019: 7 Things to Think About Now
 
How to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your EnterpriseHow to Set Up a Cloud Cost Optimization Process for your Enterprise
How to Set Up a Cloud Cost Optimization Process for your Enterprise
 
Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)Multi-Cloud Management with RightScale CMP (Demo)
Multi-Cloud Management with RightScale CMP (Demo)
 
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBMComparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
Comparing Cloud VM Types and Prices: AWS vs Azure vs Google vs IBM
 
How to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale OptimaHow to Allocate and Report Cloud Costs with RightScale Optima
How to Allocate and Report Cloud Costs with RightScale Optima
 
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
 
Using RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider ToolsUsing RightScale CMP with Cloud Provider Tools
Using RightScale CMP with Cloud Provider Tools
 
Best Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and ComplianceBest Practices for Multi-Cloud Security and Compliance
Best Practices for Multi-Cloud Security and Compliance
 
Automating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and MoreAutomating Multi-Cloud Policies for AWS, Azure, Google, and More
Automating Multi-Cloud Policies for AWS, Azure, Google, and More
 
The 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for EnterprisesThe 5 Stages of Cloud Management for Enterprises
The 5 Stages of Cloud Management for Enterprises
 
9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage Costs9 Ways to Reduce Cloud Storage Costs
9 Ways to Reduce Cloud Storage Costs
 
Serverless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBMServerless Comparison: AWS vs Azure vs Google vs IBM
Serverless Comparison: AWS vs Azure vs Google vs IBM
 
Best Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP SuccessBest Practices for Cloud Managed Services Providers: The Path to CMP Success
Best Practices for Cloud Managed Services Providers: The Path to CMP Success
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud Report2018 Cloud Trends: RightScale State of the Cloud Report
2018 Cloud Trends: RightScale State of the Cloud Report
 
Got a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsGot a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP Helps
 
How to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale OptimaHow to Manage Cloud Costs with RightScale Optima
How to Manage Cloud Costs with RightScale Optima
 

Kürzlich hochgeladen

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Kürzlich hochgeladen (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Cloud Immortality - Architecting for High Availability & Disaster Recovery

  • 1. 1 Cloud Immortality: Architecting for High Availability & Disaster Recovery Brian Adler, Sr. Professional Services Architect, RightScale Josep Blanquer, Sr. Systems Architect, RightScale Watch the video of this presentation
  • 2. 2# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • HA and DR Checklists and Best Practices • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 3. 3# Terminology • Fault Tolerance • Designs incorporating redundancy and replication to enable systems to continue operating properly (perhaps at a degraded level) if one or more components fails • High Availability (HA) • Fault Tolerant systems are measured by their Availability in terms of planned and unplanned service outages for end users • 99% Availability = 3.65 days of downtime per year • 99.9% Availability = 8.76 hours of downtime per year • 99.99% Availability = 53 minutes of downtime per year • 99.999% Availability = 5.26 minutes of downtime per year • Disaster Recovery (DR) • The process, policies and procedures related to restoring critical systems after a catastrophic event Real Cloud Experience. Shared.
  • 4. 4# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • HA and DR Checklists and Best Practices • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 5. 5# Takeaways • Introduction to architectural options for designing highly- available, fault-tolerant applications • Best Practices for implementation of these architectural options • Multi-Availability Zone (AZ), Multi-Region, and Multi-Cloud • Architectural options • Considerations/pros and cons of these options Real Cloud Experience. Shared.
  • 6. 6# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • HA and DR Checklists and Best Practices • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 7. 7# Designing for Failure • Large scale failures in the cloud are rare but do happen • Application owners are ultimately responsible for availability and recoverability • Need to balance cost and complexity of HA efforts against risk(s) you are willing to bear • Cloud infrastructure has made DR and HA remarkably affordable versus past options • Multi-server • Multi-AZ • Multi-Region • Multi-Cloud Real Cloud Experience. Shared.
  • 8. 8# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • HA and DR Checklists and Best Practices • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 9. 9# What do we mean by “Cloud”? • A cloud is a physical datacenter entity behind an API endpoint • What does that really mean? • Amazon Web Services is not a cloud • EC2 is not a cloud • Eucalyptus, Cloud.com, OpenStack are not clouds • EC2 US-East, Rackspace, „my private cloud‟… these are clouds • An availability zone is not a cloud (but it is part of one) Think of a cloud as a “resource pool” accessed via an API Real Cloud Experience. Shared.
  • 10. 10# Regions & Availability Zones •Zones within a region share a LAN (high bandwidth, low latency, private IP access) •Zones utilize separate power sources, are physically segregated •Regions are “islands”, and share no resources Real Cloud Experience. Shared.
  • 11. 11# Overcoming Multi-Cloud Pain Points • APIs differ • Different sets of available resources • Different formats, encodings and versions • Abstractions and features differ • Network architectures differ: VLANs, security groups, NAT, IPs, ACLs, … • Storage architectures differ: local/attachable disks, backup, snapshots, … • Hypervisors, machine images, cost models, billing, reporting…etc. Each cloud is unique in some/many/all respects, with different access mechanisms and varying functionalities provided by the managed resources. Real Cloud Experience. Shared.
  • 12. 12# Overcoming Multi-Cloud Pain Points • Navigating the obstacles • Design using generic concepts (“durable storage”) yet deploy using cloud specifics (“EBS volumes”) • Have tools that translate your concepts to cloud-specific ones (e.g. scripts/recipes that choose the correct provider for the desired resource) • Think of how to share resources across clouds (i.e. data sharing) Real Cloud Experience. Shared.
  • 13. 13# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • Infrastructure abstraction and automation as building blocks for highly available applications • HA and DR Checklists and Best Practices • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 14. 14# HA/DR Checklist for Risk Mitigation  Determine who owns the architecture, DR process and testing.  Develop expertise in-house and / or get outside help.  Conduct a risk assessment for each application.  Specify your target Recovery Time Objective and Recovery Point Objective.  Design for failure starting with application architecture. This will help drive the infrastructure architecture.  Implement HA best practices balancing cost, complexity and risk.  Automate infrastructure for consistency and reliability.  Document operational processes and automations.  Test the failover... then test it again.  Release the Chaos Monkey. Real Cloud Experience. Shared.
  • 15. 15# General HA Best Practices  Avoid single points of failure  Always place (at least) one of each component (load balancers, app servers, databases) in at least two AZs  Maintain sufficient capacity to absorb AZ / cloud failures  Reserved Instances – guarantee capacity is available in a separate region/cloud  Replicate data across AZs and backup or replicate across clouds/regions for failover  Setup monitoring, alerts and operations to identify and automate problem resolution or failover process  Design stateless applications for resilience to reboot / relaunch Real Cloud Experience. Shared.
  • 16. 16# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • Infrastructure abstraction and automation as building blocks for highly available applications • HA and DR Checklists and Best Practices • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 17. 17# The Start: A Typical Single-Cloud Application • Three tiered Application Cloud 1 • RR DNS Load Balancers Zone1 Zone2 • Array of Application Servers LB LB • Master – Slave Databases App App … App • Isolation Features • Faults: using Availability Zones Master Slave • Security: Security Groups / Firewalls • Plus • Monitoring • Alerts • Automation Real Cloud Experience. Shared.
  • 18. 18# Scaling: Bursting App Capacity to another Cloud Private Cloud Public Cloud LB LB App App … App Master Slave Real Cloud Experience. Shared.
  • 19. 19# Scaling: Bursting App Capacity to another Cloud Private Cloud Public Cloud LB LB App App … App App App … App CPU Scale Up Scale Down ServerArray Master Slave Real Cloud Experience. Shared.
  • 20. 20# Scaling: Bursting App Capacity to another Cloud Register with LB’s: • Using SSH Private Cloud • Using RightNet Public Cloud • Central repo LB LB App App … App Public Internet App App … App CPU Scale Up Scale Down ServerArray Master Slave Security: Contact DB: • Access Control • Using DNS / IP • Encryption • Executing a script Real Cloud Experience. Shared.
  • 21. 21# DR: Replicating Data in another Cloud Private Cloud Public Cloud LB LB App App … App Public Internet App App … App CPU Scale Up Scale Down ServerArray snap snap snap snap Master Slave snap snap snap snap snap snap Slave snap snap Access and encryption Where’s the data? Real Cloud Experience. Shared.
  • 22. 22# DR: Replicating Data in another Cloud Private Cloud Public Cloud LB LB App App … App Public Internet App App … App CPU Scale Up Scale Down ServerArray snap snap snap snap Master Slave snap snap snap snap snap snap Slave snap snap S3/Cloudfiles • Global storage • Live replication + rsync • NoSQL Real Cloud Experience. Shared.
  • 23. 23# Full distribution: Splitting the Load Balancers Private Cloud Public Cloud LB LB App App … App Public Internet App App … App CPU Scale Up Scale Down ServerArray snap snap snap snap Master Slave snap snap snap snap snap snap Slave snap snap Real Cloud Experience. Shared.
  • 24. 24# Agenda • Terminology/Level-Setting • Takeaways • Designing for Failure • Cloud and component definitions (more terminology) • Infrastructure abstraction and automation as building blocks for highly available applications • Building a complete HA example across multiple clouds • Conclusions / Q&A Real Cloud Experience. Shared.
  • 25. 25# So how do I make my service immortal? • Be pessimistic: Design for failure • Assume everything will fail and architect a solution capable of handling it • Trade off between levels of resiliency and cost • Embrace the Cloud mentality: Unprecedented features • Build architectural building blocks you can reuse, build up and teardown • Use the powerful provider capabilities within a cloud • Build the glue across clouds for global redundancy (RightScale helps a lot) • Automate every single last detail • Have it your way: No single architecture that fits all cases • Reuse all patterns that fit your use case • Customize where they don‟t fit • Crawl, then walk: Build HA within cloud, then expand • Cost exponentially increases, the 80-20 rule. • Multi-AZ HA, solid DR plan, then full multi-cloud HA Real Cloud Experience. Shared.