SlideShare ist ein Scribd-Unternehmen logo
1 von 57
AWS 201 : Breakout Track Singapore
      “Design for Failure”
    HA and DR Best practices
           Harish Ganesan
          Co founder & CTO
               8KMiles
                 www.twitter.com/harish11g
         http://www.linkedin.com/in/harishganesan
Agenda

• Explain HA Architecture with Real Customer
  Case
• Understand how to Architect a web app in AWS
  with
  – Highly Availability
  – DR
  – Scalability
• Why AWS ?
About the Customer

• Online ecommerce company
• NASDAQ Listed
• Application consumed by Online users , Mobile
  and Web Services
Requirements

• High Availability on all tiers with No SPOF
• Auto Scalable and elastic infrastructure
• Ability to serve millions of requests per day
• Serve peak HTTP traffic of 8000+ reqs/sec
• Serve peak HTTPS traffic of 2500+ reqs/sec
• 65% of the business is done during holiday , so
  no downtime is affordable
• Monitoring , Backup and deployment ease
• Optimal DR setup ( Cost vs RTO/RPO)
Technology and Tiers

• Multi tiered Linux, Apache, Java Web site on
  AWS
• Data base tier using MySQL
• Cache Tier
• Integration tier with Queues and Background
  programs
• HTTP and HTTPS protocol
What 8KMiles did ?

• Consulting : Architected the entire website infra
  on AWS
• Implementation:
  – Configured the Infra on AWS
  – Developed custom DevOps scripts on AWS
• Supported during the Thanksgiving and Holiday
• Cloud Development Partner :
  – Currently Reengineering the customer App to
    leverage more AWS services
Customer Case
A simple LAMJ Architecture

                                           1   Web/App Server interacts
            US-EAST-1a
                                               with MySQL for Queries
           AWS Security Groups                 and Transactions




                             Integration
Web/App/Cache
                             Services
Server


                MySQL DB




                CloudWatch
What is the problem in this Architecture ?
A simple LAMJ Architecture

                                           Single Point of Failure at
            US-EAST-1a
                                           multiple tiers
           AWS Security Groups




                             Integration
Web/App/Cache
                             Services
Server


                MySQL DB




                CloudWatch



         Not a Highly Available Architecture
How to avoid SPOF and build a robust
            architecture ?
Step 1: Distribute the Application to
                Multiple Tiers
                                             1   Separate out the
             US-EAST-1a
                                                 individual tiers into
            AWS Security Groups                  separate EC2 instances




                              Integration
Web/App Server
                              Service tier




                 MySQL DB



                 CloudWatch
Step 2: Add Multiple Servers in each layer

                                             1   Add Multiple EC2
             US-EAST-1a
                                                 instances in every tier
            AWS Security Groups




                              Integration
Web/App Server
                              Service tier




                 MySQL DB



                 CloudWatch
Building HA @ Load Balancing Tier
Load Balancing Tier

• Load Balancing Options
  – ELB
  – HAProxy
  – Nginx
Why AWS ELB ?

• AWS ELB provides load balancing service with
  thousands of EC2 servers behind them
• AWS ELB will automatically Scale up /down
  the load balancing servers in backend
• The theoretical maximum response rate of
  AWS ELB is limitless
• It can handle 20000+ concurrent requests
  easily (RightScale Benchmark)
• AWS ELB works seamlessly with AWS Auto
  Scaling
Why AWS ELB ?

• AWS ELB is integrated well with other AWS
• No maintenance
• Pay as you go
Load balancing Layer

Online / Web / Mobile          1   Simple Round Robin
                                   Algorithm
AWS Elastic Load balancer

    US-EAST-1a


  AWS Security Groups          2   Health Checks , SSL
                                   termination


                               3   ELB is a Highly Available
    Web/App Server
                                   Service with No SPOF


      MySQL DB
Building HA @ Web/App Tier
High Availability @ Web/App tier

                                                1   Add AWS Auto Scaling to
                                                    Web / App tier
     AWS Elastic Load balancer

         US-EAST-1a

        AWS Security Groups                     2   Tie AWS Auto Scaling with
         Web/App Server                             AWS ELB
S3                                Puppet




             Auto Scaling                       3   Deploy the app using
                                                    Puppet
                                 Integration
                                 Service Tier


               MySQL DB
Designing HA @ Web/App Tier

• AWS Auto Scaling will manage un Healthy EC2
  instances
• AWS Auto Scaling will ensure minimum
  number Web/App EC2 instances are always
  running
• In event of failure , new instances will be
  launched between 30-120 seconds
  automatically
• ELB traffic is seamlessly attached to the Auto
  Scaled EC2 instances
Designing HA @ Web/App Tier

• Deploy the application / patches in Auto Scaling
  environment using Puppet / S3 scripts
• Choose the right EC2 instance Type
  – Large ( Less CPU intensive , HEAP 5.5 GB RAM )
  – High CPU Extra Large ( More CPU intensive , HEAP 5.5
    GB RAM , Concurrent GC)
• Points to remember
  – Do not store the Session in-memory of web/app server
  – Rotate and move the log files to S3 periodically
  – Move the Uploaded data files , images to S3 or
    GlusterFS
What happens when US-EAST-1a AZ fails ?

Solution : Leverage AWS Multi-AZ architecture
Source : AWS
1   Infrastructure is spread across
    HTTP/S requests hit the Amazon Load Balancer
    from the browser or mobile devices
                                                          Multi AZ’s of AWS inside a
                                                          Region
          AWS Elastic Load balancer
    AZ: US-EAST-1a               AZ: US-EAST-1b

                AWS Security Groups                   2   AWS Elastic Load balancer
Web/App EC2                        Web/App EC2            directs requests to EC2
                                                          instances across Multiple AZ’s


 Auto Scaling                          Auto Scaling
                                                      3   Amazon AutoScaling
                                                          automatically launches new
                                                          EC2 instances
                                                          across Multiple AZ’s

                                                      4   No Code Changes required to
                                                          leverage Multi-AZ
High Availability @ Web/App/DEX layer

• AZ’s are connected by Low Latency network
• AZ’s are insulated from failures in other
  Availability Zones *
• AWS Auto Scaling can manage EC2 instances
  across AZ’s
• AWS ELB can direct load to EC2 instances
  across AZ’s
• AWS CloudWatch can monitor the EC2
  instance availability across AZ
High Availability @ DB layer
Database Tier

• Options
  – MySQL Master- Slave replication
  – MySQL ndbCluster
  – RDS MySQL Master – Standby
  – RDS MySQL Master – Standby + Read Replica’s
High Availability @ DB Layer
                                                                       1   Read Replica’s launched
                                                                           in Multiple AZ’s for HA
                      AWS Elastic Load Balancer

  USA- EAST -1A                         USA- EAST -1B
                         AWS Security groups
                                                                       2   RDS Standby will be
                                                                           launched on different AZ
                                                                           from the RDS master for
  Web/App EC2                          Web/App EC2
                                                                           HA


       Auto Scaling                               Auto Scaling
                                                                       3   Web/APP hosted on
                                                                           Amazon EC2 will transact
                                  S3
   Read                                                       Read
                                                                           with RDS master and
  Replica                                                    Replica       read from Read replica’s
                         RDS              RDS
                        Master          Standby

                                   D


CloudWatch
High Availability @ DB Layer

• RDS Master and RDS Standby in Multiple AZ
  for HA
• Read Replica’s in Multiple AZ for HA
• Offers No SPOF on AZ level
• Read Replica’s can be launched/terminated
  without affecting the RDS Master availability
• In event of RDS master failure, RDS Standby
  will be automatically promoted
• Promotion <180 seconds and no changes in
  the application
High Availability @ DB Layer

• DB snapshots and MySQL Dumps facility
  available
• Automatic full backups at configured
  maintenance windows
• Point in time recovery till last minute
• Recovery might require App layer
  configuration changes
High Availability @ DB Layer

• Points to remember
  – RDS supports only MySQL innodb engine
  – Give more memory to RDS Master
     • Use Extra Large or High Memory instance types
  – Keep your Read Replica’s and RDS Master with
    same size
  – Multiple Read Replica’s can be Load Balanced
    using HAProxy LB
Use AWS Building blocks in your architecture
Use AWS Building blocks

• AWS Building blocks are in built with
  – Inherent fault tolerance
  – HA and scalability


• Following Building blocks were used
  – S3 , CloudFront , Route 53 , CloudWatch , SNS ,
    SQS , SES , ELB , EIP , EBS
Application Architecture in AWS
               Browser / Web Services /
               Mobile

                            Route 53
                                                                            AWS CloudFront
               Elastic Load balancer                                        CDN

  AZ: US-EAST-1a                        AZ: US-EAST-1b
                                                                                     AWS Simple
                   AWS Security Groups                                               Email Service

Amazon EC2 Servers                      Amazon EC2 Servers                  C
                                                                            L
                                                                            O
                                                                            U
    Auto Scaling                             Auto Scaling
                                                                            D
                                                                            W
                          ElastiCache
                                                                            A
                                                                            T       AWS Simple
                                                                      S3    C
                                                                                    Notification Service
                                                                                    (Alerts)
 Read Slave                                              Read Slave
     1                                                       2              H
              DB Master                     DB Standby



                             Puppet                                   SQS
How it is used in the Project ?

•   ELB – Load Balancing
•   Route 53 – DNS mappings , Algo- RR
•   CloudFront - Assets , HTML , CSS , JS , Images
•   S3 – Logs , Snapshots , Images
•   CloudWatch – Monitor the CPU , ELB , RDS ,
    Custom metrics
•   SNS – System Alerts
•   SES – Emails ( Password , activation , app alerts )
•   EBS – EBS backed AMI for Web/app tier
•   EIP – Elastic IP for Puppet server
What happens if the Entire AWS region is
                affected ?

Solution : Design HA/DR across Regions
High Availability across AWS Regions

                                 DR Web site is hosted in
                                 AWS Tokyo




                                         Main Web Site is hosted
                                         in AWS Singapore region
DR / HA Options in AWS



No downtime                                  Hot Active

  In minutes                        Hot DR

  > 1-2 hours             Warm DR

 > Few hours    Cold DR


                    $       $$        $$$        $$$$
Cold DR

                                                                Passive
Active

                                                                            AWS Tokyo
               AWS Singapore                         Amazon
                                                     Route 53
                     ELB                                                          ELB



                              Web / App EC2                      Web/App EC2               Web / App EC2
 Web/App EC2




                                                                          Database Layer
             Database Layer


     Master                      Standby                            Master                    Standby


                                 Puppet
         D
                                                                                                   D
                                          Sync DB Snaphsots /
                                          Dumps every X hours
                                                    Sync
Cold DR

• When the primary is Down , entire Secondary site is
  manually activated in Cold DR
• RTO > Few Hours to get the Secondary site up and
  running
• RPO – Data loss is acceptable
• CloudFormation templates can be configured on
  Primary and Secondary
• AMI’s , App and DB Data are synced periodically
Cold DR

• EIP Problem – Integration Services ( FTP ,
  WebServices)
• Cost effective
• Most common
Warm DR

                                                                Passive
Active

                                                                           AWS Tokyo
               AWS Singapore
                                                  Amazon
                                                  Route 53                       ELB
                     ELB


                              Web / App EC2                       Web/App EC2          Web / App EC2
 Web/App EC2




             Database Layer                                               Database Layer


     Master                      Standby                             Master                Standby


                                 Puppet                              Puppet
         D
                                                                                               D
                         Asynchronous Replication of databases between AWS regions

                                               Sync
Warm DR

• When the primary is Down , Secondary site is
  manually activated in Warm DR
• RTO > 1 hours to get the Secondary site up and
  running
• RPO – minimal Data loss is acceptable
• CloudFormation templates can be configured on
  Primary and Secondary site
• DB Data are replicated using Asynchronously
• Only DB and Puppet Servers are ready and running
Warm DR

• AMI’s, Application Patches and deployments are
  managed through Puppet
• EIP Problem – Integration Services ( FTP , Web
  Services)
• Costlier than Cold DR
• Recommended in many use cases
Hot DR

                                                                Passive
Active

               AWS Singapore                                                AWS Tokyo
                                                 Amazon
                     ELB                         Route 53                         ELB


 Web/App EC2                  Web / App EC2                      Web/App EC2               Web / App EC2




             Database Layer                                               Database Layer


     Master                      Standby                            Master                    Standby


                                 Puppet                             Puppet
         D
                                                                                                    D
                         Asynchronous Replication of databases between AWS regions

                                               Sync
Hot DR

• When the primary is Down , Secondary site is
  activated in Hot DR
• RTO > few minutes to get the Secondary site up and
  running
• RPO – very minimal Data loss is acceptable
• CloudFormation templates can be configured on
  Primary and Secondary site
• All the tiers are in ready and running state in
  secondary but not active with live transactions
Hot DR

• DB Data are replicated using Asynchronously
• AMI’s, Application Patches and deployments are
  managed through Puppet
• EIP Problem – Integration Services ( FTP , Web
  Services)
• Costlier than Warm DR
• Rare usage
Hot Active
                                           Directional DNS / Traffic


Active                                                         Active

               AWS Singapore                                                 AWS Tokyo
                                                   Amazon
                     ELB                           Route 53                        ELB


 Web/App EC2                  Web / App EC2                      Web/App EC2                Web / App EC2




             Database Layer                                                Database Layer


     Master                      Standby                                Master                 Standby


                                 Puppet                                  Puppet
         D
                                                                                                     D
                    2- way Asynchronous Replication of databases between AWS regions

                                                  Sync
Hot Active-Active

• Both primary and Secondary site are active
• RTO > few seconds to direct the traffic from
  primary to Secondary site
• RPO – negligible Data loss
• Managed DNS server will provide automatic
  failover at DNS level in case of a outage at the
  primary website location
• Transparent switch between websites hosted in
  AWS Singapore and AWS Tokyo within <30-60
  seconds during outage
Hot Active-Active

• Automatic Traffic diversion to nearest site location
• Managed/Directional DNS servers are globally
  distributed and Highly Available Service
• Persistent Data are replicated using Asynchronously
  (2-way)
• AMI’s, Application Patches and deployments are
  managed through Distributed Puppet
• EIP Problem – Integration Services ( FTP , Web
  Services)
• Use case specific
Hot Active-Active

• Website deployed in both regions can scale and
  shrink according to load
• Cost effective for large server farm deployments
• Low latency achieved through traffic direction
• No customers are lost because of load or
  availability problems . Ops are happy !!!
Hot Active-Active

• Technically complex and intricate setup
• Costlier to build and operate (Sophistication
  comes at a cost)
• No Unified Infra Management currently for this
  architecture
  – Example : Directional DNS Console
  – AWS Console
  – Puppet Console
Summary

• Understood how to Architect HA on AWS for LAMJ
  website case
• Understood AWS Building blocks for HA and fault
  tolerance
• How to achieve High Availability across AWS
  Availability Zones (AZ’s) ?
• How to achieve High Availability across AWS
  regions ?
If you need help in architecting High Availability
solutions on AWS?
Leave it to the experts , we will
handle this



Cloud Architecture Consulting
Cloud Application Development
Cloud Migration & Implementation
Cloud Adoption Strategy


                                   “Let's get the job done”
Q&A


“All you need is an idea and the cloud will execute it for you.” (Structure 2010 event)
                           - Dr Werner Vogels , CTO of Amazon on 8KMiles

                                    Contact :

                                cloud@8KMiles.com

                               harish@8KMiles.com

                           www.twitter.com/harish11g

                 http://www.linkedin.com/in/harishganesan

Weitere ähnliche Inhalte

Was ist angesagt?

Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...Amazon Web Services
 
Auto scaling websites in the cloud
Auto scaling websites in the cloudAuto scaling websites in the cloud
Auto scaling websites in the cloudDavid Veksler
 
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC Amazon Web Services
 
Scale New Business Peaks with Amazon AutoScaling - Harish Ganesan
Scale New Business Peaks with Amazon AutoScaling - Harish GanesanScale New Business Peaks with Amazon AutoScaling - Harish Ganesan
Scale New Business Peaks with Amazon AutoScaling - Harish GanesanAmazon Web Services
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsAmazon Web Services
 
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...Amazon Web Services
 
AWS Summit 2011: Designing Fault Tolerant Applicatons
AWS Summit 2011: Designing Fault Tolerant ApplicatonsAWS Summit 2011: Designing Fault Tolerant Applicatons
AWS Summit 2011: Designing Fault Tolerant ApplicatonsAmazon Web Services
 
Scalable Database Options on AWS
Scalable Database Options on AWSScalable Database Options on AWS
Scalable Database Options on AWSAmazon Web Services
 
Cloud Connect 2013- Lock Stock and x Smoking EC2's
Cloud Connect 2013- Lock Stock and x Smoking EC2'sCloud Connect 2013- Lock Stock and x Smoking EC2's
Cloud Connect 2013- Lock Stock and x Smoking EC2'sHarish Ganesan
 
Journey Through the AWS Cloud; Application Services
Journey Through the AWS Cloud; Application ServicesJourney Through the AWS Cloud; Application Services
Journey Through the AWS Cloud; Application ServicesAmazon Web Services
 
Amazon Ec2 Application Design
Amazon Ec2 Application DesignAmazon Ec2 Application Design
Amazon Ec2 Application Designguestd0b61e
 
Building Web Scale Applications with AWS
Building Web Scale Applications with AWSBuilding Web Scale Applications with AWS
Building Web Scale Applications with AWSAmazon Web Services
 
Building High-availability Websites on AWS
Building High-availability Websites on AWSBuilding High-availability Websites on AWS
Building High-availability Websites on AWSAmazon Web Services
 
AWS RDS Presentation - DOAG Conference
AWS RDS Presentation - DOAG Conference AWS RDS Presentation - DOAG Conference
AWS RDS Presentation - DOAG Conference Amazon Web Services
 

Was ist angesagt? (20)

Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
 
Auto Scaling Groups
Auto Scaling GroupsAuto Scaling Groups
Auto Scaling Groups
 
Your First Week with Amazon EC2
Your First Week with Amazon EC2Your First Week with Amazon EC2
Your First Week with Amazon EC2
 
Auto scaling websites in the cloud
Auto scaling websites in the cloudAuto scaling websites in the cloud
Auto scaling websites in the cloud
 
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC
 
CMS on AWS Deep Dive
CMS on AWS Deep DiveCMS on AWS Deep Dive
CMS on AWS Deep Dive
 
Scale New Business Peaks with Amazon AutoScaling - Harish Ganesan
Scale New Business Peaks with Amazon AutoScaling - Harish GanesanScale New Business Peaks with Amazon AutoScaling - Harish Ganesan
Scale New Business Peaks with Amazon AutoScaling - Harish Ganesan
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on aws
 
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
 
AWS Summit 2011: Designing Fault Tolerant Applicatons
AWS Summit 2011: Designing Fault Tolerant ApplicatonsAWS Summit 2011: Designing Fault Tolerant Applicatons
AWS Summit 2011: Designing Fault Tolerant Applicatons
 
Scalable Database Options on AWS
Scalable Database Options on AWSScalable Database Options on AWS
Scalable Database Options on AWS
 
Cloud Connect 2013- Lock Stock and x Smoking EC2's
Cloud Connect 2013- Lock Stock and x Smoking EC2'sCloud Connect 2013- Lock Stock and x Smoking EC2's
Cloud Connect 2013- Lock Stock and x Smoking EC2's
 
Journey Through the AWS Cloud; Application Services
Journey Through the AWS Cloud; Application ServicesJourney Through the AWS Cloud; Application Services
Journey Through the AWS Cloud; Application Services
 
Intro to Amazon ECS
Intro to Amazon ECSIntro to Amazon ECS
Intro to Amazon ECS
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
Amazon EC2 Masterclass
Amazon EC2 MasterclassAmazon EC2 Masterclass
Amazon EC2 Masterclass
 
Amazon Ec2 Application Design
Amazon Ec2 Application DesignAmazon Ec2 Application Design
Amazon Ec2 Application Design
 
Building Web Scale Applications with AWS
Building Web Scale Applications with AWSBuilding Web Scale Applications with AWS
Building Web Scale Applications with AWS
 
Building High-availability Websites on AWS
Building High-availability Websites on AWSBuilding High-availability Websites on AWS
Building High-availability Websites on AWS
 
AWS RDS Presentation - DOAG Conference
AWS RDS Presentation - DOAG Conference AWS RDS Presentation - DOAG Conference
AWS RDS Presentation - DOAG Conference
 

Ähnlich wie Aws 201:Advanced Breakout Track on HA and DR

Running High Availability Websites with Acquia and AWS
Running High Availability Websites with Acquia and AWSRunning High Availability Websites with Acquia and AWS
Running High Availability Websites with Acquia and AWSAcquia
 
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...IndicThreads
 
PHP LAMP AWS RightSscale
PHP LAMP AWS RightSscalePHP LAMP AWS RightSscale
PHP LAMP AWS RightSscalemaxgribov
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWSAmazon Web Services
 
AWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh Varia
AWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh VariaAWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh Varia
AWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh VariaAmazon Web Services
 
AWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the CloudAWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the CloudAmazon Web Services
 
AWS fault tolerant architecture
AWS fault tolerant architectureAWS fault tolerant architecture
AWS fault tolerant architectureskadyan1
 
Architecting for the Cloud: Best Practices
Architecting for the Cloud: Best PracticesArchitecting for the Cloud: Best Practices
Architecting for the Cloud: Best PracticesAmazon Web Services
 
NWCloud Cloud Track - Best Practices for Architecting in the Cloud
NWCloud Cloud Track - Best Practices for Architecting in the CloudNWCloud Cloud Track - Best Practices for Architecting in the Cloud
NWCloud Cloud Track - Best Practices for Architecting in the Cloudnwcloud
 
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...Amazon Web Services
 
Designing Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVDesigning Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVAmazon Web Services
 
Deploy PHP Apps on AWS Beanstalk & Deploy with Git
Deploy PHP Apps on AWS Beanstalk & Deploy with GitDeploy PHP Apps on AWS Beanstalk & Deploy with Git
Deploy PHP Apps on AWS Beanstalk & Deploy with GitAmazon Web Services
 
Architecting Cloud Apps
Architecting Cloud AppsArchitecting Cloud Apps
Architecting Cloud Appsjineshvaria
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWSKristana Kane
 
Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20Amazon Web Services
 
AWS Enterprise Workloads on AWS IP Expo 2013
AWS Enterprise Workloads on AWS IP Expo 2013AWS Enterprise Workloads on AWS IP Expo 2013
AWS Enterprise Workloads on AWS IP Expo 2013Amazon Web Services
 
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...Amazon Web Services
 
SRV201 Getting Started with Docker on AWS
SRV201 Getting Started with Docker on AWSSRV201 Getting Started with Docker on AWS
SRV201 Getting Started with Docker on AWSAmazon Web Services
 

Ähnlich wie Aws 201:Advanced Breakout Track on HA and DR (20)

Running High Availability Websites with Acquia and AWS
Running High Availability Websites with Acquia and AWSRunning High Availability Websites with Acquia and AWS
Running High Availability Websites with Acquia and AWS
 
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
 
PHP LAMP AWS RightSscale
PHP LAMP AWS RightSscalePHP LAMP AWS RightSscale
PHP LAMP AWS RightSscale
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
 
AWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh Varia
AWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh VariaAWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh Varia
AWS Architecting Cloud Apps - Best Practices and Design Patterns By Jinesh Varia
 
AWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the CloudAWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the Cloud
 
AWS fault tolerant architecture
AWS fault tolerant architectureAWS fault tolerant architecture
AWS fault tolerant architecture
 
Architecting for the Cloud: Best Practices
Architecting for the Cloud: Best PracticesArchitecting for the Cloud: Best Practices
Architecting for the Cloud: Best Practices
 
NWCloud Cloud Track - Best Practices for Architecting in the Cloud
NWCloud Cloud Track - Best Practices for Architecting in the CloudNWCloud Cloud Track - Best Practices for Architecting in the Cloud
NWCloud Cloud Track - Best Practices for Architecting in the Cloud
 
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
 
Designing Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVDesigning Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSV
 
Deploy PHP Apps on AWS Beanstalk & Deploy with Git
Deploy PHP Apps on AWS Beanstalk & Deploy with GitDeploy PHP Apps on AWS Beanstalk & Deploy with Git
Deploy PHP Apps on AWS Beanstalk & Deploy with Git
 
AWS.doc
AWS.docAWS.doc
AWS.doc
 
Architecting Cloud Apps
Architecting Cloud AppsArchitecting Cloud Apps
Architecting Cloud Apps
 
Getting Started with Docker on AWS
Getting Started with Docker on AWSGetting Started with Docker on AWS
Getting Started with Docker on AWS
 
AWS Distilled
AWS DistilledAWS Distilled
AWS Distilled
 
Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20Aws webcast - Scaling on AWS 13 08-20
Aws webcast - Scaling on AWS 13 08-20
 
AWS Enterprise Workloads on AWS IP Expo 2013
AWS Enterprise Workloads on AWS IP Expo 2013AWS Enterprise Workloads on AWS IP Expo 2013
AWS Enterprise Workloads on AWS IP Expo 2013
 
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
ARC205 Building Web-scale Applications Architectures with AWS - AWS re: Inven...
 
SRV201 Getting Started with Docker on AWS
SRV201 Getting Started with Docker on AWSSRV201 Getting Started with Docker on AWS
SRV201 Getting Started with Docker on AWS
 

Kürzlich hochgeladen

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 

Kürzlich hochgeladen (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 

Aws 201:Advanced Breakout Track on HA and DR

  • 1. AWS 201 : Breakout Track Singapore “Design for Failure” HA and DR Best practices Harish Ganesan Co founder & CTO 8KMiles www.twitter.com/harish11g http://www.linkedin.com/in/harishganesan
  • 2. Agenda • Explain HA Architecture with Real Customer Case • Understand how to Architect a web app in AWS with – Highly Availability – DR – Scalability • Why AWS ?
  • 3. About the Customer • Online ecommerce company • NASDAQ Listed • Application consumed by Online users , Mobile and Web Services
  • 4. Requirements • High Availability on all tiers with No SPOF • Auto Scalable and elastic infrastructure • Ability to serve millions of requests per day • Serve peak HTTP traffic of 8000+ reqs/sec • Serve peak HTTPS traffic of 2500+ reqs/sec • 65% of the business is done during holiday , so no downtime is affordable • Monitoring , Backup and deployment ease • Optimal DR setup ( Cost vs RTO/RPO)
  • 5. Technology and Tiers • Multi tiered Linux, Apache, Java Web site on AWS • Data base tier using MySQL • Cache Tier • Integration tier with Queues and Background programs • HTTP and HTTPS protocol
  • 6. What 8KMiles did ? • Consulting : Architected the entire website infra on AWS • Implementation: – Configured the Infra on AWS – Developed custom DevOps scripts on AWS • Supported during the Thanksgiving and Holiday • Cloud Development Partner : – Currently Reengineering the customer App to leverage more AWS services
  • 8. A simple LAMJ Architecture 1 Web/App Server interacts US-EAST-1a with MySQL for Queries AWS Security Groups and Transactions Integration Web/App/Cache Services Server MySQL DB CloudWatch
  • 9. What is the problem in this Architecture ?
  • 10. A simple LAMJ Architecture Single Point of Failure at US-EAST-1a multiple tiers AWS Security Groups Integration Web/App/Cache Services Server MySQL DB CloudWatch Not a Highly Available Architecture
  • 11. How to avoid SPOF and build a robust architecture ?
  • 12. Step 1: Distribute the Application to Multiple Tiers 1 Separate out the US-EAST-1a individual tiers into AWS Security Groups separate EC2 instances Integration Web/App Server Service tier MySQL DB CloudWatch
  • 13. Step 2: Add Multiple Servers in each layer 1 Add Multiple EC2 US-EAST-1a instances in every tier AWS Security Groups Integration Web/App Server Service tier MySQL DB CloudWatch
  • 14. Building HA @ Load Balancing Tier
  • 15. Load Balancing Tier • Load Balancing Options – ELB – HAProxy – Nginx
  • 16. Why AWS ELB ? • AWS ELB provides load balancing service with thousands of EC2 servers behind them • AWS ELB will automatically Scale up /down the load balancing servers in backend • The theoretical maximum response rate of AWS ELB is limitless • It can handle 20000+ concurrent requests easily (RightScale Benchmark) • AWS ELB works seamlessly with AWS Auto Scaling
  • 17. Why AWS ELB ? • AWS ELB is integrated well with other AWS • No maintenance • Pay as you go
  • 18. Load balancing Layer Online / Web / Mobile 1 Simple Round Robin Algorithm AWS Elastic Load balancer US-EAST-1a AWS Security Groups 2 Health Checks , SSL termination 3 ELB is a Highly Available Web/App Server Service with No SPOF MySQL DB
  • 19. Building HA @ Web/App Tier
  • 20. High Availability @ Web/App tier 1 Add AWS Auto Scaling to Web / App tier AWS Elastic Load balancer US-EAST-1a AWS Security Groups 2 Tie AWS Auto Scaling with Web/App Server AWS ELB S3 Puppet Auto Scaling 3 Deploy the app using Puppet Integration Service Tier MySQL DB
  • 21. Designing HA @ Web/App Tier • AWS Auto Scaling will manage un Healthy EC2 instances • AWS Auto Scaling will ensure minimum number Web/App EC2 instances are always running • In event of failure , new instances will be launched between 30-120 seconds automatically • ELB traffic is seamlessly attached to the Auto Scaled EC2 instances
  • 22. Designing HA @ Web/App Tier • Deploy the application / patches in Auto Scaling environment using Puppet / S3 scripts • Choose the right EC2 instance Type – Large ( Less CPU intensive , HEAP 5.5 GB RAM ) – High CPU Extra Large ( More CPU intensive , HEAP 5.5 GB RAM , Concurrent GC) • Points to remember – Do not store the Session in-memory of web/app server – Rotate and move the log files to S3 periodically – Move the Uploaded data files , images to S3 or GlusterFS
  • 23. What happens when US-EAST-1a AZ fails ? Solution : Leverage AWS Multi-AZ architecture
  • 25. 1 Infrastructure is spread across HTTP/S requests hit the Amazon Load Balancer from the browser or mobile devices Multi AZ’s of AWS inside a Region AWS Elastic Load balancer AZ: US-EAST-1a AZ: US-EAST-1b AWS Security Groups 2 AWS Elastic Load balancer Web/App EC2 Web/App EC2 directs requests to EC2 instances across Multiple AZ’s Auto Scaling Auto Scaling 3 Amazon AutoScaling automatically launches new EC2 instances across Multiple AZ’s 4 No Code Changes required to leverage Multi-AZ
  • 26. High Availability @ Web/App/DEX layer • AZ’s are connected by Low Latency network • AZ’s are insulated from failures in other Availability Zones * • AWS Auto Scaling can manage EC2 instances across AZ’s • AWS ELB can direct load to EC2 instances across AZ’s • AWS CloudWatch can monitor the EC2 instance availability across AZ
  • 28. Database Tier • Options – MySQL Master- Slave replication – MySQL ndbCluster – RDS MySQL Master – Standby – RDS MySQL Master – Standby + Read Replica’s
  • 29. High Availability @ DB Layer 1 Read Replica’s launched in Multiple AZ’s for HA AWS Elastic Load Balancer USA- EAST -1A USA- EAST -1B AWS Security groups 2 RDS Standby will be launched on different AZ from the RDS master for Web/App EC2 Web/App EC2 HA Auto Scaling Auto Scaling 3 Web/APP hosted on Amazon EC2 will transact S3 Read Read with RDS master and Replica Replica read from Read replica’s RDS RDS Master Standby D CloudWatch
  • 30. High Availability @ DB Layer • RDS Master and RDS Standby in Multiple AZ for HA • Read Replica’s in Multiple AZ for HA • Offers No SPOF on AZ level • Read Replica’s can be launched/terminated without affecting the RDS Master availability • In event of RDS master failure, RDS Standby will be automatically promoted • Promotion <180 seconds and no changes in the application
  • 31. High Availability @ DB Layer • DB snapshots and MySQL Dumps facility available • Automatic full backups at configured maintenance windows • Point in time recovery till last minute • Recovery might require App layer configuration changes
  • 32. High Availability @ DB Layer • Points to remember – RDS supports only MySQL innodb engine – Give more memory to RDS Master • Use Extra Large or High Memory instance types – Keep your Read Replica’s and RDS Master with same size – Multiple Read Replica’s can be Load Balanced using HAProxy LB
  • 33. Use AWS Building blocks in your architecture
  • 34. Use AWS Building blocks • AWS Building blocks are in built with – Inherent fault tolerance – HA and scalability • Following Building blocks were used – S3 , CloudFront , Route 53 , CloudWatch , SNS , SQS , SES , ELB , EIP , EBS
  • 35. Application Architecture in AWS Browser / Web Services / Mobile Route 53 AWS CloudFront Elastic Load balancer CDN AZ: US-EAST-1a AZ: US-EAST-1b AWS Simple AWS Security Groups Email Service Amazon EC2 Servers Amazon EC2 Servers C L O U Auto Scaling Auto Scaling D W ElastiCache A T AWS Simple S3 C Notification Service (Alerts) Read Slave Read Slave 1 2 H DB Master DB Standby Puppet SQS
  • 36. How it is used in the Project ? • ELB – Load Balancing • Route 53 – DNS mappings , Algo- RR • CloudFront - Assets , HTML , CSS , JS , Images • S3 – Logs , Snapshots , Images • CloudWatch – Monitor the CPU , ELB , RDS , Custom metrics • SNS – System Alerts • SES – Emails ( Password , activation , app alerts ) • EBS – EBS backed AMI for Web/app tier • EIP – Elastic IP for Puppet server
  • 37. What happens if the Entire AWS region is affected ? Solution : Design HA/DR across Regions
  • 38. High Availability across AWS Regions DR Web site is hosted in AWS Tokyo Main Web Site is hosted in AWS Singapore region
  • 39. DR / HA Options in AWS No downtime Hot Active In minutes Hot DR > 1-2 hours Warm DR > Few hours Cold DR $ $$ $$$ $$$$
  • 40. Cold DR Passive Active AWS Tokyo AWS Singapore Amazon Route 53 ELB ELB Web / App EC2 Web/App EC2 Web / App EC2 Web/App EC2 Database Layer Database Layer Master Standby Master Standby Puppet D D Sync DB Snaphsots / Dumps every X hours Sync
  • 41. Cold DR • When the primary is Down , entire Secondary site is manually activated in Cold DR • RTO > Few Hours to get the Secondary site up and running • RPO – Data loss is acceptable • CloudFormation templates can be configured on Primary and Secondary • AMI’s , App and DB Data are synced periodically
  • 42. Cold DR • EIP Problem – Integration Services ( FTP , WebServices) • Cost effective • Most common
  • 43. Warm DR Passive Active AWS Tokyo AWS Singapore Amazon Route 53 ELB ELB Web / App EC2 Web/App EC2 Web / App EC2 Web/App EC2 Database Layer Database Layer Master Standby Master Standby Puppet Puppet D D Asynchronous Replication of databases between AWS regions Sync
  • 44. Warm DR • When the primary is Down , Secondary site is manually activated in Warm DR • RTO > 1 hours to get the Secondary site up and running • RPO – minimal Data loss is acceptable • CloudFormation templates can be configured on Primary and Secondary site • DB Data are replicated using Asynchronously • Only DB and Puppet Servers are ready and running
  • 45. Warm DR • AMI’s, Application Patches and deployments are managed through Puppet • EIP Problem – Integration Services ( FTP , Web Services) • Costlier than Cold DR • Recommended in many use cases
  • 46. Hot DR Passive Active AWS Singapore AWS Tokyo Amazon ELB Route 53 ELB Web/App EC2 Web / App EC2 Web/App EC2 Web / App EC2 Database Layer Database Layer Master Standby Master Standby Puppet Puppet D D Asynchronous Replication of databases between AWS regions Sync
  • 47. Hot DR • When the primary is Down , Secondary site is activated in Hot DR • RTO > few minutes to get the Secondary site up and running • RPO – very minimal Data loss is acceptable • CloudFormation templates can be configured on Primary and Secondary site • All the tiers are in ready and running state in secondary but not active with live transactions
  • 48. Hot DR • DB Data are replicated using Asynchronously • AMI’s, Application Patches and deployments are managed through Puppet • EIP Problem – Integration Services ( FTP , Web Services) • Costlier than Warm DR • Rare usage
  • 49. Hot Active Directional DNS / Traffic Active Active AWS Singapore AWS Tokyo Amazon ELB Route 53 ELB Web/App EC2 Web / App EC2 Web/App EC2 Web / App EC2 Database Layer Database Layer Master Standby Master Standby Puppet Puppet D D 2- way Asynchronous Replication of databases between AWS regions Sync
  • 50. Hot Active-Active • Both primary and Secondary site are active • RTO > few seconds to direct the traffic from primary to Secondary site • RPO – negligible Data loss • Managed DNS server will provide automatic failover at DNS level in case of a outage at the primary website location • Transparent switch between websites hosted in AWS Singapore and AWS Tokyo within <30-60 seconds during outage
  • 51. Hot Active-Active • Automatic Traffic diversion to nearest site location • Managed/Directional DNS servers are globally distributed and Highly Available Service • Persistent Data are replicated using Asynchronously (2-way) • AMI’s, Application Patches and deployments are managed through Distributed Puppet • EIP Problem – Integration Services ( FTP , Web Services) • Use case specific
  • 52. Hot Active-Active • Website deployed in both regions can scale and shrink according to load • Cost effective for large server farm deployments • Low latency achieved through traffic direction • No customers are lost because of load or availability problems . Ops are happy !!!
  • 53. Hot Active-Active • Technically complex and intricate setup • Costlier to build and operate (Sophistication comes at a cost) • No Unified Infra Management currently for this architecture – Example : Directional DNS Console – AWS Console – Puppet Console
  • 54. Summary • Understood how to Architect HA on AWS for LAMJ website case • Understood AWS Building blocks for HA and fault tolerance • How to achieve High Availability across AWS Availability Zones (AZ’s) ? • How to achieve High Availability across AWS regions ?
  • 55. If you need help in architecting High Availability solutions on AWS?
  • 56. Leave it to the experts , we will handle this Cloud Architecture Consulting Cloud Application Development Cloud Migration & Implementation Cloud Adoption Strategy “Let's get the job done”
  • 57. Q&A “All you need is an idea and the cloud will execute it for you.” (Structure 2010 event) - Dr Werner Vogels , CTO of Amazon on 8KMiles Contact : cloud@8KMiles.com harish@8KMiles.com www.twitter.com/harish11g http://www.linkedin.com/in/harishganesan