SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Protect your app from Outages
Ron Zavner, Applications Architect at Gigaspaces


                                              February 2013
AGENDA


   AWS and outages
   Outage impact
   Disaster Recovery – it’s all about redundancy!
   Cloudify as a solution for redundancy
   Demo with Cloudify on EC2




2               ® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
AWS USAGE


                •   AWS – around 0.5M servers
                •   Facebook – less than 0.1M servers
                •   Google – around 1M servers




3
THE OUTAGE PROBLEM




4
OUTAGE – APRIL 21, 2011




5          ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
OUTAGE - JUNE 29, 2012




6          ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
OUTAGE - OCTOBER 22, 2012




7          ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
OUTAGE - CHRISTMAS EVE 2012




8         ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
THAT’S WHAT YOU EXPECT?




99% - 3.65 days downtime
99.9% - 8.76 hours downtime
99.99% - 53 minutes downtime
99.999% - 5.26 minutes downtime




 9
OUTAGE IMPACT – DESIGN FOR FAILURES




Outage could cost…
$89K per hour for Amadeus
$225K per hour for PayPal!




  10                  ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
DISASTER RECOVERY




11
MULTI CLOUD




12
PREPARE FOR DISASTER RECOVERY




•Dedicated expert for DR architecture
•Define target recovery time & point
•Assume every tier can fail
•Use monitoring and alerts
•Document your operational processes




 13
CHAOS MONKEY




14
15
CLONE YOUR ENVIORMENT




16
CLONE YOUR DATA




17
18
CLOUDIFY POSITIONING IN THE CLOUD STACK

     Productivity                                                 High productivity with
                                                                       full control


               Heroku
  PaaS                           CloudFoundry
               GAE
                           OpenShift




                                                Rightscale
DevOps
(Automation)                                        Enstratus
                                                                Puppet       Chef

  IaaS               Public clouds
                     (AWS, Rackspace,..)                        Private clouds
                                                                (Vmware, OpenStack..)
                                                                                           Control
                                           19
CLONE YOUR ENV - HOW DOES IT WORK?
EXTENSIVE PLATFORM SUPPORT




21        ® Copyright 2012 GigaSpaces. All Rights Reserved
USE ANY CLOUD




22
GETTING COMPUTE RESOURCES IN A PORTABLE WAY

                                                                                  compute {
                                                                                    template "SMALL_LINUX"
                                                                                  }




SMALL_LINUX : template{                                                  SMALL_LINUX : template
  imageId "1234"                                                           imageId "us-east-1/ami-76f0061f“
  machineMemoryMB 3200                                                     remoteDirectory "/home/ec2-user/gs-files“
  hardwareId "103"                                                         machineMemoryMB 1600
  remoteDirectory "/root/gs-files"                                         hardwareId "m1.small"
  localDirectory "upload"                                                  locationId "us-east-1"
  keyFile "gigaPGHP.pem"                                                   localDirectory "upload"
  options ([                                                               keyFile "myKeyFile.pem"
    "openstack.securityGroup" : "default",
    "openstack.keyPair" : "gigaPGHP"                                       options ([
            ])                                                                   "securityGroups" : ["default"]as
            privileged true                                              String[],
}                                                                                "keyPair" : "myKeyFile"
                                                                               ])
                                                                               overrides (["jclouds.ec2.ami-query":"",
                                                                               "jclouds.ec2.cc-ami-query":""])
                                                                               privileged true
                                                                         }


   23                      ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
DATA REPLICATION


• Cloudify Replicated MySQL Recipe
• Generic replication service using WAN Gateway




24                ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
GENERIC REPLICATION SERVICE OVER WAN




                                     London



                       New York


                                                                         Hong Kong




In-Memory Speed   Scalable and Efficient   High Availability and Self-Healing
26
VERIFI (CURRENT) DEPLOYMENT ARCHITECTURE




                                                 PostgresSQL
                mod_cluster


                                                                        Data Volume
     Internet   EC2 Instance                      EC2 Instance


                                                  Cassandra

                    JBoss                                                             4 recipes

                                                                        Data Volume
                 EC2 Instance                     EC2 Instance



                                Availability region (US-West: Oregon)




27
TARGET ARCHITECTURE
 Bootstrap two EC2 clouds in different regions, install the “verifi” application on each. The second cloud will have a slightly modified
 (extended) postgres recipe for acting as a slave + no running app servers. Upon the primary zone failure, the second cloud will spin up
 instances of the app servers and turn the data instance into master, then bootstrapping another “slave” cloud in another zone.
                                                                               replication




                         mod_cluster                                                         mod_cluster


                                                                Data Volume                                                         Data Volume

                                             Postgres Master                                 EC2 Instance         Postgres Slave
Internet                 EC2 Instance
                                              EC2 Instance                                                         EC2 Instance



                                               Cassandra                                                          Cassandra
                              JBoss                                                              JBoss


                                                                 Data Volume                                                         Data Volume

                                              EC2 Instance                                   EC2 Instance         EC2 Instance
                          EC2 Instance


                               Availability Region (US-West Oregon)                                Availability Region (US-East Virginia)
FAILOVER SCENARIO
           Cloud #1                                                                         Upon initial deployment, the primary deployment
                                                            Cloud #2                        of the application will be bootstrapped onto cloud
                                                                                            #1, another slightly modified application recipe
                                                                                            will be bootstrapped as cloud #2, polling cloud #1
                                                                                            for failure, and acting as a PostgresSQL db slave.



                          PostgresSQL                           PostgresSQL
            App Servers                     Liveness poll




     Region (US-West Oregon)               Region (US-East Virginia)
                                                                                                                    Bootstrap another cloud in
                                                                                                                    a different region using the
                          Region failure                                                                            same application recipe
                                                                        Turn Postgres slave into                    used to bootstrap cloud #2
                          occurs                                        master, Start app server                    above*
                                                                        instances*
                                                            Cloud #2                                                    Cloud #3
                  Cloud #1




                                                                                                    Liveness poll
                                                       App Servers   PostgresSQL                                                  PostgresSQL




                                                                                                     Region (US-West California)
                                                 Region (US-East Virginia )




29
DEMO ON EC2 - 5 MINUTES SETUP

/* Credentials - You must enter your
 * cloud provider account credentials
 */


user="ENTER_USER_HERE"
apiKey="ENTER_API_KEY_HERE"
keyFile="ENTER_KEY_FILE_HERE"
keyPair="ENTER_KEY_PAIR_HERE"


// Advanced usage


hardwareId="m1.small"
locationId="us-east-1"
linuxImageId="us-east-1/ami-1624987f"
ubuntuImageId="us-east-1/ami-82fa58eb"




 30                    ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
SUMMARY


 AWS and outages
 Outage impact
 Disaster Recovery – it’s all about redundancy!
      Cloning your environment – app stack
      Cloning your DB – Replication
 Cloudify as a solution for Redundancy
      Use recipes to work on any cloud
      Fast and customized data replication
 Demo with Cloudify on EC2



31                ® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
QUESTIONS & ANSWERS




    Thank You!
RonZ@gigaspaces.com

32        ® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

Weitere ähnliche Inhalte

Was ist angesagt?

Bitrix Site Manager v11.0 Presentation
Bitrix Site Manager v11.0 PresentationBitrix Site Manager v11.0 Presentation
Bitrix Site Manager v11.0 PresentationBitrix, Inc.
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSuzquiano
 
Supporting and Using EC2/CIMI on top of Cloud Environments via Deltacloud
Supporting and Using EC2/CIMI on top of Cloud Environments via DeltacloudSupporting and Using EC2/CIMI on top of Cloud Environments via Deltacloud
Supporting and Using EC2/CIMI on top of Cloud Environments via DeltacloudOved Ourfali
 
OpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overviewOpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overviewOpen Stack
 
OpenStack Technology Overview
OpenStack Technology OverviewOpenStack Technology Overview
OpenStack Technology OverviewOpen Stack
 
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014zshoylev
 
Introduction to openstack
Introduction to openstackIntroduction to openstack
Introduction to openstackYaniv Zadka
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
OpenStack at Xen summit Asia
OpenStack at Xen summit Asia OpenStack at Xen summit Asia
OpenStack at Xen summit Asia Jaesuk Ahn
 
Hibernate
HibernateHibernate
Hibernateksain
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
OSS Presentation by Stefano Maffulli
OSS Presentation by Stefano MaffulliOSS Presentation by Stefano Maffulli
OSS Presentation by Stefano MaffulliOpenStorageSummit
 
DotNetNuke on Azure Cloud Servers
DotNetNuke on Azure Cloud ServersDotNetNuke on Azure Cloud Servers
DotNetNuke on Azure Cloud Serversbrchapman
 
SCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AISCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AIHiroshi Tanaka
 
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012The Linux Foundation
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-12012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1tcloudcomputing-tw
 
C fowler azure-dojo
C fowler azure-dojoC fowler azure-dojo
C fowler azure-dojosdeconf
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture FutureKimihiko Kitase
 

Was ist angesagt? (20)

DevCloud and CloudMonkey
DevCloud and CloudMonkeyDevCloud and CloudMonkey
DevCloud and CloudMonkey
 
Bitrix Site Manager v11.0 Presentation
Bitrix Site Manager v11.0 PresentationBitrix Site Manager v11.0 Presentation
Bitrix Site Manager v11.0 Presentation
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
 
Supporting and Using EC2/CIMI on top of Cloud Environments via Deltacloud
Supporting and Using EC2/CIMI on top of Cloud Environments via DeltacloudSupporting and Using EC2/CIMI on top of Cloud Environments via Deltacloud
Supporting and Using EC2/CIMI on top of Cloud Environments via Deltacloud
 
OpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overviewOpenStack Boston User Group, OpenStack overview
OpenStack Boston User Group, OpenStack overview
 
OpenStack Technology Overview
OpenStack Technology OverviewOpenStack Technology Overview
OpenStack Technology Overview
 
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
 
Introduction to openstack
Introduction to openstackIntroduction to openstack
Introduction to openstack
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Introduction to CloudStack: How to Deploy and Manage Infrastructure-as-a-Serv...
Introduction to CloudStack: How to Deploy and Manage Infrastructure-as-a-Serv...Introduction to CloudStack: How to Deploy and Manage Infrastructure-as-a-Serv...
Introduction to CloudStack: How to Deploy and Manage Infrastructure-as-a-Serv...
 
OpenStack at Xen summit Asia
OpenStack at Xen summit Asia OpenStack at Xen summit Asia
OpenStack at Xen summit Asia
 
Hibernate
HibernateHibernate
Hibernate
 
Slide 1
Slide 1Slide 1
Slide 1
 
OSS Presentation by Stefano Maffulli
OSS Presentation by Stefano MaffulliOSS Presentation by Stefano Maffulli
OSS Presentation by Stefano Maffulli
 
DotNetNuke on Azure Cloud Servers
DotNetNuke on Azure Cloud ServersDotNetNuke on Azure Cloud Servers
DotNetNuke on Azure Cloud Servers
 
SCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AISCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AI
 
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-12012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
 
C fowler azure-dojo
C fowler azure-dojoC fowler azure-dojo
C fowler azure-dojo
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture Future
 

Ähnlich wie Protect your app from Outages

Running your Java EE 6 applications in the Cloud (FISL 12)
Running your Java EE 6 applications in the Cloud (FISL 12)Running your Java EE 6 applications in the Cloud (FISL 12)
Running your Java EE 6 applications in the Cloud (FISL 12)Arun Gupta
 
A Groovy Kind of Java (San Francisco Java User Group)
A Groovy Kind of Java (San Francisco Java User Group)A Groovy Kind of Java (San Francisco Java User Group)
A Groovy Kind of Java (San Francisco Java User Group)Nati Shalom
 
Big data (reversim)
Big data (reversim)Big data (reversim)
Big data (reversim)Nati Shalom
 
Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the CloudNati Shalom
 
JFokus 2011 - Running your Java EE 6 apps in the Cloud
JFokus 2011 - Running your Java EE 6 apps in the CloudJFokus 2011 - Running your Java EE 6 apps in the Cloud
JFokus 2011 - Running your Java EE 6 apps in the CloudArun Gupta
 
Running your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the CloudRunning your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the CloudArun Gupta
 
Running your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the CloudRunning your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the CloudArun Gupta
 
Running your Java EE 6 Apps in the Cloud - JavaOne India 2011
Running your Java EE 6 Apps in the Cloud - JavaOne India 2011Running your Java EE 6 Apps in the Cloud - JavaOne India 2011
Running your Java EE 6 Apps in the Cloud - JavaOne India 2011Arun Gupta
 
JavaOne India 2011 - Running your Java EE 6 Apps in the Cloud
JavaOne India 2011 - Running your Java EE 6 Apps in the CloudJavaOne India 2011 - Running your Java EE 6 Apps in the Cloud
JavaOne India 2011 - Running your Java EE 6 Apps in the CloudArun Gupta
 
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Arun Gupta
 
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016Amazon Web Services Korea
 
Running your Java EE 6 applications in the clouds
Running your Java EE 6 applications in the clouds Running your Java EE 6 applications in the clouds
Running your Java EE 6 applications in the clouds Arun Gupta
 
Scaling and Managing Big Data Apps in the Cloud
Scaling and Managing Big Data Apps in the CloudScaling and Managing Big Data Apps in the Cloud
Scaling and Managing Big Data Apps in the CloudNati Shalom
 
Avoiding Cloud Outage
Avoiding Cloud OutageAvoiding Cloud Outage
Avoiding Cloud OutageNati Shalom
 
Cloud computing & lamp applications
Cloud computing & lamp applicationsCloud computing & lamp applications
Cloud computing & lamp applicationsCorley S.r.l.
 
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparisonbizalgo
 
Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...
Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...
Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...Amazon Web Services
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalleybuildacloud
 
Cloud Foundry for Spring Developers
Cloud Foundry for Spring DevelopersCloud Foundry for Spring Developers
Cloud Foundry for Spring DevelopersGunnar Hillert
 

Ähnlich wie Protect your app from Outages (20)

Running your Java EE 6 applications in the Cloud (FISL 12)
Running your Java EE 6 applications in the Cloud (FISL 12)Running your Java EE 6 applications in the Cloud (FISL 12)
Running your Java EE 6 applications in the Cloud (FISL 12)
 
A Groovy Kind of Java (San Francisco Java User Group)
A Groovy Kind of Java (San Francisco Java User Group)A Groovy Kind of Java (San Francisco Java User Group)
A Groovy Kind of Java (San Francisco Java User Group)
 
Big data (reversim)
Big data (reversim)Big data (reversim)
Big data (reversim)
 
Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the Cloud
 
JFokus 2011 - Running your Java EE 6 apps in the Cloud
JFokus 2011 - Running your Java EE 6 apps in the CloudJFokus 2011 - Running your Java EE 6 apps in the Cloud
JFokus 2011 - Running your Java EE 6 apps in the Cloud
 
Running your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the CloudRunning your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the Cloud
 
Running your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the CloudRunning your Java EE 6 applications in the Cloud
Running your Java EE 6 applications in the Cloud
 
Running your Java EE 6 Apps in the Cloud - JavaOne India 2011
Running your Java EE 6 Apps in the Cloud - JavaOne India 2011Running your Java EE 6 Apps in the Cloud - JavaOne India 2011
Running your Java EE 6 Apps in the Cloud - JavaOne India 2011
 
JavaOne India 2011 - Running your Java EE 6 Apps in the Cloud
JavaOne India 2011 - Running your Java EE 6 Apps in the CloudJavaOne India 2011 - Running your Java EE 6 Apps in the Cloud
JavaOne India 2011 - Running your Java EE 6 Apps in the Cloud
 
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
 
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
 
Running your Java EE 6 applications in the clouds
Running your Java EE 6 applications in the clouds Running your Java EE 6 applications in the clouds
Running your Java EE 6 applications in the clouds
 
Scaling and Managing Big Data Apps in the Cloud
Scaling and Managing Big Data Apps in the CloudScaling and Managing Big Data Apps in the Cloud
Scaling and Managing Big Data Apps in the Cloud
 
Cloudy Ajax 08 10
Cloudy Ajax 08 10Cloudy Ajax 08 10
Cloudy Ajax 08 10
 
Avoiding Cloud Outage
Avoiding Cloud OutageAvoiding Cloud Outage
Avoiding Cloud Outage
 
Cloud computing & lamp applications
Cloud computing & lamp applicationsCloud computing & lamp applications
Cloud computing & lamp applications
 
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief ComparisonCloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
CloudStack vs OpenStack vs Eucalyptus: IaaS Private Cloud Brief Comparison
 
Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...
Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...
Mythical Mysfits: Monolith to Microservices with Docker and Fargate - MAD305 ...
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
 
Cloud Foundry for Spring Developers
Cloud Foundry for Spring DevelopersCloud Foundry for Spring Developers
Cloud Foundry for Spring Developers
 

Kürzlich hochgeladen

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Protect your app from Outages

  • 1. Protect your app from Outages Ron Zavner, Applications Architect at Gigaspaces February 2013
  • 2. AGENDA  AWS and outages  Outage impact  Disaster Recovery – it’s all about redundancy!  Cloudify as a solution for redundancy  Demo with Cloudify on EC2 2 ® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
  • 3. AWS USAGE • AWS – around 0.5M servers • Facebook – less than 0.1M servers • Google – around 1M servers 3
  • 5. OUTAGE – APRIL 21, 2011 5 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 6. OUTAGE - JUNE 29, 2012 6 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 7. OUTAGE - OCTOBER 22, 2012 7 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 8. OUTAGE - CHRISTMAS EVE 2012 8 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 9. THAT’S WHAT YOU EXPECT? 99% - 3.65 days downtime 99.9% - 8.76 hours downtime 99.99% - 53 minutes downtime 99.999% - 5.26 minutes downtime 9
  • 10. OUTAGE IMPACT – DESIGN FOR FAILURES Outage could cost… $89K per hour for Amadeus $225K per hour for PayPal! 10 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 13. PREPARE FOR DISASTER RECOVERY •Dedicated expert for DR architecture •Define target recovery time & point •Assume every tier can fail •Use monitoring and alerts •Document your operational processes 13
  • 15. 15
  • 18. 18
  • 19. CLOUDIFY POSITIONING IN THE CLOUD STACK Productivity High productivity with full control Heroku PaaS CloudFoundry GAE OpenShift Rightscale DevOps (Automation) Enstratus Puppet Chef IaaS Public clouds (AWS, Rackspace,..) Private clouds (Vmware, OpenStack..) Control 19
  • 20. CLONE YOUR ENV - HOW DOES IT WORK?
  • 21. EXTENSIVE PLATFORM SUPPORT 21 ® Copyright 2012 GigaSpaces. All Rights Reserved
  • 23. GETTING COMPUTE RESOURCES IN A PORTABLE WAY compute { template "SMALL_LINUX" } SMALL_LINUX : template{ SMALL_LINUX : template imageId "1234" imageId "us-east-1/ami-76f0061f“ machineMemoryMB 3200 remoteDirectory "/home/ec2-user/gs-files“ hardwareId "103" machineMemoryMB 1600 remoteDirectory "/root/gs-files" hardwareId "m1.small" localDirectory "upload" locationId "us-east-1" keyFile "gigaPGHP.pem" localDirectory "upload" options ([ keyFile "myKeyFile.pem" "openstack.securityGroup" : "default", "openstack.keyPair" : "gigaPGHP" options ([ ]) "securityGroups" : ["default"]as privileged true String[], } "keyPair" : "myKeyFile" ]) overrides (["jclouds.ec2.ami-query":"", "jclouds.ec2.cc-ami-query":""]) privileged true } 23 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 24. DATA REPLICATION • Cloudify Replicated MySQL Recipe • Generic replication service using WAN Gateway 24 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 25. GENERIC REPLICATION SERVICE OVER WAN London New York Hong Kong In-Memory Speed Scalable and Efficient High Availability and Self-Healing
  • 26. 26
  • 27. VERIFI (CURRENT) DEPLOYMENT ARCHITECTURE PostgresSQL mod_cluster Data Volume Internet EC2 Instance EC2 Instance Cassandra JBoss 4 recipes Data Volume EC2 Instance EC2 Instance Availability region (US-West: Oregon) 27
  • 28. TARGET ARCHITECTURE Bootstrap two EC2 clouds in different regions, install the “verifi” application on each. The second cloud will have a slightly modified (extended) postgres recipe for acting as a slave + no running app servers. Upon the primary zone failure, the second cloud will spin up instances of the app servers and turn the data instance into master, then bootstrapping another “slave” cloud in another zone. replication mod_cluster mod_cluster Data Volume Data Volume Postgres Master EC2 Instance Postgres Slave Internet EC2 Instance EC2 Instance EC2 Instance Cassandra Cassandra JBoss JBoss Data Volume Data Volume EC2 Instance EC2 Instance EC2 Instance EC2 Instance Availability Region (US-West Oregon) Availability Region (US-East Virginia)
  • 29. FAILOVER SCENARIO Cloud #1 Upon initial deployment, the primary deployment Cloud #2 of the application will be bootstrapped onto cloud #1, another slightly modified application recipe will be bootstrapped as cloud #2, polling cloud #1 for failure, and acting as a PostgresSQL db slave. PostgresSQL PostgresSQL App Servers Liveness poll Region (US-West Oregon) Region (US-East Virginia) Bootstrap another cloud in a different region using the Region failure same application recipe Turn Postgres slave into used to bootstrap cloud #2 occurs master, Start app server above* instances* Cloud #2 Cloud #3 Cloud #1 Liveness poll App Servers PostgresSQL PostgresSQL Region (US-West California) Region (US-East Virginia ) 29
  • 30. DEMO ON EC2 - 5 MINUTES SETUP /* Credentials - You must enter your * cloud provider account credentials */ user="ENTER_USER_HERE" apiKey="ENTER_API_KEY_HERE" keyFile="ENTER_KEY_FILE_HERE" keyPair="ENTER_KEY_PAIR_HERE" // Advanced usage hardwareId="m1.small" locationId="us-east-1" linuxImageId="us-east-1/ami-1624987f" ubuntuImageId="us-east-1/ami-82fa58eb" 30 ® Copyright 2012 GigaSpaces Ltd. All Rights Reserved
  • 31. SUMMARY  AWS and outages  Outage impact  Disaster Recovery – it’s all about redundancy!  Cloning your environment – app stack  Cloning your DB – Replication  Cloudify as a solution for Redundancy  Use recipes to work on any cloud  Fast and customized data replication  Demo with Cloudify on EC2 31 ® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
  • 32. QUESTIONS & ANSWERS Thank You! RonZ@gigaspaces.com 32 ® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

Hinweis der Redaktion

  1. A high-ranking Amazon executive said there are 60,000 different customers across the various Amazon Web Services, and most of them are not the startups that are normally associated with on-demand computing. Rather the biggest customers in both number and amount of computing resources consumed are divisions of banks, pharmaceuticals companies and other large corporations who try AWS once for a temporary project, and then get hooked. According to Statspotting.com in March 2012 - researcher estimates that Amazon Web Services is using at least 454,400 servers in seven data center hubs around the globe. Let us try this: Google is powered by a million servers. Maybe a little more than that. And Amazon has half a million servers. Now, things fall in place. Facebook, the service that takes up one fourth of all our time online, is powered by less than 100,000 servers.Biggest customers – pinterest, instagram, Netflix, heroku, quora, foursquare etcAmazon Web Services runs more than 835,000 requests per second for hundreds of thousands of customers in 190 countries, including 300 government agencies and 1,500 educational institutions. 
  2. The Amazon cloud proved itself in that sufficient resources were available world-wide such that many well-prepared users could continue operating with relatively little downtime. But because Amazon’s reliability has been incredible, many users were not well-prepared leading to widespread outages.Amazon EC2 outage on April 2011 was the worst in cloud computing’s history back then. It made the front page of many news pages, including the New York Times, probably because many people were shocked by how many web sites and services rely on EC2.Microsoft Azure outageDec 28 2012 -  some owners of Microsoft's Xbox 360 game console were unable to access some of their cloud-based save storage files.July 26 - 2012 - Service for Microsoft’s Windows Azure Europe region went down for more than two hoursFeb 29 2012 - The ultimate result was service impacts of 8-10 hours for users of Azure data centers in Dublin, Ireland, Chicago, and San Antonio.
  3. Some parts of Amazon Web Services suffered a major outage. A portion of volumes utilizing the Elastic Block Store (EBS) service became "stuck" and were unable to fulfill read/write requests. It took at least two days for service to be fully restored. Reddit, one of the better-known sites to go down due to the error, said it has 700 EBS volumes with Amazon.Sites like Quora and Reddit were able to come back online in "read-only" mode, but users couldn't post new content for many hours.
  4. For second time in less than a month, Amazon’s Northern Virginia data center has suffered an outage and is impacting many popular services such as Instagram, Pinterest & Netflix.Several websites that rely on Amazon Web Services were taken offline due to a severe storm of historic proportions in the Northern Virginia area where Amazon's largest datacenter is located. Amazon previously suffered an outage in its Northern Virginia facilities on June 14, 2012.A line of severe storms packing winds of up to 80 mph has caused extensive damage and power outages in Virginia. Dominion Virginia Power crews are assessing damages and will be restoring power where safe to do so.
  5. A major outage occurred, affecting many sites such as reddit, Foursquare, Pinterest, and others. The cause was a latent bug in an operational data collection agent. A memory leak and a failed monitoring system caused the Amazon Web Services outage on Monday that took out Reddit and other major services.According to a post Friday night, AWS explained that the problem arose after a simple replacement of a data collection server. After installation, the server did not propagate its DNS address correctly and so a fraction of servers did not get the message. Those servers kept trying to reach the server, which led to a memory leak that then went out of control due to the failure of an internal monitoring alarm. Eventually the system ground to a virtual stop and millions of customers felt the pain.
  6. Amazon AWS again suffered an outage, causing websites such as Netflix instant video to be unavailable for some customers, particularly in the North-eastern US. Amazon later issued a statement detailing the issues with the Elastic Load Balancing service that led up to the outage.The disruption began shortly after noon Pacific time on December 24 when data was accidentally deleted by a developer during maintenance on the East Coast Elastic Load Balancing system, which is designed to distribute traffic volume among servers."Netflix is designed to handle failure of all or part of a single availability zone in a region as we run across three zones and operate with no loss of functionality on two," the company said in ablog post this afternoon. "We are working on ways of extending our resiliency to handle partial or complete regional outages."
  7. Fault tolerant systems are measured by their uptime / downtime for end usersAmazon says it is "committed" to a 99.95 percent uptime
  8. Although AWS went offline for a few hours only, the downtime experience did have an impact on customers’ businesses. There is no known data for the number of people affected by a cloud computing service outage. It is estimated that the travel service provider Amadeus loses $89,000 per hour during any cloud computing outage, while Paypal loses around $225,000 per hour.
  9. DR – The process and procedures you take to restore your system after catastrophic event.Cloud infrastructure has made DR much easier and affordable comparing to previous options.Cloud can also suffer from large scale failures because of network, power or any IT failures.Applications owners need to be responsible for HA and DR – can use multiple servers, AZ, regions and even clouds.Zones within a region share a LAN so they have high bandwidth, low latency and private IP access. Zones utilize separate power resources. Regions are “islands” – they share no resources.
  10. Each cloud is unique in many aspects offering different API and functionality to manage the resources.Different set of available resourcesDifferent format, encoding and versionsDifferent security groups, machine images, snapshots etc.
  11. Make sure to have a dedicated expert to manage your DR architecture, processes and testing.Define what your target recovery time and recovery point is.Be pessimistic and design for failures – (assume everything will fail and design a solution that is capable of handling it). Avoid single point of failures – all parts of your app should be highly available (different AZ / regions / cloud) – load balancers, app servers, web servers, message bus, database.Use monitoring and alerts for failover processes and for every change in state.Document your DR operational processes and automations.Try to “break” different part in your application. Try different ways to break it – unplug the network, turn machine off etc. Try it again.
  12. Netflix has open sourced ”Chaos Monkey,” its tool designed to purposely cause failure in order to increase the resiliency of an application in Amazon Web Services (AWS.)It’s a timely move as AWS has had its fair share of outages. With tools like Chaos Monkey, companies can be better prepared when a cloud infrastructure has a failure.In a blog post, Netflx says that this is the first of several tools that it will open source to help companies better manage the services they run in cloud infrastructures. Next up is likely to be Janitor Monkey which helps keep an environment tidy and costs down.Chaos Monkey has achieved its own fame for its innovative approach. According to Netflix, the tool “randomly disables production instances to make sure it can survive common types of failure without any customer impact. The name comes from the idea of unleashing a wild monkey with a weapon in your data center (or cloud region) to randomly shoot down instances and chew through cables — all the while we continue serving our customers without interruption.”
  13. Netflix provides an excellent toolset for surving outages at the operation level.In this part i wanted to zoom-in more on the design implication of our application.The core principle for surviving failure is actually fairly simple and in fact applies to any systems not just cloud whether they happen to be Airplane, Missiles, Cars etc.. At the end its all about redundancy. The degree of tolerance is often determined by how many alternate systems or parts of the system we have in our design and how much they are separated from one another. The degree tolerance is also determined by how fast we can detect the broken part in our system and make the switch. In software terms the common parts that comprises our system is built out of two main groups - the business logic and the data.Making a redundant software application that can survive failure is often based on setting up clones for two of those  parts of our system.
  14. We need abstraction – we don’t want to be locked in. We want to use tools that offer this abstraction layers both for daily management and for DR. This tool should translate our architecture concepts to the cloud specific properties (using recipes).To clone our application business logic we need to be able to ensure that all parts of our system runs the exact same version of all our software components . That include not just the binaries but also the configuration, the scripts that runs our application and more importantly that all our post deployment procedures such as fail-over, scaling and monitoring are also kept consistent. Quite often the things that makes the cloning of our business logic complex is due to the fact that the information on how to run our application is often scattered in many different sources such as scripts, as well as the mind of the people that runs those apps. To make the job of cloning our application much simpler and thus more consistent we need to be able to capture all parts of the information for running our apps in the same place. Configuration management tools such as Chef, Puppet and in the case of Amazon CloudFormation can help on this regard. 
  15. RDS read replica - Amazon RDS uses MySQL’s built-in replication functionality to create a special type of DB Instance called a Read Replica that allows you to elastically scale out beyond the capacity constraints of a single DB Instance for read-heavy database workloads. Once you create a Read Replica, database updates on the source DB Instance are replicated to the Read Replica using MySQL’s native, asynchronous replication. Since Read Replicas leverage standard MySQL replication, they may fall behind their sources, and they are therefore not intended to be used for enhancing fault tolerance in the event of source DB Instance failure or Availability Zone failure.
  16. There are lots of patterns on how to avoid failure.It took Netflix lots of development work to build a framework that can handle them well.Most users, startup don't have the lactury of implementing them themselves. You need a tool that will enable you to automate those patterns in a consistent way.  - Enter Cloudify
  17. Cloudify is an enterprise-class open source PaaS stack that sits between your application and your chosen cloud. It enables your application to concentrate on doing what it does best, leaving Cloudify to ensure that the resources it needs are available regardless of the cloud and stack used.
  18. Any App, Any Stack — Move your application to the cloud without making any code changes, regardless of the application stack (Java/Spring, Java EE, Ruby on Rails, …), database store (relational such as MySQL or non-relational such as Apache Cassandra), or any other middleware components it uses. This enables you to achieve your objective of no code changes.To make the work of setting all this work simpler we tried to bake all those patterns into a readymade tools and are scripted into out of the box recipes. The cloudify recipes includes: Database cluster recipes with support for MySQL, MongoDB, Cassandra, Postgress etc..Integration with Chef and Puppet Automation of fail-over, scaling and continues maintenance of our application.Application recipes that allows you to capture all the aspect of running your application including the post deployment aspect such as fail-over, scaling and monitoring.
  19. Need to add Puppet graphics next to Chef
  20. Any Cloud — Move your application to any cloud environment, from any environment, at any time. Cloudify supports public clouds (Amazon EC2, Windows Azure, Rackspace, …) and private clouds (OpenStack, cloud.com, VMWarevCenter, Citrix XenServer, …). Moreover, to support enterprises that want to deploy the same application in multiple environments (say, for cloud bursting), Cloudify’s framework is designed to be flexible enough to handle any application stack, and yet on the other hand, isolate the application completely from the underlying cloud runtime. By hiding the APIs and configuration of a cloud from your application, your application can be easily moved from cloud to cloud. This enables you to achieve your objective of no lock-in.
  21. Cloudify Mysql master slave recipe – The first service instance becomes the master (automatically) and the other instances (two instances in our case), are the slaves.Generic replication service using XAP WAN Gateway
  22. Monitor data mutating SQL statements on source site. Turn on the MySQL query log, and write a listener (“Feeder”) to intercept data mutating SQL statements, then write them to GigaSpaces In-Memory Data Grid.Replicate data mutating SQL statements over WAN. I used GigaSpaces WAN Replication to replicate the SQL statements between the data grids of the primary and secondary sites in a real-time and transactional manner.Execute data mutating SQL statements on target site. Write a listener (“Processor”) to intercept incoming SQL statements on the data grid and execute them on the local MySQL DB.The network connectivity between the primary and secondary sites can be addressed in several ways, ranging from load-balancing between the sites, through setting up VPN between the sites, and up to using designated products such as Cisco’s Connected Cloud Solution.
  23. There are lots of patterns on how to avoid failure.It took Netflix lots of development work to build a framework that can handle them well.Most users, startup don't have the lactury of implementing them themselves. You need a tool that will enable you to automate those patterns in a consistent way.  - Enter Cloudify
  24. Cloud brings lots of promise for making our business more agile.Cloud has also become a huge shared infrastructure in which every failure has a much more significant impact on our business world wide.The experience in the past year had tought us that even a robust cloud infrastructure such as Amazon can fail. Through this experience we've learned that rather than relying on the infrastructure for preventing failure we need to design our system to cope with failure and get used to failure as away of life. Having said that the investment required to build a robust application can be fairly large and not something that everyone can afford.Using tools like Cloudify, Chef Puppet and if your a pure Amazon shop Netflix <framework> could help greatly to reduce this effort by making a lot of those patterns pre-backed into recipes.