Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Zero Downtime JEE Architectures

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 52 Anzeige

Zero Downtime JEE Architectures

Herunterladen, um offline zu lesen

Zero Downtime Architectures based on JEE platform. Almost every big enterprise with online business tries to design its applications in a way that they are always online. But is it also the case when we upgrade the database cluster? When we switch the whole data center? Based on a customer project we try to present common architecture principles that enable you to do all this without any service interruption and the most important: without any stress.

Zero Downtime Architectures based on JEE platform. Almost every big enterprise with online business tries to design its applications in a way that they are always online. But is it also the case when we upgrade the database cluster? When we switch the whole data center? Based on a customer project we try to present common architecture principles that enable you to do all this without any service interruption and the most important: without any stress.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Andere mochten auch (20)

Anzeige

Ähnlich wie Zero Downtime JEE Architectures (20)

Aktuellste (20)

Anzeige

Zero Downtime JEE Architectures

  1. 1. Zero Downtime Architectures Alexander Penev ByteSource Technology Consulting GmbH Neubaugasse 43 1070, Vienna Austria
  2. 2. whoami Alexander Penev Email: alexander.penev@bytesource.net Twitter: @apenev @ByteSourceNet JEE, Databases, Linux, TCP/IP Fan of (automatic) testing, TDD, ADD, BDD….. Like to design high available and scalable systems :-)
  3. 3. Zero Downtime Architectures ● Base on a customer project with the classic JEE Application Stack ● Classic web applications with server side code ● HTTP based APIs ● Goals, Concepts and Implementation Techniques ● Constraints and limitations ● Developement guidelines ● How these concepts can be applied to the new cuttung edge technolgies ● Single page Java Script based Apps ● Mobile clients ● Rest APIs ● Node.js ● NoSQL stores
  4. 4. Zero Downtime Architecture? ● My database server has 99.999% uptime ● We have Tomcat cluster ● Redundant power supply ● Second Datacenter ● Load Balancer ● Distribute routes over OSPF ● Deploy my application online ● Second ISP ● Session Replication ● Monitoring ● Data Replication ● Auto restarts
  5. 5. Zero Downtime architecture: our definition The services from the end user point of view could be always available
  6. 6. Our Vision Identify all sources of downtime and remove all them http://www.meteleco.com/wp-content/uploads/2011/09/p360.jpg
  7. 7. When could we have a downtime (unplanned)? ● Human errors ● Server node has crashed ● Power supply is broken, RAM Chip burned out, OS just crashed ● Server Software just crashed ● IO errors, software bug, tablespace full ● Network is unavailable ● Router crashed, Uplink down ● Datacenter is down ● Uplinks down ( notorious bagger :-) ) ● Flood/Fire ● Aircondition broken ● Hit by a nuke (not so often :-) )
  8. 8. When could we need a downtime (planned)? ● Replace a hardware part ● Replace a router/switch ● Firmware upgrade ● Upgrade/exchange the storage ● Configuration of the connection pool ● Configuration of the cluster ● Upgrade the cluster software ● Recover from a logical data error ● Upgrade the database software ● Deploy a new version of our software ● Move the application to another data center
  9. 9. How can we avoid downtime ● Redunancy ● Hardware, network ● Uplinks ● Datacenters ● Software ● Monitoring ● Detect exhausted resources before the application notices it ● Detect a failed node and replace it ● Software design ● Idempotent service calls ● Backwards compatibility ● Live releases ● Scalability ● Scale on more load ● Protect from attacks (e.g. DDoS)
  10. 10. Requirements for a Zero Downtime Architecture: handling of events of failure or maintenance Event/Application category Online applications Batch jobs Failure or maintenance of an internet uplink/router/switch Yes Yes Failure or maintenance of a firewall node, loadbalancer node or a network component Yes Yes Failure or maintenance of a webserver node Yes N/A Failure or maintenance of an application server node Yes partly (will be restarted) Failure or maintenance of a database node Yes partly Switchover of a datacenter: switching only one application (group) Yes Yes (maintenance) partly (failure) Switchover of a datacenter: switching all applications Yes Yes (maintenance) partly (failure) New application deployment Yes Yes Upgrade of operating system Yes Yes Upgrade of an arbitrary middleware software Yes Yes Upgrade of database software Yes Yes Overload of processing nodes Yes Yes Failure of a single JVM Yes No Failure of a node due to leak of system resources Yes No
  11. 11. Our goals and constraints ● Reduce downtime to 0 ● Keep the costs low ● No expensive propriatery hardware ● Minimize the potential application changes/rewrites http://www.signwarehouse.com/blog/how-to-keep-fixed-costs-low/
  12. 12. Our Concepts 1/4 ● Independent Applications or Application Groups ● One Application (Group) = IP Address ● Communication between Application exclusively over this IP Address! http://www.binaryguys.de/media/catalog/product/cache/1/image/313x313/9df78eab33525d08d6e5fb8d27136e95/3/6/36.noplacelikelocalhost_1_4.jpg
  13. 13. Our Concepts 2/4 Treat the internet and internal traffic independently
  14. 14. Our Concepts 3/4 ● Reduce the downtime within a datacenter to 0 ● High available network ● Redundant firewalls and load balancers ● Web server farms ● Application server clusters with sesion replication ● Oracle RAC Cluster ● Downtime free application deployments
  15. 15. Our Concepts 4/4 ● Replicate the data on both datacenters ● and make the applications switchable
  16. 16. Implementation: Network (Layer 2)
  17. 17. Concepts: Internet traffic, BGP(Border Gateway Protocol) 1/2 ● Every datacenter has fully redundant uplinks ● Own provider independent IP address range (assigned by RIPE) ● Hard to get in the moment (but not impossible) ● Propagate these addresses to the rest of the internet through both ISPs using BGP ● Both DCs our addresses ● The network path of one announcement could be preferred (for costs reasons) ● Switch of internet traffic ● Gracefully by changing the preferences of the announcements – No single TCP session lost ● In case of disaster the backup route is propagated automatically within seconds to minutes (depending on the internet distance) ● Protect us from connectivity problems between our ISPs and our customer ISPs 10.8.8.0/24 10.8.8.0/24 Announcement Announcement
  18. 18. Concepts: Internet traffic, use DNS ? 2/2 ● We don't use DNS for switching ● A datacenter switch based on DNS could take up to months to reach all customers and their software (e.g. JVMs caching DNS entries, default behaviour) ● No need to restart browsers, applications and proxies on the customer site. The customer doesn't see any change at all (except that route to us has changed) ● DNS is good for load balancing but not for High Availability!
  19. 19. Concepts: Internal traffic ● OSPF (Open Shortest Path First) protocol for dynamic routing ● Deals with redundant paths completely transparently ● Can also do load balancing ● The second level firewalls (in front of the load balancers) announce the address to the rest of the routers ● To switch the processing of a service, it's firewall just has to announce the route (could be also a /32) with a higher priority, after a second the traffic goes through the new route. ● Could be also used for a unattended switch of the whole datacenter ● Just announce the same IPs from both sites with different priorities ● If the one datacenter dies there are only announcements from the other one 10.8.8.23 10.8.8.23
  20. 20. Our Concepts ● Independent Applications or Application Groups ● Independent Internet and internal network trafic ● Reduce Downtime within a DC ● Replicate the data between the Dcs and make the application switchable
  21. 21. Zero Downtime within a datacenter ● High Available network ● Redundant switches – Again using Spanning Tree Protocol ● Redundant firewalls, routers, load balancers – Active/Passive Clusters – VRRP protocol implemeneted by keepalived – IP tables with contractd ● Web Server Apache farms ● Managed by load balancer ● Application Server Cluster ● Weblogic Cluster ● With Session replication, ● automcatic retries and restarts ● Oracle RAC database cluster ● Deployment without downtime
  22. 22. Failover within one datacenter:Apache plugin (mod_wl) Session ID Format: sessionid!primary_server_id!secondary_server_id Quelle: http://egeneration.beasys.com/wls/docs100/cluster/wwimages/cluster-06-1-2.gif
  23. 23. Development guidelines (HTTPSession) ● If you need a session then you most probably want to replicate it ● Example (weblogic.xml) ● Generally all requests of one session go to the same application instance ● When it fails (answer with 50x, dies or not answer in a given period) the backup instance is involved ● The session attributes are only replicated on the backup node when HTTPSession.setAttribute was called. HTTPSession.getAttribute("foo") .changeSomething() will not be replicated! ● Every attribute stored in the HTTPSession must be serializable! ● The ServletContext will not be replicated in any cases. ● If you implement caches they will have probably different contents on every node (except we use a 3rd party cluster aware cache). Probably the best practice is not to rely that the data is present and declare the cache transient ● Keep the session small in size and do regular reattaching.
  24. 24. Development guidelines (cluster handling) ● Return proper HTTP return codes to the client ● Common practice is to return a well formed error page with HTTP code 200 ● It is a good practice if you are sure that the cluster is incapable of recovering from it (example: a missing page will be missing on the other node too) ● But an exhausted resource (like heap, datasource) could be present on the other node ● It is hard to implement it, therefore Weblogic offers you help: ● You can bind the number of execution threads to a datasource capacity ● Shut down the node if an OutOfMemoryError occurs but use it with extreme care! ● Design for idempotence ● Do all your methods idempotent as far as possible. ● For those that cannot be idempotent (e.g. sendMoney(Money money, Account account)) prevent re- execution: – By using a ticketing service – By declaring the it as not idempotent: <LocationMatch /pathto/yourservlet >                 SetHandler weblogic­handler                Idempotent OFF </Location>
  25. 25. Development guidelines (Datasources) ● Don't build your own connection pools, take them from the Application Server by JNDI Lookup ● As we are using Oracle RAC , the datasource must be a multipool consisting of single datasources per RAC node – One can take one of the single datasources out of the mutlipool (online) – Load balancing is guaranteed – Reconfiguring the pool online ● Example Spring config: ● Example without Spring:
  26. 26. Basic monitoring ● Different possibilities for monitoring on Weblogic ● Standard admin console – Threads (stuck, in use, etc), JVM (heap size, usage etc.), online thread dumps – Connection pools statistics – Transaction manager statistics – Application statistics (per servlet), WorkManager statistics ● Diagnostic console – Online monitoring only – All attributes exposed by Weblogic Mbeans can be monitored – Demo: diagnostics console ● Diagnostic images – On demand, on shutdown, regularly – Useful for problem analysis (especially for after crash analysis) – For analysing of resource leaks: Demo: analyse a connection leak and a stuck thread ● SNMP and diagnostic modules – All MBean attributes can be monitored by SNMP – Gauge, string, counter monitors, log filters, attribute changes – Collected metrics, watches and notifications
  27. 27. Zero downtime deployment ● 2 Clusters within the one datacenter ● Managed by Apache LB ● (simple script based on the session ID) ● Both are active during normal operations ● Before we deploy the new release we switch off cluster 1 ● Old sessions go to both cluster 1 and 2 ● New sessions go to cluster 2 only ● When all sessions of cluster 1 expire we deploy the new version ● Test it ● If everything ok, then we put it back into the Apache load balancer ● Now we take cluster 2 off ● Untill all sessions expire ● The same procedure as above ● Then we deploy on the second datacenter
  28. 28. Our Concepts ● Independent Applications or Application Groups ● Independent Internet and internal network trafic ● Reduce/avoid Downtime within a DC ● Replicate the data between the DCs and make the application switchable
  29. 29. Our requirements again Event/Application category Online applications Batch jobs Failure or maintenance of an internet uplink/router/switch Yes Yes Failure or maintenance of a firewall node, loadbalancer node or a network component Yes Yes Failure or maintenance of a webserver node Yes N/A Failure or maintenance of an application server node Yes partly (will be restarted) Failure or maintenance of a database node Yes partly Switchover of a datacenter: switching only one application (group) Yes Yes (maintenance) partly (failure) Switchover of a datacenter: switching all applications Yes Yes (maintenance) partly (failure) New application deployment Yes Yes Upgrade of operating system Yes Yes Upgrade of an arbitrary middleware software Yes Yes Upgrade of database software Yes Yes Overload of processing nodes Yes Yes Failure of a single JVM Yes No Failure of a node due to leak of system resources Yes No
  30. 30. Replicate the data between the DCs ● Bidirectional data replication between DCs ● Oracle Streams/Golden Gate http://docs.oracle.com/cd/E11882_01/server.112/e10705/man_gen_rep.htm#STREP013
  31. 31. Cross Cluster replication: 2 clusters in 2 datacenters
  32. 32. Application groups ● One or more applications without hard dependencies to or from other applications ● Why application groups ● Switching many application at once leads to long downtimes and higher risk ● Switching a single one is not possible if there are hard dependencies on database level to other applications ● Identify groups of applications that are critical dependent on each other but not to other applications out of the group ● Switch such groups always at once ● As bigger the group as longer the downtime – A single application in the category HA will be able to switch without any downtime, just delayed requests ● Critical (hard) dependencies is if it leads to issues (editing the same record on different DCs will be definitely problematic, reading data for reporting is not) – Must be identified on case by case base
  33. 33. Identify application groups
  34. 34. Switch application by application
  35. 35. Example of a switch procedure of an application group
  36. 36. Applications: Limitations Limitation/Categories No bulk transactions No DB sequences No file based sequences No shared file system storage Use a central batch system All new releases has to be compatible with the previous release. Stick to the infrastructure
  37. 37. Our Concepts ● Independent Applications or Application Groups ● Independent Internet and internal network trafic ● Reduce/avoid Downtime within a DC ● Replicate the data between the DCs and make the application switchable
  38. 38. Our requirements once again Event/Application category Online applications Batch jobs Failure or maintenance of an internet uplink/router/switch Yes Yes Failure or maintenance of a firewall node, loadbalancer node or a network component Yes Yes Failure or maintenance of a webserver node Yes N/A Failure or maintenance of an application server node Yes partly (will be restarted) Failure or maintenance of a database node Yes partly Switchover of a datacenter: switching only one application (group) Yes Yes (maintenance) partly (failure) Switchover of a datacenter: switching all applications Yes Yes (maintenance) partly (failure) New application deployment Yes Yes Upgrade of operating system Yes Yes Upgrade of an arbitrary middleware software Yes Yes Upgrade of database software Yes Yes Overload of processing nodes Yes Yes Failure of a single JVM Yes No Failure of a node due to leak of system resources Yes No
  39. 39. Modern Architectures: how does the concepts fit?
  40. 40. Modern Architectures: Application Layer ● Web apps ● Completely independent on the backend ● Using only Rest APIs ● 90% of the state is locally managed (supported by frameworks like AngularJS and BackboneJS) ● Must be compatible with different versions of the Rest API (at least 2 versions) ● If websockets are used, then more tricky, see backend. ● New mobile versions managed by Apps Stores ● Good to have a upgrade reminder (to limit the supported versions) ● Rest API must be versioned and backwards compatible ● Messages over message clouds is transparent. HA managed by vendors ● Stafeful Services ● e.g. Oauth v1/v2 – Normally by DB Persistence
  41. 41. Session Replication ● Less needed that with Server Side Applications ● Frameworks like AngularJS, BackboneJS , Ember etc. manage their own sessions, routings etc. ● but still needed ● Weblogic: no change ● Tomcat evtl. with JDBC Store ● Jetty with Terracotta ● Node.js: secure (digitally signed) sessions stored in cookies – Senchalabs Connect – Mozilla/node-client-sessions ● https://hacks.mozilla.org/2012/12/using-secure-client-side-sessions-to-build-simple-and- scalable-node-js-applications-a-node-js-holiday-season-part-3/
  42. 42. Backend: Bidirectional Data Replication ● Elastic Search ● Currently no cross cluster replication ● But is on their roadmap ● Couchdb ● Very flexible replication, regardless within one or more datacenters ● Bidirectional replication is possible ● Mongodb ● One direction replication possible and mature ● Bidirectional not possible in the moment ● Workaround would be: one mongodb per app and strict separation of the apps ● Hadoop HDFS ● Currently no cross cluster replication available ● e.g. Facebook wrote their own replication for HIVE ● Will possibly arrive soon with Apache Falcon http://falcon.incubator.apache.org/
  43. 43. Questions? Thank you for your attention !
  44. 44. Some pictures on this presentation were purchased from iStockphoto LP. The price paid applies for the use of the pictures within the scope of a standard license, which includes among other things, online publications including websites up to a maximum image size of 800 x 600 pixels (video: 640 x 480 pixels). Some icons from https://www.iconfinder.com/ are used under the Creative Commons public domain license from the following authors: Artbees, Neurovit and Pixel Mixer (http://pixel-mixer.com) All other trademarks mentioned herein are the property of their respective owners.
  45. 45. Backup slides
  46. 46. Big picture example architecture
  47. 47. Key features ● 2 datacenters ● Both active (both datacenters active but probably different applications running on them) ● Independent uplinks ● Redundant interconnect ● Applications are deployed and running on both ● Application cluster in every datacenter ● Session replication within every datacenter ● Cross replication between the 2 datacenters ● e.g. with Weblogic Cluster ● Bidirectional database replication ● e.g. 2 independent Oracle RAC in each datacenter ● Replication over streams/Golden Gate ● Monitoring of all critical resources ● Hardware nodes ● Connection pools ● JVM heaps ● Application switch
  48. 48. Concepts: other network components ● Firewalls ● First level firewalls – Cisco routers – Stateless firewalls – Not very restrictive ● Second level firewalls (in front of the application load balancers) – Should be stateful – based on Linux/Iptables with conntrackd (for failover) – Statefull, connection tracking – Very restrictive – Rate limiting of new connections (DoS or slashdot) ● All firewalls will be/are in active/hot standby mode. ● On a controlled failover (both are running and we switch them) no single TCP connection should be affected (except small delays) ● In disaster case some seconds until the cluster software detects the crash of the node and initiate the failover. No TCP connections should be lost but there is a very small risk
  49. 49. Example of a switch procedure of an application group ● Preparation steps ● Check the health of the replication processes. ● Stop all batch applications (by stopping the job scheduling system). If the time pressure for the switch is high just kill all running jobs (they should be restartable anyway, also currently). ● Switch off the keepalive feature on all httpd servers ● Switching steps ● Change the firewall rules on the second layer firewalls, so that any new connection requests (Syn flag is active) is being dropped. ● Wait until the data is synchronized on both sides (e.g. by monitoring a heartbeat table) and no more httpd processes are active. ● Switch the application traffic to the other DC (by changing the routing of their IP addresses). ● Clean up (remove dropping of Syn packages on the “old” site etc.) ● This procedure is done per application group until all applications are running
  50. 50. Application clusters (Weblogic) ● Features of Weblogic that we use ● mod_wl – Manages the stickiness and failover to backup nodes – Automatic retry of failed requests ● On time-outs ● On response header 50x ● Multipools – Gracefully remove a database node out of the pool – Gracefully change parameters of connection pools – Guaranteed balance of connections between database nodes ● Binding execution threads to connection pools ● Auto shutdown (+ restart) of nodes on OutOfMemoryException ● Session replication (also over both DCs) ● Thread monitoring (detect dead or long running threads etc.) ● Diagnostic images and alarms
  51. 51. Apache plugin failover Quelle: e-docs.bea.com
  52. 52. Deployment of connection pools ● One datasource per Oracle RAC node ● Set the initial capacity to a value that will be sufficient for the usual load for the application – Creation of new connections is expensive ● Set the max capacity to a value that will be sufficient in a high load scenario – The overall number of connections should match to the limit of connection on the database site ● Set JDBC parameter in the connection pool and not globally (e.g. v8compatibility=true) ● Check connections on reserve ● You can set db session parameters in the init SQL property (e.g. alter session set NLS_SORT='GERMAN') ● Enable 2 phase commit only if you need it (expensive) ● Prepared statement caching does not bring much performance (at least for Oracle databases) but cost open cursors in the database (per connection!), so don't use it unless you have a very good reason to do it. ● One Multipool containing all single datasources for one database ● Strategy: load balancing

Hinweis der Redaktion

  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications
  • reduce downtime to 0
    keep the costs low
    use linux
    use x64 hw
    SW Licenses as low as possible
    Minimize changes of applications

×