The document discusses Linux high availability (HA). It begins with definitions of HA and describes how HA ensures one service takes over for another during a failure to provide transparency to end users. It discusses HA tools like Pacemaker and Corosync that act as the "brain" to monitor services and resources and make decisions like restarting or moving services when failures occur. It provides examples of configuring and working with HA clusters using tools like Pacemaker Configuration Shell (PCS) and promoting the use of automation, monitoring, and realistic expectations with HA.
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Â
Linux HA anno 2014: Understanding High Availability
1. Linux HA anno 2014
Julien PivottoJulien Pivotto
LOADays, AntwerpLOADays, Antwerp
April 4th, 2014April 4th, 2014
2. whoamiwhoami
⢠sysadmin @ inuitssysadmin @ inuits
⢠open-source defender for 7+ yearsopen-source defender for 7+ years
⢠devops believerdevops believer
⢠@roidelapluie on twitter/github@roidelapluie on twitter/github
Julien Pivotto Linux HA
4. What is HAWhat is HA
⢠High AvailabilityHigh Availability
⢠One service fail â another takes over its jobOne service fail â another takes over its job
⢠Transparent for the end-userTransparent for the end-user
Julien Pivotto Linux HA
5. Where HA will NOT helpWhere HA will NOT help
⢠It is not about scalabilityIt is not about scalability
⢠It will not ďŹx your applicationIt will not ďŹx your application
⢠It will make your application stableIt will make your application stable
⢠It is not a one-size-ďŹts-all solutionIt is not a one-size-ďŹts-all solution
⢠It is not about performancesIt is not about performances
⢠It is not backupIt is not backup
Julien Pivotto Linux HA
6. Why caring about HA?Why caring about HA?
⢠Service goes down at 5pm on Friday?Service goes down at 5pm on Friday?
⢠Downtime makes users unhappyDowntime makes users unhappy
⢠Downtime costs moneyDowntime costs money
Julien Pivotto Linux HA
7. What will not workWhat will not work
⢠Virtualization will not make your app HAVirtualization will not make your app HA
⢠VM mirroring is not HAVM mirroring is not HA
⢠Live migrations are not HALive migrations are not HA
⢠Containers are not HAContainers are not HA
⢠Cloud lolCloud lol
Julien Pivotto Linux HA
8. HA is about servicesHA is about services
Julien Pivotto Linux HA
9. Start on a good basisStart on a good basis
⢠AutomationAutomation
⢠MonitoringMonitoring
⢠CI / CDCI / CD
⢠TestingTesting
⢠. . . Then, start working on HA. . . Then, start working on HA
Julien Pivotto Linux HA
10. Eliminate the SPOFEliminate the SPOF
⢠Single Point of FailuresSingle Point of Failures
⢠Hardware failsHardware fails
⢠Disks always failDisks always fail
⢠etc. . .etc. . .
⢠Replicate. . .Replicate. . .
Julien Pivotto Linux HA
11. Split-BrainSplit-Brain
⢠Nodes canât talk to each otherNodes canât talk to each other
⢠They think they are aloneThey think they are alone
⢠They take decision and leadershipThey take decision and leadership
⢠Data inconsistencyData inconsistency
Julien Pivotto Linux HA
12. FencingFencing
⢠Shoot the other node in the headShoot the other node in the head
⢠Be sure a node is deadBe sure a node is dead
⢠Preserve integrity of the dataPreserve integrity of the data
⢠Combine with quorumsCombine with quorums
Julien Pivotto Linux HA
13. MonitoringMonitoring
⢠Monitoring if PID is running is uselessMonitoring if PID is running is useless
⢠Result-based monitoringResult-based monitoring
⢠Extract data out of itExtract data out of it
⢠E.g try to insert in DBE.g try to insert in DB
Julien Pivotto Linux HA
14. Cluster?Cluster?
⢠Active/active: everything is activeActive/active: everything is active
⢠Active/passive: nodes in standbyActive/passive: nodes in standby
⢠N+1: One node waiting in standbyN+1: One node waiting in standby
⢠N+M: Nodes waiting in standbyN+M: Nodes waiting in standby
⢠Can mix them etc. . .Can mix them etc. . .
Julien Pivotto Linux HA
21. The StateThe State
⢠Stateless applicationStateless application
⢠Everything in DBEverything in DB
⢠Avoid temp ďŹlesAvoid temp ďŹles
⢠Disaster recoveryDisaster recovery
Julien Pivotto Linux HA
22. The right toolsThe right tools
⢠Make relevant choices for you appMake relevant choices for you app
⢠Look for HA in databasesLook for HA in databases
⢠Look for HA in queuing systemsLook for HA in queuing systems
⢠Look for HA in ďŹlesystems?Look for HA in ďŹlesystems?
⢠Master/Master vs Master/slaveMaster/Master vs Master/slave
Julien Pivotto Linux HA
23. The conďŹgurationThe conďŹguration
⢠Same conďŹg everywhereSame conďŹg everywhere
⢠Use puppet, chef, . . .Use puppet, chef, . . .
⢠ConďŹg in one placeConďŹg in one place
⢠KISSKISS
Julien Pivotto Linux HA
25. PacemakerPacemaker
⢠It is the brainIt is the brain
⢠Decides what to do, whenDecides what to do, when
⢠Gets information from ressourcesGets information from ressources
⢠Depends on messaging and cluster managerDepends on messaging and cluster manager
⢠Does not require shared storageDoes not require shared storage
Julien Pivotto Linux HA
26. DecisionsDecisions
⢠A node fails, now whatA node fails, now what
⢠A service fails, now whatA service fails, now what
⢠Restart? Move?Restart? Move?
⢠Needs to be quick and without interventionNeeds to be quick and without intervention
⢠Scores, policiesScores, policies
Julien Pivotto Linux HA
27. CIBCIB
⢠Cluster Information BaseCluster Information Base
⢠XML shared accross the clusterXML shared accross the cluster
⢠Updated using "pcs"Updated using "pcs"
⢠Contains knowledge about the clusterContains knowledge about the cluster
Julien Pivotto Linux HA
28. PrimitivesPrimitives
⢠Service, Ip address, mountpoint,. . .Service, Ip address, mountpoint,. . .
⢠Base bricks of a clusterBase bricks of a cluster
⢠Get a lot of parametersGet a lot of parameters
primitive ClusterIP ocf:heartbeat:IPaddr2
params ip="192.168.122.101" cidr_netmask="32"
op monitor interval="30s"
Julien Pivotto Linux HA
29. Resource AgentResource Agent
⢠ScriptScript
⢠How to startHow to start
⢠How to stopHow to stop
⢠How to change state (promote, demote)How to change state (promote, demote)
⢠How to monitor (real monitoring)How to monitor (real monitoring)
⢠An init script but way beterAn init script but way beter
Julien Pivotto Linux HA
30. ClonesClones
⢠Same resource running on multiple hostsSame resource running on multiple hosts
⢠DeďŹne minimum and maximum of running primitivesDeďŹne minimum and maximum of running primitives
⢠Possible to run multiple on the same nodePossible to run multiple on the same node
clone WebIP ClusterIP
meta globally-unique="true" clone-max="2"
clone-node-max="2"
Julien Pivotto Linux HA
31. Master Slave (ms)Master Slave (ms)
⢠Set of primitives with roleSet of primitives with role
⢠Masters and slaves (e.g mysql, ldap)Masters and slaves (e.g mysql, ldap)
⢠Can promote slaves to masterCan promote slaves to master
⢠Can demote masters to slaveCan demote masters to slave
⢠Multiples slaves / mastersMultiples slaves / masters
ms WebDataClone WebData
meta master-max="2" master-node-max="1"
clone-max="2" clone-node-max="1"
Julien Pivotto Linux HA
32. GroupGroup
⢠Group of primitives of diďŹerent kindGroup of primitives of diďŹerent kind
⢠Implies colocationImplies colocation
⢠Starts in a ďŹxed orderStarts in a ďŹxed order
⢠Stops in the opposite orderStops in the opposite order
Julien Pivotto Linux HA
33. ColocationColocation
⢠ConstraintConstraint
⢠Must run on the same hostsMust run on the same hosts
⢠Has a scoreHas a score
⢠Order mattersOrder matters
⢠e.g vip with servicee.g vip with service
colocation website-with-ip inf: WebSite ClusterIPproperty
Julien Pivotto Linux HA
34. LocationLocation
⢠Set preferred locationSet preferred location
⢠Has a scoreHas a score
location prefer-apache-1 WebSite 50: apache-1
ms WebDataClone WebData
meta master-max="2" master-node-max="1"
clone-max="2" clone-node-max="1"
Julien Pivotto Linux HA
35. OrderOrder
⢠What starts after whatWhat starts after what
⢠Even across nodesEven across nodes
⢠Has a scoreHas a score
pcs order WebFS-after-WebData inf: WebDataClone:promote
WebFSClone:start
Julien Pivotto Linux HA
37. MaintenanceMaintenance
⢠Manually move resourcesManually move resources
⢠Set a DO-NOT-MANAGE ďŹagSet a DO-NOT-MANAGE ďŹag
⢠Do not forget to revertDo not forget to revert
Julien Pivotto Linux HA
39. CMANCMAN
⢠Manages membership and quorumManages membership and quorum
⢠NotiďŹes pacemaker when something changesNotiďŹes pacemaker when something changes
⢠Starts and manages corosyncStarts and manages corosync
⢠Needs a cluster.conf that contains all the nodesNeeds a cluster.conf that contains all the nodes
⢠Managed via ccsManaged via ccs
⢠Will propagate the changesWill propagate the changes
Julien Pivotto Linux HA
40. CorosyncCorosync
⢠Messaging layerMessaging layer
⢠Controlled via CMANControlled via CMAN
⢠Next version will take over CMANNext version will take over CMAN
Julien Pivotto Linux HA
42. DistributionsDistributions
⢠Developed mainly by RedHat and SuSeDeveloped mainly by RedHat and SuSe
⢠Used with Openstack tooUsed with Openstack too
⢠Getting into a unique stackGetting into a unique stack
⢠Available in * distrosAvailable in * distros
Julien Pivotto Linux HA
43. crmsh vs pcscrmsh vs pcs
⢠crmsh was more usedcrmsh was more used
⢠Disappeared in CentOS 6.4Disappeared in CentOS 6.4
⢠Getting used to pcsGetting used to pcs
⢠One goal: modify the CIBOne goal: modify the CIB
⢠pcs is young/not widely usedpcs is young/not widely used
Julien Pivotto Linux HA
45. Create a resourceCreate a resource
pcs resource create ClusterIP ocf:heartbeat:IPaddr2
ip=192.168.0.120 cidr_netmask=32 op monitor
interval=30s
Julien Pivotto Linux HA
46. Create constraintsCreate constraints
pcs constraint colocation add WebFS WebDataClone
INFINITY with-rsc-role=Master
pcs constraint order promote WebDataClone then
start WebFS
Julien Pivotto Linux HA
48. Check the status of the clusterCheck the status of the cluster
pcs status
Last updated: Fri Sep 14 12:41:12 2012
Last change: Fri Sep 14 12:41:08 2012 via crm_attribute on pcmk-1
Stack: corosync
Current DC: pcmk-1 (1) - partition with quorum
Version: 1.1.8-1.el7-60a19ed12fdb4d5c6a6b6767f52e5391e447fec0
2 Nodes configured, unknown expected votes
5 Resources configured.
Node pcmk-1 (1): standby
Online: [ pcmk-2 ]
Full list of resources:
ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2
WebSite (ocf::heartbeat:apache): Started pcmk-2
Master/Slave Set: WebDataClone [WebData]
Masters: [ pcmk-2 ]
Stopped: [ WebData:1 ]
WebFS (ocf::heartbeat:Filesystem): Started pcmk-2
Julien Pivotto Linux HA
50. Percona Replication ManagerPercona Replication Manager
⢠MySQL replication with pacemakerMySQL replication with pacemaker
⢠Complete documentationComplete documentation
⢠Ressource agentsRessource agents
⢠Supports multi-slave setupsSupports multi-slave setups
⢠Good documentationGood documentation
Julien Pivotto Linux HA
51. Mysql Ressource AgentMysql Ressource Agent
⢠Keeps track of a score for each slaveKeeps track of a score for each slave
⢠In case of failure, will switch to the "best scored"In case of failure, will switch to the "best scored"
⢠Can be reused in your clusterCan be reused in your cluster
⢠https://github.com/percona/percona-pacemaker-agentshttps://github.com/percona/percona-pacemaker-agents
Julien Pivotto Linux HA
60. Be cleverBe clever
⢠KISSKISS
⢠AutomateAutomate
⢠MonitorMonitor
⢠Be realisticBe realistic
Julien Pivotto Linux HA
61. Do not promise the impossibleDo not promise the impossible
⢠WONTFIX your appWONTFIX your app
⢠Working together (devops)Working together (devops)
⢠Not about scaleNot about scale
⢠Not about stabilityNot about stability
⢠Do not talk in ninesDo not talk in nines
Julien Pivotto Linux HA
62. Linux HALinux HA
⢠ReliableReliable
⢠Pacemaker, Corosync, CMANPacemaker, Corosync, CMAN
⢠Pcs, crmsh, ccsPcs, crmsh, ccs
⢠A lot of readingA lot of reading
⢠A lot of experience to buildA lot of experience to build
Julien Pivotto Linux HA
63. RTFMRTFM
⢠http://clusterlabs.orghttp://clusterlabs.org
⢠Clusters From ScratchClusters From Scratch
⢠Pacemaker explainedPacemaker explained
⢠http://blog.clusterlabs.orghttp://blog.clusterlabs.org
⢠Old http://linux-ha.orgOld http://linux-ha.org
Julien Pivotto Linux HA