2. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Table of content
1 Overview
The mission
Before the migration
2 PostgreSQL 9.0
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
3 Clustering
Set up of corosync
OCF resource
4 Backups
Cron jobs
BackupPC
5 Monitoring
Nagios
Munin
6 Automation
Puppet module
The node ļ¬le
#TODO
7 The end
Julien Pivotto PostgreSQL 9.0 HA
5. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
C.E.R.I.S.E
ā¢ A web application
ā¢ Plone (python)
ā¢ 15k+ visits, 500k+ pages and 2.000.000+ hits each month
ā¢ Developped by Aļ¬nitic
ā¢ Several databases
ā¢ PostgreSQL 9.0
ā¢ Oracle database
ā¢ Several servers/services
ā¢ Two reverse proxies in failover HA
ā¢ Two application servers in load balancing HA
ā¢ Two PostgreSQL servers in failover HA
ā¢ An oracledb server
ā¢ A development server
ā¢ A pentaho server
ā¢ Being integrated in jenkins (to be continued. . . )
Julien Pivotto PostgreSQL 9.0 HA
6. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
PostgreSQL before the migration
ā¢ PostgreSQL 8.3.7
ā¢ No native support of HA
ā¢ High availability with heartbeat 2 and DRBD
ā¢ Installed on the application servers
ā¢ Nothing automated
ā¢ Failover: Passive node is not even read only
ā¢ Installed in November 2008
Julien Pivotto PostgreSQL 9.0 HA
7. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Monitoring before the installation
ā¢ Icinga
ā¢ Check of the DRBD
ā¢ Simple connection check to PostgreSQL
ā¢ Graphing with Cacti
ā¢ Size of the databases
ā¢ Connexions to the database
ā¢ Checkpoints
Julien Pivotto PostgreSQL 9.0 HA
8. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Backups before the installation
ā¢ Backups were done every hour one the same machine
ā¢ External backups once a day on disk and on tape
ā¢ Backups are made with pg_dump command
ā¢ BackupPC get those ļ¬les
Julien Pivotto PostgreSQL 9.0 HA
9. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
PostgreSQL 9.0
ā¢ PostgreSQL 9.0 was out in september 2010
ā¢ It brings to the world native replication in PostgresSQL
ā¢ There is not any native failover tool
ā¢ So we need to use PostgreSQL + Corosync
ā¢ The setup of PostgreSQL is managed by Puppet
Julien Pivotto PostgreSQL 9.0 HA
10. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Write-Ahead Logging
ā¢ It means that every change to dataļ¬le must ļ¬rst be written
into a log ļ¬le
ā¢ Less disk writes: only the log ļ¬le needs to be ļ¬ushed to disk to
guarantee that a transaction is committed, rather than every
data ļ¬le changed by the transaction
Julien Pivotto PostgreSQL 9.0 HA
11. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
What is streaming replication
ā¢ Streaming replication provides the capability to ship and apply
WAL XLOGS to standby servers
ā¢ Itās possible to have multiple standby servers
ā¢ Standby servers can be read-only ("Hot standby")
Julien Pivotto PostgreSQL 9.0 HA
12. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
DisadvantagesSpeciļ¬cations of streaming replication
ā¢ Streaming replication supports only asynchronous log-shipping
ā¢ But when the database is used, the delay is close to
synchronous log-shipping
ā¢ Adding a standby server requires manual action
ā¢ But in our case we will only have one standby server
ā¢ PostgreSQL does not provide HA feature
ā¢ But Corosync does
ā¢ It is a single-threaded replication
ā¢ It is a single-threaded replication
Julien Pivotto PostgreSQL 9.0 HA
13. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Master conļ¬guration
The master only needs one conļ¬guration ļ¬le.
Conļ¬guration non-related to SR
#Postgresql configuration
#http://www.postgresql.org/docs/9.0/interactive/index.html
listen_addresses = ā*ā
max_connections = 200
shared_buffers = 4096MB
work_mem = 4096MB
effective_cache_size = 10024MB
commit_delay = 100000
effective_cache_size = 2560
log_destination = āstderrā
log_directory = āpg_logā
logging_collector = on
log_filename = āpostgresql-%Y-%m-%d_%H%M%S.logā
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_messages = notice
log_min_duration_statement = 1000
log_line_prefix = ā%t %u ā
log_statement = ānoneā
datestyle = āiso, dmyā
Julien Pivotto PostgreSQL 9.0 HA
15. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Master conļ¬guration
ā¢ wal_level = hot_standby
Allows stanby server to be readable
ā¢ max_wal_senders = 2
We allow up to 2 standby nodes
ā¢ wal_keep_segments = 128
The minimum wal segments to keep
Julien Pivotto PostgreSQL 9.0 HA
16. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Slave conļ¬guration
ā¢ The slave requires at least two conļ¬guration ļ¬les
ā¢ A postgreSQL.conf ļ¬le
ā¢ A recovery.conf ļ¬le, used to apply the WAL XLOGS shipped by
the master
ā¢ A trigger ļ¬le to stop replication can be speciļ¬ed
PostgreSQL.conf - Conļ¬guration related to SR
wal_level = hot_standby
hot_standby = on
Note that the ļ¬le also have the same ļ¬rst part of the conļ¬g ļ¬le
than the master conļ¬guration.
Julien Pivotto PostgreSQL 9.0 HA
17. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Slave conļ¬guration
recovery.conf
standby_mode = āonā
primary_conninfo = āhost=192.168.177.2 user=replicuserā
ā¢ standby_mode means that this is a standby server
ā¢ primary_conninfo is the connection to the master
Julien Pivotto PostgreSQL 9.0 HA
18. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Replication user
ā¢ A super user called replication has to be created
ā¢ The SQL command to create it is
CREATE USER replication SUPERUSER LOGIN CONNECTION
LIMIT 1 ENCRYPTED PASSWORD āfoobarā;
Julien Pivotto PostgreSQL 9.0 HA
19. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
pg_hba.conf
ā¢ pg_hba.conf is the ļ¬le that contains some kind of ACLs for
the PostgreSQL connections
ā¢ In that ļ¬le we will add both nodes as ātrustedā and the
replication user as trusted too
pg_hba.conf
hostnossl all all 10.0.10.8/32 trust
hostnossl all all 10.0.10.9/32 trust
hostnossl replication replicuser 192.168.177.2/24 trust
Julien Pivotto PostgreSQL 9.0 HA
20. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master conļ¬guration
Slave conļ¬guration
PostgreSQL speciļ¬c tricks
Setting up a slave
Setting up a slave
ā¢ You have to type a bunch of commands on the master when
you add a new standby server
Adding a standby server
psql -c "SELECT pg_start_backup(ālabelā, true)"
rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid --exclude ā*-masterā
--exclude ā*-slaveā
psql -c "SELECT pg_stop_backup()"
Julien Pivotto PostgreSQL 9.0 HA
21. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
Corosync conļ¬guration
ā¢ The goal of corosync is to make the switch between
master/slave when needed
ā¢ It will ensure that a master is online and connected to the
router
ā¢ The two servers are connected to each other on eth1
ā¢ Corosync is installed by Puppet
ā¢ We take it from the clusterlabs repositories
ā¢ We use a personalized master/slave ocf resource to manage
the PostgreSQL M/S
Julien Pivotto PostgreSQL 9.0 HA
22. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
The main conļ¬guration ļ¬le of corosync is
/etc/corosync/crm.conf. It contains all the
resources/nodes/etc. . .
Deļ¬ning the nodes
node babar.interne.arsia.be
attributes standby="off"
node dumbo.interne.arsia.be
attributes standby="off"
In this code, the two nodes are deļ¬ned, and we tell corosync that
they should be started at launch.
Julien Pivotto PostgreSQL 9.0 HA
23. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
Deļ¬ning the primitives
primitive pgsql ocf:inuits:pgsql-ms
primitive virt_ip ocf:heartbeat:IPaddr2
params nic="eth0" iflabel="0" ip="10.0.10.10" cidr_netmask="24" broadcast="10.0.10.255"
meta target-role="Started" is-managed="true"
primitive ping ocf:pacemaker:ping
params host_list="10.0.10.1"
op monitor interval="10s" timeout="10s"
op start interval="0" timeout="45s"
op stop interval="0" timeout="50s"
ā¢ We deļ¬ne 3 primitives:
ā¢ pgsql, the PostgreSQL primitive
ā¢ virt_ip, the ļ¬oating IP address
ā¢ ping, the primitive that will check that the servers are
connected to the router
Julien Pivotto PostgreSQL 9.0 HA
24. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
Conļ¬guring the primitives
ms pgsql-ms pgsql
params pgsqlconfig="/var/lib/pgsql/data/postgresql.conf"
lsb_script="/etc/init.d/postgresql-9.0"
pgsqlrecovery="/var/lib/pgsql/data/recovery.conf"
meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="false"
clone clone-ping ping
meta globally-unique="false"
ā¢ We conļ¬gure the PostgreSQL M/S: the init script, the
conļ¬guration ļ¬les. . .
ā¢ We also conļ¬gure the ping resource as a clone (it will be
launched on both servers)
Julien Pivotto PostgreSQL 9.0 HA
25. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
Deļ¬ning the nodes
group PSQL virt_ip
location connected PSQL
rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0
colocation ip_psql inf: PSQL pgsql-ms:Master
property $id="cib-bootstrap-options"
cluster-infrastructure="openais"
expected-quorum-votes="2"
stonith-enabled="false"
no-quorum-policy="ignore"
default-resource-stickiness="INFINITY"
rsc_defaults $id="rsc_defaults-options"
migration-threshold="INFINITY"
failure-timeout="10"
resource-stickiness="INFINITY"
ā¢ These lines will ensure that the master is always on the same
node as the ļ¬oating IP address
ā¢ And also that the master is connected to the router
Julien Pivotto PostgreSQL 9.0 HA
26. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
OCF resource
ā¢ There is a custom OCF resource to manage the master/slave
PostgreSQL
ā¢ It is based on an example of resource written by Andrew
Beekhof from Clusterlabs
ā¢ The ļ¬le has to be in
/usr/lib/ocf/resource.d/inuits/pgsql-ms
Julien Pivotto PostgreSQL 9.0 HA
27. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
OCF resource
ā¢ The script does the following:
ā¢ It moves the postgresql.conf-master to
postgresql.conf when a node is promoted/master
ā¢ It moves the postgresql.conf-slave to postgresql.conf
when a node is depromoted/slave
ā¢ It ensure that recovery.conf-slave is on recovery.conf
on slave and absent on master
ā¢ It starts/restarts PostgreSQL when needed.
ā¢ I will post that ļ¬le on Github soon
Julien Pivotto PostgreSQL 9.0 HA
28. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Cron jobs
BackupPC
Backups of the databases
ā¢ Sometimes, you need backups (especially when you donāt have
backups. . . )
ā¢ We do a backup per hour on each node (one at minute 0 and
one at minute 30)
ā¢ We do a backup per day on each node
ā¢ We do a backup per day on before BackupPC backup on each
node.
ā¢ We keep 24 hourly backups and 7 daily backups on disk
ā¢ With BackupPC we keep months of backups
Julien Pivotto PostgreSQL 9.0 HA
29. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Cron jobs
BackupPC
Hourly backup script
/usr/local/bin/backup_hourly.sh
#!/bin/bash
DATE=$(date +%H)
BACKUP_PATH=/var/lib/backups/hourly
for db in foobar_db foobar2_db
do
/usr/bin/pg_dump $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz
ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz
done
The daily script is almost the same.
Julien Pivotto PostgreSQL 9.0 HA
30. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Cron jobs
BackupPC
BackupPC script
/usr/local/bin/backup_backuppc.sh
#!/bin/bash
DATE=$(date +%u)
BACKUP_PATH=/var/lib/backups/backuppc
for db in cerise trackitquality trackit zodb_cerise
do
/usr/bin/pg_dump -U postgres $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz
ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz
done
In the backupPC conļ¬g, I added the following:
BackupPC config
$Conf{DumpPreUserCmd} = ā$sshPath -t -q -x -l backuppc $host /usr/local/bin/backup_backuppc.shā;
Julien Pivotto PostgreSQL 9.0 HA
32. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Nagios
Munin
Check hot_standby latency
ā¢ The check_postgres.pl script has a check for hot_standby
delay
ā¢ But we do not know who is the master and the slave, and it is
required to launch the script
ā¢ So, here is a bash script I wrote to know the M/S order
Master/slave replication check
#!/bin/bash
/usr/lib64/nagios/plugins/check_postgres.pl --db="$1"
--action hot_standby_delay -w 300 -c 600 --host=$(
crm_resource --resource pgsql-ms --locate|
awk ā/Master/ {master=$6} / $/ {slave=$6} END {print master","slave}ā
)
Julien Pivotto PostgreSQL 9.0 HA
35. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node ļ¬le
#TODO
Puppet module
ā¢ The puppet postgres module is forked from Kris Buytaertās
github page
ā¢ It is modiļ¬ed to remove all references to services, because we
want corosync to manage them
ā¢ It creates the users, the super users, the databases
ā¢ It is a parameterized class, with a "cluster" parameter. So we
can also install simple PostgreSQL
ā¢ The cache sizes are parameterized too, so we can also use that
in Vagrant boxes
ā¢ Here are some examples from the module I will upload on
Github ASAP
Julien Pivotto PostgreSQL 9.0 HA
36. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node ļ¬le
#TODO
Class postgres
The postgres class installs the packages and makes the initdb stuļ¬.
init.pp
class postgres (
$cluster = ānoā,
$running_ip = ā127.0.0.1ā
){ ...
ā¢ The cluster parameter indicates if we want or not clustering
ā¢ running_ip is used for the SQL commands. In case of a
cluster, you have to put clusteās IP address here.
Julien Pivotto PostgreSQL 9.0 HA
38. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node ļ¬le
#TODO
Example in the node ļ¬le
Here is the result in the node ļ¬le:
dumbo.pp
node babar {
class {
āpostgresā:
cluster => āyesā,
running_ip => ā10.0.10.10ā,
}
include postgres::munin
include postgres::backup
include cluster::node
postgres::config{
$::fqdn: listen => ā*ā,
}
Julien Pivotto PostgreSQL 9.0 HA
39. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node ļ¬le
#TODO
Example in the node ļ¬le
dumbo.pp
postgres::hba {
$::fqdn:
allowedrules => [
"host all all $::ipaddress/32 trust",
āhostnossl all all 10.0.10.8/32 trustā,
āhostnossl all all 10.0.10.9/32 trustā,
āhostnossl all all 10.0.10.10/32 trustā,
āhostnossl replication replicuser 192.168.177.2/24 trustā,
],
}
Julien Pivotto PostgreSQL 9.0 HA
40. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node ļ¬le
#TODO
Example in the node ļ¬le
dumbo.pp
postgres::createsuperuser{
āreplicuserā:
passwd => āfoobarā,
}
postgres::createuser{
āceriseā:
passwd => āfoobarā;
}
postgres::createdb{
āzodb_ceriseā:
owner => āceriseā,
require => Postgres::Createuser[āceriseā],
}
}
Julien Pivotto PostgreSQL 9.0 HA
41. ;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node ļ¬le
#TODO
#TODO
ā¢ The ļ¬rst synchronisation is not puppetized
ā¢ More advanced checks on the database #monitoringsucks
(e.g. slow queries)
ā¢ A disaster recovery
ā¢ Improve the ocf script
ā¢ Check the content of the backups
ā¢ . . .
Julien Pivotto PostgreSQL 9.0 HA