SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
1
Alexander Kukushkin
PGConf US 2017, Jersey City
Patroni - HA PostgreSQL made easy
2
ABOUT ME
Alexander Kukushkin
Database Engineer @ZalandoTech
Email: alexander.kukushkin@zalando.de
Twitter: @cyberdemn
3
ZALANDO AT A GLANCE
~3.6billion EURO
net sales 2016
~165
million
visits
per
month
>12,000
employees in
Europe
50%
return rate across
all categories
~20
million
active customers
~200,000
product choices
>1,500
brands
15
countries
4
ZALANDO TECHNOLOGY
BERLIN
5
ZALANDO TECHNOLOGY
BERLIN
DORTMUND
DUBLIN
HELSINKI
ERFURT
MÖNCHENGLADBACH
HAMBURG
6
ZALANDO TECHNOLOGY
● > 150 databases in DC
● > 130 databases on AWS
● > 1600 tech employees
● We are hiring!
7
POSTGRESQL
● Rock-solid by default
● Transactional DDL
● Standard-compliant modern SQL
● Blazing performance
● PostgreSQL is a community
The world’s most advanced open-source database
8
RUNNING DATABASES AT SCALE
9
RUNNING DATABASES AT SCALE
10
CLOUD DATABASES
● Rapid deployments
● Commodity hardware (cattle vs pets)
● Standard configuration and automatic tuning
11
12
AUTOMATIC FAILOVER
“PostgreSQL does not
provide the system
software required to
identify a failure on the
primary and notify the
standby database
server.”
CC0 Public Domain
13
EXISTING AUTOMATIC FAILOVER SOLUTIONS
● Promote a replica when the master is not responding
○ Split brain/potentially many masters
● Use one monitor node to make decisions
○ Monitor node is a single point of failure
○ Former master needs to be killed (STONITH)
● Use multiple monitor nodes
○ Distributed consistency problem
14
DISTRIBUTED CONSISTENCY PROBLEM
https://www.flickr.com/photos/kevandotorg
15
PATRONI APPROACH
● Use Distributed Configuration System (DCS): Etcd, Zookeeper or Consul
● Built-in distributed consensus (RAFT, Zab)
● Session/TTL to expire data (i.e. master key)
● Key-value storage for cluster information
● Atomic operations (CAS)
● Watches for important keys
16
DCS STRUCTURE
● /service/cluster/
○ config
○ initialize
○ members/
■ dbnode1
■ dbnode2
○ leader
○ optime/
■ leader
○ failover
17
● initialize
○ "key": "/service/testcluster/initialize",
"value": "6303731710761975832"
● leader/optime
○ "key": "/service/testcluster/optime/leader",
"value": "67393608"
● config
○ "key": "/service/testcluster/config",
"value": "{"postgresql":{"parameters":{"max_connections":"200"}}}"
KEYS THAT NEVER EXPIRE
18
● leader
○ "key": "/service/testcluster/leader",
"value": "dbnode2",
"ttl": 22
● members
○ "key": "/service/testcluster/members/dbnode2",
“value": "{"role":"master","state":"running","xlog_location":67393608,
"conn_url":"postgres://172.17.0.3:5432/postgres",
"api_url":"http://172.17.0.3:8008/patroni"}",
"ttl": 22
KEYS WITH TTL
19
● Initialization race
● initdb by a winner of an initialization race
● Waiting for the leader key by the rest of the nodes
● Bootstrapping of non-leader nodes (pg_basebackup)
BOOTSTRAPPING OF A NEW CLUSTER
20
● Update the leader key or demote if update failed
● Write the leader/optime (xlog position)
● Update the member key
● Add/delete replication slots for other members
EVENT LOOP OF A RUNNING CLUSTER (MASTER)
21
● Check that the cluster has a leader
○ Check recovery.conf points to the correct leader
○ Join the leader race if a leader is not present
● Add/delete replication slots for cascading replicas
● Update the member key
EVENT LOOP OF A RUNNING CLUSTER (REPLICA)
22
● Check whether the member is the healthiest
○ Evaluate its xlog position against all other members
● Try to acquire the leader lock
● Promote itself to become a master after acquiring the lock
LEADER RACE
23
LEADER RACE
CREATE (“/leader”, “A”, ttl=30, prevExists=False)
CREATE (“/leader”, “B”, ttl=30, prevExists=False)
Success
Fail
promote
A
B
24
LIVE DEMO
25
PATRONI FEATURES
● Manual and Scheduled Failover
● Synchronous mode
● Attach the old master with pg_rewind
● Customizable replica creation methods
● Linux watchdog support (coming soon)
● Pause (maintenance) mode
● patronictl
26
● Change Patroni/PostgreSQL parameters via Patroni REST API
○ Store them in DCS and apply dynamically on all nodes
● Ensure identical configuration of the following parameters on all members:
○ ttl, loop_wait, retry_timeout, maximum_lag_on_failover
○ wal_level, hot_standby
○ max_connections, max_prepared_transactions, max_locks_per_transaction,
max_worker_processes, track_commit_timestamp, wal_log_hints
○ wal_keep_segments, max_replication_slots
● Inform the user that PostgreSQL needs to be restarted (pending_restart flag)
DYNAMIC CONFIGURATION
27
BUILDING HA POSTGRESQL BASED ON PATRONI
● Client traffic routing
○ patroni callbacks
○ confd + haproxy, pgbouncer
● Backup and recovery
○ WAL-E, barman
● Monitoring
○ Nagios, zabbix, zmon
Image by flickr user https://www.flickr.com/photos/brickset/
28
SPILO: DOCKER + PATRONI + WAL-E + AWS/K8S
29
SPILO DEPLOYMENT
30
AUTOMATIC FAILOVER IS HARD
31
WHEN SHOULD THE MASTER DEMOTE ITSELF?
● Chances of data loss vs write availability
● Avoiding too many master switches (retry_timeout, loop_wait, ttl)
● 2 x retry_timeout + loop_wait < ttl
● Zookeeper and Consul session duration quirks
32
CHOOSING A NEW MASTER
● Reliability/performance of the host or connection
○ nofailover tag
● XLOG position
○ highest xlog position = the best candidate
○ xlog > leader/optime - maximum_lag_on_failover
■ maximum_lag_on_failover > size of WAL segment (16MB) for disaster recovery
33
ATTACHING THE OLD MASTER BACK AS REPLICA
● Diverged timelines after the former master crash
● pg_rewind
○ use_pg_rewind
○ remove_data_directory_on_rewind_failure
34
USEFUL LINKS
● Spilo: https://github.com/zalando/spilo
● Confd: http://www.confd.io
● Etcd: https://github.com/coreos/etcd
● RAFT: http://thesecretlivesofdata.com/raft/
35
Questions?
https://github.com/zalando/patroni

Weitere ähnliche Inhalte

Was ist angesagt?

The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialJean-François Gagné
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HAharoonm
 
Tuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlTuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlMydbops
 
PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsMydbops
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergendistributed matters
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep InternalEXEM
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanGabriele Bartolini
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQLMydbops
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1Federico Campoli
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsAlexander Korotkov
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfJesmar Cannao'
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataJignesh Shah
 
Multi Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesMulti Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesOhyama Masanori
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIAltinity Ltd
 

Was ist angesagt? (20)

The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Tuning Autovacuum in Postgresql
Tuning Autovacuum in PostgresqlTuning Autovacuum in Postgresql
Tuning Autovacuum in Postgresql
 
PostgreSQL replication
PostgreSQL replicationPostgreSQL replication
PostgreSQL replication
 
PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with Barman
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
 
Multi Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on KubernetesMulti Master PostgreSQL Cluster on Kubernetes
Multi Master PostgreSQL Cluster on Kubernetes
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 

Ähnlich wie Patroni - HA PostgreSQL made easy

Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database AuditingJuan Berner
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGPablo Garbossa
 
PostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacksPostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacksShowmax Engineering
 
Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on KubernetesJoerg Henning
 
PERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schemaPERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schemaFromDual GmbH
 
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander KukushkinPGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander KukushkinEqunix Business Solutions
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB plc
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django applicationbangaloredjangousergroup
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfYunusShaikh49
 
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAsFOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAsValerii Kravchuk
 
MySQL for Oracle DBAs
MySQL for Oracle DBAsMySQL for Oracle DBAs
MySQL for Oracle DBAsFromDual GmbH
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2PgTraining
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
 
Learn backend java script
Learn backend java scriptLearn backend java script
Learn backend java scriptTsuyoshi Maeda
 

Ähnlich wie Patroni - HA PostgreSQL made easy (20)

Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
 
PostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacksPostgreSQL Monitoring using modern software stacks
PostgreSQL Monitoring using modern software stacks
 
Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on Kubernetes
 
Varnish - PLNOG 4
Varnish - PLNOG 4Varnish - PLNOG 4
Varnish - PLNOG 4
 
PERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schemaPERFORMANCE_SCHEMA and sys schema
PERFORMANCE_SCHEMA and sys schema
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander KukushkinPGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
PGConf.ASIA 2019 Bali - Patroni in 2019 - Alexander Kukushkin
 
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance OptimizationMariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - Performance Optimization
 
Web scale monitoring
Web scale monitoringWeb scale monitoring
Web scale monitoring
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdf
 
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAsFOSDEM 2015: gdb tips and tricks for MySQL DBAs
FOSDEM 2015: gdb tips and tricks for MySQL DBAs
 
MySQL for Oracle DBAs
MySQL for Oracle DBAsMySQL for Oracle DBAs
MySQL for Oracle DBAs
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Learn backend java script
Learn backend java scriptLearn backend java script
Learn backend java script
 

Kürzlich hochgeladen

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Kürzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Patroni - HA PostgreSQL made easy

  • 1. 1 Alexander Kukushkin PGConf US 2017, Jersey City Patroni - HA PostgreSQL made easy
  • 2. 2 ABOUT ME Alexander Kukushkin Database Engineer @ZalandoTech Email: alexander.kukushkin@zalando.de Twitter: @cyberdemn
  • 3. 3 ZALANDO AT A GLANCE ~3.6billion EURO net sales 2016 ~165 million visits per month >12,000 employees in Europe 50% return rate across all categories ~20 million active customers ~200,000 product choices >1,500 brands 15 countries
  • 6. 6 ZALANDO TECHNOLOGY ● > 150 databases in DC ● > 130 databases on AWS ● > 1600 tech employees ● We are hiring!
  • 7. 7 POSTGRESQL ● Rock-solid by default ● Transactional DDL ● Standard-compliant modern SQL ● Blazing performance ● PostgreSQL is a community The world’s most advanced open-source database
  • 10. 10 CLOUD DATABASES ● Rapid deployments ● Commodity hardware (cattle vs pets) ● Standard configuration and automatic tuning
  • 11. 11
  • 12. 12 AUTOMATIC FAILOVER “PostgreSQL does not provide the system software required to identify a failure on the primary and notify the standby database server.” CC0 Public Domain
  • 13. 13 EXISTING AUTOMATIC FAILOVER SOLUTIONS ● Promote a replica when the master is not responding ○ Split brain/potentially many masters ● Use one monitor node to make decisions ○ Monitor node is a single point of failure ○ Former master needs to be killed (STONITH) ● Use multiple monitor nodes ○ Distributed consistency problem
  • 15. 15 PATRONI APPROACH ● Use Distributed Configuration System (DCS): Etcd, Zookeeper or Consul ● Built-in distributed consensus (RAFT, Zab) ● Session/TTL to expire data (i.e. master key) ● Key-value storage for cluster information ● Atomic operations (CAS) ● Watches for important keys
  • 16. 16 DCS STRUCTURE ● /service/cluster/ ○ config ○ initialize ○ members/ ■ dbnode1 ■ dbnode2 ○ leader ○ optime/ ■ leader ○ failover
  • 17. 17 ● initialize ○ "key": "/service/testcluster/initialize", "value": "6303731710761975832" ● leader/optime ○ "key": "/service/testcluster/optime/leader", "value": "67393608" ● config ○ "key": "/service/testcluster/config", "value": "{"postgresql":{"parameters":{"max_connections":"200"}}}" KEYS THAT NEVER EXPIRE
  • 18. 18 ● leader ○ "key": "/service/testcluster/leader", "value": "dbnode2", "ttl": 22 ● members ○ "key": "/service/testcluster/members/dbnode2", “value": "{"role":"master","state":"running","xlog_location":67393608, "conn_url":"postgres://172.17.0.3:5432/postgres", "api_url":"http://172.17.0.3:8008/patroni"}", "ttl": 22 KEYS WITH TTL
  • 19. 19 ● Initialization race ● initdb by a winner of an initialization race ● Waiting for the leader key by the rest of the nodes ● Bootstrapping of non-leader nodes (pg_basebackup) BOOTSTRAPPING OF A NEW CLUSTER
  • 20. 20 ● Update the leader key or demote if update failed ● Write the leader/optime (xlog position) ● Update the member key ● Add/delete replication slots for other members EVENT LOOP OF A RUNNING CLUSTER (MASTER)
  • 21. 21 ● Check that the cluster has a leader ○ Check recovery.conf points to the correct leader ○ Join the leader race if a leader is not present ● Add/delete replication slots for cascading replicas ● Update the member key EVENT LOOP OF A RUNNING CLUSTER (REPLICA)
  • 22. 22 ● Check whether the member is the healthiest ○ Evaluate its xlog position against all other members ● Try to acquire the leader lock ● Promote itself to become a master after acquiring the lock LEADER RACE
  • 23. 23 LEADER RACE CREATE (“/leader”, “A”, ttl=30, prevExists=False) CREATE (“/leader”, “B”, ttl=30, prevExists=False) Success Fail promote A B
  • 25. 25 PATRONI FEATURES ● Manual and Scheduled Failover ● Synchronous mode ● Attach the old master with pg_rewind ● Customizable replica creation methods ● Linux watchdog support (coming soon) ● Pause (maintenance) mode ● patronictl
  • 26. 26 ● Change Patroni/PostgreSQL parameters via Patroni REST API ○ Store them in DCS and apply dynamically on all nodes ● Ensure identical configuration of the following parameters on all members: ○ ttl, loop_wait, retry_timeout, maximum_lag_on_failover ○ wal_level, hot_standby ○ max_connections, max_prepared_transactions, max_locks_per_transaction, max_worker_processes, track_commit_timestamp, wal_log_hints ○ wal_keep_segments, max_replication_slots ● Inform the user that PostgreSQL needs to be restarted (pending_restart flag) DYNAMIC CONFIGURATION
  • 27. 27 BUILDING HA POSTGRESQL BASED ON PATRONI ● Client traffic routing ○ patroni callbacks ○ confd + haproxy, pgbouncer ● Backup and recovery ○ WAL-E, barman ● Monitoring ○ Nagios, zabbix, zmon Image by flickr user https://www.flickr.com/photos/brickset/
  • 28. 28 SPILO: DOCKER + PATRONI + WAL-E + AWS/K8S
  • 31. 31 WHEN SHOULD THE MASTER DEMOTE ITSELF? ● Chances of data loss vs write availability ● Avoiding too many master switches (retry_timeout, loop_wait, ttl) ● 2 x retry_timeout + loop_wait < ttl ● Zookeeper and Consul session duration quirks
  • 32. 32 CHOOSING A NEW MASTER ● Reliability/performance of the host or connection ○ nofailover tag ● XLOG position ○ highest xlog position = the best candidate ○ xlog > leader/optime - maximum_lag_on_failover ■ maximum_lag_on_failover > size of WAL segment (16MB) for disaster recovery
  • 33. 33 ATTACHING THE OLD MASTER BACK AS REPLICA ● Diverged timelines after the former master crash ● pg_rewind ○ use_pg_rewind ○ remove_data_directory_on_rewind_failure
  • 34. 34 USEFUL LINKS ● Spilo: https://github.com/zalando/spilo ● Confd: http://www.confd.io ● Etcd: https://github.com/coreos/etcd ● RAFT: http://thesecretlivesofdata.com/raft/