SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Patroni:
Kubernetes-native
PostgreSQL companion
PGConf APAC 2018
Singapore
ALEXANDER KUKUSHKIN
23-03-2018
2
ABOUT ME
Alexander Kukushkin
Database Engineer @ZalandoTech
Email: alexander.kukushkin@zalando.de
Twitter: @cyberdemn
3
ZALANDO
15 markets
6 fulfillment centers
20 million active customers
3.6 billion € net sales 2016
165 million visits per month
12,000 employees in Europe
4
FACTS & FIGURES
> 300 databases
on premise
> 150
on AWS EC2
> 200
on K8S
5
Bot pattern and Patroni
Postgres-operator
Patroni on Kubernetes, first attempt
Kubernetes-native Patroni
Live-demo
AGENDA
6
● small python daemon
● implements “bot” pattern
● runs next to PostgreSQL
● decides on promotion/demotion
● uses DCS to run leader election and keep cluster state
Bot pattern and Patroni
7
● Distributed Consensus/Configuration Store (Key-Value)
● Uses RAFT (Etcd, Consul) or ZAB (ZooKeeper)
● Write succeed only if majority of nodes acknowledge it
(quorum)
● Supports Atomic operations (CompareAndSet)
● Can expire objects after TTL
http://thesecretlivesofdata.com/raft/
DCS
8
Bot pattern: leader alive
Primary
NODE A
Standby
NODE B
Standby
NODE C
UPDATE(“/leader”, “A”, ttl=30,
prevValue=”A”)Success
WATCH (/leader)
WATCH (/leader)
/leader: “A”, ttl: 30
9
Bot pattern: master dies, leader key holds
Primary
Standby
Standby
WATCH (/leader)
WATCH (/leader)
/leader: “A”, ttl: 17
NODE A
NODE B
NODE C
10
Bot pattern: leader key expires
Standby
Standby
Notify (/leader, expired=true)
Notify (/leader, expired=true)
/leader: “A”, ttl: 0
NODE B
NODE C
11
Bot pattern: who will be the next master?
Standby
Standby
Node B:
GET A:8008/patroni -> failed/timeout
GET C:8008/patroni -> wal_position: 100
Node C:
GET A:8008/patroni -> failed/timeout
GET B:8008/patroni -> wal_position: 100
NODE B
NODE C
12
Bot pattern: leader race among equals
Standby
Standby
/leader: “C”, ttl: 30
CREATE (“/leader”, “C”,
ttl=30, prevExists=False)
CREATE (“/leader”, “B”,
ttl=30, prevExists=False)
FAIL
SUCCESS
NODE B
NODE C
13
Bot pattern: promote and continue
replication
Standby
Primary
/leader: “C”, ttl: 30WATCH(/leader
)
promote
NODE B
NODE C
14
DCS STRUCTURE
● /service/cluster-name/
○ config {"postgresql":{"parameters":{"max_connections":300}}}
○ initialize ”6303731710761975832” (database system identifier)
○ members/
■ dbnode1 {"role":"replica","state":"running”,"conn_url":"postgres://172.17.0.2:5432/postgres"}
■ dbnode2 {"role":"master","state":"running”,"conn_url":"postgres://172.17.0.3:5432/postgres"}
○ leader dbnode2
○ optime/
■ leader “67393608” # ← absolute wal positition
15
AWS DEPLOYMENT
16
“Kubernetes is an open-source system for automating deployment, scaling,
and management of containerized applications.
It groups containers that make up an application into logical units (Pods) for
easy management and discovery. Kubernetes builds upon 15 years of
experience of running production workloads at Google, combined with
best-of-breed ideas and practices from the community.”
kubernetes.io
KUBERNETES
17
Spilo & Patroni on K8S v1
Node
Pod: demo-0
role: replica
PersistentVolume
PersistentVolume
Node
Pod: demo-1
role: master
StatefulSet: demo
Secret: demoUPDATE()
WATCH()
Service: demo-replica
labelSelector: role=replica
Service: demo
labelSelector: role=master
18
Spilo & Patroni on K8S v1
● We will deploy Etcd on Kubernetes
● Depoy Spilo with PetSet (old name for StatefulSet)
● And quickly hack a callback script for Patroni, which will
label the Pod we are running in with the current role
(master, replica)
● And use Services with labelSelectors for traffic routing
19
Can we get rid from Etcd?
● Use labelSelector to find all Kubernetes objects
associated with the given cluster
○ Pods - cluster members
○ ConfigMaps or Endpoints to keep configuration
● Every iteration of HA loop we will update labels and
metadata on the objects (the same way as we updating
keys in Etcd)
● It is even possible to do CAS operation using K8S API
20
No K8S API for expiring objects
How to do leader election?
21
Do it on the client side!
● Leader should periodically update ConfigMap or Endpoint
○ Update must happen as CAS operation
○ Demote to read-only in case of failure
● All other members should check that leader ConfigMap (or
Endpoint) is being updated
○ If there are no updates during TTL => do leader election
22
Kubernetes-native Patroni
Node
Pod: demo-0
role: replica
PersistentVolume
PersistentVolume
Node
Pod: demo-1
role: master
StatefulSet: demo
Endpoint: demo Service: demo
Secret: demo
UPDATE()
W
ATCH()
Endpoint: demo-config
Service: demo-replica
labelSelector: role=replica
23
DEMO TIME
24
● No dependency on Etcd
● When using Endpoint for leader
election we can also maintain
subsets with the IP of the
leader Pod
● 100% Kubernetes-native
solution
Kubernetes API as DCS
CONSPROS
● Can’t tolerante arbitrary clock
skew rate
● OpenShift doesn’t allow to put
IP from the Pods rage into the
Endpoint
● SLA for K8S API on GCE
prommiss only 99.5% availability
25
DEPLOYMENT
26
How to deploy it
● kubectl create -f your-cluster.yaml
● Use Patroni Helm Chart + Spilo
● Use postgres-operator
27
POSTGRES-OPERATOR
● Creates CustomResourceDefinition Postgresql and watches it
● When new Postgresql object is created - deploys a new cluster
○ Creates Secrets, Endpoints, Services and StatefulSet
● When Postgresql object is updated - updates StatefulSet
○ and does a rolling upgrade
● Periodically syncs running clusters with the manifests
● When Postgresql object is deleted - cleans everything up
28
DEPLOYMENT WITH OPERATOR
29
CLUSTER STATUS
30
PostgreSQL
manifest
Stateful set
Spilo pod
Kubernetes cluster
PATRONI
Postgres
operator
pod
Endpoint
Service
Client
application
Postgres
operator
config mapCluster
secrets
Database
deployer
create
create
create
watch
deploy
Update with
actual master
role
31
Monitoring & Backups
● Things to monitor:
○ Pods status (via K8S API)
○ Patroni & PostgreSQL state
○ Replication state and lag
● Always do Backups!
○ And always test them!
GET http://$POD_IP:8008/patroni
for every Pod in the cluster, check
that state=running and compare
xlog_position with the master
32
Our learnings
● We run Kubernetes on top of AWS infrastructure
○ Availability of K8S API in our case is very close to 100%
○ PersistentVolume (EBS) attach/detach sometimes buggy and slow
● Kubernetes cluster upgrade
○ Require rotating all nodes and can cause multiple switchovers
■ Thanks to postgres-operator it is solved, now we need only one
● Kubernetes node autoscaler
○ Sometimes terminates the nodes were Spilo/Patroni/PostgreSQL runs
■ Patroni handles it gracefully, by doing a switchover
33
LINKS
● Patroni: https://github.com/zalando/patroni
● Patroni Documentation: https://patroni.readthedocs.io
● Spilo: https://github.com/zalando/spilo
● Helm chart: https://github.com/unguiculus/charts/tree/feature/patroni/incubator/patroni
● Postgres-operator: https://github.com/zalando-incubator/postgres-operator
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?Mydbops
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1Federico Campoli
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetAlexey Lesovsky
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 MinutesSveta Smirnova
 
ProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymenthyeongchae lee
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergendistributed matters
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIAltinity Ltd
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions Yugabyte
 

Was ist angesagt? (20)

What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication Cheatsheet
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
 
ProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQL
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike SteenbergenMeet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 

Ähnlich wie Patroni: Kubernetes-native PostgreSQL companion

PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC
 
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinPGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinEqunix Business Solutions
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopWeaveworks
 
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and Docker[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and DockerWSO2
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and DockerWSO2
 
Free GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOpsFree GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOpsWeaveworks
 
WKSctl: Gitops Management of Kubernetes Clusters
WKSctl: Gitops Management of Kubernetes ClustersWKSctl: Gitops Management of Kubernetes Clusters
WKSctl: Gitops Management of Kubernetes ClustersWeaveworks
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusTobias Schmidt
 
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and DockerWSO2
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowMarynaHoldaieva
 
Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...
Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...
Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...Lviv Startup Club
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsAmbassador Labs
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes IntroductionMiloš Zubal
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftChester Chen
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesJimmy Angelakos
 
Creating Kubernetes multi clusters with ClusterAPI in the Hetzner Cloud
Creating Kubernetes multi clusters with ClusterAPI in the Hetzner CloudCreating Kubernetes multi clusters with ClusterAPI in the Hetzner Cloud
Creating Kubernetes multi clusters with ClusterAPI in the Hetzner CloudTobias Schneck
 
Container Camp London (2016-09-09)
Container Camp London (2016-09-09)Container Camp London (2016-09-09)
Container Camp London (2016-09-09)craigbox
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On DemandBogdan Kyryliuk
 

Ähnlich wie Patroni: Kubernetes-native PostgreSQL companion (20)

PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companionPGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
 
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinPGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps Workshop
 
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and Docker[WSO2Con USA 2018] Deploying Applications in K8S and Docker
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
 
Free GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOpsFree GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOps
 
WKSctl: Gitops Management of Kubernetes Clusters
WKSctl: Gitops Management of Kubernetes ClustersWKSctl: Gitops Management of Kubernetes Clusters
WKSctl: Gitops Management of Kubernetes Clusters
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflow
 
Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...
Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...
Yannis Zarkadas. Stefano Fioravanzo. Enterprise data science workflows on kub...
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on Kubernetes
 
Creating Kubernetes multi clusters with ClusterAPI in the Hetzner Cloud
Creating Kubernetes multi clusters with ClusterAPI in the Hetzner CloudCreating Kubernetes multi clusters with ClusterAPI in the Hetzner Cloud
Creating Kubernetes multi clusters with ClusterAPI in the Hetzner Cloud
 
Container Camp London (2016-09-09)
Container Camp London (2016-09-09)Container Camp London (2016-09-09)
Container Camp London (2016-09-09)
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
 

Kürzlich hochgeladen

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Kürzlich hochgeladen (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Patroni: Kubernetes-native PostgreSQL companion

  • 1. Patroni: Kubernetes-native PostgreSQL companion PGConf APAC 2018 Singapore ALEXANDER KUKUSHKIN 23-03-2018
  • 2. 2 ABOUT ME Alexander Kukushkin Database Engineer @ZalandoTech Email: alexander.kukushkin@zalando.de Twitter: @cyberdemn
  • 3. 3 ZALANDO 15 markets 6 fulfillment centers 20 million active customers 3.6 billion € net sales 2016 165 million visits per month 12,000 employees in Europe
  • 4. 4 FACTS & FIGURES > 300 databases on premise > 150 on AWS EC2 > 200 on K8S
  • 5. 5 Bot pattern and Patroni Postgres-operator Patroni on Kubernetes, first attempt Kubernetes-native Patroni Live-demo AGENDA
  • 6. 6 ● small python daemon ● implements “bot” pattern ● runs next to PostgreSQL ● decides on promotion/demotion ● uses DCS to run leader election and keep cluster state Bot pattern and Patroni
  • 7. 7 ● Distributed Consensus/Configuration Store (Key-Value) ● Uses RAFT (Etcd, Consul) or ZAB (ZooKeeper) ● Write succeed only if majority of nodes acknowledge it (quorum) ● Supports Atomic operations (CompareAndSet) ● Can expire objects after TTL http://thesecretlivesofdata.com/raft/ DCS
  • 8. 8 Bot pattern: leader alive Primary NODE A Standby NODE B Standby NODE C UPDATE(“/leader”, “A”, ttl=30, prevValue=”A”)Success WATCH (/leader) WATCH (/leader) /leader: “A”, ttl: 30
  • 9. 9 Bot pattern: master dies, leader key holds Primary Standby Standby WATCH (/leader) WATCH (/leader) /leader: “A”, ttl: 17 NODE A NODE B NODE C
  • 10. 10 Bot pattern: leader key expires Standby Standby Notify (/leader, expired=true) Notify (/leader, expired=true) /leader: “A”, ttl: 0 NODE B NODE C
  • 11. 11 Bot pattern: who will be the next master? Standby Standby Node B: GET A:8008/patroni -> failed/timeout GET C:8008/patroni -> wal_position: 100 Node C: GET A:8008/patroni -> failed/timeout GET B:8008/patroni -> wal_position: 100 NODE B NODE C
  • 12. 12 Bot pattern: leader race among equals Standby Standby /leader: “C”, ttl: 30 CREATE (“/leader”, “C”, ttl=30, prevExists=False) CREATE (“/leader”, “B”, ttl=30, prevExists=False) FAIL SUCCESS NODE B NODE C
  • 13. 13 Bot pattern: promote and continue replication Standby Primary /leader: “C”, ttl: 30WATCH(/leader ) promote NODE B NODE C
  • 14. 14 DCS STRUCTURE ● /service/cluster-name/ ○ config {"postgresql":{"parameters":{"max_connections":300}}} ○ initialize ”6303731710761975832” (database system identifier) ○ members/ ■ dbnode1 {"role":"replica","state":"running”,"conn_url":"postgres://172.17.0.2:5432/postgres"} ■ dbnode2 {"role":"master","state":"running”,"conn_url":"postgres://172.17.0.3:5432/postgres"} ○ leader dbnode2 ○ optime/ ■ leader “67393608” # ← absolute wal positition
  • 16. 16 “Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units (Pods) for easy management and discovery. Kubernetes builds upon 15 years of experience of running production workloads at Google, combined with best-of-breed ideas and practices from the community.” kubernetes.io KUBERNETES
  • 17. 17 Spilo & Patroni on K8S v1 Node Pod: demo-0 role: replica PersistentVolume PersistentVolume Node Pod: demo-1 role: master StatefulSet: demo Secret: demoUPDATE() WATCH() Service: demo-replica labelSelector: role=replica Service: demo labelSelector: role=master
  • 18. 18 Spilo & Patroni on K8S v1 ● We will deploy Etcd on Kubernetes ● Depoy Spilo with PetSet (old name for StatefulSet) ● And quickly hack a callback script for Patroni, which will label the Pod we are running in with the current role (master, replica) ● And use Services with labelSelectors for traffic routing
  • 19. 19 Can we get rid from Etcd? ● Use labelSelector to find all Kubernetes objects associated with the given cluster ○ Pods - cluster members ○ ConfigMaps or Endpoints to keep configuration ● Every iteration of HA loop we will update labels and metadata on the objects (the same way as we updating keys in Etcd) ● It is even possible to do CAS operation using K8S API
  • 20. 20 No K8S API for expiring objects How to do leader election?
  • 21. 21 Do it on the client side! ● Leader should periodically update ConfigMap or Endpoint ○ Update must happen as CAS operation ○ Demote to read-only in case of failure ● All other members should check that leader ConfigMap (or Endpoint) is being updated ○ If there are no updates during TTL => do leader election
  • 22. 22 Kubernetes-native Patroni Node Pod: demo-0 role: replica PersistentVolume PersistentVolume Node Pod: demo-1 role: master StatefulSet: demo Endpoint: demo Service: demo Secret: demo UPDATE() W ATCH() Endpoint: demo-config Service: demo-replica labelSelector: role=replica
  • 24. 24 ● No dependency on Etcd ● When using Endpoint for leader election we can also maintain subsets with the IP of the leader Pod ● 100% Kubernetes-native solution Kubernetes API as DCS CONSPROS ● Can’t tolerante arbitrary clock skew rate ● OpenShift doesn’t allow to put IP from the Pods rage into the Endpoint ● SLA for K8S API on GCE prommiss only 99.5% availability
  • 26. 26 How to deploy it ● kubectl create -f your-cluster.yaml ● Use Patroni Helm Chart + Spilo ● Use postgres-operator
  • 27. 27 POSTGRES-OPERATOR ● Creates CustomResourceDefinition Postgresql and watches it ● When new Postgresql object is created - deploys a new cluster ○ Creates Secrets, Endpoints, Services and StatefulSet ● When Postgresql object is updated - updates StatefulSet ○ and does a rolling upgrade ● Periodically syncs running clusters with the manifests ● When Postgresql object is deleted - cleans everything up
  • 30. 30 PostgreSQL manifest Stateful set Spilo pod Kubernetes cluster PATRONI Postgres operator pod Endpoint Service Client application Postgres operator config mapCluster secrets Database deployer create create create watch deploy Update with actual master role
  • 31. 31 Monitoring & Backups ● Things to monitor: ○ Pods status (via K8S API) ○ Patroni & PostgreSQL state ○ Replication state and lag ● Always do Backups! ○ And always test them! GET http://$POD_IP:8008/patroni for every Pod in the cluster, check that state=running and compare xlog_position with the master
  • 32. 32 Our learnings ● We run Kubernetes on top of AWS infrastructure ○ Availability of K8S API in our case is very close to 100% ○ PersistentVolume (EBS) attach/detach sometimes buggy and slow ● Kubernetes cluster upgrade ○ Require rotating all nodes and can cause multiple switchovers ■ Thanks to postgres-operator it is solved, now we need only one ● Kubernetes node autoscaler ○ Sometimes terminates the nodes were Spilo/Patroni/PostgreSQL runs ■ Patroni handles it gracefully, by doing a switchover
  • 33. 33 LINKS ● Patroni: https://github.com/zalando/patroni ● Patroni Documentation: https://patroni.readthedocs.io ● Spilo: https://github.com/zalando/spilo ● Helm chart: https://github.com/unguiculus/charts/tree/feature/patroni/incubator/patroni ● Postgres-operator: https://github.com/zalando-incubator/postgres-operator