SlideShare a Scribd company logo
1 of 33
Download to read offline
Distributed Fun with
And the consensus problem
DistSys Riyadh Meetup
Abdulaziz AlMalki @almalki_am
Agenda
● The consensus problem
● Paxos and raft
● What is etcd?
● etcd use cases
● etcd as a kv store
● etcd consistency guarantees
● etcd failure modes
● Leader election
● Distributed locks
Agenda
● Distributed cluster configuration
● Service discovery
● How kubernetes uses etcd
● Demo:
○ PostgreSQL leader election with patroni and etcd
○ Using etcd and confd for dynamic pull based cluster reconfiguration
The consensus problem
What is consensus?
Getting a group of processes to agree on a value
Properties:
● Termination: eventually, every non-faulty process decides some value
● Agreement: all processes select the same value
● Integrity: a process decides only once
● Validity: The value must have proposed by some process
The consensus problem
Reaching an agreement (consensus) is an important step in many distributed
computing problems:
● synchronizing replicated state machines and making sure all replicas have the
same (consistent) view of system state.
● electing a leader
● mutual exclusion (distributed locks)
● managing group membership/failure detection
● deciding to commit or abort for distributed transactions
But...
There's always a but.
Is it possible to achieve consensus in distributed systems?
It depends..
Distributed System Models
Synchronous model
● messages are received within a known bounded time
● drift of each process local clock has a known bound
● Each step in a process has a known bound
● e.g supercomputer
Asynchronous model
● no bounds on message transmission delays
● arbitrary drift rate of local clocks
● no bounds on process execution
● e.g The Internet
Back to consensus
Is it possible to achieve consensus in distributed systems?
Yes & No
Yes in Synchronous model
Not in Asynchronous model
Why?
FLP Proof
Impossibility of distributed consensus with one faulty process (1985)
Fischer, Lynch and Paterson
https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf
Result:
“We show that every protocol for this problem has the possibility of nontermination,
even with only one faulty process. By way of contrast, solutions are known for the
synchronous case, the "Byzantine Generals" problem.”
Paxos
Leslie Lamport discovered the algorithm in the late 1980s
Used by Google Chubby
Guarantees safety, but not liveness
● Safety: agreement property, guaranteed
● Liveness: termination property, not guaranteed
Eventual liveness
Hard to understand and implement!
Raft
Reliable, Replicated, Redundant, And Fault-Tolerant
(was supposed to be named Redundo)
https://groups.google.com/forum/#!topic/raft-dev/95rZqptGpmU
Developed by Diego Ongaro and John Ousterhout from Stanford University
Designed to be easy to understand
Published in 2014: https://raft.github.io/raft.pdf
More Info and related research can be found here: https://raft.github.io/
Demo
The Secret Lives of Data (An interactive demo that explains how raft works)
http://thesecretlivesofdata.com/raft/
RaftScope: a raft cluster running in your browser that you can interact with to see
Raft in action
https://raft.github.io/raftscope/
etcd playground
http://play.etcd.io/play
etcd
etcd is a distributed key value store that provides a reliable way to store data
across a cluster of machines.
etcd is used by kubernetes for the backend for service discovery and storing
cluster state and configuration
Cloud Foundry uses etcd to store cluster state and configuration and as a global
lock service
etcd
etcd is written in Go and uses the Raft consensus algorithm to manage a
highly-available replicated log.
https://github.com/etcd-io/etcd
Production-grade
Name from unix "/etc" folder and "d"istributed systems
Originally developed for CoreOS to get automatic, zero-downtime Linux kernel
updates using Locksmith which implements a distributed semaphore over etcd to
ensure only a subset of a cluster is rebooting at any given time.
etcd use cases
Should be used to store metadata and configurations, such as to coordinate
processes
Can handle a few GB of data with consistent ordering
etcd replicates all data within a single consistent replication group, no sharding
etcd provides distributed coordination primitives such as event watches, leases,
elections, and distributed shared locks out of the box.
etcd as a kv store
gRPC remote procedure call
● KV - Creates, updates, fetches, and deletes key-value pairs.
● Watch - Monitors changes to keys.
● Lease - Primitives for consuming client keep-alive messages.
Demo
etcdctl
https://github.com/etcd-io/etcd/blob/master/etcdctl/README.md
Interacting with etcd
https://github.com/etcd-io/etcd/blob/master/Documentation/dev-guide/interacting_
v3.md
etcd consistency guarantees
● Atomicity
○ All API requests are atomic; an operation either completes entirely or not at all.
○ For watch requests, all events generated by one operation will be in one watch response.
● Consistency
○ sequential consistency: a client reads the same events in the same order
○ etcd does not ensure linearizability for watch operations
○ etcd ensures linearizability for all other operations by default
○ For lower latencies and higher throughput, use serializable, may access stale data with respect
to quorum
● Isolation
○ etcd ensures serializable isolation
● Durability
○ Any completed operations are durable
etcd failure modes
Minor followers failure
● with less than half of the members failing, etcd continues running
● clients should automatically reconnect to other operating members
Leader failure
● etcd cluster automatically elects a new leader
● takes about an election timeout to elect a new leader
● requests sent during the election are queued
● writes already sent to the old leader but not yet committed may be lost
etcd failure modes
Majority failure
● etcd cluster fails and cannot accept more writes
● recover from a majority failure once the majority of members become available
Network partition
● either minor followers failure or a leader failure
Leader election
https://github.com/etcd-io/etcd/blob/v3.2.17/Documentation/dev-guide/api_concurr
ency_reference_v3.md
Distributed locks
https://github.com/etcd-io/etcd/blob/v3.2.17/Documentation/dev-guide/api_concurr
ency_reference_v3.md
Distributed cluster configuration
Use etcd as a central configuration store
● all consumers have immediate access to configuration data
● etcd makes it easy for applications to watch for changes
● reduces the time between a configuration change and propagation of that
change throughout the infrastructure
● failed nodes get latest config immediately after recovery
(Pushing config files to servers lacks all of the above)
Service Discovery
Services register/heartbeat/deregister themselves
Clients (or load balancers) watch etcd for endpoints and use it to connect
e.g.
/services/<service_name>/<instance_id> = <instance_address>
How kubernetes uses etcd
● Kubernetes stores data, state, and metadata in etcd
● All access to etcd goes through the apiserver
● Kubernetes stores the ideal state and the actual state.
● Kubernetes control loop (kube-controller-manager) watches these states of the
cluster through the apiserver and if these two states have diverged, it’ll make
changes to reconcile them.
● Clusters using etcd3 preserve changes in the last 5 minutes by default.
GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245
How kubernetes uses etcd
Create Pod Flow.
Source:
heptio.com
Patroni
Patroni: A Template for PostgreSQL HA with ZooKeeper, etcd or Consul
https://github.com/zalando/patroni
https://github.com/zalando/patroni/blob/master/patroni/dcs/etcd.py
Patroni originated as a fork of Governor, the project from Compose
https://github.com/helm/charts/tree/master/incubator/patroni
HA PostgreSQL Clusters with Docker
https://github.com/zalando/spilo
Confd
Manage local application configuration files using templates and data from etcd
http://www.confd.io/
● Sync configuration files by polling etcd and processing template resources.
● Reloading applications to pick up new config file changes
References and further reading
A Brief Tour of FLP Impossibility
https://www.the-paper-trail.org/post/2008-08-13-a-brief-tour-of-flp-impossibility/
Distributed Systems, Failures, and Consensus
https://www2.cs.duke.edu/courses/fall07/cps212/consensus.pdf
Consensus
https://www.cs.rutgers.edu/~pxk/417/notes/content/consensus.html
References and further reading
etcd github
https://github.com/etcd-io/etcd
etcd Concurrency primitives
https://github.com/etcd-io/etcd/tree/master/clientv3/concurrency
Consistency Models
https://jepsen.io/consistency
https://aphyr.com/posts/313-strong-consistency-models
References and further reading
Cloud Computing Concepts, Part 1 & 2
https://www.coursera.org/learn/cloud-computing/
https://www.coursera.org/learn/cloud-computing-2
Distributed Consensus
https://homepage.cs.uiowa.edu/~ghosh/16612.week11.pdf
How to Build a Highly Available System Using Consensus
https://www.microsoft.com/en-us/research/publication/how-to-build-a-highly-availab
le-system-using-consensus/
References and further reading
In Search of an Understandable Consensus Algorithm
https://www.usenix.org/conference/atc14/technical-sessions/presentation/ongaro
Tech Talk - Raft, In Search of an Understandable Consensus Algorithm by Diego
Ongaro
https://www.youtube.com/watch?v=LAqyTyNUYSY&feature=youtu.be
The Raft Consensus Algorithm
https://raft.github.io/
References and further reading
State machine replication
https://en.wikipedia.org/wiki/State_machine_replication
Kube-controller-manager
https://kubernetes.io/docs/concepts/overview/components/
https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller
-manager/
go-config: a dynamic config framework
https://github.com/micro/go-config

More Related Content

What's hot

Repository Management with JFrog Artifactory
Repository Management with JFrog ArtifactoryRepository Management with JFrog Artifactory
Repository Management with JFrog ArtifactoryStephen Chin
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes IntroductionPeng Xiao
 
CI, CD with Docker, Jenkins and Tutum
CI, CD with Docker, Jenkins and TutumCI, CD with Docker, Jenkins and Tutum
CI, CD with Docker, Jenkins and TutumSreenivas Makam
 
CICD Pipelines for Microservices Best Practices
CICD Pipelines for Microservices Best Practices CICD Pipelines for Microservices Best Practices
CICD Pipelines for Microservices Best Practices Codefresh
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes ArchitectureKnoldus Inc.
 
Docker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and ContainersDocker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and ContainersYajushi Srivastava
 
Docker Container Security - A Network View
Docker Container Security - A Network ViewDocker Container Security - A Network View
Docker Container Security - A Network ViewNeuVector
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to KubernetesImesh Gunaratne
 
FOSDEM 2017: GitLab CI
FOSDEM 2017:  GitLab CIFOSDEM 2017:  GitLab CI
FOSDEM 2017: GitLab CIOlinData
 
Docker introduction &amp; benefits
Docker introduction &amp; benefitsDocker introduction &amp; benefits
Docker introduction &amp; benefitsAmit Manwade
 
Kubernetes
KubernetesKubernetes
Kuberneteserialc_w
 
GitOps with ArgoCD
GitOps with ArgoCDGitOps with ArgoCD
GitOps with ArgoCDCloudOps2005
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to AnsibleKnoldus Inc.
 
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...Edureka!
 

What's hot (20)

Git best practices workshop
Git best practices workshopGit best practices workshop
Git best practices workshop
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
Repository Management with JFrog Artifactory
Repository Management with JFrog ArtifactoryRepository Management with JFrog Artifactory
Repository Management with JFrog Artifactory
 
Jenkins CI
Jenkins CIJenkins CI
Jenkins CI
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
CI, CD with Docker, Jenkins and Tutum
CI, CD with Docker, Jenkins and TutumCI, CD with Docker, Jenkins and Tutum
CI, CD with Docker, Jenkins and Tutum
 
Jenkins
JenkinsJenkins
Jenkins
 
Jenkins
JenkinsJenkins
Jenkins
 
CICD Pipelines for Microservices Best Practices
CICD Pipelines for Microservices Best Practices CICD Pipelines for Microservices Best Practices
CICD Pipelines for Microservices Best Practices
 
Docker, LinuX Container
Docker, LinuX ContainerDocker, LinuX Container
Docker, LinuX Container
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes Architecture
 
Docker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and ContainersDocker 101 : Introduction to Docker and Containers
Docker 101 : Introduction to Docker and Containers
 
Docker Container Security - A Network View
Docker Container Security - A Network ViewDocker Container Security - A Network View
Docker Container Security - A Network View
 
An Introduction to Kubernetes
An Introduction to KubernetesAn Introduction to Kubernetes
An Introduction to Kubernetes
 
FOSDEM 2017: GitLab CI
FOSDEM 2017:  GitLab CIFOSDEM 2017:  GitLab CI
FOSDEM 2017: GitLab CI
 
Docker introduction &amp; benefits
Docker introduction &amp; benefitsDocker introduction &amp; benefits
Docker introduction &amp; benefits
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
GitOps with ArgoCD
GitOps with ArgoCDGitOps with ArgoCD
GitOps with ArgoCD
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
CI CD Pipeline Using Jenkins | Continuous Integration and Deployment | DevOps...
 

Similar to Distributed fun with etcd

Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsImesha Sudasingha
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerBob Killen
 
Techtalks: taking docker to production
Techtalks: taking docker to productionTechtalks: taking docker to production
Techtalks: taking docker to productionmuayyad alsadi
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systemsAndrea Monacchi
 
Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012mumrah
 
JavaScript for Enterprise Applications
JavaScript for Enterprise ApplicationsJavaScript for Enterprise Applications
JavaScript for Enterprise ApplicationsPiyush Katariya
 
CrawlerLD - Distributed crawler for linked data
CrawlerLD - Distributed crawler for linked dataCrawlerLD - Distributed crawler for linked data
CrawlerLD - Distributed crawler for linked dataRaphael do Vale
 
The State of the Veil Framework
The State of the Veil FrameworkThe State of the Veil Framework
The State of the Veil FrameworkVeilFramework
 
Crikeycon 2019 Velociraptor Workshop
Crikeycon 2019 Velociraptor WorkshopCrikeycon 2019 Velociraptor Workshop
Crikeycon 2019 Velociraptor WorkshopVelocidex Enterprises
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101Itiel Shwartz
 
Workflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin PiebiakWorkflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin PiebiakNETWAYS
 
Workflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large EnterprisesWorkflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large EnterprisesPuppet
 
Introduction to containers
Introduction to containersIntroduction to containers
Introduction to containersNitish Jadia
 
A Practical Event Driven Model
A Practical Event Driven ModelA Practical Event Driven Model
A Practical Event Driven ModelXi Wu
 
First steps with kubernetes
First steps with kubernetesFirst steps with kubernetes
First steps with kubernetesVinícius Kroth
 
Zookeeper big sonata
Zookeeper  big sonataZookeeper  big sonata
Zookeeper big sonataAnh Le
 

Similar to Distributed fun with etcd (20)

Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systems
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and Docker
 
Techtalks: taking docker to production
Techtalks: taking docker to productionTechtalks: taking docker to production
Techtalks: taking docker to production
 
JOSA TechTalk: Taking Docker to Production
JOSA TechTalk: Taking Docker to ProductionJOSA TechTalk: Taking Docker to Production
JOSA TechTalk: Taking Docker to Production
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systems
 
Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
JavaScript for Enterprise Applications
JavaScript for Enterprise ApplicationsJavaScript for Enterprise Applications
JavaScript for Enterprise Applications
 
CrawlerLD - Distributed crawler for linked data
CrawlerLD - Distributed crawler for linked dataCrawlerLD - Distributed crawler for linked data
CrawlerLD - Distributed crawler for linked data
 
The State of the Veil Framework
The State of the Veil FrameworkThe State of the Veil Framework
The State of the Veil Framework
 
Crikeycon 2019 Velociraptor Workshop
Crikeycon 2019 Velociraptor WorkshopCrikeycon 2019 Velociraptor Workshop
Crikeycon 2019 Velociraptor Workshop
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
Workflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin PiebiakWorkflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin Piebiak
 
Workflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large EnterprisesWorkflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large Enterprises
 
Introduction to containers
Introduction to containersIntroduction to containers
Introduction to containers
 
A Practical Event Driven Model
A Practical Event Driven ModelA Practical Event Driven Model
A Practical Event Driven Model
 
KrakenD API Gateway
KrakenD API GatewayKrakenD API Gateway
KrakenD API Gateway
 
First steps with kubernetes
First steps with kubernetesFirst steps with kubernetes
First steps with kubernetes
 
Zookeeper big sonata
Zookeeper  big sonataZookeeper  big sonata
Zookeeper big sonata
 

Recently uploaded

%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 

Recently uploaded (20)

%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 

Distributed fun with etcd

  • 1. Distributed Fun with And the consensus problem DistSys Riyadh Meetup Abdulaziz AlMalki @almalki_am
  • 2. Agenda ● The consensus problem ● Paxos and raft ● What is etcd? ● etcd use cases ● etcd as a kv store ● etcd consistency guarantees ● etcd failure modes ● Leader election ● Distributed locks
  • 3. Agenda ● Distributed cluster configuration ● Service discovery ● How kubernetes uses etcd ● Demo: ○ PostgreSQL leader election with patroni and etcd ○ Using etcd and confd for dynamic pull based cluster reconfiguration
  • 4. The consensus problem What is consensus? Getting a group of processes to agree on a value Properties: ● Termination: eventually, every non-faulty process decides some value ● Agreement: all processes select the same value ● Integrity: a process decides only once ● Validity: The value must have proposed by some process
  • 5. The consensus problem Reaching an agreement (consensus) is an important step in many distributed computing problems: ● synchronizing replicated state machines and making sure all replicas have the same (consistent) view of system state. ● electing a leader ● mutual exclusion (distributed locks) ● managing group membership/failure detection ● deciding to commit or abort for distributed transactions
  • 6. But... There's always a but. Is it possible to achieve consensus in distributed systems? It depends..
  • 7. Distributed System Models Synchronous model ● messages are received within a known bounded time ● drift of each process local clock has a known bound ● Each step in a process has a known bound ● e.g supercomputer Asynchronous model ● no bounds on message transmission delays ● arbitrary drift rate of local clocks ● no bounds on process execution ● e.g The Internet
  • 8. Back to consensus Is it possible to achieve consensus in distributed systems? Yes & No Yes in Synchronous model Not in Asynchronous model Why?
  • 9. FLP Proof Impossibility of distributed consensus with one faulty process (1985) Fischer, Lynch and Paterson https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf Result: “We show that every protocol for this problem has the possibility of nontermination, even with only one faulty process. By way of contrast, solutions are known for the synchronous case, the "Byzantine Generals" problem.”
  • 10. Paxos Leslie Lamport discovered the algorithm in the late 1980s Used by Google Chubby Guarantees safety, but not liveness ● Safety: agreement property, guaranteed ● Liveness: termination property, not guaranteed Eventual liveness Hard to understand and implement!
  • 11. Raft Reliable, Replicated, Redundant, And Fault-Tolerant (was supposed to be named Redundo) https://groups.google.com/forum/#!topic/raft-dev/95rZqptGpmU Developed by Diego Ongaro and John Ousterhout from Stanford University Designed to be easy to understand Published in 2014: https://raft.github.io/raft.pdf More Info and related research can be found here: https://raft.github.io/
  • 12. Demo The Secret Lives of Data (An interactive demo that explains how raft works) http://thesecretlivesofdata.com/raft/ RaftScope: a raft cluster running in your browser that you can interact with to see Raft in action https://raft.github.io/raftscope/ etcd playground http://play.etcd.io/play
  • 13. etcd etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. etcd is used by kubernetes for the backend for service discovery and storing cluster state and configuration Cloud Foundry uses etcd to store cluster state and configuration and as a global lock service
  • 14. etcd etcd is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log. https://github.com/etcd-io/etcd Production-grade Name from unix "/etc" folder and "d"istributed systems Originally developed for CoreOS to get automatic, zero-downtime Linux kernel updates using Locksmith which implements a distributed semaphore over etcd to ensure only a subset of a cluster is rebooting at any given time.
  • 15. etcd use cases Should be used to store metadata and configurations, such as to coordinate processes Can handle a few GB of data with consistent ordering etcd replicates all data within a single consistent replication group, no sharding etcd provides distributed coordination primitives such as event watches, leases, elections, and distributed shared locks out of the box.
  • 16. etcd as a kv store gRPC remote procedure call ● KV - Creates, updates, fetches, and deletes key-value pairs. ● Watch - Monitors changes to keys. ● Lease - Primitives for consuming client keep-alive messages.
  • 18. etcd consistency guarantees ● Atomicity ○ All API requests are atomic; an operation either completes entirely or not at all. ○ For watch requests, all events generated by one operation will be in one watch response. ● Consistency ○ sequential consistency: a client reads the same events in the same order ○ etcd does not ensure linearizability for watch operations ○ etcd ensures linearizability for all other operations by default ○ For lower latencies and higher throughput, use serializable, may access stale data with respect to quorum ● Isolation ○ etcd ensures serializable isolation ● Durability ○ Any completed operations are durable
  • 19. etcd failure modes Minor followers failure ● with less than half of the members failing, etcd continues running ● clients should automatically reconnect to other operating members Leader failure ● etcd cluster automatically elects a new leader ● takes about an election timeout to elect a new leader ● requests sent during the election are queued ● writes already sent to the old leader but not yet committed may be lost
  • 20. etcd failure modes Majority failure ● etcd cluster fails and cannot accept more writes ● recover from a majority failure once the majority of members become available Network partition ● either minor followers failure or a leader failure
  • 23. Distributed cluster configuration Use etcd as a central configuration store ● all consumers have immediate access to configuration data ● etcd makes it easy for applications to watch for changes ● reduces the time between a configuration change and propagation of that change throughout the infrastructure ● failed nodes get latest config immediately after recovery (Pushing config files to servers lacks all of the above)
  • 24. Service Discovery Services register/heartbeat/deregister themselves Clients (or load balancers) watch etcd for endpoints and use it to connect e.g. /services/<service_name>/<instance_id> = <instance_address>
  • 25. How kubernetes uses etcd ● Kubernetes stores data, state, and metadata in etcd ● All access to etcd goes through the apiserver ● Kubernetes stores the ideal state and the actual state. ● Kubernetes control loop (kube-controller-manager) watches these states of the cluster through the apiserver and if these two states have diverged, it’ll make changes to reconcile them. ● Clusters using etcd3 preserve changes in the last 5 minutes by default. GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245
  • 26. How kubernetes uses etcd Create Pod Flow. Source: heptio.com
  • 27. Patroni Patroni: A Template for PostgreSQL HA with ZooKeeper, etcd or Consul https://github.com/zalando/patroni https://github.com/zalando/patroni/blob/master/patroni/dcs/etcd.py Patroni originated as a fork of Governor, the project from Compose https://github.com/helm/charts/tree/master/incubator/patroni HA PostgreSQL Clusters with Docker https://github.com/zalando/spilo
  • 28. Confd Manage local application configuration files using templates and data from etcd http://www.confd.io/ ● Sync configuration files by polling etcd and processing template resources. ● Reloading applications to pick up new config file changes
  • 29. References and further reading A Brief Tour of FLP Impossibility https://www.the-paper-trail.org/post/2008-08-13-a-brief-tour-of-flp-impossibility/ Distributed Systems, Failures, and Consensus https://www2.cs.duke.edu/courses/fall07/cps212/consensus.pdf Consensus https://www.cs.rutgers.edu/~pxk/417/notes/content/consensus.html
  • 30. References and further reading etcd github https://github.com/etcd-io/etcd etcd Concurrency primitives https://github.com/etcd-io/etcd/tree/master/clientv3/concurrency Consistency Models https://jepsen.io/consistency https://aphyr.com/posts/313-strong-consistency-models
  • 31. References and further reading Cloud Computing Concepts, Part 1 & 2 https://www.coursera.org/learn/cloud-computing/ https://www.coursera.org/learn/cloud-computing-2 Distributed Consensus https://homepage.cs.uiowa.edu/~ghosh/16612.week11.pdf How to Build a Highly Available System Using Consensus https://www.microsoft.com/en-us/research/publication/how-to-build-a-highly-availab le-system-using-consensus/
  • 32. References and further reading In Search of an Understandable Consensus Algorithm https://www.usenix.org/conference/atc14/technical-sessions/presentation/ongaro Tech Talk - Raft, In Search of an Understandable Consensus Algorithm by Diego Ongaro https://www.youtube.com/watch?v=LAqyTyNUYSY&feature=youtu.be The Raft Consensus Algorithm https://raft.github.io/
  • 33. References and further reading State machine replication https://en.wikipedia.org/wiki/State_machine_replication Kube-controller-manager https://kubernetes.io/docs/concepts/overview/components/ https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller -manager/ go-config: a dynamic config framework https://github.com/micro/go-config