SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Dynamic Reconfiguration of ZooKeeper

             Alex Shraer
    (presented by Benjamin Reed)
Why ZooKeeper?




•
    Lots of servers
•
    Lots of processes
•
    High volumes of data
•
    Highly complex software systems
•
    … mere mortal developers
What ZooKeeper gives you
●   Simple programming model
●   Coordination of distributed processes
●   Fast notification of changes
●   Elasticity
●   Easy setup
●   High availability
ZooKeeper Configuration

• Membership
• Role of each server
  – E.g., follower or observer
• Quorum System spec
  – Zookeeper: majority or hierarchical
• Network addresses & ports
• Timeouts, directory paths, etc.
Zookeeper - distributed and replicated
                                 ZooKeeper Service
                                    Leader

             Server     Server        Server            Server        Server




    Client   Client   Client     Client        Client        Client     Client   Client


• All servers store a copy of the data (in memory)
• A leader is elected at startup
• Reads served by followers, all updates go through leader
• Update acked when a quorum of servers have persisted the
  change (on disk)
• Zookeeper uses ZAB - its own atomic broadcast protocol
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Hazards of Manual Reconfiguration
                                     E
       A

                       C


        {A, B, C}

        B              {A, B, C}     D




           {A, B, C}


       • Goal: add servers E and D
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
        • Restart Servers
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
        • Restart Servers
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
        • Restart Servers
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        •    Goal: add servers E and D
        •    Change Configuration
        •    Restart Servers
        •    Lost    and    !
18

          Just use a coordination service!
     • Zookeeper is the coordination service
        – Don’t want to deploy another system to coordinate it!


     • Who will reconfigure that system ?
        – GFS has 3 levels of coordination services


     • More system components -> more management overhead


     • Use Zookeeper to reconfigure itself!
        – Other systems store configuration information in Zookeeper
        – Can we do the same??
        – Only if there are no failures
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
This doesn’t work for reconfigurations!
                                                                        E
                               C


                                                     B
                               {A, B, C, D, E}                          {A, B, C, D, E}


setData(/zookeeper/config, {A, B, F})
                                                      {A, B, C, D, E}   D
      remove C, D, E add F



             F
                                                                        {A, B, C, D, E}
                                        A




                                         {A, B, C, D, E}
This doesn’t work for reconfigurations!
                                                                          E
                               C


                                                        B
                               {A, B, C, D, E}                            {A, B, C, D, E}


setData(/zookeeper/config, {A, B, F})
                                                        {A, B, C, D, E}   D
      remove C, D, E add F



             F
                                                                          {A, B, C, D, E}
                                        A



 {A, B, F}
                                            {A, B, F}
This doesn’t work for reconfigurations!
                                                                              E
                                   C


                                                            B
                                   {A, B, C, D, E}                            {A, B, C, D, E}


    setData(/zookeeper/config, {A, B, F})
                                                            {A, B, C, D, E}   D
          remove C, D, E add F



                  F
                                                                              {A, B, C, D, E}
                                            A



      {A, B, F}
                                                {A, B, F}

•   Must persist the decision to reconfigure in the old
    config before activating the new config!
•   Once such decision is reached, must not allow further
    ops to be committed in old config
Our Solution
•   Correct
•   Fully automatic
•   No external services or additional components
•   Minimal changes to Zookeeper
•   Usually unnoticeable to clients
    – Pause operations only in rare circumstances
    – Clients work with a single configuration
• Rebalances clients across servers in new configuration

• Reconfigures immediately

• Speculative Reconfiguration
    – Reconfiguration (and commands that follow it) speculatively sent out by the
      primary, similarly to all other updates
Principles
●   Commit reconfig in a quorum of the old ensemble
    –   Submit reconfig op just like any other update
●   Make sure new ensemble has latest state before
    becoming active
    –   Get quorum of synced followers from new config
    –   Get acks from both old and new ensembles before committing
        updates proposed between reconfig op and activation
    –   Activate new configuration when reconfig commits
●   Once new ensemble active old ensemble cannot commit
    or propose new updates
●   Gossip activation through leader election and syncing
●   Verify configuration id of leader and follower
Failure free flow
Reconfiguration scenario 1
                                 E
   A

                   C


    {A, B, C}                        {A, B, C}

    B              {A, B, C}     D




       {A, B, C}
                                      {A, B, C}


   • Goal: add servers E and D
Reconfiguration scenario 1
                               E
   A

                   C


    {A, B, C}

    B              {A, B, C}   D




       {A, B, C}


   • Goal: add servers E and D
   •    doesn't commit until quorums of
     both ensembles ack
Reconfiguration scenario 1
                               E
   A

                   C


    {A, B, C}                      {A, B, C}

    B              {A, B, C}   D




       {A, B, C}
                                    {A, B, C}


   • Goal: add servers E and D
   •    doesn't commit until quorums of
     both ensembles ack
Reconfiguration scenario 1
                               E
   A

                   C


    {A, B, C}                      {A, B, C}

    B              {A, B, C}   D




       {A, B, C}
                                    {A, B, C}


   • Goal: add servers E and D
   •    doesn't commit until quorums of
     both ensembles ack
Reconfiguration scenario 1
                                 E
    A

                     C


   {A, B, C, D, E}               {A, B, C, D, E}

     B               {A, B, C}   D




   {A, B, C, D, E}
                                  {A, B, C, D, E}


    • Goal: add servers E and D
    •    doesn't commit until quorums of
      both ensembles ack
Reconfiguration scenario 1
                                 E
    A

                     C


   {A, B, C, D, E}               {A, B, C, D, E}

     B               {A, B, C}   D




   {A, B, C, D, E}
                                  {A, B, C, D, E}


    • Goal: add servers E and D
    •    doesn't commit until quorums of
      both ensembles ack
    • E and D gossip new configuration
      to C
Reconfiguration scenario 1
                                       E
    A

                       C


   {A, B, C, D, E}                     {A, B, C, D, E}

     B               {A, B, C, D, E}   D




   {A, B, C, D, E}
                                        {A, B, C, D, E}


    • Goal: add servers E and D
    •    doesn't commit until quorums of
      both ensembles ack
    • E and D gossip new configuration
      to C
Example - reconfig using CLI
reconfig -add 1=host1.com:1234:1235:observer;1239

         -add 2=host2.com:1236:1237:follower;1231 -remove 5
●
    Change follower 1 to an observer and change its ports
●
    Add follower 2 to the ensemble
●
    Remove follower 5 from the ensemble

reconfig -file myNewConfig.txt -v 234547
●
    Change the current config to the one in myNewConfig.txt
●
    But only if current config version is 234547

getConfig -w -c
●
    set a watch on /zookeeper/config
●
    -c means we only want the new connection string for clients
When it will not work
●   Quorum of new ensemble must be in sync
●   Another reconfig in progress
●   Version condition check fails
How do you know you are done
●   Write something somewhere
The “client side” of reconfiguration
• When system changes, clients need to stay connected
   – The usual solution: directory service (e.g., DNS)
• Re-balancing load during reconfiguration is also important!
• Goal: uniform #clients per server with minimal client migration
   – Migration should be proportional to change in membership




                  X 10   X 10   X 10
The “client side” of reconfiguration
• When system changes, clients need to stay connected
   – The usual solution: directory service (e.g., DNS)
• Re-balancing load during reconfiguration is also important!
• Goal: uniform #clients per server with minimal client migration
   – Migration should be proportional to change in membership




                   X 10   X 10   X 10
Our approach - Probabilistic Load Balancing
• Example 1 :


                X 10   X 10   X 10
Our approach - Probabilistic Load Balancing
• Example 1 :


                X 10   X 10   X 10
Our approach - Probabilistic Load Balancing
• Example 1 :


                       X6       X6       X6        X6     X6

   –   Each client moves to a random new server with probability 0.4
   –   1 – 3/5 = 0.4

   –   Exp. 40% clients will move off of each server
Our approach - Probabilistic Load Balancing
• Example 1 :


                        X6       X6       X6        X6     X6

    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :



                        X6       X6      X6         X6     X6
Our approach - Probabilistic Load Balancing
• Example 1 :


                        X6       X6       X6        X6     X6

    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :



                        X6       X6      X6         X6     X6
Our approach - Probabilistic Load Balancing
• Example 1 :


                         X6         X6       X6           X6      X6
    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :
                                                   4/18        4/18    10/18




                         X6        X6       X6            X6      X6

    –   Connected clients don’t move
    –   Disconnected clients move to old servers with prob 4/18 and new one with prob
        10/18
    –   Exp. 8 clients will move from A, B, C to D, E and 10 to F
Our approach - Probabilistic Load Balancing
• Example 1 :


                         X6         X6       X6           X6      X6
    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :
                                                   4/18        4/18      10/18




                                                        X 10      X 10      X 10

    –   Connected clients don’t move
    –   Disconnected clients move to old servers with prob 4/18 and new one with prob
        10/18
    –   Exp. 8 clients will move from A, B, C to D, E and 10 to F
Current Load Balancing
ProbabilisticCurrent Load Balancing
 When moving from config. S to S’:
E (load (i, S ' )) = load (i, S ) +     ∑ load ( j, S ) ⋅ Pr( j → i ) − load (i, S ) ∑ Pr(i → j )
                                      j∈S ∧ j ≠i                                 j∈S ' ∧ j ≠i

 expected #clients       #clients
 connected to i in S’   connected                      #clients
(10 in last example)     to i in S                                               #clients
                                                    moving to i from         moving from i to
                                                   other servers in S       other servers in S’
 Solving for Pr we get case-specific probabilities.
 Input: each client answers locally
 Question 1: Are there more servers now or less ?
 Question 2: Is my server being removed?
 Output: 1) disconnect or stay connected to my server
          if disconnect 2) Pr(connect to one of the old servers)
                 and Pr(connect to newly added server)
Implementation
• Implemented in Zookeeper (Java & C), integration ongoing
   – 3 new Zookeeper API calls: reconfig, getConfig, updateServerList
   – feature requested since 2008, expected in 3.5.0 release (july 2012)
• Dynamic changes to:
   –   Membership
   –   Quorum System
   –   Server roles
   –   Addresses & ports
• Reconfiguration modes:
   – Incremental (add servers E and D, remove server B)
   – Non-incremental (new config = {A, C, D, E})
   – Blind or conditioned (reconfig only if current config is #5)
• Subscriptions to config changes
   – Client can invoke client-side re-balancing upon change
52

                                      Summary
     • Design and implementation of reconfiguration for Apache Zookeeper
        – being contributed into Zookeeper codebase


     • Much simpler than state of the art, using properties already provided by Zookeeper

     • Many nice features:
        – Doesn’t limit concurrency
        – Reconfigures immediately
        – Preserves primary order
        – Doesn’t stop client ops
        – Zookeeper used by online systems, any delay must be avoided
        – Clients work with a single configuration at a time
        – No external services
        – Includes client-side rebalancing

Weitere ähnliche Inhalte

Was ist angesagt?

今話題のいろいろなコンテナランタイムを比較してみた
今話題のいろいろなコンテナランタイムを比較してみた今話題のいろいろなコンテナランタイムを比較してみた
今話題のいろいろなコンテナランタイムを比較してみたKohei Tokunaga
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamSATOSHI TAGOMORI
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentHostedbyConfluent
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...NETWAYS
 
Transaction Management on Cassandra
Transaction Management on CassandraTransaction Management on Cassandra
Transaction Management on CassandraScalar, Inc.
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterKenny Gryp
 
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개OpenStack Korea Community
 
知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月
知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月
知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月VirtualTech Japan Inc.
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouseVianney FOUCAULT
 
Cilium - BPF & XDP for containers
 Cilium - BPF & XDP for containers Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containersDocker, Inc.
 
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin	Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin Vietnam Open Infrastructure User Group
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBYugabyteDB
 
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...HostedbyConfluent
 

Was ist angesagt? (20)

今話題のいろいろなコンテナランタイムを比較してみた
今話題のいろいろなコンテナランタイムを比較してみた今話題のいろいろなコンテナランタイムを比較してみた
今話題のいろいろなコンテナランタイムを比較してみた
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
 
macvlan and ipvlan
macvlan and ipvlanmacvlan and ipvlan
macvlan and ipvlan
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
 
Transaction Management on Cassandra
Transaction Management on CassandraTransaction Management on Cassandra
Transaction Management on Cassandra
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
 
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
 
知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月
知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月
知っておくべきCephのIOアクセラレーション技術とその活用方法 - OpenStack最新情報セミナー 2015年9月
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
 
Cilium - BPF & XDP for containers
 Cilium - BPF & XDP for containers Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin	Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
Kata Container - The Security of VM and The Speed of Container | Yuntong Jin
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
Oracle Database 12c : Multitenant
Oracle Database 12c : MultitenantOracle Database 12c : Multitenant
Oracle Database 12c : Multitenant
 
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
 

Ähnlich wie Dynamic Reconfiguration of Apache ZooKeeper

Understanding histogramppt.prn
Understanding histogramppt.prnUnderstanding histogramppt.prn
Understanding histogramppt.prnLeyi (Kamus) Zhang
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeNYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeRizwan Habib
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16MLconf
 
Graph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraphGraph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraphAndrew Yongjoon Kong
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculatorsolarisyourep
 
C++ unit-1-part-11
C++ unit-1-part-11C++ unit-1-part-11
C++ unit-1-part-11Jadavsejal
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsFlink Forward
 
Lucene revolution 2011
Lucene revolution 2011Lucene revolution 2011
Lucene revolution 2011Takahiko Ito
 
Using GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaUsing GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaTim Ellison
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyUri Laserson
 
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxUKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxMarco Gralike
 
Provenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-ComputationProvenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-ComputationPaolo Missier
 
Real World Optimization
Real World OptimizationReal World Optimization
Real World OptimizationDavid Golden
 
Extend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemExtend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemFei Dong
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementKyong-Ha Lee
 
What's new in Doctrine
What's new in DoctrineWhat's new in Doctrine
What's new in DoctrineJonathan Wage
 
Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)Sean Cribbs
 
Eventually-Consistent Data Structures
Eventually-Consistent Data StructuresEventually-Consistent Data Structures
Eventually-Consistent Data StructuresSean Cribbs
 
Towards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingTowards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingEdmundo López Bóbeda
 

Ähnlich wie Dynamic Reconfiguration of Apache ZooKeeper (20)

Understanding histogramppt.prn
Understanding histogramppt.prnUnderstanding histogramppt.prn
Understanding histogramppt.prn
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeNYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKee
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
 
Graph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraphGraph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraph
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
 
C++ unit-1-part-11
C++ unit-1-part-11C++ unit-1-part-11
C++ unit-1-part-11
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
 
Lucene revolution 2011
Lucene revolution 2011Lucene revolution 2011
Lucene revolution 2011
 
Using GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaUsing GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with Java
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive Biology
 
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxUKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
 
Provenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-ComputationProvenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-Computation
 
Real World Optimization
Real World OptimizationReal World Optimization
Real World Optimization
 
Extend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemExtend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop Ecosystem
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
What's new in Doctrine
What's new in DoctrineWhat's new in Doctrine
What's new in Doctrine
 
Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)
 
Eventually-Consistent Data Structures
Eventually-Consistent Data StructuresEventually-Consistent Data Structures
Eventually-Consistent Data Structures
 
Towards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingTowards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker Prototyping
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Dynamic Reconfiguration of Apache ZooKeeper

  • 1. Dynamic Reconfiguration of ZooKeeper Alex Shraer (presented by Benjamin Reed)
  • 2. Why ZooKeeper? • Lots of servers • Lots of processes • High volumes of data • Highly complex software systems • … mere mortal developers
  • 3. What ZooKeeper gives you ● Simple programming model ● Coordination of distributed processes ● Fast notification of changes ● Elasticity ● Easy setup ● High availability
  • 4. ZooKeeper Configuration • Membership • Role of each server – E.g., follower or observer • Quorum System spec – Zookeeper: majority or hierarchical • Network addresses & ports • Timeouts, directory paths, etc.
  • 5. Zookeeper - distributed and replicated ZooKeeper Service Leader Server Server Server Server Server Client Client Client Client Client Client Client Client • All servers store a copy of the data (in memory) • A leader is elected at startup • Reads served by followers, all updates go through leader • Update acked when a quorum of servers have persisted the change (on disk) • Zookeeper uses ZAB - its own atomic broadcast protocol
  • 6. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 7. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 8. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 9. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 10. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 11. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 12. Hazards of Manual Reconfiguration E A C {A, B, C} B {A, B, C} D {A, B, C} • Goal: add servers E and D
  • 13. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration
  • 14. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers
  • 15. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers
  • 16. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers
  • 17. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers • Lost and !
  • 18. 18 Just use a coordination service! • Zookeeper is the coordination service – Don’t want to deploy another system to coordinate it! • Who will reconfigure that system ? – GFS has 3 levels of coordination services • More system components -> more management overhead • Use Zookeeper to reconfigure itself! – Other systems store configuration information in Zookeeper – Can we do the same?? – Only if there are no failures
  • 19. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 20. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 21. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 22. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 23. This doesn’t work for reconfigurations! E C B {A, B, C, D, E} {A, B, C, D, E} setData(/zookeeper/config, {A, B, F}) {A, B, C, D, E} D remove C, D, E add F F {A, B, C, D, E} A {A, B, C, D, E}
  • 24. This doesn’t work for reconfigurations! E C B {A, B, C, D, E} {A, B, C, D, E} setData(/zookeeper/config, {A, B, F}) {A, B, C, D, E} D remove C, D, E add F F {A, B, C, D, E} A {A, B, F} {A, B, F}
  • 25. This doesn’t work for reconfigurations! E C B {A, B, C, D, E} {A, B, C, D, E} setData(/zookeeper/config, {A, B, F}) {A, B, C, D, E} D remove C, D, E add F F {A, B, C, D, E} A {A, B, F} {A, B, F} • Must persist the decision to reconfigure in the old config before activating the new config! • Once such decision is reached, must not allow further ops to be committed in old config
  • 26. Our Solution • Correct • Fully automatic • No external services or additional components • Minimal changes to Zookeeper • Usually unnoticeable to clients – Pause operations only in rare circumstances – Clients work with a single configuration • Rebalances clients across servers in new configuration • Reconfigures immediately • Speculative Reconfiguration – Reconfiguration (and commands that follow it) speculatively sent out by the primary, similarly to all other updates
  • 27. Principles ● Commit reconfig in a quorum of the old ensemble – Submit reconfig op just like any other update ● Make sure new ensemble has latest state before becoming active – Get quorum of synced followers from new config – Get acks from both old and new ensembles before committing updates proposed between reconfig op and activation – Activate new configuration when reconfig commits ● Once new ensemble active old ensemble cannot commit or propose new updates ● Gossip activation through leader election and syncing ● Verify configuration id of leader and follower
  • 29. Reconfiguration scenario 1 E A C {A, B, C} {A, B, C} B {A, B, C} D {A, B, C} {A, B, C} • Goal: add servers E and D
  • 30. Reconfiguration scenario 1 E A C {A, B, C} B {A, B, C} D {A, B, C} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 31. Reconfiguration scenario 1 E A C {A, B, C} {A, B, C} B {A, B, C} D {A, B, C} {A, B, C} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 32. Reconfiguration scenario 1 E A C {A, B, C} {A, B, C} B {A, B, C} D {A, B, C} {A, B, C} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 33. Reconfiguration scenario 1 E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 34. Reconfiguration scenario 1 E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack • E and D gossip new configuration to C
  • 35. Reconfiguration scenario 1 E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack • E and D gossip new configuration to C
  • 36. Example - reconfig using CLI reconfig -add 1=host1.com:1234:1235:observer;1239 -add 2=host2.com:1236:1237:follower;1231 -remove 5 ● Change follower 1 to an observer and change its ports ● Add follower 2 to the ensemble ● Remove follower 5 from the ensemble reconfig -file myNewConfig.txt -v 234547 ● Change the current config to the one in myNewConfig.txt ● But only if current config version is 234547 getConfig -w -c ● set a watch on /zookeeper/config ● -c means we only want the new connection string for clients
  • 37. When it will not work ● Quorum of new ensemble must be in sync ● Another reconfig in progress ● Version condition check fails
  • 38. How do you know you are done ● Write something somewhere
  • 39. The “client side” of reconfiguration • When system changes, clients need to stay connected – The usual solution: directory service (e.g., DNS) • Re-balancing load during reconfiguration is also important! • Goal: uniform #clients per server with minimal client migration – Migration should be proportional to change in membership X 10 X 10 X 10
  • 40. The “client side” of reconfiguration • When system changes, clients need to stay connected – The usual solution: directory service (e.g., DNS) • Re-balancing load during reconfiguration is also important! • Goal: uniform #clients per server with minimal client migration – Migration should be proportional to change in membership X 10 X 10 X 10
  • 41. Our approach - Probabilistic Load Balancing • Example 1 : X 10 X 10 X 10
  • 42. Our approach - Probabilistic Load Balancing • Example 1 : X 10 X 10 X 10
  • 43. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server
  • 44. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : X6 X6 X6 X6 X6
  • 45. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : X6 X6 X6 X6 X6
  • 46. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : 4/18 4/18 10/18 X6 X6 X6 X6 X6 – Connected clients don’t move – Disconnected clients move to old servers with prob 4/18 and new one with prob 10/18 – Exp. 8 clients will move from A, B, C to D, E and 10 to F
  • 47. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : 4/18 4/18 10/18 X 10 X 10 X 10 – Connected clients don’t move – Disconnected clients move to old servers with prob 4/18 and new one with prob 10/18 – Exp. 8 clients will move from A, B, C to D, E and 10 to F
  • 49. ProbabilisticCurrent Load Balancing When moving from config. S to S’: E (load (i, S ' )) = load (i, S ) + ∑ load ( j, S ) ⋅ Pr( j → i ) − load (i, S ) ∑ Pr(i → j ) j∈S ∧ j ≠i j∈S ' ∧ j ≠i expected #clients #clients connected to i in S’ connected #clients (10 in last example) to i in S #clients moving to i from moving from i to other servers in S other servers in S’ Solving for Pr we get case-specific probabilities. Input: each client answers locally Question 1: Are there more servers now or less ? Question 2: Is my server being removed? Output: 1) disconnect or stay connected to my server if disconnect 2) Pr(connect to one of the old servers) and Pr(connect to newly added server)
  • 50. Implementation • Implemented in Zookeeper (Java & C), integration ongoing – 3 new Zookeeper API calls: reconfig, getConfig, updateServerList – feature requested since 2008, expected in 3.5.0 release (july 2012) • Dynamic changes to: – Membership – Quorum System – Server roles – Addresses & ports • Reconfiguration modes: – Incremental (add servers E and D, remove server B) – Non-incremental (new config = {A, C, D, E}) – Blind or conditioned (reconfig only if current config is #5) • Subscriptions to config changes – Client can invoke client-side re-balancing upon change
  • 51. 52 Summary • Design and implementation of reconfiguration for Apache Zookeeper – being contributed into Zookeeper codebase • Much simpler than state of the art, using properties already provided by Zookeeper • Many nice features: – Doesn’t limit concurrency – Reconfigures immediately – Preserves primary order – Doesn’t stop client ops – Zookeeper used by online systems, any delay must be avoided – Clients work with a single configuration at a time – No external services – Includes client-side rebalancing