SlideShare ist ein Scribd-Unternehmen logo
1 von 76
Downloaden Sie, um offline zu lesen
ZooKeeper
  Scott Leberknight
Your mission..
   (should you choose to accept it)

           Build a distributed lock service


           Only one process may own the lock


           Must preserve ordering of requests


           Ensure proper lock release



                  . .this message will self destruct in 5 seconds
Mission
Training
“Distributed Coordination
         Service”
Distributed, hierarchical ïŹlesystem

High availability, fault tolerant

Performant (i.e. it’s fast)

Facilitates loose coupling
Fallacies of Distributed
       Computing..
  # 1 The network is reliable.




       http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
Partial Failure

Did my message get through?


Did the operation complete?


      Or, did it fail???
Follow the Leader
                           One elected leader
      Many followers




Followers may lag leader
                             Eventual consistency
What problems can it solve?
  Group membership


  Distributed data structures
  (locks, queues, barriers, etc.)




  Reliable conïŹguration service


  Distributed workïŹ‚ow
Training exercise:
 Group Membership
Get connected..
public ZooKeeper connect(String hosts, int timeout)
      throws IOException, InterruptedException {

    final CountDownLatch signal = new CountDownLatch(1);
    ZooKeeper zk = new ZooKeeper(hosts, timeout, new Watcher() {
      @Override
      public void process(WatchedEvent event) {
        if (event.getState() ==
            Watcher.Event.KeeperState.SyncConnected) {
          signal.countDown();
        }
      }
    });
    signal.await();
    return zk;
}
                                  must wait for connected event!
ZooKeeper Sessions           Tick time

                       Session timeout

                     Automatic failover
znode

Persistent or ephemeral

Optional sequential numbering

Can hold data (like a ïŹle)

Persistent ones can have children (like a directory)
public void create(String groupName)
    throws KeeperException, InterruptedException {

    String path = "/" + groupName;
    String createdPath = zk.create(path,
      null /*data*/,
      ZooDefs.Ids.OPEN_ACL_UNSAFE,
      CreateMode.PERSISTENT);
    System.out.println("Created " + createdPath);
}
public void join(String groupName, String memberName)
    throws KeeperException, InterruptedException {

    String path = "/" + groupName + "/" + memberName;
    String createdPath = zk.create(path,
      null /*data*/,
      ZooDefs.Ids.OPEN_ACL_UNSAFE,
      CreateMode.EPHEMERAL);
    System.out.println("Created " + createdPath);
}
public void list(String groupName)
    throws KeeperException, InterruptedException {

    String path = "/" + groupName;
    try {
      List<String> children = zk.getChildren(path, false);
      for (String child : children) {
        System.out.println(child);
      }
    } catch (KeeperException.NoNodeException e) {
      System.out.printf("Group %s does not existn", groupName);
    }
}
public void delete(String groupName)
    throws KeeperException, InterruptedException {

    String path = "/" + groupName;
    try {
      List<String> children = zk.getChildren(path, false);
      for (String child : children) {
        zk.delete(path + "/" + child, -1);
      }
      zk.delete(path, -1); // parent
    } catch (KeeperException.NoNodeException e) {
      System.out.printf("%s does not existn", groupName);
    }
}


            -1   deletes unconditionally , or specify version
initial group members




node-4 died
Architecture
region




      Data Model
customer        account




address        transaction
Hierarchical ïŹlesystem


Comprised of znodes
                                     znode data
                                       (< 1 MB)
Watchers


Atomic znode access (reads/writes)


Security
(via authentication, ACLs)
znode - types


Persistent          Persistent Sequential


Ephemeral           Ephemeral Sequential


        die when session expires
znode - sequential



  {
         sequence numbers
znode - operations
 Operation     Type
  create       write
  delete       write
  exists       read
getChildren    read
  getData      read
  setData      write
  getACL       read
  setACL       write
   sync        read
APIs
 synchronous

 asynchronous
Synchronous
  public void list(final String groupName)
      throws KeeperException, InterruptedException {

      String path = "/" + groupName;
      try {
        List<String> children = zk.getChildren(path, false);
        // process children...
      }
      catch (KeeperException.NoNodeException e) {
        // handle non-existent path...
      }
  }
aSync
 public void list(String groupName)
     throws KeeperException, InterruptedException {

     String path = "/" + groupName;
     zk.getChildren(path, false,
       new AsyncCallback.ChildrenCallback() {
         @Override
         public void processResult(int rc, String path,
             Object ctx, List<String> children) {
           // process results when get called back later...
         }
       }, null /* optional context object */);
 }
znode - Watchers

Set (one-time) watches on read operations



    Write operations trigger watches
          on affected znodes



 Re-register watches on events (optional)
public interface Watcher {

    void process(WatchedEvent event);

    // other details elided...
}
public class ChildZNodeWatcher implements Watcher {
    private Semaphore semaphore = new Semaphore(1);

    public void watchChildren()
          throws InterruptedException, KeeperException {
        semaphore.acquire();
        while (true) {
            List<String> children = zk.getChildren(lockPath, this);
            display(children);
            semaphore.acquire();
        }
    }
    @Override public void process(WatchedEvent event) {
        if (event.getType() == Event.EventType.NodeChildrenChanged) {
          semaphore.release();
        }
    }
    // other details elided...
}
Gotcha!
          When using watchers...


          ...updates can be missed!
                (not seen by client)
Data Consistency
“A foolish consistency is the hobgoblin of little
minds, adored by little statesmen and philosophers
and divines. With consistency a great soul has simply
                                     nothing to do.”




                             - Ralph Waldo Emerson
                                      (Self-Reliance)
Data Consistency in ZK
Sequential updates                Durability (of writes)


Atomicity (all or nothin’)        Bounded lag time
                                    (eventual consistency)



Consistent client view
 (across all ZooKeeper servers)
Ensemble
broadcast                            broadcast




                                write
           Follower                                 Leader                           Follower




read                  read                  write            read                read                    read



  client              client               client            client             client          client
Leader election
                               “majority rules”
Atomic broadcast


Clients have session on one server


Writes routed through leader


Reads from server memory
Sessions
Client-requested timeout period



Automatic keep-alive (heartbeats)



 Automatic/transparent failover
Writes
All go through leader


       Broadcast to followers


                Global ordering
                                        every update
                                          has unique
                        zxid (ZooKeeper transaction id)
Reads
“Follow-the-leader”
                              Eventual consistency

         Can lag leader


                 In-memory (fast!)
Training Complete

      PA SSED
Fin al Mis sion:

Distributed Lock
Objectives

Mutual exclusion between processes



       Decouple lock users
Building Blocks
   parent lock znode

   ephemeral, sequential child znodes



                   lock-node


  child-1              child-2    ...   child-N
sample-lock
sample-lock


process-1


Lock
sample-lock


process-1   process-2


Lock
sample-lock


process-1   process-2     process-3


Lock
sample-lock


process-2     process-3


 Lock
sample-lock


              process-3


              Lock
sample-lock
pseudocode     (remember that stuff?)


  1. create child ephemeral, sequential znode

  2. list children of lock node and set a watch

  3. if znode from step 1 has lowest number,
     then lock is acquired. done.


  4. else wait for watch event from step 2,
     and then go back to step 2 on receipt
This is OK, but the re are problems. .




                          (perhaps a main B bus undervolt?)
Problem #1 - Connection Loss

     If partial failure on znode creation...



...then how do we know if znode was created?
Solution #1 (Connection Loss)

    Embed session id in child znode names


            lock-<sessionId>-


Failed-over client checks for child w/ sessionId
Problem #2 - The Herd Effect
Many clients watching lock node.



NotiïŹcation sent to all watchers on lock release...



     ...possible large spike in network trafïŹc
Solution #2 (The Herd Effect)
       Set watch only on child znode
       with preceding sequence number


 e.g. client holding lock-9 watches only lock-8


     Note, lock-9 could watch lock-7
               (e.g. lock-8 client died)
Implementation
Do it yourself?



...or be lazy and use existing code?
    org.apache.zookeeper.recipes.lock.WriteLock
    org.apache.zookeeper.recipes.lock.LockListener
Implementation, part deux

 WriteLock calls back when lock acquired (async)

 Maybe you want a synchronous client model...

 Let’s do some decoration...
public class BlockingWriteLock {
  private CountDownLatch signal = new CountDownLatch(1);
    // other instance variable definitions elided...


    public BlockingWriteLock(String name, ZooKeeper zookeeper,
        String path, List<ACL> acls) {
        this.name = name;
        this.path = path;
        this.writeLock = new WriteLock(zookeeper, path,
                                       acls, new SyncLockListener());
    }

    public void lock() throws InterruptedException, KeeperException {
      writeLock.lock();
        signal.await();
    }

    public void unlock() { writeLock.unlock(); }

    class SyncLockListener implements LockListener { /* next slide */ }
}
                                                                          55
class SyncLockListener implements LockListener {

    @Override public void lockAcquired() {
      signal.countDown();
    }

    @Override public void lockReleased() {
    }

}




                                                   56
BlockingWriteLock lock =
  new BlockingWriteLock(myName, zooKeeper, path,
                         ZooDefs.Ids.OPEN_ACL_UNSAFE);
try {
  lock.lock();
  // do something while we have the lock
}
catch (Exception ex) {
  // handle appropriately...
}
finally {
  lock.unlock();
}
                            Easy to forget!

                                                         57
(we can do a little better, right?)
public class DistributedOperationExecutor {
  private final ZooKeeper _zk;
  public DistributedOperationExecutor(ZooKeeper zk) { _zk = zk; }

    public Object withLock(String name, String lockPath,
                            List<ACL> acls,
                            DistributedOperation op)
        throws InterruptedException, KeeperException {
      BlockingWriteLock writeLock =
        new BlockingWriteLock(name, _zk, lockPath, acl);
      try {
        writeLock.lock();
        return op.execute();
      } finally {
        writeLock.unlock();
      }
    }
}
executor.withLock(myName, path, ZooDefs.Ids.OPEN_ACL_UNSAFE,
   new DistributedOperation() {
     @Override public Object execute() {
       // do something while we have the lock
       return whateverTheResultIs;
     }
   }
);
Run it!
Mission:

      OMP LIS HED
ACC
Review..
let’s crowdsource
   (for free beer, of course)
Like a ïŹlesystem, except distributed & replicated


Build distributed coordination, data structures, etc.


High-availability, reliability


Automatic session failover, keep-alive


Writes via leader, in-memory reads (fast)
Refs
zookeeper.apache.org/




                        shop.oreilly.com/product/0636920021773.do
                        (3rd edition pub date is May 29, 2012)
photo attributions
  Follow the Leader - http://www.ïŹ‚ickr.com/photos/davidspinks/4211977680/


  Antique Clock - http://www.ïŹ‚ickr.com/photos/cncphotos/3828163139/


  Skull & Crossbones - http://www.ïŹ‚ickr.com/photos/halfanacre/2841434212/


  Ralph Waldo Emerson - http://www.americanpoems.com/poets/emerson


  1921 Jazzing Orchestra - http://en.wikipedia.org/wiki/File:Jazzing_orchestra_1921.png


   Apollo 13 Patch - http://science.ksc.nasa.gov/history/apollo/apollo-13/apollo-13.html


  Herd of Sheep - http://www.ïŹ‚ickr.com/photos/freefoto/639294974/


  Running on Beach - http://www.ïŹ‚ickr.com/photos/kcdale99/3148108922/


  Crowd - http://www.ïŹ‚ickr.com/photos/laubarnes/5449810523/
                                                                             (others from iStockPhoto...)
(my info)


scott.leberknight@nearinfinity.com
www.nearinfinity.com/blogs/
twitter: sleberknight

Weitere Àhnliche Inhalte

Was ist angesagt?

Introduction to the Disruptor
Introduction to the DisruptorIntroduction to the Disruptor
Introduction to the DisruptorTrisha Gee
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practiceEugene Fidelin
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseC4Media
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultDataWorks Summit
 
CockroachDB
CockroachDBCockroachDB
CockroachDBandrei moga
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Flink Streaming
Flink StreamingFlink Streaming
Flink StreamingGyula FĂłra
 
Apache Kudu: Technical Deep Dive‹‹
Apache Kudu: Technical Deep Dive‹‹Apache Kudu: Technical Deep Dive‹‹
Apache Kudu: Technical Deep Dive‹‹Cloudera, Inc.
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: InternalsKnoldus Inc.
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkHortonworks
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 

Was ist angesagt? (20)

Introduction to the Disruptor
Introduction to the DisruptorIntroduction to the Disruptor
Introduction to the Disruptor
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
 
CockroachDB
CockroachDBCockroachDB
CockroachDB
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Flink Streaming
Flink StreamingFlink Streaming
Flink Streaming
 
Apache Kudu: Technical Deep Dive‹‹
Apache Kudu: Technical Deep Dive‹‹Apache Kudu: Technical Deep Dive‹‹
Apache Kudu: Technical Deep Dive‹‹
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: Internals
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 

Ähnlich wie Apache ZooKeeper

Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperAlex Ehrnschwender
 
What`s new in Java 7
What`s new in Java 7What`s new in Java 7
What`s new in Java 7Georgian Micsa
 
Much ado about randomness. What is really a random number?
Much ado about randomness. What is really a random number?Much ado about randomness. What is really a random number?
Much ado about randomness. What is really a random number?Aleksandr Yampolskiy
 
Virtualizing Java in Java (jug.ru)
Virtualizing Java in Java (jug.ru)Virtualizing Java in Java (jug.ru)
Virtualizing Java in Java (jug.ru)aragozin
 
Java basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini indiaJava basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini indiaSanjeev Tripathi
 
Java basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini indiaJava basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini indiasanjeeviniindia1186
 
Nodejs Intro Part One
Nodejs Intro Part OneNodejs Intro Part One
Nodejs Intro Part OneBudh Ram Gurung
 
Node.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterNode.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterSimen Li
 
Node intro
Node introNode intro
Node introcloudhead
 
Python twisted
Python twistedPython twisted
Python twistedMahendra M
 
Let'swift "Concurrency in swift"
Let'swift "Concurrency in swift"Let'swift "Concurrency in swift"
Let'swift "Concurrency in swift"Hyuk Hur
 
Reactive Programming in .Net - actorbased computing with Akka.Net
Reactive Programming in .Net - actorbased computing with Akka.NetReactive Programming in .Net - actorbased computing with Akka.Net
Reactive Programming in .Net - actorbased computing with Akka.NetSören Stelzer
 
55j7
55j755j7
55j7swein2
 
soft-shake.ch - Hands on Node.js
soft-shake.ch - Hands on Node.jssoft-shake.ch - Hands on Node.js
soft-shake.ch - Hands on Node.jssoft-shake.ch
 
DevoxxUK: Is your profiler speaking the same language as you?
DevoxxUK: Is your profiler speaking the same language as you?DevoxxUK: Is your profiler speaking the same language as you?
DevoxxUK: Is your profiler speaking the same language as you?Simon Maple
 
Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?Simon Maple
 
Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?Simon Maple
 

Ähnlich wie Apache ZooKeeper (20)

Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache Zookeeper
 
What`s new in Java 7
What`s new in Java 7What`s new in Java 7
What`s new in Java 7
 
Curator intro
Curator introCurator intro
Curator intro
 
Much ado about randomness. What is really a random number?
Much ado about randomness. What is really a random number?Much ado about randomness. What is really a random number?
Much ado about randomness. What is really a random number?
 
Virtualizing Java in Java (jug.ru)
Virtualizing Java in Java (jug.ru)Virtualizing Java in Java (jug.ru)
Virtualizing Java in Java (jug.ru)
 
Java basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini indiaJava basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini india
 
Java basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini indiaJava basic tutorial by sanjeevini india
Java basic tutorial by sanjeevini india
 
Nodejs Intro Part One
Nodejs Intro Part OneNodejs Intro Part One
Nodejs Intro Part One
 
Multi-Threading
Multi-ThreadingMulti-Threading
Multi-Threading
 
Node.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitterNode.js Event Loop & EventEmitter
Node.js Event Loop & EventEmitter
 
Node intro
Node introNode intro
Node intro
 
Python twisted
Python twistedPython twisted
Python twisted
 
Let'swift "Concurrency in swift"
Let'swift "Concurrency in swift"Let'swift "Concurrency in swift"
Let'swift "Concurrency in swift"
 
Reactive Programming in .Net - actorbased computing with Akka.Net
Reactive Programming in .Net - actorbased computing with Akka.NetReactive Programming in .Net - actorbased computing with Akka.Net
Reactive Programming in .Net - actorbased computing with Akka.Net
 
55j7
55j755j7
55j7
 
soft-shake.ch - Hands on Node.js
soft-shake.ch - Hands on Node.jssoft-shake.ch - Hands on Node.js
soft-shake.ch - Hands on Node.js
 
Java
Java Java
Java
 
DevoxxUK: Is your profiler speaking the same language as you?
DevoxxUK: Is your profiler speaking the same language as you?DevoxxUK: Is your profiler speaking the same language as you?
DevoxxUK: Is your profiler speaking the same language as you?
 
Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?
 
Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?Devoxx PL: Is your profiler speaking the same language as you?
Devoxx PL: Is your profiler speaking the same language as you?
 

Mehr von Scott Leberknight (20)

JShell & ki
JShell & kiJShell & ki
JShell & ki
 
JUnit Pioneer
JUnit PioneerJUnit Pioneer
JUnit Pioneer
 
JDKs 10 to 14 (and beyond)
JDKs 10 to 14 (and beyond)JDKs 10 to 14 (and beyond)
JDKs 10 to 14 (and beyond)
 
Unit Testing
Unit TestingUnit Testing
Unit Testing
 
SDKMAN!
SDKMAN!SDKMAN!
SDKMAN!
 
JUnit 5
JUnit 5JUnit 5
JUnit 5
 
AWS Lambda
AWS LambdaAWS Lambda
AWS Lambda
 
Dropwizard
DropwizardDropwizard
Dropwizard
 
RESTful Web Services with Jersey
RESTful Web Services with JerseyRESTful Web Services with Jersey
RESTful Web Services with Jersey
 
httpie
httpiehttpie
httpie
 
jps & jvmtop
jps & jvmtopjps & jvmtop
jps & jvmtop
 
Cloudera Impala, updated for v1.0
Cloudera Impala, updated for v1.0Cloudera Impala, updated for v1.0
Cloudera Impala, updated for v1.0
 
Java 8 Lambda Expressions
Java 8 Lambda ExpressionsJava 8 Lambda Expressions
Java 8 Lambda Expressions
 
Google Guava
Google GuavaGoogle Guava
Google Guava
 
Cloudera Impala
Cloudera ImpalaCloudera Impala
Cloudera Impala
 
iOS
iOSiOS
iOS
 
HBase Lightning Talk
HBase Lightning TalkHBase Lightning Talk
HBase Lightning Talk
 
Hadoop
HadoopHadoop
Hadoop
 
wtf is in Java/JDK/wtf7?
wtf is in Java/JDK/wtf7?wtf is in Java/JDK/wtf7?
wtf is in Java/JDK/wtf7?
 
CoffeeScript
CoffeeScriptCoffeeScript
CoffeeScript
 

KĂŒrzlich hochgeladen

Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vĂĄzquez
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

KĂŒrzlich hochgeladen (20)

Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Apache ZooKeeper

  • 1. ZooKeeper Scott Leberknight
  • 2. Your mission.. (should you choose to accept it) Build a distributed lock service Only one process may own the lock Must preserve ordering of requests Ensure proper lock release . .this message will self destruct in 5 seconds
  • 4. “Distributed Coordination Service” Distributed, hierarchical ïŹlesystem High availability, fault tolerant Performant (i.e. it’s fast) Facilitates loose coupling
  • 5. Fallacies of Distributed Computing.. # 1 The network is reliable. http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
  • 6. Partial Failure Did my message get through? Did the operation complete? Or, did it fail???
  • 7. Follow the Leader One elected leader Many followers Followers may lag leader Eventual consistency
  • 8. What problems can it solve? Group membership Distributed data structures (locks, queues, barriers, etc.) Reliable conïŹguration service Distributed workïŹ‚ow
  • 11. public ZooKeeper connect(String hosts, int timeout) throws IOException, InterruptedException { final CountDownLatch signal = new CountDownLatch(1); ZooKeeper zk = new ZooKeeper(hosts, timeout, new Watcher() { @Override public void process(WatchedEvent event) { if (event.getState() == Watcher.Event.KeeperState.SyncConnected) { signal.countDown(); } } }); signal.await(); return zk; } must wait for connected event!
  • 12. ZooKeeper Sessions Tick time Session timeout Automatic failover
  • 13. znode Persistent or ephemeral Optional sequential numbering Can hold data (like a ïŹle) Persistent ones can have children (like a directory)
  • 14. public void create(String groupName) throws KeeperException, InterruptedException { String path = "/" + groupName; String createdPath = zk.create(path, null /*data*/, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); System.out.println("Created " + createdPath); }
  • 15. public void join(String groupName, String memberName) throws KeeperException, InterruptedException { String path = "/" + groupName + "/" + memberName; String createdPath = zk.create(path, null /*data*/, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL); System.out.println("Created " + createdPath); }
  • 16. public void list(String groupName) throws KeeperException, InterruptedException { String path = "/" + groupName; try { List<String> children = zk.getChildren(path, false); for (String child : children) { System.out.println(child); } } catch (KeeperException.NoNodeException e) { System.out.printf("Group %s does not existn", groupName); } }
  • 17. public void delete(String groupName) throws KeeperException, InterruptedException { String path = "/" + groupName; try { List<String> children = zk.getChildren(path, false); for (String child : children) { zk.delete(path + "/" + child, -1); } zk.delete(path, -1); // parent } catch (KeeperException.NoNodeException e) { System.out.printf("%s does not existn", groupName); } } -1 deletes unconditionally , or specify version
  • 20. region Data Model customer account address transaction
  • 21. Hierarchical ïŹlesystem Comprised of znodes znode data (< 1 MB) Watchers Atomic znode access (reads/writes) Security (via authentication, ACLs)
  • 22. znode - types Persistent Persistent Sequential Ephemeral Ephemeral Sequential die when session expires
  • 23. znode - sequential { sequence numbers
  • 24. znode - operations Operation Type create write delete write exists read getChildren read getData read setData write getACL read setACL write sync read
  • 26. Synchronous public void list(final String groupName) throws KeeperException, InterruptedException { String path = "/" + groupName; try { List<String> children = zk.getChildren(path, false); // process children... } catch (KeeperException.NoNodeException e) { // handle non-existent path... } }
  • 27. aSync public void list(String groupName) throws KeeperException, InterruptedException { String path = "/" + groupName; zk.getChildren(path, false, new AsyncCallback.ChildrenCallback() { @Override public void processResult(int rc, String path, Object ctx, List<String> children) { // process results when get called back later... } }, null /* optional context object */); }
  • 28. znode - Watchers Set (one-time) watches on read operations Write operations trigger watches on affected znodes Re-register watches on events (optional)
  • 29. public interface Watcher { void process(WatchedEvent event); // other details elided... }
  • 30. public class ChildZNodeWatcher implements Watcher { private Semaphore semaphore = new Semaphore(1); public void watchChildren() throws InterruptedException, KeeperException { semaphore.acquire(); while (true) { List<String> children = zk.getChildren(lockPath, this); display(children); semaphore.acquire(); } } @Override public void process(WatchedEvent event) { if (event.getType() == Event.EventType.NodeChildrenChanged) { semaphore.release(); } } // other details elided... }
  • 31. Gotcha! When using watchers... ...updates can be missed! (not seen by client)
  • 33. “A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. With consistency a great soul has simply nothing to do.” - Ralph Waldo Emerson (Self-Reliance)
  • 34. Data Consistency in ZK Sequential updates Durability (of writes) Atomicity (all or nothin’) Bounded lag time (eventual consistency) Consistent client view (across all ZooKeeper servers)
  • 36. broadcast broadcast write Follower Leader Follower read read write read read read client client client client client client
  • 37. Leader election “majority rules” Atomic broadcast Clients have session on one server Writes routed through leader Reads from server memory
  • 38. Sessions Client-requested timeout period Automatic keep-alive (heartbeats) Automatic/transparent failover
  • 39. Writes All go through leader Broadcast to followers Global ordering every update has unique zxid (ZooKeeper transaction id)
  • 40. Reads “Follow-the-leader” Eventual consistency Can lag leader In-memory (fast!)
  • 41. Training Complete PA SSED
  • 42. Fin al Mis sion: Distributed Lock
  • 43. Objectives Mutual exclusion between processes Decouple lock users
  • 44. Building Blocks parent lock znode ephemeral, sequential child znodes lock-node child-1 child-2 ... child-N
  • 47. sample-lock process-1 process-2 Lock
  • 48. sample-lock process-1 process-2 process-3 Lock
  • 49. sample-lock process-2 process-3 Lock
  • 50. sample-lock process-3 Lock
  • 52. pseudocode (remember that stuff?) 1. create child ephemeral, sequential znode 2. list children of lock node and set a watch 3. if znode from step 1 has lowest number, then lock is acquired. done. 4. else wait for watch event from step 2, and then go back to step 2 on receipt
  • 53. This is OK, but the re are problems. . (perhaps a main B bus undervolt?)
  • 54. Problem #1 - Connection Loss If partial failure on znode creation... ...then how do we know if znode was created?
  • 55. Solution #1 (Connection Loss) Embed session id in child znode names lock-<sessionId>- Failed-over client checks for child w/ sessionId
  • 56. Problem #2 - The Herd Effect
  • 57. Many clients watching lock node. NotiïŹcation sent to all watchers on lock release... ...possible large spike in network trafïŹc
  • 58. Solution #2 (The Herd Effect) Set watch only on child znode with preceding sequence number e.g. client holding lock-9 watches only lock-8 Note, lock-9 could watch lock-7 (e.g. lock-8 client died)
  • 59. Implementation Do it yourself? ...or be lazy and use existing code? org.apache.zookeeper.recipes.lock.WriteLock org.apache.zookeeper.recipes.lock.LockListener
  • 60. Implementation, part deux WriteLock calls back when lock acquired (async) Maybe you want a synchronous client model... Let’s do some decoration...
  • 61. public class BlockingWriteLock { private CountDownLatch signal = new CountDownLatch(1); // other instance variable definitions elided... public BlockingWriteLock(String name, ZooKeeper zookeeper, String path, List<ACL> acls) { this.name = name; this.path = path; this.writeLock = new WriteLock(zookeeper, path, acls, new SyncLockListener()); } public void lock() throws InterruptedException, KeeperException { writeLock.lock(); signal.await(); } public void unlock() { writeLock.unlock(); } class SyncLockListener implements LockListener { /* next slide */ } } 55
  • 62. class SyncLockListener implements LockListener { @Override public void lockAcquired() { signal.countDown(); } @Override public void lockReleased() { } } 56
  • 63. BlockingWriteLock lock = new BlockingWriteLock(myName, zooKeeper, path, ZooDefs.Ids.OPEN_ACL_UNSAFE); try { lock.lock(); // do something while we have the lock } catch (Exception ex) { // handle appropriately... } finally { lock.unlock(); } Easy to forget! 57
  • 64. (we can do a little better, right?)
  • 65. public class DistributedOperationExecutor { private final ZooKeeper _zk; public DistributedOperationExecutor(ZooKeeper zk) { _zk = zk; } public Object withLock(String name, String lockPath, List<ACL> acls, DistributedOperation op) throws InterruptedException, KeeperException { BlockingWriteLock writeLock = new BlockingWriteLock(name, _zk, lockPath, acl); try { writeLock.lock(); return op.execute(); } finally { writeLock.unlock(); } } }
  • 66. executor.withLock(myName, path, ZooDefs.Ids.OPEN_ACL_UNSAFE, new DistributedOperation() { @Override public Object execute() { // do something while we have the lock return whateverTheResultIs; } } );
  • 68.
  • 69. Mission: OMP LIS HED ACC
  • 71. let’s crowdsource (for free beer, of course)
  • 72. Like a ïŹlesystem, except distributed & replicated Build distributed coordination, data structures, etc. High-availability, reliability Automatic session failover, keep-alive Writes via leader, in-memory reads (fast)
  • 73. Refs
  • 74. zookeeper.apache.org/ shop.oreilly.com/product/0636920021773.do (3rd edition pub date is May 29, 2012)
  • 75. photo attributions Follow the Leader - http://www.ïŹ‚ickr.com/photos/davidspinks/4211977680/ Antique Clock - http://www.ïŹ‚ickr.com/photos/cncphotos/3828163139/ Skull & Crossbones - http://www.ïŹ‚ickr.com/photos/halfanacre/2841434212/ Ralph Waldo Emerson - http://www.americanpoems.com/poets/emerson 1921 Jazzing Orchestra - http://en.wikipedia.org/wiki/File:Jazzing_orchestra_1921.png Apollo 13 Patch - http://science.ksc.nasa.gov/history/apollo/apollo-13/apollo-13.html Herd of Sheep - http://www.ïŹ‚ickr.com/photos/freefoto/639294974/ Running on Beach - http://www.ïŹ‚ickr.com/photos/kcdale99/3148108922/ Crowd - http://www.ïŹ‚ickr.com/photos/laubarnes/5449810523/ (others from iStockPhoto...)