SlideShare ist ein Scribd-Unternehmen logo
1 von 31
ID generation



PHP London 2012-08-02
@davegardnerisme
@davegardnerisme




hailoapp.com/dave
(for a £5 discount)
MySQL auto increment



                                    DC 1



               1,2,3,4…
       MySQL              Web App
MySQL auto increment


   • Numeric IDs
   • Go up with time
   • Not resilient
MySQL multi-master replication



                                     DC 1

                1,3,5,7…
        MySQL
                           Web App
                2,4,6,8…
        MySQL
MySQL multi-master replication


   • Numeric IDs
   • Do not go up with time
   • Some resilience
DC 1

DC 2   DC 4
                       DC 5




                              DC 6
         DC 3




                 Going global…
MySQL in multi DC setup

                            DC 1

           1,2,3…   Web
   MySQL
                    App
                                   WAN LINK


                     DC 2




                            ?                 Web
                                              App
Flickr MySQL ticket server

                             DC 1


   Ticket   1,3,5…   Web                 WAN link not required to
   Server            App                          generate an ID

                                      WAN LINK


                      DC 2


                             Ticket   4,6,8…     Web
                             Server              App
Flickr MySQL ticket server


    • Numeric IDs
    • Do not go up with time
    • Resilient and distributed
    • ID generation separated from
      data store
The anatomy of a ticket server

                                        DC


    Web      Web            Web   Web
    App      App            App   App




                   Ticket
                   Server
Making things simpler



                                          DC


    Web      Web        Web      Web
    App      App        App      App

   ID gen    ID gen     ID gen   ID gen
UUIDs


   • 128 bits
   • Could use type 4 (Random) or
     type 1 (MAC address with time
     component)
   • Can generate on each machine
     with no co-ordination
Type 4 – random


                             version

 xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
        variant (8, 9, A or B)



 f47ac10b-58cc-4372-a567-0e02b2c3d479
5.3 x          1036


possible values for a type 4 UUID
1.1 x        1019


UUIDs we could generate per second
     since the Universe began
2.1 x          1027


Olympic swimming pools filled if each
possible value contributed a millilitre
Type 1 – MAC address


 51063800-dc76-11e1-9fae-001c42000009


   • Time component is based on 100
     nanosecond intervals since
     October 15, 1582
   • Most significant bits of timestamp
     shifted to least significant bits of
     UUID
Type 1 – MAC address


   • The address (MAC) of the
     computer that generated the ID is
     encoded into it
   • Lexical ordering essentially
     meaningless
   • Deterministically unique
There are some other options…
No co-ordination needed


Deterministically unique


K-ordered (time-ordered
       lexically)
Twitter Snowflake


   • Under 64 bits
   • No co-ordination (after startup)
   • K-ordered
   • Scala service, Thrift interface,
     uses Zookeeper for configuration
Twitter Snowflake


   41 bits   Timestamp
             millisecond precision, bespoke epoch

   10 bits   Configured machine ID
   12 bits   Sequence number
Twitter Snowflake


   77669839702851584


   = (timestamp << 22)
    | (machine << 12)
    | sequence
Boundary Flake


   • 128 bits
   • No co-ordination at all
   • K-ordered
   • Erlang service
Boundary Flake


   64 bits   Timestamp
             millisecond precision, 1970 epoch

   48 bits   MAC address
   16 bits   Sequence number
PHP Cruftflake


   • Based on Twitter Snowflake
   • No co-ordination (after startup)
   • K-ordered
   • PHP, ZeroMQ interface, uses
     Zookeeper for configuration
Questions?
References
Flickr distributed ticket server
http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-
the-cheap/

UUIDs
http://tools.ietf.org/html/rfc4122

How random are random UUIDs?
http://stackoverflow.com/a/2514722/15318

Twitter Snowflake
https://github.com/twitter/snowflake

Boundary Flake
https://github.com/boundary/flake

PHP Cruftflake
https://github.com/davegardnerisme/cruftflake
private function mintId64($timestamp, $machine, $sequence)
{
    $timestamp = (int)$timestamp;
    $value = ($timestamp << 22) | ($machine << 12) | $sequence;
    return (string)$value;
}

private function mintId32($timestamp, $machine, $sequence)
{
    $hi = (int)($timestamp / pow(2,10));
    $lo = (int)($timestamp * pow(2, 22));

    // stick in the machine + sequence to the low bit
    $lo = $lo | ($machine << 12) | $sequence;

    // reconstruct into a string of numbers
    $hex = pack('N2', $hi, $lo);
    $unpacked = unpack('H*', $hex);
    $value = $this->hexdec($unpacked[1]);
    return (string)$value;
}
public function generate()
{
    $t = floor($this->timer->getUnixTimestamp()
             - $this->epoch);
    if ($t !== $this->lastTime) {
        $this->sequence = 0;
        $this->lastTime = $t;
    } else {
        $this->sequence++;
        if ($this->sequence > 4095) {
             throw new OverflowException('Sequence overflow');
        }
    }

    if (PHP_INT_SIZE === 4) {
        return $this->mintId32($t, $this->machine,
             $this->sequence);
    } else {
        return $this->mintId64($t, $this->machine,
             $this->sequence);
    }
}

Weitere ähnliche Inhalte

Was ist angesagt?

From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitternkallen
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)KafkaZone
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions Yugabyte
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScaleSeunghyun Lee
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Philip Fisher-Ogden
 

Was ist angesagt? (20)

From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Big Data in Real-Time at Twitter
Big Data in Real-Time at TwitterBig Data in Real-Time at Twitter
Big Data in Real-Time at Twitter
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014
 

Andere mochten auch

Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Brendan Gregg
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 

Andere mochten auch (7)

Velocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 

Ähnlich wie Unique ID generation in distributed systems

MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...DataWorks Summit
 
Introduction to web security @ confess 2012
Introduction to web security @ confess 2012Introduction to web security @ confess 2012
Introduction to web security @ confess 2012jakobkorherr
 
.NET Cloud-Native Bootcamp Minneapolis
.NET Cloud-Native Bootcamp Minneapolis.NET Cloud-Native Bootcamp Minneapolis
.NET Cloud-Native Bootcamp MinneapolisVMware Tanzu
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterJohn Adams
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Game On! Exploring Microservices with a Text-Based Adventure Game
Game On! Exploring Microservices with a Text-Based Adventure GameGame On! Exploring Microservices with a Text-Based Adventure Game
Game On! Exploring Microservices with a Text-Based Adventure GameErin Schnabel
 
SPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival GuideSPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival GuideJ.D. Wade
 
Infrastructure API Lightning Talk by Jeremy Pollard of box.com
Infrastructure API Lightning Talk by Jeremy Pollard of box.comInfrastructure API Lightning Talk by Jeremy Pollard of box.com
Infrastructure API Lightning Talk by Jeremy Pollard of box.comDevOps4Networks
 
How to over-engineer things and have fun? | Oto Brglez, OPALAB
How to over-engineer things and have fun? | Oto Brglez, OPALABHow to over-engineer things and have fun? | Oto Brglez, OPALAB
How to over-engineer things and have fun? | Oto Brglez, OPALABHostedbyConfluent
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxData Con LA
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Asynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per secondAsynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per secondStuart (Pid) Williams
 
Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...
Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...
Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...Damir Dobric
 

Ähnlich wie Unique ID generation in distributed systems (20)

MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
 
Introduction to web security @ confess 2012
Introduction to web security @ confess 2012Introduction to web security @ confess 2012
Introduction to web security @ confess 2012
 
.NET Cloud-Native Bootcamp Minneapolis
.NET Cloud-Native Bootcamp Minneapolis.NET Cloud-Native Bootcamp Minneapolis
.NET Cloud-Native Bootcamp Minneapolis
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Game On! Exploring Microservices with a Text-Based Adventure Game
Game On! Exploring Microservices with a Text-Based Adventure GameGame On! Exploring Microservices with a Text-Based Adventure Game
Game On! Exploring Microservices with a Text-Based Adventure Game
 
SPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival GuideSPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival Guide
 
Infrastructure API Lightning Talk by Jeremy Pollard of box.com
Infrastructure API Lightning Talk by Jeremy Pollard of box.comInfrastructure API Lightning Talk by Jeremy Pollard of box.com
Infrastructure API Lightning Talk by Jeremy Pollard of box.com
 
How to over-engineer things and have fun? | Oto Brglez, OPALAB
How to over-engineer things and have fun? | Oto Brglez, OPALABHow to over-engineer things and have fun? | Oto Brglez, OPALAB
How to over-engineer things and have fun? | Oto Brglez, OPALAB
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of Datastax
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Asynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per secondAsynchronous design with Spring and RTI: 1M events per second
Asynchronous design with Spring and RTI: 1M events per second
 
Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...
Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...
Realtime Messaging und verteilte Systeme mit SharePoint und Windows Azure Ser...
 

Mehr von Dave Gardner

Cabs, Cassandra, and Hailo (at Cassandra EU)
Cabs, Cassandra, and Hailo (at Cassandra EU)Cabs, Cassandra, and Hailo (at Cassandra EU)
Cabs, Cassandra, and Hailo (at Cassandra EU)Dave Gardner
 
Cabs, Cassandra, and Hailo
Cabs, Cassandra, and HailoCabs, Cassandra, and Hailo
Cabs, Cassandra, and HailoDave Gardner
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsDave Gardner
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning CassandraDave Gardner
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraDave Gardner
 
Intro slides from Cassandra London July 2011
Intro slides from Cassandra London July 2011Intro slides from Cassandra London July 2011
Intro slides from Cassandra London July 2011Dave Gardner
 
2011.07.18 cassandrameetup
2011.07.18 cassandrameetup2011.07.18 cassandrameetup
2011.07.18 cassandrameetupDave Gardner
 
Cassandra + Hadoop = Brisk
Cassandra + Hadoop = BriskCassandra + Hadoop = Brisk
Cassandra + Hadoop = BriskDave Gardner
 
Introduction to Cassandra at London Web Meetup
Introduction to Cassandra at London Web MeetupIntroduction to Cassandra at London Web Meetup
Introduction to Cassandra at London Web MeetupDave Gardner
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 

Mehr von Dave Gardner (13)

Cabs, Cassandra, and Hailo (at Cassandra EU)
Cabs, Cassandra, and Hailo (at Cassandra EU)Cabs, Cassandra, and Hailo (at Cassandra EU)
Cabs, Cassandra, and Hailo (at Cassandra EU)
 
Cabs, Cassandra, and Hailo
Cabs, Cassandra, and HailoCabs, Cassandra, and Hailo
Cabs, Cassandra, and Hailo
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
 
Intro slides from Cassandra London July 2011
Intro slides from Cassandra London July 2011Intro slides from Cassandra London July 2011
Intro slides from Cassandra London July 2011
 
2011.07.18 cassandrameetup
2011.07.18 cassandrameetup2011.07.18 cassandrameetup
2011.07.18 cassandrameetup
 
Cassandra + Hadoop = Brisk
Cassandra + Hadoop = BriskCassandra + Hadoop = Brisk
Cassandra + Hadoop = Brisk
 
Introduction to Cassandra at London Web Meetup
Introduction to Cassandra at London Web MeetupIntroduction to Cassandra at London Web Meetup
Introduction to Cassandra at London Web Meetup
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
PHP and Cassandra
PHP and CassandraPHP and Cassandra
PHP and Cassandra
 

Kürzlich hochgeladen

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Kürzlich hochgeladen (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Unique ID generation in distributed systems

  • 1. ID generation PHP London 2012-08-02 @davegardnerisme
  • 3. MySQL auto increment DC 1 1,2,3,4… MySQL Web App
  • 4. MySQL auto increment • Numeric IDs • Go up with time • Not resilient
  • 5. MySQL multi-master replication DC 1 1,3,5,7… MySQL Web App 2,4,6,8… MySQL
  • 6. MySQL multi-master replication • Numeric IDs • Do not go up with time • Some resilience
  • 7. DC 1 DC 2 DC 4 DC 5 DC 6 DC 3 Going global…
  • 8. MySQL in multi DC setup DC 1 1,2,3… Web MySQL App WAN LINK DC 2 ? Web App
  • 9. Flickr MySQL ticket server DC 1 Ticket 1,3,5… Web WAN link not required to Server App generate an ID WAN LINK DC 2 Ticket 4,6,8… Web Server App
  • 10. Flickr MySQL ticket server • Numeric IDs • Do not go up with time • Resilient and distributed • ID generation separated from data store
  • 11. The anatomy of a ticket server DC Web Web Web Web App App App App Ticket Server
  • 12. Making things simpler DC Web Web Web Web App App App App ID gen ID gen ID gen ID gen
  • 13. UUIDs • 128 bits • Could use type 4 (Random) or type 1 (MAC address with time component) • Can generate on each machine with no co-ordination
  • 14. Type 4 – random version xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx variant (8, 9, A or B) f47ac10b-58cc-4372-a567-0e02b2c3d479
  • 15. 5.3 x 1036 possible values for a type 4 UUID
  • 16. 1.1 x 1019 UUIDs we could generate per second since the Universe began
  • 17. 2.1 x 1027 Olympic swimming pools filled if each possible value contributed a millilitre
  • 18. Type 1 – MAC address 51063800-dc76-11e1-9fae-001c42000009 • Time component is based on 100 nanosecond intervals since October 15, 1582 • Most significant bits of timestamp shifted to least significant bits of UUID
  • 19. Type 1 – MAC address • The address (MAC) of the computer that generated the ID is encoded into it • Lexical ordering essentially meaningless • Deterministically unique
  • 20. There are some other options…
  • 21. No co-ordination needed Deterministically unique K-ordered (time-ordered lexically)
  • 22. Twitter Snowflake • Under 64 bits • No co-ordination (after startup) • K-ordered • Scala service, Thrift interface, uses Zookeeper for configuration
  • 23. Twitter Snowflake 41 bits Timestamp millisecond precision, bespoke epoch 10 bits Configured machine ID 12 bits Sequence number
  • 24. Twitter Snowflake 77669839702851584 = (timestamp << 22) | (machine << 12) | sequence
  • 25. Boundary Flake • 128 bits • No co-ordination at all • K-ordered • Erlang service
  • 26. Boundary Flake 64 bits Timestamp millisecond precision, 1970 epoch 48 bits MAC address 16 bits Sequence number
  • 27. PHP Cruftflake • Based on Twitter Snowflake • No co-ordination (after startup) • K-ordered • PHP, ZeroMQ interface, uses Zookeeper for configuration
  • 29. References Flickr distributed ticket server http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on- the-cheap/ UUIDs http://tools.ietf.org/html/rfc4122 How random are random UUIDs? http://stackoverflow.com/a/2514722/15318 Twitter Snowflake https://github.com/twitter/snowflake Boundary Flake https://github.com/boundary/flake PHP Cruftflake https://github.com/davegardnerisme/cruftflake
  • 30. private function mintId64($timestamp, $machine, $sequence) { $timestamp = (int)$timestamp; $value = ($timestamp << 22) | ($machine << 12) | $sequence; return (string)$value; } private function mintId32($timestamp, $machine, $sequence) { $hi = (int)($timestamp / pow(2,10)); $lo = (int)($timestamp * pow(2, 22)); // stick in the machine + sequence to the low bit $lo = $lo | ($machine << 12) | $sequence; // reconstruct into a string of numbers $hex = pack('N2', $hi, $lo); $unpacked = unpack('H*', $hex); $value = $this->hexdec($unpacked[1]); return (string)$value; }
  • 31. public function generate() { $t = floor($this->timer->getUnixTimestamp() - $this->epoch); if ($t !== $this->lastTime) { $this->sequence = 0; $this->lastTime = $t; } else { $this->sequence++; if ($this->sequence > 4095) { throw new OverflowException('Sequence overflow'); } } if (PHP_INT_SIZE === 4) { return $this->mintId32($t, $this->machine, $this->sequence); } else { return $this->mintId64($t, $this->machine, $this->sequence); } }