SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Big problems,
 Massive data

Stratified B-trees
Versioned dictionaries

•   put(k,ver,data)
                                          Monday 12:00   v10


•   get(k_start,k_end,ver)

•   clone(v): create a child of v
                                          Monday 16:00   v11

    that inherits the latest version of
    its keys
                                             Now         v12
Versioned dictionaries

•   put(k,ver,data)
                                          Monday 12:00   v10


•   get(k_start,k_end,ver)

•   clone(v): create a child of v
                                          Monday 16:00   v11

    that inherits the latest version of
    its keys
                                             Now         v12



    This talk: a versioned dictionary with fast updates,
    and optimal space/query/update tradeoffs
Why?
                           •    Powerful: cloning, time-travel,
                                cache and space-efficiency, ...
Monday 12:00   v10

                           •    Give developers a recent
                                branch of live dataset
Monday 16:00   v11
                           •    Expose different views of same
                                base dataset

   Now         v12   v13
                               Run analytics/tests/etc on
                                  this clone, without
                                 performance impact.
State of the art: copy-on-write
  Used in ZFS, WAFL, Btrfs, ... Apply path-copying [DSST] to
  the B-tree
State of the art: copy-on-write
  Used in ZFS, WAFL, Btrfs, ... Apply path-copying [DSST] to
  the B-tree
                               Problems:
                            • Space blowup: Each update may
                               rewrite an entire path
                            • Slow updates: as above
                            • Needs random IO to scale
                            • Concurrency is tricky
State of the art: copy-on-write
  Used in ZFS, WAFL, Btrfs, ... Apply path-copying [DSST] to
  the B-tree
                               Problems:
                            • Space blowup: Each update may
                               rewrite an entire path
                            • Slow updates: as above
                            • Needs random IO to scale
                            • Concurrency is tricky
 A log file system makes updates sequential, but relies on
 garbage collection (achilles heel!)
~ log (2^30)/log 10000
       = 3 IOs/update

                               CoW B-tree
                            [ZFS,WAFL,Btrfs,..]
                                O(logB Nv)
       Update
                               random IOs

     Range query
                             O(Z/B) random
       (size Z)

        Space                O(N B logB Nv)


Nv = #keys live (accessible) at version v
B = “block size”, say 1MB at 100 bytes/entry = 10000 entries
complication: B is asymmetric for flash..
important for flash
   ~ log (2^30)/log 10000                                    ~ log (2^30)/10000
       = 3 IOs/update                                        = 0.003 IOs/update

                               CoW B-tree
                                                             This talk
                            [ZFS,WAFL,Btrfs,..]
                                O(logB Nv)             O((log Nv) / B)
       Update
                               random IOs            cache-oblivious IOs

     Range query
                             O(Z/B) random               O(Z/B) sequential
       (size Z)

        Space                O(N B logB Nv)                   O(N)


Nv = #keys live (accessible) at version v
B = “block size”, say 1MB at 100 bytes/entry = 10000 entries
complication: B is asymmetric for flash..
Unversioned Case
[Doubling Array]
Doubling Array
                      Inserts




 Buffer arrays in memory
until we have > B of them
Doubling Array
                      Inserts


   2




 Buffer arrays in memory
until we have > B of them
Doubling Array
                      Inserts


   2


   9




 Buffer arrays in memory
until we have > B of them
Doubling Array
                      Inserts


   2


   9




 Buffer arrays in memory
until we have > B of them
Doubling Array
                      Inserts


            2    9




 Buffer arrays in memory
until we have > B of them
Doubling Array
        Inserts


2   9
Doubling Array
             Inserts


11   2   9
Doubling Array
             Inserts


11   2   9


8
Doubling Array
         Inserts


2   9


8   11
Doubling Array
         Inserts


2   9


8   11
Doubling Array
                       Inserts


                        2   8   9   11


                                               etc...


Similar to log-structured merge trees (LSM), cache-
oblivious lookahead array (COLA), ...
O(log N) “levels”, each element is rewritten once per level
  O((log N) / B) IOs
Doubling Array
    Queries
Doubling Array
                 Queries




• Add an index to each array to do lookups
Doubling Array
                 Queries




              query(k)

• Add an index to each array to do lookups
• query(k) searches each array independently
Doubling Array
                   Queries




                query(k)

• Bloom Filters can help exclude arrays from
  search
• ... but don’t help with range queries
Fractional Cascading
Fractional Cascading



• Fractional Cascading:
    Use information from search at level l
          to help search at level l+1

• From each array, sample every 4th element
  and put a pointer to it in previous level
Fractional Cascading
                  found entry




• Fractional Cascading:
    Use information from search at level l
          to help search at level l+1

• From each array, sample every 4th element
  and put a pointer to it in previous level
Fractional Cascading
                  found entry




                  ‘forward pointers’ give bounds for search in next array


• Fractional Cascading:
    Use information from search at level l
          to help search at level l+1

• From each array, sample every 4th element
  and put a pointer to it in previous level
Fractional Cascading



 forward pointer
 data
Fractional Cascading


search
Fractional Cascading


search
Fractional Cascading


 search
Fractional Cascading


          search
Fractional Cascading


                       search
Fractional Cascading
Fractional Cascading
• In case you might get unlucky with the
  sampling...
Fractional Cascading
• In case you might get unlucky with the
  sampling...



• ... add regular ‘secondary’ pointers to nearest
  FP above and below
Versioned case (sketch)
Adding versions
                  version 1

k1   k2   k3k4   k5   k6   k7   k8   k9 k10 k11 k12 k13


if layout is good for v1 ...


                                                          v1




                                                          v2
Adding versions
                  version 1

k1   k2   k3k4   k5   k6   k7   k8   k9 k10 k11 k12 k13 k6

                                                             version 2
if layout is good for v1 ...
                       ... then it’s bad for v2

                                                                         v1




                                                                         v2
Adding versions
                               version 1

         k1   k2   k3k4   k5    k6   k7   k8   k9 k10 k11 k12 k13 k6

                                                                              version 2
        if layout is good for v1 ...
                               ... then it’s bad for v2

if you try to keep all versions of a key close...                                         v1

         k1   k2   k3     k4   k5    k6   k6   k7   k8   k9 k10 k11 k12 k13



                                                                                          v2
Adding versions
                             version 1

       k1    k2   k3   k4   k5   k6   k7   k8   k9 k10 k11 k12 k13 k6

                                                                        version 2
            if layout is good for v1 ...
                                   ... then it’s bad for v2

if you try to keep all versions of a key close...
          k1 k2 k3 k4 k5 k6 k6 k6 k6 k6 ... k7 k8 k9                    k10 k11 k12 k13



                                                       ... then it’s bad for all versions
     versions 2, 3, 4, ...
Density
                           k0      k1      k2     k3

                    v0

                    v4                                                             v0

                    v5
                                                                                   v4   v5
                   v1                                                    v1

                   v2
                                                                    v2        v3
                   v3



          W={v1,v2,v3} k0, v0, x k1, v0, x k1, v2, x k2, v1, x k2, v2, x k2, v3, x k3, v1, x k3, v2, x




•   Arrays are tagged with a version set W
Density
                                     k0      k1      k2     k3      live at v1
                              v0
    live(v1) = 4
    live(v2) = 4              v4
                                                                 live at v3                  v0
    live(v3) = 4
    density = 4/8             v5
                                                                                             v4   v5
                             v1                                                    v1

                             v2
                                                                              v2        v3
                             v3



                    W={v1,v2,v3} k0, v0, x k1, v0, x k1, v2, x k2, v1, x k2, v2, x k2, v3, x k3, v1, x k3, v2, x


•    f(A,v) = (#elements in A live at version v) / |A|

•    density(A,W) = min{w in W} f(A,w)
Density
                                     k0      k1      k2     k3      live at v1
                              v0
    live(v1) = 4
    live(v2) = 4              v4
                                                                 live at v3                  v0
    live(v3) = 4
    density = 4/8             v5
                                                                                             v4   v5
                             v1                                                    v1

                             v2
                                                                              v2        v3
                             v3



                    W={v1,v2,v3} k0, v0, x k1, v0, x k1, v2, x k2, v1, x k2, v2, x k2, v3, x k3, v1, x k3, v2, x


•    f(A,v) = (#elements in A live at version v) / |A|

•    density(A,W) = min{w in W} f(A,w)
•    We say the array (A,W) is dense if density ≥1/5

•    Tradeoff: high density means good range queries, but many duplicates
     (imagine density 1 and density 1/N)
optimal bound of O(log Nv + Z/B). For much smaller range
queries, the worst-case performance may be the same as for
a point query. We now prove the amortized bound, which
                   Range queries
applies to smaller queries.


  Theorem 2. A range query at version v costs O(log Nv +
Z/B) amortized I/Os.
                       (k,*)
   Proof. We first consider just point queries, and amortize
the cost of lookup(k, v) over all keys live at v. Let l(k, v) be
    •
the cost of lookup(k, v), then the amortized cost is given by
        imagine scanning over each accessible array
   k l(k, v)/Nv .
    •   density => trivially true for large (‘voluminous’) range queries
    •
For anfor point queries: v, Ai ) be the number of I/Os used
         array Ai , let l(k,
in examining elements in Ai for lookup(k,v). The idea is
        •  amortize over all k for a fixed version v
        •   each query examines disjoint regions of the array
        •   density implies total size examined = O(Nv log Nv)
Don’t worry, stay dense!
       •   Version sets disjoint at each level -- lookups examine one array/level

       •   merge arrays with intersecting version sets

       •   the result of a merge might not be dense

       •   Answer: density amplification!


      promote              merge            density amplification            demote
...                                                                                  ...
                   {1,2}                                            {2,3}
                                             {1,2,3}

                   {1,3}                                             {1}

                                           {4}

                   {4}                                               {4}
“density amplification”
                                                         k0   k1   k2   k3
       live(v0) = 2                                 v0
      density = 2/11
                                                    v4                       live(v0) = 2
      k0        k1        k2        k3                                       live(v5) = 4
                                          split 1   v5
v0
                                                    v1                       density = 2/4
v4
                                                    v2
v5
                                                    v3
v1

v2

v3                                                       k0   k1   k2   k3

                                                    v0
                                                                             live(v4) = 2
                     v0
                                          split 2   v4                       live(v1) = 3
                                                                             live(v2) = 3
                                                    v5
                                                                             live(v3) = 3
                     v4        v5                   v1
           v1                                                                density = 2/7
                                split 1             v2

                                                    v3
     v2      v3
     split 2
e-    If (A, V ) also satisfies (L-live) then every split of it does
      (since all live elements are included), and likewise for (L-
         “density amplification”
r-
 h    edge). It follows that version splitting (A , V ) – which
m     necessarily has no promotable versions – results in a set of
      arrays all of which satisfy all of the L-* conditions necessary
                                                 k0 k1  k2 k3
      to stay atlive(v0) = 2
                   level l.
                                                  v0
 s,             density = 2/11
                                                  v4                       live(v0) = 2
he    The main result of k3
             k0 k1  k2    this process is the following.                   live(v5) = 4
                                      split 1     v5
al         v0
                                                  v1                      density = 2/4
 n         v4

ut      Lemma 3 (Promotion). T he fraction of lead elements
           v5
                                            v2


e,    over v1 l output arrays after a version split is ≥ 1/39.
            al                              v3


           v2

           v3
         Proof. First, we claim that under k0 k1same conditions
                                                        the    k2  k3

 st   as the version split lemma, if in addition |A| < 2M live(v4) = 2
                                                    v0                    and
  n                                       split 2
      live(v) >= M/3 for all v, then the number of output strata = 3
                         v0                         v4                   live(v1)
re                                                                       live(v2) = 3
      is at most 13. Consider the arrays which obey the live(v3) = 3
                                                    v5                   lead
  o                      v4 v5
      fraction constraint. Each has sizev1at least M/3, since at
                   v1
ng    least one version is   split live in it, and least half of the array is= 2/7
                                   1                v2
                                                                        density

 d    lead, sov2at least M/6 lead keys. The total number of lead
                                                    v3
                      v3
re    keys in split 2 array A is ≤ 2M , since the array itself is no
               the
ui    larger than this; it follows that there can be no more than
O n snapshot or clone of version v to new descendant ver-            ou tpu t
sion v , v is registered for each array A which is currently


3.9
            Update bound
registered to the parent of v. T his does not require any I / Os.

       Update
                                                                     T he th
                                                                     rays ca
                                                                     ting.
  Theorem 1. The stratified doubling array performs up-
dates to a leaf version v in a cache-oblivious O (log N v / B )      3.10
amortized I/Os.
                                                                     For lar
                                                                     Z = Ω
   Proof. A ssume we have at our disposal a memory buffer             proper
of size at least B (recall that B is not known to the algo-          op tima
rithm). T hen each array that is involved in a disk merge            queries
has size at least B , so a merge of some number of arrays of         a poin
total size k elements costs O (k / B ) I / Os. In the C O L A [5],   applies
each element exists in exactly one array and may participate
in O (log N ) merges, which immediately gives the desired
amortized bound. In the scheme described here, elements                The
may exist in many arrays, and elements may participate in            Z/B) a
many merges at the same level (eg when an array at level
l is version split and some subarrays remain at level l after          Pro
the version split). N evertheless, we shall prove the theorem        the cos
O n snapshot or clone of version v to new descendant ver-           ou tpu t
sion v , v is registered for each array A which is currently


3.9
             Update bound
registered to the parent of v. T his does not require any I / Os.

        Update
                                                                    T he th
                                                                    rays ca
                                                                    ting.
  Theorem 1. The stratified doubling array performs up-
dates to a leaf version v in a cache-oblivious O (log N v / B )     3.10
amortized I/Os.
                                                                   For lar
                                                                   Z = Ω
•    Not possible to use basic amortized method (some elements in
   Proof.arrays; somehave at ourmerged many times)
              A ssume we elements disposal a memory buffer          proper
     many
of size at least B (recall that B is not known to the algo-        op tima
•
rithm). T hen each array of merges/splits to leaddisk merge only queries
     Idea: charge the cost that is involved in a elements
    •   (k,v) appears as lead in of some array -> always N total leadpoin
has size at least B , so a merge exactly 1 number of arrays of     a
total size k elements costs O (k / B ) I / Os. In the C O L A [5], applies
    •
each element exists in exactly one array andpromotion
        each lead element receives $c/B on may participate
    •
in O (log N ) merges, which immediately v / B) the desired
        total charge for version v is O(log N gives
amortized bound. In the scheme described here, elements               The
may exist in many arrays, and elements may participate in          Z/B) a
many merges at the same level (eg when an array at level
l is version split and some subarrays remain at level l after         Pro
the version split). N evertheless, we shall prove the theorem      the cos
9: return [split(r)]
 O n snapshot or clone of version v to new descendant ver-             ou tpu t
 sion v , v is registered for each array A which is currently

                Update bound
 registered to the parent of v. T his does not require any I / Os.
there is a version split of (A, V ), say (Ai , Vi ) for i = 1 . . . n,
such that each array satisfies ( L-dense) and ( L-size) for level
                                                                       T he th
                                                                       rays ca
l, and Updateat most one index i for which lead(Ai ) <
 3.9 there is                                                          ting.
|AiTheorem 1. The stratified doubling array performs up-
     |/2.
 dates to a leaf version v in a cache-oblivious O (log N v / B )       3.10
 amortized I/Os.
                                                                       For lar
If (A, V ) also satisfies (L-live) then every split of it does          Z = Ω
•
(since all live elements basic amortized method (some elements in
      Not possible to use are included), and likewise for (L-
    Proof.arrays; somehave at ourmerged many times)
                A ssume we elements disposal a memory buffer            proper
      many
edge). It follows that version splitting (A , V ) – which
 of size at least B (recall that B is not known to the algo-           op tima
•
necessarily has no promotable versions – results in a set of only
 rithm). T hen each array of merges/splits to leaddisk merge
      Idea: charge the cost that is involved in a elements
arrays all of which satisfy all of the L-* conditions necessary
                                                                       queries
    •     (k,v) appears as lead in of some array -> always N total leadpoin
 has size at least B , so a merge exactly 1 number of arrays of
to stay at level l.
                                                                       a
 total size k elements costs O (k / B ) I / Os. In the C O L A [5],    applies
    •   element exists in exactly one array andpromotion
          each lead element receives $c/B on may
 eachmain result of this process is the following. participate
The
    •
 in O (log N ) merges, which immediately v / B) the desired
          total charge for version v is O(log N gives
 amortized bound. In the scheme described here, elements                  The
 may exist 3 many arrays, andTelements may lead elements
               in (Promotion). he fraction of participate in           Z/B) a
    Lemma
over al merges at the same level (eg when is ≥array at level
 many l output arrays after a version split an 1/39.
 l is version split and some subarrays remain at level l after            Pro
 the version split). N evertheless, we shall prove the theorem         the cos
Does it work?
Insert rate, as a function of dictionary size




                     1e+06




                     100000
Inserts per second




                     10000




                      1000




                        100       Stratified B-tree
                                      CoW B-tree

                              1                                                                   10
                                                                    Keys (millions)
                                                                                                       ~3 OoM
Range rate, as a function of dictionary size
                   1e+09




                   1e+08
Reads per second




                   1e+07




                   1e+06




                   100000


                                Stratified B-tree
                                    CoW B-tree
                    10000
                            1                                                                 10
                                                                 Keys (millions)
                                                                                                   ~1 OoM
bitbucket.org/acunu
                                          www.acunu.com/download




Apache, Apache Cassandra, Cassandra, Hadoop, and the eye and
elephant logos are trademarks of the Apache Software Foundation.

Weitere ähnliche Inhalte

Was ist angesagt?

Astricon 2010: Scaling Asterisk installations
Astricon 2010: Scaling Asterisk installationsAstricon 2010: Scaling Asterisk installations
Astricon 2010: Scaling Asterisk installationsOlle E Johansson
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep InternalEXEM
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Mydbops
 
Introducing Scylla Cloud
Introducing Scylla CloudIntroducing Scylla Cloud
Introducing Scylla CloudScyllaDB
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...
Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...
Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...HostedbyConfluent
 
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDBComparing Apache Cassandra 4.0, 3.0, and ScyllaDB
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDBScyllaDB
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Ontico
 
Oracle 21c: New Features and Enhancements of Data Pump & TTS
Oracle 21c: New Features and Enhancements of Data Pump & TTSOracle 21c: New Features and Enhancements of Data Pump & TTS
Oracle 21c: New Features and Enhancements of Data Pump & TTSChristian Gohmann
 
redis 소개자료 - 네오클로바
redis 소개자료 - 네오클로바redis 소개자료 - 네오클로바
redis 소개자료 - 네오클로바NeoClova
 
NY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleNY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleWagner Bianchi
 
Nginx Architecture
Nginx ArchitectureNginx Architecture
Nginx Architecture건 손
 
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Citus Data
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerMongoDB
 
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 Seoul
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 SeoulElastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 Seoul
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 SeoulSeungYong Oh
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityOSSCube
 
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberKafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberHostedbyConfluent
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanGabriele Bartolini
 

Was ist angesagt? (20)

Astricon 2010: Scaling Asterisk installations
Astricon 2010: Scaling Asterisk installationsAstricon 2010: Scaling Asterisk installations
Astricon 2010: Scaling Asterisk installations
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
Wars of MySQL Cluster ( InnoDB Cluster VS Galera )
 
Introducing Scylla Cloud
Introducing Scylla CloudIntroducing Scylla Cloud
Introducing Scylla Cloud
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...
Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...
Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction with El...
 
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDBComparing Apache Cassandra 4.0, 3.0, and ScyllaDB
Comparing Apache Cassandra 4.0, 3.0, and ScyllaDB
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
Oracle 21c: New Features and Enhancements of Data Pump & TTS
Oracle 21c: New Features and Enhancements of Data Pump & TTSOracle 21c: New Features and Enhancements of Data Pump & TTS
Oracle 21c: New Features and Enhancements of Data Pump & TTS
 
redis 소개자료 - 네오클로바
redis 소개자료 - 네오클로바redis 소개자료 - 네오클로바
redis 소개자료 - 네오클로바
 
NY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with MaxscaleNY Meetup: Scaling MariaDB with Maxscale
NY Meetup: Scaling MariaDB with Maxscale
 
Nginx Architecture
Nginx ArchitectureNginx Architecture
Nginx Architecture
 
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 Seoul
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 SeoulElastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 Seoul
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 Seoul
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High Availability
 
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberKafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with Barman
 

Ähnlich wie 2011.06.20 stratified-btree

Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!Acunu
 
Stratified B-trees - HotStorage11
Stratified B-trees - HotStorage11Stratified B-trees - HotStorage11
Stratified B-trees - HotStorage11Acunu
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
Docker talk
Docker talkDocker talk
Docker talkRui Sun
 
MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012Steven Francia
 
MongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataMongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataSteven Francia
 
"Mobage DBA Fight against Big Data" - NHN TE
"Mobage DBA Fight against Big Data" - NHN TE"Mobage DBA Fight against Big Data" - NHN TE
"Mobage DBA Fight against Big Data" - NHN TERyosuke IWANAGA
 
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)Spark Summit
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksMark Chang
 
Graph processing
Graph processingGraph processing
Graph processingyeahjs
 
Scaling Language Specifications
Scaling Language SpecificationsScaling Language Specifications
Scaling Language Specificationsericupnorth
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in Rherbps10
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)Pavlo Baron
 
Intuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordIntuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordJAXLondon_Conference
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...confluent
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced CassandraEric Evans
 

Ähnlich wie 2011.06.20 stratified-btree (20)

Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!
 
Stratified B-trees - HotStorage11
Stratified B-trees - HotStorage11Stratified B-trees - HotStorage11
Stratified B-trees - HotStorage11
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 3: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Docker talk
Docker talkDocker talk
Docker talk
 
MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012
 
MongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataMongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous Data
 
"Mobage DBA Fight against Big Data" - NHN TE
"Mobage DBA Fight against Big Data" - NHN TE"Mobage DBA Fight against Big Data" - NHN TE
"Mobage DBA Fight against Big Data" - NHN TE
 
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
IndexedRDD: Efficeint Fine-Grained Updates for RDD's-(Ankur Dave, UC Berkeley)
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural Networks
 
Graph processing
Graph processingGraph processing
Graph processing
 
Scaling Language Specifications
Scaling Language SpecificationsScaling Language Specifications
Scaling Language Specifications
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)NoSQL - how it works (@pavlobaron)
NoSQL - how it works (@pavlobaron)
 
Intuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordIntuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin Stopford
 
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
Pnuts
PnutsPnuts
Pnuts
 

Mehr von Acunu

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinAcunu
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu
 
All Your Base
All Your BaseAll Your Base
All Your BaseAcunu
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraAcunu
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonAcunu
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Acunu
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with CassandraAcunu
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraAcunu
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: CassandraAcunu
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Acunu
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraAcunu
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation CassandraAcunu
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Acunu
 

Mehr von Acunu (20)

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on Cassandra
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational Aspirin
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra Apps
 
All Your Base
All Your BaseAll Your Base
All Your Base
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache Cassandra
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with Cassandra
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra London
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with Cassandra
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into Cassandra
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation Cassandra
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
 

Kürzlich hochgeladen

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Kürzlich hochgeladen (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

2011.06.20 stratified-btree

  • 1. Big problems, Massive data Stratified B-trees
  • 2. Versioned dictionaries • put(k,ver,data) Monday 12:00 v10 • get(k_start,k_end,ver) • clone(v): create a child of v Monday 16:00 v11 that inherits the latest version of its keys Now v12
  • 3. Versioned dictionaries • put(k,ver,data) Monday 12:00 v10 • get(k_start,k_end,ver) • clone(v): create a child of v Monday 16:00 v11 that inherits the latest version of its keys Now v12 This talk: a versioned dictionary with fast updates, and optimal space/query/update tradeoffs
  • 4. Why? • Powerful: cloning, time-travel, cache and space-efficiency, ... Monday 12:00 v10 • Give developers a recent branch of live dataset Monday 16:00 v11 • Expose different views of same base dataset Now v12 v13 Run analytics/tests/etc on this clone, without performance impact.
  • 5. State of the art: copy-on-write Used in ZFS, WAFL, Btrfs, ... Apply path-copying [DSST] to the B-tree
  • 6. State of the art: copy-on-write Used in ZFS, WAFL, Btrfs, ... Apply path-copying [DSST] to the B-tree Problems: • Space blowup: Each update may rewrite an entire path • Slow updates: as above • Needs random IO to scale • Concurrency is tricky
  • 7. State of the art: copy-on-write Used in ZFS, WAFL, Btrfs, ... Apply path-copying [DSST] to the B-tree Problems: • Space blowup: Each update may rewrite an entire path • Slow updates: as above • Needs random IO to scale • Concurrency is tricky A log file system makes updates sequential, but relies on garbage collection (achilles heel!)
  • 8. ~ log (2^30)/log 10000 = 3 IOs/update CoW B-tree [ZFS,WAFL,Btrfs,..] O(logB Nv) Update random IOs Range query O(Z/B) random (size Z) Space O(N B logB Nv) Nv = #keys live (accessible) at version v B = “block size”, say 1MB at 100 bytes/entry = 10000 entries complication: B is asymmetric for flash..
  • 9. important for flash ~ log (2^30)/log 10000 ~ log (2^30)/10000 = 3 IOs/update = 0.003 IOs/update CoW B-tree This talk [ZFS,WAFL,Btrfs,..] O(logB Nv) O((log Nv) / B) Update random IOs cache-oblivious IOs Range query O(Z/B) random O(Z/B) sequential (size Z) Space O(N B logB Nv) O(N) Nv = #keys live (accessible) at version v B = “block size”, say 1MB at 100 bytes/entry = 10000 entries complication: B is asymmetric for flash..
  • 11. Doubling Array Inserts Buffer arrays in memory until we have > B of them
  • 12. Doubling Array Inserts 2 Buffer arrays in memory until we have > B of them
  • 13. Doubling Array Inserts 2 9 Buffer arrays in memory until we have > B of them
  • 14. Doubling Array Inserts 2 9 Buffer arrays in memory until we have > B of them
  • 15. Doubling Array Inserts 2 9 Buffer arrays in memory until we have > B of them
  • 16. Doubling Array Inserts 2 9
  • 17. Doubling Array Inserts 11 2 9
  • 18. Doubling Array Inserts 11 2 9 8
  • 19. Doubling Array Inserts 2 9 8 11
  • 20. Doubling Array Inserts 2 9 8 11
  • 21. Doubling Array Inserts 2 8 9 11 etc... Similar to log-structured merge trees (LSM), cache- oblivious lookahead array (COLA), ... O(log N) “levels”, each element is rewritten once per level O((log N) / B) IOs
  • 22. Doubling Array Queries
  • 23. Doubling Array Queries • Add an index to each array to do lookups
  • 24. Doubling Array Queries query(k) • Add an index to each array to do lookups • query(k) searches each array independently
  • 25. Doubling Array Queries query(k) • Bloom Filters can help exclude arrays from search • ... but don’t help with range queries
  • 27. Fractional Cascading • Fractional Cascading: Use information from search at level l to help search at level l+1 • From each array, sample every 4th element and put a pointer to it in previous level
  • 28. Fractional Cascading found entry • Fractional Cascading: Use information from search at level l to help search at level l+1 • From each array, sample every 4th element and put a pointer to it in previous level
  • 29. Fractional Cascading found entry ‘forward pointers’ give bounds for search in next array • Fractional Cascading: Use information from search at level l to help search at level l+1 • From each array, sample every 4th element and put a pointer to it in previous level
  • 37. Fractional Cascading • In case you might get unlucky with the sampling...
  • 38. Fractional Cascading • In case you might get unlucky with the sampling... • ... add regular ‘secondary’ pointers to nearest FP above and below
  • 40. Adding versions version 1 k1 k2 k3k4 k5 k6 k7 k8 k9 k10 k11 k12 k13 if layout is good for v1 ... v1 v2
  • 41. Adding versions version 1 k1 k2 k3k4 k5 k6 k7 k8 k9 k10 k11 k12 k13 k6 version 2 if layout is good for v1 ... ... then it’s bad for v2 v1 v2
  • 42. Adding versions version 1 k1 k2 k3k4 k5 k6 k7 k8 k9 k10 k11 k12 k13 k6 version 2 if layout is good for v1 ... ... then it’s bad for v2 if you try to keep all versions of a key close... v1 k1 k2 k3 k4 k5 k6 k6 k7 k8 k9 k10 k11 k12 k13 v2
  • 43. Adding versions version 1 k1 k2 k3 k4 k5 k6 k7 k8 k9 k10 k11 k12 k13 k6 version 2 if layout is good for v1 ... ... then it’s bad for v2 if you try to keep all versions of a key close... k1 k2 k3 k4 k5 k6 k6 k6 k6 k6 ... k7 k8 k9 k10 k11 k12 k13 ... then it’s bad for all versions versions 2, 3, 4, ...
  • 44. Density k0 k1 k2 k3 v0 v4 v0 v5 v4 v5 v1 v1 v2 v2 v3 v3 W={v1,v2,v3} k0, v0, x k1, v0, x k1, v2, x k2, v1, x k2, v2, x k2, v3, x k3, v1, x k3, v2, x • Arrays are tagged with a version set W
  • 45. Density k0 k1 k2 k3 live at v1 v0 live(v1) = 4 live(v2) = 4 v4 live at v3 v0 live(v3) = 4 density = 4/8 v5 v4 v5 v1 v1 v2 v2 v3 v3 W={v1,v2,v3} k0, v0, x k1, v0, x k1, v2, x k2, v1, x k2, v2, x k2, v3, x k3, v1, x k3, v2, x • f(A,v) = (#elements in A live at version v) / |A| • density(A,W) = min{w in W} f(A,w)
  • 46. Density k0 k1 k2 k3 live at v1 v0 live(v1) = 4 live(v2) = 4 v4 live at v3 v0 live(v3) = 4 density = 4/8 v5 v4 v5 v1 v1 v2 v2 v3 v3 W={v1,v2,v3} k0, v0, x k1, v0, x k1, v2, x k2, v1, x k2, v2, x k2, v3, x k3, v1, x k3, v2, x • f(A,v) = (#elements in A live at version v) / |A| • density(A,W) = min{w in W} f(A,w) • We say the array (A,W) is dense if density ≥1/5 • Tradeoff: high density means good range queries, but many duplicates (imagine density 1 and density 1/N)
  • 47. optimal bound of O(log Nv + Z/B). For much smaller range queries, the worst-case performance may be the same as for a point query. We now prove the amortized bound, which Range queries applies to smaller queries. Theorem 2. A range query at version v costs O(log Nv + Z/B) amortized I/Os. (k,*) Proof. We first consider just point queries, and amortize the cost of lookup(k, v) over all keys live at v. Let l(k, v) be • the cost of lookup(k, v), then the amortized cost is given by imagine scanning over each accessible array k l(k, v)/Nv . • density => trivially true for large (‘voluminous’) range queries • For anfor point queries: v, Ai ) be the number of I/Os used array Ai , let l(k, in examining elements in Ai for lookup(k,v). The idea is • amortize over all k for a fixed version v • each query examines disjoint regions of the array • density implies total size examined = O(Nv log Nv)
  • 48. Don’t worry, stay dense! • Version sets disjoint at each level -- lookups examine one array/level • merge arrays with intersecting version sets • the result of a merge might not be dense • Answer: density amplification! promote merge density amplification demote ... ... {1,2} {2,3} {1,2,3} {1,3} {1} {4} {4} {4}
  • 49. “density amplification” k0 k1 k2 k3 live(v0) = 2 v0 density = 2/11 v4 live(v0) = 2 k0 k1 k2 k3 live(v5) = 4 split 1 v5 v0 v1 density = 2/4 v4 v2 v5 v3 v1 v2 v3 k0 k1 k2 k3 v0 live(v4) = 2 v0 split 2 v4 live(v1) = 3 live(v2) = 3 v5 live(v3) = 3 v4 v5 v1 v1 density = 2/7 split 1 v2 v3 v2 v3 split 2
  • 50. e- If (A, V ) also satisfies (L-live) then every split of it does (since all live elements are included), and likewise for (L- “density amplification” r- h edge). It follows that version splitting (A , V ) – which m necessarily has no promotable versions – results in a set of arrays all of which satisfy all of the L-* conditions necessary k0 k1 k2 k3 to stay atlive(v0) = 2 level l. v0 s, density = 2/11 v4 live(v0) = 2 he The main result of k3 k0 k1 k2 this process is the following. live(v5) = 4 split 1 v5 al v0 v1 density = 2/4 n v4 ut Lemma 3 (Promotion). T he fraction of lead elements v5 v2 e, over v1 l output arrays after a version split is ≥ 1/39. al v3 v2 v3 Proof. First, we claim that under k0 k1same conditions the k2 k3 st as the version split lemma, if in addition |A| < 2M live(v4) = 2 v0 and n split 2 live(v) >= M/3 for all v, then the number of output strata = 3 v0 v4 live(v1) re live(v2) = 3 is at most 13. Consider the arrays which obey the live(v3) = 3 v5 lead o v4 v5 fraction constraint. Each has sizev1at least M/3, since at v1 ng least one version is split live in it, and least half of the array is= 2/7 1 v2 density d lead, sov2at least M/6 lead keys. The total number of lead v3 v3 re keys in split 2 array A is ≤ 2M , since the array itself is no the ui larger than this; it follows that there can be no more than
  • 51. O n snapshot or clone of version v to new descendant ver- ou tpu t sion v , v is registered for each array A which is currently 3.9 Update bound registered to the parent of v. T his does not require any I / Os. Update T he th rays ca ting. Theorem 1. The stratified doubling array performs up- dates to a leaf version v in a cache-oblivious O (log N v / B ) 3.10 amortized I/Os. For lar Z = Ω Proof. A ssume we have at our disposal a memory buffer proper of size at least B (recall that B is not known to the algo- op tima rithm). T hen each array that is involved in a disk merge queries has size at least B , so a merge of some number of arrays of a poin total size k elements costs O (k / B ) I / Os. In the C O L A [5], applies each element exists in exactly one array and may participate in O (log N ) merges, which immediately gives the desired amortized bound. In the scheme described here, elements The may exist in many arrays, and elements may participate in Z/B) a many merges at the same level (eg when an array at level l is version split and some subarrays remain at level l after Pro the version split). N evertheless, we shall prove the theorem the cos
  • 52. O n snapshot or clone of version v to new descendant ver- ou tpu t sion v , v is registered for each array A which is currently 3.9 Update bound registered to the parent of v. T his does not require any I / Os. Update T he th rays ca ting. Theorem 1. The stratified doubling array performs up- dates to a leaf version v in a cache-oblivious O (log N v / B ) 3.10 amortized I/Os. For lar Z = Ω • Not possible to use basic amortized method (some elements in Proof.arrays; somehave at ourmerged many times) A ssume we elements disposal a memory buffer proper many of size at least B (recall that B is not known to the algo- op tima • rithm). T hen each array of merges/splits to leaddisk merge only queries Idea: charge the cost that is involved in a elements • (k,v) appears as lead in of some array -> always N total leadpoin has size at least B , so a merge exactly 1 number of arrays of a total size k elements costs O (k / B ) I / Os. In the C O L A [5], applies • each element exists in exactly one array andpromotion each lead element receives $c/B on may participate • in O (log N ) merges, which immediately v / B) the desired total charge for version v is O(log N gives amortized bound. In the scheme described here, elements The may exist in many arrays, and elements may participate in Z/B) a many merges at the same level (eg when an array at level l is version split and some subarrays remain at level l after Pro the version split). N evertheless, we shall prove the theorem the cos
  • 53. 9: return [split(r)] O n snapshot or clone of version v to new descendant ver- ou tpu t sion v , v is registered for each array A which is currently Update bound registered to the parent of v. T his does not require any I / Os. there is a version split of (A, V ), say (Ai , Vi ) for i = 1 . . . n, such that each array satisfies ( L-dense) and ( L-size) for level T he th rays ca l, and Updateat most one index i for which lead(Ai ) < 3.9 there is ting. |AiTheorem 1. The stratified doubling array performs up- |/2. dates to a leaf version v in a cache-oblivious O (log N v / B ) 3.10 amortized I/Os. For lar If (A, V ) also satisfies (L-live) then every split of it does Z = Ω • (since all live elements basic amortized method (some elements in Not possible to use are included), and likewise for (L- Proof.arrays; somehave at ourmerged many times) A ssume we elements disposal a memory buffer proper many edge). It follows that version splitting (A , V ) – which of size at least B (recall that B is not known to the algo- op tima • necessarily has no promotable versions – results in a set of only rithm). T hen each array of merges/splits to leaddisk merge Idea: charge the cost that is involved in a elements arrays all of which satisfy all of the L-* conditions necessary queries • (k,v) appears as lead in of some array -> always N total leadpoin has size at least B , so a merge exactly 1 number of arrays of to stay at level l. a total size k elements costs O (k / B ) I / Os. In the C O L A [5], applies • element exists in exactly one array andpromotion each lead element receives $c/B on may eachmain result of this process is the following. participate The • in O (log N ) merges, which immediately v / B) the desired total charge for version v is O(log N gives amortized bound. In the scheme described here, elements The may exist 3 many arrays, andTelements may lead elements in (Promotion). he fraction of participate in Z/B) a Lemma over al merges at the same level (eg when is ≥array at level many l output arrays after a version split an 1/39. l is version split and some subarrays remain at level l after Pro the version split). N evertheless, we shall prove the theorem the cos
  • 55. Insert rate, as a function of dictionary size 1e+06 100000 Inserts per second 10000 1000 100 Stratified B-tree CoW B-tree 1 10 Keys (millions) ~3 OoM
  • 56. Range rate, as a function of dictionary size 1e+09 1e+08 Reads per second 1e+07 1e+06 100000 Stratified B-tree CoW B-tree 10000 1 10 Keys (millions) ~1 OoM
  • 57. bitbucket.org/acunu www.acunu.com/download Apache, Apache Cassandra, Cassandra, Hadoop, and the eye and elephant logos are trademarks of the Apache Software Foundation.

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. LolCoW. if you want to do fast updates, then CoW technique cannot help -- the cow is built around the assumption that every update can do a lookup, and update reference counts\n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. The crucial notion is density. A versioned array, a version tree and its layout on disk. Versions v1,v2,v3 are tagged, so dark entries are lead entries.\nThe entry (k0,v0,x) is written in v0, so it is not a lead entry, but it is live at v1,v2 and v3. Similarly, (k1, v0, x) is live at v1 and v3 (since it was not overwritten at v1) but not at v2.\nThe live counts are as follows: live(v1) = 4, live(v2) = 4, live(v3) = 4, density = 4/8.\nIn practice, the on-disk layout can be compressed by writing the key once for all the versions, and other well-known techniques.\n
  56. The crucial notion is density. A versioned array, a version tree and its layout on disk. Versions v1,v2,v3 are tagged, so dark entries are lead entries.\nThe entry (k0,v0,x) is written in v0, so it is not a lead entry, but it is live at v1,v2 and v3. Similarly, (k1, v0, x) is live at v1 and v3 (since it was not overwritten at v1) but not at v2.\nThe live counts are as follows: live(v1) = 4, live(v2) = 4, live(v3) = 4, density = 4/8.\nIn practice, the on-disk layout can be compressed by writing the key once for all the versions, and other well-known techniques.\n
  57. \n
  58. \n
  59. Example of density amplification. The merged array has density $\\frac{2}{11} &lt; \\frac{1}{5}$, so it is not dense. We find a split into two parts: the first split $(A_{1},\\{v_{0},v_{5}\\})$ has size 4 and density $\\frac{1}{2}$. The second split $(A_{2},\\{v_{4}, v_{1}, v_{2},v_{3}\\})$ has size 7 and density $\\frac{2}{7}$. Both splits have size $&lt;8$ and density $\\ge \\frac{1}{5}$, so they can remain at the current level.\n\nWe start at the root version and greedily search for a version $v$ and some subset of its children whose split arrays can be merged into one dense array at level $l$. More precisely, letting $\\mathcal{U}=\\bigcup_{i} \\mathcal{W&apos;}[v_{i}]$, we search for a subset of $v$&apos;s children $\\{v_{i}\\}$ such that \n$$|\\mathrm{split}(\\mathcal{A&apos;},\\mathcal{U}) | &lt; 2^{l+1}.$$ \n\nIf no such set exists at $v$, we recurse into the child $v_{i}$ maximizing $|\\mathrm{split}(\\mathcal{A&apos;}, \\mathcal{W&apos;}[v_{i}])|$. It is possible to show that this always finds a dense split. Once such a set $\\mathcal{U}$ is identified, the corresponding array is written out, and we recurse on the remainder $\\mathrm{split}(\\mathcal{A&apos;}, \\mathcal{W&apos;} \\setminus \\mathcal{U})$. Figure \\ref{fig:split} gives an example of density amplification.\n\n\n
  60. Example of density amplification. The merged array has density $\\frac{2}{11} &lt; \\frac{1}{5}$, so it is not dense. We find a split into two parts: the first split $(A_{1},\\{v_{0},v_{5}\\})$ has size 4 and density $\\frac{1}{2}$. The second split $(A_{2},\\{v_{4}, v_{1}, v_{2},v_{3}\\})$ has size 7 and density $\\frac{2}{7}$. Both splits have size $&lt;8$ and density $\\ge \\frac{1}{5}$, so they can remain at the current level.\n\nWe start at the root version and greedily search for a version $v$ and some subset of its children whose split arrays can be merged into one dense array at level $l$. More precisely, letting $\\mathcal{U}=\\bigcup_{i} \\mathcal{W&apos;}[v_{i}]$, we search for a subset of $v$&apos;s children $\\{v_{i}\\}$ such that \n$$|\\mathrm{split}(\\mathcal{A&apos;},\\mathcal{U}) | &lt; 2^{l+1}.$$ \n\nIf no such set exists at $v$, we recurse into the child $v_{i}$ maximizing $|\\mathrm{split}(\\mathcal{A&apos;}, \\mathcal{W&apos;}[v_{i}])|$. It is possible to show that this always finds a dense split. Once such a set $\\mathcal{U}$ is identified, the corresponding array is written out, and we recurse on the remainder $\\mathrm{split}(\\mathcal{A&apos;}, \\mathcal{W&apos;} \\setminus \\mathcal{U})$. Figure \\ref{fig:split} gives an example of density amplification.\n\n\n
  61. \n
  62. \n
  63. \n
  64. \n
  65. The plot shows range query performance (elements/s extracted using range queries of size 1000).\nThe CoW B-tree is limited by random IO here ((100/s*32KB)/(200 bytes/key) = 16384 key/s), but the Stratified B-tree is CPU-bound (OCaml is single-threaded).\nPreliminary performance results from a highly-concurrent in-kernel implementation suggest that well over 500k updates/s are possible with 16 cores \n
  66. \n