SlideShare ist ein Scribd-Unternehmen logo
1 von 121
CASSANDRA DOES WHAT?
   CODE MANIA 2012
  Aaron Morton, Apache Cassandra Committer
               @aaronmorton
           www.thelastpickle.com




    Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
Cassandra?
Cassandra is...

              Scalable
Cassandra is...

          Distributed
Cassandra is...

     Highly Available
Cassandra uses...

    Column Families
Cassandra is...

                  Fast
Cassandra is...

                  Fun
                  (Really.)
Why Cassandra?
Why Cassandra?

             Scale
Why Cassandra?

       Operations
Why Cassandra?

       Data Model
Today.

         Cluster
         Data Model
           Node
Cluster.

 Store the ‘foo’ row.
Store ‘foo’.
                                Node 1 - 'foo'




               Node 4 - 'foo'                    Node 2 - 'foo'




                                Node 3 - 'foo'
Cluster Capacity?

            Limited.
Replication Factor specifies the
  number of row replicas.
              (RF)
Everything is a copy.

        Master Slave
        Replication.
Store ‘foo’ with Replication Factor 3.
                              Node 1 - 'foo'




                     Node 4                    Node 2 - 'foo'




                              Node 3 - 'foo'
Cluster Capacity?
   Node Capacity X Number Nodes
           Replication Factor
Scalable Capacity?



                 ✓
Consistent Hashing...	


 Evenly map keys to
       nodes.
Consistent Hashing...	

        Minimise key
      movements when
     nodes join or leave.
Partitioner...
     RandomPartitioner
   transforms Keys to Tokens
           using MD5.
         (Default Partitioner, there are others.)
Keys and Tokens?
    key     'fop'   'foo'




  token 0    10     90      99
128 Bit Unsigned Integer Token.

170,141,183,460,46
9,231,731,687,303,7
   15,884,105,728
Token Ring.
                          99   0
                  'foo'            'fop'
              token: 90            token: 10
Partitioning...

   Assign a Token to
      each node.
                  (initial_token)
Token Ranges.
                                   Node 1
                                   token: 0

                            76-0               1-25




                  Node 4                              Node 2
                token: 75                             token: 25




                                   Node 3
                                   token: 50
Token Ranges.
    Node        Token   Range From   Range To
      1           0         76          0

      2          25         1           25

      3          50         26          50

      4          75         51          75
Locate Token Range.
                                              Node 1
                                              token: 0


                      'foo'
                      token: 90


                                    Node 4                Node 2
                                  token: 75               token: 25




                                              Node 3
                                              token: 50
Replication Strategy selects
Replication Factor number of
      nodes for a row.
SimpleStrategy selects
 nodes by Token Order.
    (Non default, there are others.)
SimpleStrategy with RF 3.
                                          Node 1
                                          token: 0


                  'foo'
                  token: 90


                                Node 4                Node 2
                              token: 75               token: 25




                                          Node 3
                                          token: 50
NetworkTopologyStrategy uses a
 Replication Factor per Data
           Centre.
            (Default.)
NetworkTopologyStrategy...

    Stripes replicas
     across racks.
Multi DC Replication with RF 3 and RF 2.
                          Node 1                              Node 1
                          token: 0                            token: 1


  'foo'
  token: 90


                Node 4    West DC     Node 2        Node 4    East DC     Node 2
              token: 75               token: 25   token: 76               token: 26




                          Node 3                              Node 3
                          token: 50                           token: 51
The Snitch knows which Data
 Centre and rack contains a
            Node.
SimpleSnitch.
 Places all nodes in the same
        DC and rack.
          (Default, there are others.)
PropertyFileSnitch.
   Places nodes in a multiple
      DCs and racks using
         configuration.
             (There are others.)
EC2Snitch.
  Places nodes in a DC using
  the AWS Region and a rack
    using Availability Zone.
             (There are others.)
DynamicSnitch.
Re-orders nodes according to
their observed performance.
           (Wraps other snitch.)
Clients connect to
 any node in the
      cluster.
Coordinator handles
  a request for a
       client.
The Client and the Coordinator.
                                            Node 1
                                            token: 0


                    'foo'
                    token: 90


                                  Node 4                Node 2
                                token: 75               token: 25




                                            Node 3
                    Client
                                            token: 50
Nodes Gossip about
  other nodes.
Gossip?
Nodes share information with
a small number of neighbours.
Who share information with...
Scalable Throughput?


            ✓
Distributed?


               ✓
Node Down
   (oh noes)
Node Down.
                                     Node 1
                                     token: 0


             'foo'
             token: 90


                           Node 4                Node 2
                         token: 75               token: 25




                                     Node 3
             Client
                                     token: 50
Client specified
Consistency Level.
Consistency Level...

   Any*, One, Two,
       Three,
Consistency Level...
          QUORUM,
       LOCAL_QUORUM,
       EACH_QUOURM*
Quorum?
     floor(RF / 2) +1
QUOURM at Replication Factor...
   Replication
                 2 or 3   4 or 5   6 or 7
     Factor




   QUOURM          2        3        4
UnavailableException
TimedOutException
Node Down with Hinted Handoff.
                                          Node 1
                                          'foo'


                  'foo'
                  token: 90


                               Node 4              Node 2
                           'foo' for #3            'foo'




                                          Node 3
                  Client
Cluster.

 Read the ‘foo’ row.
Read ‘foo’.
                                      Node 1
                                      token: 0


              'foo'
              token: 90


                            Node 4                Node 2
                          token: 75               token: 25




                                      Node 3
              Client
                                      token: 50
Consistency Level
  nodes must
   respond.
Read ‘foo’ at QUOURM.
                                       Node 1
                                       'foo'


                  'foo'
                  token: 90


                              Node 4            Node 2
                                                'foo'




                                       Node 3
                  Client
Consistency Level
nodes must agree.
Digests used to
detect differences.
Timestamps used to
resolve differences.
Differences in the ‘foo’ row.
    Column        Node 1           Node 2           Node 3
                    cromulent        cromulent
      purple                                         <missing>
                 (timestamp 10)   (timestamp 10)

                    embiggens        embiggens       debigulator
     monkey
                 (timestamp 10)   (timestamp 10)   (timestamp 5)

                     tomato           tomato           tomacco
    dishwasher
                 (timestamp 10)   (timestamp 10)   (timestamp 15)
Consistent Read.
                    Node 1                                           Node 1



                   cromulent


                             cromulent
          Node 4                         Node 2            Node 4               Node 2

                   <empty>                                          cromulent
                                                      cromulent




 Client                                           Client
                    Node 3                                           Node 3
Read Repair is active
  on a fraction of
     requests.
       (10% by default)
QUORUM with and without Read Repair.
                  Node 1                              Node 1




         Node 4            Node 2            Node 4            Node 2




                  Node 3                              Node 3
Client                              Client
I can haz Consistency ?

           R +W > N
  (#Read Nodes + #Write Nodes > Replication Factor)
Anti Entropy...

 Hash key ranges on
  each node using
   Merkle Trees.
Anti Entropy...

  Stream differences
   between nodes.
Highly Available?


             ✓
Today.
            Cluster
         Data Model
            Node
Data Model so far.


    Row Key:   Column        Column   Column


                 (Incomplete.)
Data Model.
                           Keyspace

               Column Family   Column Family   Column Family
                  Column          Column          Column
    Row Key:      Column          Column          Column
                  Column          Column          Column


                (Excludes Super Columns.)
Rows are the unit
 of replication.
The Column Family
   is the unit of
      storage.
Row and Column
Family are the unit
   of querying.
API...
                           Mutate
# pycassa - Python

>>> col_fam = pycassa.ColumnFamily(pool, 'ColumnFamily1')

>>> col_fam.insert('row_key', {'col_name': 'col_val'})
API...
                  Mutate
# Cassandra Query Language (CQL)

INSERT INTO ColumnFamily1 (KEY, col_name)
VALUES ('row_key', 'col_value');
API...
                     Delete
# pycassa - Python

>>> col_fam.remove('row_key')

>>> col_fam.remove('row_key', [‘col_name’])
API...
                  Delete
# Cassandra Query Language (CQL)

DELETE FROM ColumnFamily1 WHERE key IN
('row_key',);

DELETE col_name FROM ColumnFamily1 WHERE
key = 'row_key';
Batch Mutate saves
 on round trips.
      (It’s not a Tx.)
API...
                     Get, Multi-Get
# pycassa - Python

>>> col_fam.get('row_key')
{'col_name': 'col_val', 'col_name2': 'col_val2'}

>>> col_fam.multi_get(['row_key'], [‘col_name’])
{‘row_key’ : {'col_name': 'col_val'}}
API...
             Get, Multi-Get
# Cassandra Query Language (CQL)

SELECT * FROM ColumnFamily1;

SELECT col_name FROM ColumnFamily1 WHERE
KEY IN (‘row_key’);
API...
                     Get Range*
# pycassa - Python

>>> col_fam.get_range(start='row_key')
{
'row_key' : {'col_name': 'col_val'},
'row_key50': {'col_name': 'col_val'},
'row_key2': {'col_name': 'col_val'}
}
API...
               Get Range*
# Cassandra Query Language (CQL)

SELECT * FROM ColumnFamily1 WHERE KEY >=
‘row_key’;
Column Families?


            ✓
Today.
          Cluster
         Data Model
          Node
Optimised for
  Writes.
Write path...
  Append to Write
    Ahead Log.
  (fsync every 10s by default, other options available)
Write path...
   Merge Columns
   into Memtable.
        (Lock free, always in memory.)
Write path...
           Done.
Fast for writes?


             ✓
(Later.)
      Asynchronously flush
      Memtable to new files.
           (May be 10’s or 100’s of MB in size.)
Data is stored in
immutable SSTables.
      (Sorted String table.)
SSTable files.
                 *-Data.db
                 *-Index.db
                 *-Filter.db
        (Also *-Statistics.db and *-Digest.sha1)
SSTables.
         SSTable 1             SSTable 2     SSTable 3         SSTable 4        SSTable 5
   foo:                   foo:                           foo:
    dishwasher (ts 10):    frink (ts 20):                 dishwasher (ts 15):
     tomato                 flayven                         tomacco
    purple (ts 10):        monkey (ts 10):
     cromulent              embiggins
Read Path...
   Read columns from each
  SSTable, then merge results.
               (Roughly speaking.)
Read Path...
     Use Bloom Filter to
 determine if a row key does
    not exist in a SSTable.
               (In memory)
Bloom Filter says if a key is
 definitely not present, or
  present with a certain
        probability.
    (Default false positive rate is 0.0744%)
Read Path...
     Search for prior key in
       *-Index.db sample.
               (In memory)
Read Path...
 Scan *-Index.db from prior
key to find the search key and
     its’ *-Data.db offset.
               (On disk.)
Read Path...
Read *-Data.db from offset, all
 columns or specific pages.
               (Default 64KB page size.)
Read purple, monkey, dishwasher.
               Bloom Filter           Bloom Filter         Bloom Filter          Bloom Filter          Bloom Filter

  Memory      Index Sample           Index Sample         Index Sample          Index Sample          Index Sample


  Disk
            SSTable 1-Index.db     SSTable 2-Index.db   SSTable 3-Index.db    SSTable 4-Index.db    SSTable 5-Index.db


             SSTable 1-Data.db      SSTable 2-Data.db   SSTable 3-Data.db      SSTable 4-Data.db    SSTable 5-Data.db
           foo:                   foo:                                       foo:
            dishwasher (ts 10):    frink (ts 20):                             dishwasher (ts 15):
             tomato                 flayven                                     tomacco
            purple (ts 10):        monkey (ts 10):
             cromulent              embiggins
Merge SSTables.
    Column       SSTable 1        SSTable 2        SSTable 4
                    cromulent
      purple
                 (timestamp 10)

                                     embiggens
     monkey
                                  (timestamp 10)

                     tomato                            tomacco
    dishwasher
                 (timestamp 10)                    (timestamp 15)
Key Cache caches row key
position in *-Data.db file.
  (Removes up to1disk seek per SSTable.)
Read with Key Cache.
               Bloom Filter           Bloom Filter         Bloom Filter          Bloom Filter          Bloom Filter

                Key Cache              Key Cache           Key Cache              Key Cache            Key Cache

  Memory      Index Sample           Index Sample         Index Sample          Index Sample          Index Sample


  Disk
            SSTable 1-Index.db     SSTable 2-Index.db   SSTable 3-Index.db    SSTable 4-Index.db    SSTable 5-Index.db


             SSTable 1-Data.db      SSTable 2-Data.db   SSTable 3-Data.db      SSTable 4-Data.db    SSTable 5-Data.db
           foo:                   foo:                                       foo:
            dishwasher (ts 10):    frink (ts 20):                             dishwasher (ts 15):
             tomato                 flayven                                     tomacco
            purple (ts 10):        monkey (ts 10):
             cromulent              embiggins
Row Cache caches entire row.
        (Removes all disk IO.)
Read with Row Cache.
                                                               Row Cache

                  Bloom Filter           Bloom Filter         Bloom Filter          Bloom Filter          Bloom Filter

                   Key Cache              Key Cache           Key Cache              Key Cache            Key Cache


     Memory       Index Sample          Index Sample         Index Sample          Index Sample          Index Sample


     Disk
               SSTable 1-Index.db     SSTable 2-Index.db   SSTable 3-Index.db   SSTable 4-Index.db     SSTable 5-Index.db


                SSTable 1-Data.db      SSTable 2-Data.db   SSTable 3-Data.db      SSTable 4-Data.db    SSTable 5-Data.db
              foo:                   foo:                                       foo:
               dishwasher (ts 10):    frink (ts 20):                             dishwasher (ts 15):
                tomato                 flayven                                      tomacco
               purple (ts 10):        monkey (ts 10):
                cromulent              embiggins
Fast for reads?


             ✓
Tombstones ensure all replicas
       see a delete.
      (Purged after 10 days, configurable.)
Merge SSTables with Tombstones.
   Column        SSTable 1        SSTable 2        SSTable 4
                    cromulent                       <tombstone>
      purple
                 (timestamp 10)                    (timestamp 15)

                                     embiggens
     monkey
                                  (timestamp 10)

                     tomato                            tomacco
    dishwasher
                 (timestamp 10)                    (timestamp 15)
Merge node response with Tombstones.
   Column         Node 1           Node 2           Node 3
                    cromulent        cromulent      <tombstone>
      purple
                 (timestamp 10)   (timestamp 10)   (timestamp 15)

                    embiggens        embiggens       debigulator
     monkey
                 (timestamp 10)   (timestamp 10)   (timestamp 5)

                     tomato           tomato           tomacco
    dishwasher
                 (timestamp 10)   (timestamp 10)   (timestamp 15)
Compaction merges truth from
  multiple SSTables into one
 SSTable with the same truth.
   (Manual and continuous background process.)
Compaction.
  Column SSTable 1 SSTable 2 SSTable 4                              New
                   cromulent                       <tombstone> <tombstone>
     purple
                (timestamp 10)                    (timestamp 15) (timestamp 15)

                                    embiggens                       embiggens
    monkey
                                 (timestamp 10)                  (timestamp 10)

                    tomato                            tomacco        tomacco
   dishwasher
                (timestamp 10)                    (timestamp 15) (timestamp 15)
Today.

          Cluster
         Data Model
           Node
Papers.
•Cassandra - A Decentralized Structured Storage System (Lakshman et al).
•Bigtable: A Distributed Storage System for Structured Data (Chang, et al).
•Dynamo: Amazon’s Highly Available Key-value Store (DeCandia, et al).
•Eventually Consistent (Werner Vogels).
•Epidemic algorithms for replicated database maintenance (Demers, et al).
•Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web
services (Gilbert et al).
•Consistent hashing and random trees: distributed caching protocols for relieving
hot spots on the world wide web (Karger, et al).
•The φ Accrual Failure Detector (Hayashibara et al).
Aaron Morton
                     @aaronmorton
                   www.thelastpickle.com




Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Weitere ähnliche Inhalte

Mehr von aaronmorton

Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache CassandraCassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache Cassandraaaronmorton
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.Xaaronmorton
 
Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandraaaronmorton
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...aaronmorton
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
 
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL aaronmorton
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break GlassCassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glassaaronmorton
 
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra InternalsCassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra Internalsaaronmorton
 
Cassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break GlassCassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break Glassaaronmorton
 
Cassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra InternalsCassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra Internalsaaronmorton
 
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2aaronmorton
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performanceaaronmorton
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internalsaaronmorton
 
Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance aaronmorton
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandraaaronmorton
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandraaaronmorton
 
Cassandra - Wellington No Sql
Cassandra - Wellington No SqlCassandra - Wellington No Sql
Cassandra - Wellington No Sqlaaronmorton
 

Mehr von aaronmorton (18)

Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache CassandraCassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
 
Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandra
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break GlassCassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
 
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra InternalsCassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
 
Cassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break GlassCassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break Glass
 
Cassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra InternalsCassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra Internals
 
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performance
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internals
 
Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
Cassandra - Wellington No Sql
Cassandra - Wellington No SqlCassandra - Wellington No Sql
Cassandra - Wellington No Sql
 

Kürzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Kürzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Cassandra does what ? Code Mania 2012

  • 1. CASSANDRA DOES WHAT? CODE MANIA 2012 Aaron Morton, Apache Cassandra Committer @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 3. Cassandra is... Scalable
  • 4. Cassandra is... Distributed
  • 5. Cassandra is... Highly Available
  • 6. Cassandra uses... Column Families
  • 8. Cassandra is... Fun (Really.)
  • 10. Why Cassandra? Scale
  • 11. Why Cassandra? Operations
  • 12. Why Cassandra? Data Model
  • 13. Today. Cluster Data Model Node
  • 14. Cluster. Store the ‘foo’ row.
  • 15. Store ‘foo’. Node 1 - 'foo' Node 4 - 'foo' Node 2 - 'foo' Node 3 - 'foo'
  • 16. Cluster Capacity? Limited.
  • 17. Replication Factor specifies the number of row replicas. (RF)
  • 18. Everything is a copy. Master Slave Replication.
  • 19. Store ‘foo’ with Replication Factor 3. Node 1 - 'foo' Node 4 Node 2 - 'foo' Node 3 - 'foo'
  • 20. Cluster Capacity? Node Capacity X Number Nodes Replication Factor
  • 22. Consistent Hashing... Evenly map keys to nodes.
  • 23. Consistent Hashing... Minimise key movements when nodes join or leave.
  • 24. Partitioner... RandomPartitioner transforms Keys to Tokens using MD5. (Default Partitioner, there are others.)
  • 25. Keys and Tokens? key 'fop' 'foo' token 0 10 90 99
  • 26. 128 Bit Unsigned Integer Token. 170,141,183,460,46 9,231,731,687,303,7 15,884,105,728
  • 27. Token Ring. 99 0 'foo' 'fop' token: 90 token: 10
  • 28. Partitioning... Assign a Token to each node. (initial_token)
  • 29. Token Ranges. Node 1 token: 0 76-0 1-25 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • 30. Token Ranges. Node Token Range From Range To 1 0 76 0 2 25 1 25 3 50 26 50 4 75 51 75
  • 31. Locate Token Range. Node 1 token: 0 'foo' token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • 32. Replication Strategy selects Replication Factor number of nodes for a row.
  • 33. SimpleStrategy selects nodes by Token Order. (Non default, there are others.)
  • 34. SimpleStrategy with RF 3. Node 1 token: 0 'foo' token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 token: 50
  • 35. NetworkTopologyStrategy uses a Replication Factor per Data Centre. (Default.)
  • 36. NetworkTopologyStrategy... Stripes replicas across racks.
  • 37. Multi DC Replication with RF 3 and RF 2. Node 1 Node 1 token: 0 token: 1 'foo' token: 90 Node 4 West DC Node 2 Node 4 East DC Node 2 token: 75 token: 25 token: 76 token: 26 Node 3 Node 3 token: 50 token: 51
  • 38. The Snitch knows which Data Centre and rack contains a Node.
  • 39. SimpleSnitch. Places all nodes in the same DC and rack. (Default, there are others.)
  • 40. PropertyFileSnitch. Places nodes in a multiple DCs and racks using configuration. (There are others.)
  • 41. EC2Snitch. Places nodes in a DC using the AWS Region and a rack using Availability Zone. (There are others.)
  • 42. DynamicSnitch. Re-orders nodes according to their observed performance. (Wraps other snitch.)
  • 43. Clients connect to any node in the cluster.
  • 44. Coordinator handles a request for a client.
  • 45. The Client and the Coordinator. Node 1 token: 0 'foo' token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • 46. Nodes Gossip about other nodes.
  • 47. Gossip? Nodes share information with a small number of neighbours. Who share information with...
  • 50. Node Down (oh noes)
  • 51. Node Down. Node 1 token: 0 'foo' token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • 53. Consistency Level... Any*, One, Two, Three,
  • 54. Consistency Level... QUORUM, LOCAL_QUORUM, EACH_QUOURM*
  • 55. Quorum? floor(RF / 2) +1
  • 56. QUOURM at Replication Factor... Replication 2 or 3 4 or 5 6 or 7 Factor QUOURM 2 3 4
  • 59. Node Down with Hinted Handoff. Node 1 'foo' 'foo' token: 90 Node 4 Node 2 'foo' for #3 'foo' Node 3 Client
  • 60. Cluster. Read the ‘foo’ row.
  • 61. Read ‘foo’. Node 1 token: 0 'foo' token: 90 Node 4 Node 2 token: 75 token: 25 Node 3 Client token: 50
  • 62. Consistency Level nodes must respond.
  • 63. Read ‘foo’ at QUOURM. Node 1 'foo' 'foo' token: 90 Node 4 Node 2 'foo' Node 3 Client
  • 65. Digests used to detect differences.
  • 67. Differences in the ‘foo’ row. Column Node 1 Node 2 Node 3 cromulent cromulent purple <missing> (timestamp 10) (timestamp 10) embiggens embiggens debigulator monkey (timestamp 10) (timestamp 10) (timestamp 5) tomato tomato tomacco dishwasher (timestamp 10) (timestamp 10) (timestamp 15)
  • 68. Consistent Read. Node 1 Node 1 cromulent cromulent Node 4 Node 2 Node 4 Node 2 <empty> cromulent cromulent Client Client Node 3 Node 3
  • 69. Read Repair is active on a fraction of requests. (10% by default)
  • 70. QUORUM with and without Read Repair. Node 1 Node 1 Node 4 Node 2 Node 4 Node 2 Node 3 Node 3 Client Client
  • 71. I can haz Consistency ? R +W > N (#Read Nodes + #Write Nodes > Replication Factor)
  • 72. Anti Entropy... Hash key ranges on each node using Merkle Trees.
  • 73. Anti Entropy... Stream differences between nodes.
  • 75. Today. Cluster Data Model Node
  • 76. Data Model so far. Row Key: Column Column Column (Incomplete.)
  • 77. Data Model. Keyspace Column Family Column Family Column Family Column Column Column Row Key: Column Column Column Column Column Column (Excludes Super Columns.)
  • 78. Rows are the unit of replication.
  • 79. The Column Family is the unit of storage.
  • 80. Row and Column Family are the unit of querying.
  • 81. API... Mutate # pycassa - Python >>> col_fam = pycassa.ColumnFamily(pool, 'ColumnFamily1') >>> col_fam.insert('row_key', {'col_name': 'col_val'})
  • 82. API... Mutate # Cassandra Query Language (CQL) INSERT INTO ColumnFamily1 (KEY, col_name) VALUES ('row_key', 'col_value');
  • 83. API... Delete # pycassa - Python >>> col_fam.remove('row_key') >>> col_fam.remove('row_key', [‘col_name’])
  • 84. API... Delete # Cassandra Query Language (CQL) DELETE FROM ColumnFamily1 WHERE key IN ('row_key',); DELETE col_name FROM ColumnFamily1 WHERE key = 'row_key';
  • 85. Batch Mutate saves on round trips. (It’s not a Tx.)
  • 86. API... Get, Multi-Get # pycassa - Python >>> col_fam.get('row_key') {'col_name': 'col_val', 'col_name2': 'col_val2'} >>> col_fam.multi_get(['row_key'], [‘col_name’]) {‘row_key’ : {'col_name': 'col_val'}}
  • 87. API... Get, Multi-Get # Cassandra Query Language (CQL) SELECT * FROM ColumnFamily1; SELECT col_name FROM ColumnFamily1 WHERE KEY IN (‘row_key’);
  • 88. API... Get Range* # pycassa - Python >>> col_fam.get_range(start='row_key') { 'row_key' : {'col_name': 'col_val'}, 'row_key50': {'col_name': 'col_val'}, 'row_key2': {'col_name': 'col_val'} }
  • 89. API... Get Range* # Cassandra Query Language (CQL) SELECT * FROM ColumnFamily1 WHERE KEY >= ‘row_key’;
  • 91. Today. Cluster Data Model Node
  • 92. Optimised for Writes.
  • 93. Write path... Append to Write Ahead Log. (fsync every 10s by default, other options available)
  • 94. Write path... Merge Columns into Memtable. (Lock free, always in memory.)
  • 95. Write path... Done.
  • 97. (Later.) Asynchronously flush Memtable to new files. (May be 10’s or 100’s of MB in size.)
  • 98. Data is stored in immutable SSTables. (Sorted String table.)
  • 99. SSTable files. *-Data.db *-Index.db *-Filter.db (Also *-Statistics.db and *-Digest.sha1)
  • 100. SSTables. SSTable 1 SSTable 2 SSTable 3 SSTable 4 SSTable 5 foo: foo: foo: dishwasher (ts 10): frink (ts 20): dishwasher (ts 15): tomato flayven tomacco purple (ts 10): monkey (ts 10): cromulent embiggins
  • 101. Read Path... Read columns from each SSTable, then merge results. (Roughly speaking.)
  • 102. Read Path... Use Bloom Filter to determine if a row key does not exist in a SSTable. (In memory)
  • 103. Bloom Filter says if a key is definitely not present, or present with a certain probability. (Default false positive rate is 0.0744%)
  • 104. Read Path... Search for prior key in *-Index.db sample. (In memory)
  • 105. Read Path... Scan *-Index.db from prior key to find the search key and its’ *-Data.db offset. (On disk.)
  • 106. Read Path... Read *-Data.db from offset, all columns or specific pages. (Default 64KB page size.)
  • 107. Read purple, monkey, dishwasher. Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter Memory Index Sample Index Sample Index Sample Index Sample Index Sample Disk SSTable 1-Index.db SSTable 2-Index.db SSTable 3-Index.db SSTable 4-Index.db SSTable 5-Index.db SSTable 1-Data.db SSTable 2-Data.db SSTable 3-Data.db SSTable 4-Data.db SSTable 5-Data.db foo: foo: foo: dishwasher (ts 10): frink (ts 20): dishwasher (ts 15): tomato flayven tomacco purple (ts 10): monkey (ts 10): cromulent embiggins
  • 108. Merge SSTables. Column SSTable 1 SSTable 2 SSTable 4 cromulent purple (timestamp 10) embiggens monkey (timestamp 10) tomato tomacco dishwasher (timestamp 10) (timestamp 15)
  • 109. Key Cache caches row key position in *-Data.db file. (Removes up to1disk seek per SSTable.)
  • 110. Read with Key Cache. Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter Key Cache Key Cache Key Cache Key Cache Key Cache Memory Index Sample Index Sample Index Sample Index Sample Index Sample Disk SSTable 1-Index.db SSTable 2-Index.db SSTable 3-Index.db SSTable 4-Index.db SSTable 5-Index.db SSTable 1-Data.db SSTable 2-Data.db SSTable 3-Data.db SSTable 4-Data.db SSTable 5-Data.db foo: foo: foo: dishwasher (ts 10): frink (ts 20): dishwasher (ts 15): tomato flayven tomacco purple (ts 10): monkey (ts 10): cromulent embiggins
  • 111. Row Cache caches entire row. (Removes all disk IO.)
  • 112. Read with Row Cache. Row Cache Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter Key Cache Key Cache Key Cache Key Cache Key Cache Memory Index Sample Index Sample Index Sample Index Sample Index Sample Disk SSTable 1-Index.db SSTable 2-Index.db SSTable 3-Index.db SSTable 4-Index.db SSTable 5-Index.db SSTable 1-Data.db SSTable 2-Data.db SSTable 3-Data.db SSTable 4-Data.db SSTable 5-Data.db foo: foo: foo: dishwasher (ts 10): frink (ts 20): dishwasher (ts 15): tomato flayven tomacco purple (ts 10): monkey (ts 10): cromulent embiggins
  • 114. Tombstones ensure all replicas see a delete. (Purged after 10 days, configurable.)
  • 115. Merge SSTables with Tombstones. Column SSTable 1 SSTable 2 SSTable 4 cromulent <tombstone> purple (timestamp 10) (timestamp 15) embiggens monkey (timestamp 10) tomato tomacco dishwasher (timestamp 10) (timestamp 15)
  • 116. Merge node response with Tombstones. Column Node 1 Node 2 Node 3 cromulent cromulent <tombstone> purple (timestamp 10) (timestamp 10) (timestamp 15) embiggens embiggens debigulator monkey (timestamp 10) (timestamp 10) (timestamp 5) tomato tomato tomacco dishwasher (timestamp 10) (timestamp 10) (timestamp 15)
  • 117. Compaction merges truth from multiple SSTables into one SSTable with the same truth. (Manual and continuous background process.)
  • 118. Compaction. Column SSTable 1 SSTable 2 SSTable 4 New cromulent <tombstone> <tombstone> purple (timestamp 10) (timestamp 15) (timestamp 15) embiggens embiggens monkey (timestamp 10) (timestamp 10) tomato tomacco tomacco dishwasher (timestamp 10) (timestamp 15) (timestamp 15)
  • 119. Today. Cluster Data Model Node
  • 120. Papers. •Cassandra - A Decentralized Structured Storage System (Lakshman et al). •Bigtable: A Distributed Storage System for Structured Data (Chang, et al). •Dynamo: Amazon’s Highly Available Key-value Store (DeCandia, et al). •Eventually Consistent (Werner Vogels). •Epidemic algorithms for replicated database maintenance (Demers, et al). •Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services (Gilbert et al). •Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web (Karger, et al). •The φ Accrual Failure Detector (Hayashibara et al).
  • 121. Aaron Morton @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. \n
  98. \n
  99. \n
  100. \n
  101. \n
  102. \n
  103. \n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. \n
  110. \n
  111. \n
  112. \n
  113. \n
  114. \n
  115. \n
  116. \n
  117. \n
  118. \n
  119. \n
  120. \n
  121. \n