SlideShare a Scribd company logo
1 of 26
Cassandra under the
       hood
         Richard Low
      rlow@acunu.com
Outline
• What happens when you write?
 • Commit logs
 • Memtables
                       “richard”:{
                         “email”:”rlow@acunu.com”

 • SSTables
                       }
                                    ?
• What happens when you read?
 • Point queries
 • Range queries
• Repair and snapshots
Why should we care?

• Help understand performance
• Understand performance implications of
  data model
• Helps to fix it if something goes wrong
• Interesting!
Writes
Writes (2)
             Insert


Commit log            Memtable



                      SSTable
             { Bloom filter, Index, Data }
Commit log
             Insert


Commit log            Memtable



                      SSTable
             { Bloom filter, Index, Data }
Commit log
• Each insert written to commit log first
• Stored in insertion order
• Inserts not acknowledged until written to
  commit log
• Batch vs periodic
• In case of crash, can replay
Memtable


    Memtable
Memtable

• In memory store of insertions
• ConcurrentSkipListMap
• When too large, flushed to disk
• Ensures all writes to disk are sequential
SSTable
             Insert


Commit log            Memtable



                      SSTable
             { Bloom filter, Index, Data }
SSTables

• Stores actual data, sorted by key
• Contains a Bloom filter and index to help
  find keys
• Read only
Bloom filters
• Probabilistic data structure
• Answers membership queries:
 • ‘Does the set contain x?’
• Can give false positives, never false
  negatives
• Space efficient
• Typical size: 1 byte per key
How it works together
   Bloom filter            Index                                Data

011010111010010   k_0       ->    0      k_0....................................................
                  k_128     ->    4582   .....k_1...............................................
                  k_256     ->    9242   .........k_2...........k_3..........................
How it works together
   Bloom filter             Index                                Data

011010111010010   k_0        ->    0      k_0....................................................
                  k_128      ->    4582   .....k_1...............................................
                  k_256      ->    9242   .........k_2...........k_3..........................




    Contains x?           Where is x?                        Retrieve x
How it works together
                                   Memory   Disk
   Bloom filter             Index                                    Data

011010111010010   k_0        ->     0         k_0....................................................
                  k_128      ->     4582      .....k_1...............................................
                  k_256      ->     9242      .........k_2...........k_3..........................




    Contains x?           Where is x?                            Retrieve x
Point queries
Memtables   k_0
            k_1
                         ->
                         ->
                                 .........
                                 .........
                                                      k_0
                                                      k_1
                                                                               ->
                                                                               ->
                                                                                             .........
                                                                                             .........
                                                                                                             k_0
                                                                                                             k_1
                                                                                                                       ->
                                                                                                                       ->
                                                                                                                            .........
                                                                                                                            .........
            k_2          ->      .........            k_2                      ->            .........       k_2       ->   .........




SSTables                                 k_0....................................................
                                         .....k_1...............................................
                                                                                                                             k_0....................................................
                                                                                                                             .....k_1...............................................
                                         .........k_2...........k_3..........................                                .........k_2...........k_3..........................
            k_0     ->        0                                                                      k_0       ->   0
            k_128   ->        4582                                                                   k_128     ->   4582
            k_256   ->        9242                                                                   k_256     ->   9242




                                         k_0....................................................                             k_0....................................................
                                         .....k_1...............................................                             .....k_1...............................................
                                         .........k_2...........k_3..........................                                .........k_2...........k_3..........................
            k_0     ->        0                                                                      k_0       ->   0
            k_128   ->        4582                                                                   k_128     ->   4582
            k_256   ->        9242                                                                   k_256     ->   9242
Point queries
Memtables        k_0
                 k_1
                              ->
                              ->
                                      .........
                                      .........
                                                           k_0
                                                           k_1
                                                                                    ->
                                                                                    ->
                                                                                                  .........
                                                                                                  .........
                                                                                                                  k_0
                                                                                                                  k_1
                                                                                                                            ->
                                                                                                                            ->
                                                                                                                                 .........
                                                                                                                                 .........
                 k_2          ->      .........            k_2                      ->            .........       k_2       ->   .........




SSTables                                      k_0....................................................
                                              .....k_1...............................................
                                                                                                                                  k_0....................................................
                                                                                                                                  .....k_1...............................................
                                              .........k_2...........k_3..........................                                .........k_2...........k_3..........................
                 k_0     ->        0                                                                      k_0       ->   0
                 k_128   ->        4582                                                                   k_128     ->   4582
                 k_256   ->        9242                                                                   k_256     ->   9242

1. Query filter




                                              k_0....................................................                             k_0....................................................
                                              .....k_1...............................................                             .....k_1...............................................
                                              .........k_2...........k_3..........................                                .........k_2...........k_3..........................
                 k_0     ->        0                                                                      k_0       ->   0
                 k_128   ->        4582                                                                   k_128     ->   4582
                 k_256   ->        9242                                                                   k_256     ->   9242
Point queries
Memtables          k_0
                   k_1
                                ->
                                ->
                                        .........
                                        .........
                                                             k_0
                                                             k_1
                                                                                      ->
                                                                                      ->
                                                                                                    .........
                                                                                                    .........
                                                                                                                    k_0
                                                                                                                    k_1
                                                                                                                              ->
                                                                                                                              ->
                                                                                                                                   .........
                                                                                                                                   .........
                   k_2          ->      .........            k_2                      ->            .........       k_2       ->   .........




SSTables                                        k_0....................................................
                                                .....k_1...............................................
                                                                                                                                    k_0....................................................
                                                                                                                                    .....k_1...............................................
                                                .........k_2...........k_3..........................                                .........k_2...........k_3..........................
                   k_0     ->        0                                                                      k_0       ->   0
                   k_128   ->        4582                                                                   k_128     ->   4582
                   k_256   ->        9242                                                                   k_256     ->   9242

1. Query filter
2. Find location


                                                k_0....................................................                             k_0....................................................
                                                .....k_1...............................................                             .....k_1...............................................
                                                .........k_2...........k_3..........................                                .........k_2...........k_3..........................
                   k_0     ->        0                                                                      k_0       ->   0
                   k_128   ->        4582                                                                   k_128     ->   4582
                   k_256   ->        9242                                                                   k_256     ->   9242
Point queries
Memtables          k_0
                   k_1
                                ->
                                ->
                                        .........
                                        .........
                                                             k_0
                                                             k_1
                                                                                      ->
                                                                                      ->
                                                                                                    .........
                                                                                                    .........
                                                                                                                    k_0
                                                                                                                    k_1
                                                                                                                              ->
                                                                                                                              ->
                                                                                                                                   .........
                                                                                                                                   .........
                   k_2          ->      .........            k_2                      ->            .........       k_2       ->   .........




SSTables                                        k_0....................................................
                                                .....k_1...............................................
                                                                                                                                    k_0....................................................
                                                                                                                                    .....k_1...............................................
                                                .........k_2...........k_3..........................                                .........k_2...........k_3..........................
                   k_0     ->        0                                                                      k_0       ->   0
                   k_128   ->        4582                                                                   k_128     ->   4582
                   k_256   ->        9242                                                                   k_256     ->   9242

1. Query filter
2. Find location
3. Read data
                                                k_0....................................................                             k_0....................................................
                                                .....k_1...............................................                             .....k_1...............................................
                                                .........k_2...........k_3..........................                                .........k_2...........k_3..........................
                   k_0     ->        0                                                                      k_0       ->   0
                   k_128   ->        4582                                                                   k_128     ->   4582
                   k_256   ->        9242                                                                   k_256     ->   9242
Range queries
• Bloom filters useless
• Use index to locate portion of SSTable
• Read data, merge results
• Necessary to lookup in every SSTable data
  file
• Disk I/O proportional to #SSTables
Compaction

• Merges SSTables
• Removes overwrites and obsolete
  tombstones
• Improves range query performance
• Major compaction creates one SSTable
Write optimised
• All writes are sequential on disk
• Each write is written multiple times during
  compactions
• Bloom filters mean approx. one I/O per
  read
• Avoid a read-modify-write data model
Scaling
• In memory:
 • Buffers
 • Memtables
 • Bloom filters
 • Index
• If not enough memory, significant
  performance impact
Repair: Merkle Trees
• Repair builds a Merkle tree
• Compared with replicas
• Efficient
• If differences are found,
  portions of SSTables are
  streamed
• Requires full disk scan to
  build
Snapshot

• For backup, want consistent set of SSTables
• nodetool snapshot does this
• Creates hard links to existing SSTables
• Implies data will be copied after a few
  compactions
Summary
• How writes end up on disk
• How point queries and range queries find
  the data
• Implications
• Repair
• Snapshot

More Related Content

More from Acunu

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinAcunu
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu
 
All Your Base
All Your BaseAll Your Base
All Your BaseAcunu
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraAcunu
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonAcunu
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Acunu
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with CassandraAcunu
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraAcunu
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: CassandraAcunu
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Acunu
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraAcunu
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation CassandraAcunu
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Acunu
 

More from Acunu (20)

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on Cassandra
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational Aspirin
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra Apps
 
All Your Base
All Your BaseAll Your Base
All Your Base
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache Cassandra
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with Cassandra
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra London
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with Cassandra
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into Cassandra
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation Cassandra
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
 

Recently uploaded

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Recently uploaded (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Cassandra internals

  • 1. Cassandra under the hood Richard Low rlow@acunu.com
  • 2. Outline • What happens when you write? • Commit logs • Memtables “richard”:{ “email”:”rlow@acunu.com” • SSTables } ? • What happens when you read? • Point queries • Range queries • Repair and snapshots
  • 3. Why should we care? • Help understand performance • Understand performance implications of data model • Helps to fix it if something goes wrong • Interesting!
  • 5. Writes (2) Insert Commit log Memtable SSTable { Bloom filter, Index, Data }
  • 6. Commit log Insert Commit log Memtable SSTable { Bloom filter, Index, Data }
  • 7. Commit log • Each insert written to commit log first • Stored in insertion order • Inserts not acknowledged until written to commit log • Batch vs periodic • In case of crash, can replay
  • 8. Memtable Memtable
  • 9. Memtable • In memory store of insertions • ConcurrentSkipListMap • When too large, flushed to disk • Ensures all writes to disk are sequential
  • 10. SSTable Insert Commit log Memtable SSTable { Bloom filter, Index, Data }
  • 11. SSTables • Stores actual data, sorted by key • Contains a Bloom filter and index to help find keys • Read only
  • 12. Bloom filters • Probabilistic data structure • Answers membership queries: • ‘Does the set contain x?’ • Can give false positives, never false negatives • Space efficient • Typical size: 1 byte per key
  • 13. How it works together Bloom filter Index Data 011010111010010 k_0 -> 0 k_0.................................................... k_128 -> 4582 .....k_1............................................... k_256 -> 9242 .........k_2...........k_3..........................
  • 14. How it works together Bloom filter Index Data 011010111010010 k_0 -> 0 k_0.................................................... k_128 -> 4582 .....k_1............................................... k_256 -> 9242 .........k_2...........k_3.......................... Contains x? Where is x? Retrieve x
  • 15. How it works together Memory Disk Bloom filter Index Data 011010111010010 k_0 -> 0 k_0.................................................... k_128 -> 4582 .....k_1............................................... k_256 -> 9242 .........k_2...........k_3.......................... Contains x? Where is x? Retrieve x
  • 16. Point queries Memtables k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_2 -> ......... k_2 -> ......... k_2 -> ......... SSTables k_0.................................................... .....k_1............................................... k_0.................................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242 k_0.................................................... k_0.................................................... .....k_1............................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242
  • 17. Point queries Memtables k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_2 -> ......... k_2 -> ......... k_2 -> ......... SSTables k_0.................................................... .....k_1............................................... k_0.................................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242 1. Query filter k_0.................................................... k_0.................................................... .....k_1............................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242
  • 18. Point queries Memtables k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_2 -> ......... k_2 -> ......... k_2 -> ......... SSTables k_0.................................................... .....k_1............................................... k_0.................................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242 1. Query filter 2. Find location k_0.................................................... k_0.................................................... .....k_1............................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242
  • 19. Point queries Memtables k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_0 k_1 -> -> ......... ......... k_2 -> ......... k_2 -> ......... k_2 -> ......... SSTables k_0.................................................... .....k_1............................................... k_0.................................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242 1. Query filter 2. Find location 3. Read data k_0.................................................... k_0.................................................... .....k_1............................................... .....k_1............................................... .........k_2...........k_3.......................... .........k_2...........k_3.......................... k_0 -> 0 k_0 -> 0 k_128 -> 4582 k_128 -> 4582 k_256 -> 9242 k_256 -> 9242
  • 20. Range queries • Bloom filters useless • Use index to locate portion of SSTable • Read data, merge results • Necessary to lookup in every SSTable data file • Disk I/O proportional to #SSTables
  • 21. Compaction • Merges SSTables • Removes overwrites and obsolete tombstones • Improves range query performance • Major compaction creates one SSTable
  • 22. Write optimised • All writes are sequential on disk • Each write is written multiple times during compactions • Bloom filters mean approx. one I/O per read • Avoid a read-modify-write data model
  • 23. Scaling • In memory: • Buffers • Memtables • Bloom filters • Index • If not enough memory, significant performance impact
  • 24. Repair: Merkle Trees • Repair builds a Merkle tree • Compared with replicas • Efficient • If differences are found, portions of SSTables are streamed • Requires full disk scan to build
  • 25. Snapshot • For backup, want consistent set of SSTables • nodetool snapshot does this • Creates hard links to existing SSTables • Implies data will be copied after a few compactions
  • 26. Summary • How writes end up on disk • How point queries and range queries find the data • Implications • Repair • Snapshot

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n