SlideShare a Scribd company logo
1 of 30
Cassandra
Replication & Consistency

  Benjamin Black, b@b3k.us
        2010-04-28
Dynamo                         BigTable
     Cluster                         Sparse,
 management,                     columnar data
replication, fault               model, storage
   tolerance                      architecture
                     Cassandra
Dynamo-like
 Features
Symmetric, P2P architecture
 No special nodes/SPOFs
Gossip-based cluster management
Distributed hash table for data
placement
 Pluggable partitioning
 Pluggable topology discovery
 Pluggable placement strategies
Tunable, eventual consistency
BigTable-like
  Features
Sparse, “columnar” data model
 Optional, 2-level maps called
 Super Column Families
SSTable disk storage
 Append-only commit log
 Memtable (buffer and sort)
 Immutable SSTable files
Hadoop integration
Topic(s) for Today

    Replication
         &
    Consistency
[1]
Replication
How many copies of each piece
  of data do we want in the
           system?

            N=3
Consistency
     Level
  How many replicas must
respond to declare success?
W=2                 R=2


       ?
CL.Options
WRITE                                       READ
 Level     Description       Level     Description

 ZERO     Cross fingers

 ANY
                 WEAK
          1st Response
         (including HH)
 ONE      1st Response       ONE      1st Response



              STRONG
QUORUM   N/2 + 1 replicas   QUORUM   N/2 + 1 replicas

 ALL       All replicas      ALL       All replicas
A Side Note on
      CL
        Consistency
        Level is based
        on Replication
        Factor (N), not
        on the number
        of nodes in the
        system.
A Question of
       Time
       row



             column    column      column      column      column

             value      value       value       value       value

        timestamp     timestamp   timestamp   timestamp   timestamp




All columns have a value and a timestamp
Timestamps provided by clients
   usec resolution by convention
Latest timestamp wins
Vector clocks may be introduced in 0.7
Read Repair
      ?




Query all replicas on every read
  Data from one replica
  Checksum/timestamps from all
  others
If there is a mismatch:
  Pull all data and merge
  Write back to out of sync replicas
Weak vs. Strong
Weak Consistency
(reads)Perform repair after
returning results

      Strong Consistency (reads)
    Perform repair before returning
                             results
R+W>N

  Please imagine this inequality has huge fangs, dripping with the
blood of innocent, enterprise developers so you can best appreciate
                        the terror it inspires.
Our Guarantee
R+W>N guarantees overlap of
  read and write quorums


 W=2                 R=2

           N=3
A Matter of
Perspective
       View
    consistency



                Replica
              consistency
[2]
The Ring
           0
  range
                  113

375               125


 312
           250
Tokens
A TOKEN is a
partitioner-dependent
element on the ring
                  Each NODE has a
                  single, unique TOKEN

   Each NODE claims a RANGE of
   the ring from its TOKEN to the
   token of the previous node on
   the ring
Partitioning
    Map from Key Space to Token

RandomPartitioner
  Tokens are integers in the range 0-2127
  MD5(Key) -> Token
  Good: Even key distribution, Bad:
  Inefficient range queries
OrderPreservingPartitioner
  Tokens are UTF8 strings in the range ‘’-∞
  Key -> Token
  Good: Efficient range queries, Bad:
  Uneven key distribution
Snitching
     Map from Nodes to Physical
             Location
EndpointSnitch
  Guess at rack and datacenter based on IP address octets.


DatacenterEndpointSnitch
  Specify IP subnets for racks, grouped per datacenter.


PropertySnitch
  Specify arbitrary mappings from individual IP addresses to
  racks and datacenters.


            Or write your own!
Placement
  Map from Token Space to Nodes


The first replica is always placed
on the node that claims the
range in which the token falls.

Strategies determine where the
rest of the replicas are placed.
RackUnaware
    Place replicas on the N-1
subsequent nodes around the ring,
       ignoring topology.

datacenter A            datacenter B

     rack 1    rack 2        rack 1    rack 2
RackAware
Place the second replica in another
datacenter, and the remaining N-2
replicas on nodes in other racks in
       the same datacenter.
datacenter A             datacenter B

     rack 1     rack 2        rack 1    rack 2
DatacenterShard
Place M of the N replicas in another
 datacenter, and the remaining N -
 (M + 1) replicas on nodes in other
   racks in the same datacenter.
datacenter A            datacenter B

     rack 1    rack 2        rack 1    rack 2
Or write your own!
[fin]
Cassandra
http://cassandra.apache.org
Amazon Dynamo
   http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html




       Google BigTable
                http://labs.google.com/papers/bigtable.html




Facebook Cassandra
http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
Thank you!
 Questions?

More Related Content

What's hot

NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetesconfluent
 
Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...Piotr Paradziński
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Altinity Ltd
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseAltinity Ltd
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2ScyllaDB
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraDataStax
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systemssumitjain2013
 

What's hot (20)

NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Deploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
 
Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...
 
HDFS Architecture
HDFS ArchitectureHDFS Architecture
HDFS Architecture
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 

Viewers also liked

Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoverySteven Francia
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in CassandraEd Anuff
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...DataStax Academy
 
User Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and CloudsUser Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and CloudsEran Chinthaka Withana
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Lect 07 data replication
Lect 07 data replicationLect 07 data replication
Lect 07 data replicationBilal khan
 
Cassandra: Two data centers and great performance
Cassandra: Two data centers and great performanceCassandra: Two data centers and great performance
Cassandra: Two data centers and great performanceDATAVERSITY
 
IBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM Analytics
 
Large partition in Cassandra
Large partition in CassandraLarge partition in Cassandra
Large partition in CassandraShogo Hoshii
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Modelebenhewitt
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning CassandraDave Gardner
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 

Viewers also liked (20)

Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...
 
User Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and CloudsUser Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and Clouds
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Lect 07 data replication
Lect 07 data replicationLect 07 data replication
Lect 07 data replication
 
Cassandra: Two data centers and great performance
Cassandra: Two data centers and great performanceCassandra: Two data centers and great performance
Cassandra: Two data centers and great performance
 
IBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big DataIBM InfoSphere Data Replication for Big Data
IBM InfoSphere Data Replication for Big Data
 
Large partition in Cassandra
Large partition in CassandraLarge partition in Cassandra
Large partition in Cassandra
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 

Similar to Introduction to Cassandra: Replication and Consistency

Dynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresDynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresSusan Potter
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databaseslovingprince58
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsVineet Gupta
 
Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed CoordinationLuis Galárraga
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandraWu Liang
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupAdam Hutson
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Distributed Database Consistency: Architectural Considerations and Tradeoffs
Distributed Database Consistency: Architectural Considerations and TradeoffsDistributed Database Consistency: Architectural Considerations and Tradeoffs
Distributed Database Consistency: Architectural Considerations and TradeoffsScyllaDB
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache CassandraJacky Chu
 
Basics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageBasics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageNilesh Salpe
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_storedrewz lin
 
Compilers Are Databases
Compilers Are DatabasesCompilers Are Databases
Compilers Are DatabasesMartin Odersky
 

Similar to Introduction to Cassandra: Replication and Consistency (20)

Dynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresDynamo: Not Just For Datastores
Dynamo: Not Just For Datastores
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web Systems
 
Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed Coordination
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User Group
 
Cassandra
CassandraCassandra
Cassandra
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Distributed Database Consistency: Architectural Considerations and Tradeoffs
Distributed Database Consistency: Architectural Considerations and TradeoffsDistributed Database Consistency: Architectural Considerations and Tradeoffs
Distributed Database Consistency: Architectural Considerations and Tradeoffs
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Basics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageBasics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed Storage
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
Compilers Are Databases
Compilers Are DatabasesCompilers Are Databases
Compilers Are Databases
 

Recently uploaded

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Introduction to Cassandra: Replication and Consistency

  • 1. Cassandra Replication & Consistency Benjamin Black, b@b3k.us 2010-04-28
  • 2. Dynamo BigTable Cluster Sparse, management, columnar data replication, fault model, storage tolerance architecture Cassandra
  • 3. Dynamo-like Features Symmetric, P2P architecture No special nodes/SPOFs Gossip-based cluster management Distributed hash table for data placement Pluggable partitioning Pluggable topology discovery Pluggable placement strategies Tunable, eventual consistency
  • 4. BigTable-like Features Sparse, “columnar” data model Optional, 2-level maps called Super Column Families SSTable disk storage Append-only commit log Memtable (buffer and sort) Immutable SSTable files Hadoop integration
  • 5. Topic(s) for Today Replication & Consistency
  • 6. [1]
  • 7. Replication How many copies of each piece of data do we want in the system? N=3
  • 8. Consistency Level How many replicas must respond to declare success? W=2 R=2 ?
  • 9. CL.Options WRITE READ Level Description Level Description ZERO Cross fingers ANY WEAK 1st Response (including HH) ONE 1st Response ONE 1st Response STRONG QUORUM N/2 + 1 replicas QUORUM N/2 + 1 replicas ALL All replicas ALL All replicas
  • 10. A Side Note on CL Consistency Level is based on Replication Factor (N), not on the number of nodes in the system.
  • 11. A Question of Time row column column column column column value value value value value timestamp timestamp timestamp timestamp timestamp All columns have a value and a timestamp Timestamps provided by clients usec resolution by convention Latest timestamp wins Vector clocks may be introduced in 0.7
  • 12. Read Repair ? Query all replicas on every read Data from one replica Checksum/timestamps from all others If there is a mismatch: Pull all data and merge Write back to out of sync replicas
  • 13. Weak vs. Strong Weak Consistency (reads)Perform repair after returning results Strong Consistency (reads) Perform repair before returning results
  • 14. R+W>N Please imagine this inequality has huge fangs, dripping with the blood of innocent, enterprise developers so you can best appreciate the terror it inspires.
  • 15. Our Guarantee R+W>N guarantees overlap of read and write quorums W=2 R=2 N=3
  • 16. A Matter of Perspective View consistency Replica consistency
  • 17. [2]
  • 18. The Ring 0 range 113 375 125 312 250
  • 19. Tokens A TOKEN is a partitioner-dependent element on the ring Each NODE has a single, unique TOKEN Each NODE claims a RANGE of the ring from its TOKEN to the token of the previous node on the ring
  • 20. Partitioning Map from Key Space to Token RandomPartitioner Tokens are integers in the range 0-2127 MD5(Key) -> Token Good: Even key distribution, Bad: Inefficient range queries OrderPreservingPartitioner Tokens are UTF8 strings in the range ‘’-∞ Key -> Token Good: Efficient range queries, Bad: Uneven key distribution
  • 21. Snitching Map from Nodes to Physical Location EndpointSnitch Guess at rack and datacenter based on IP address octets. DatacenterEndpointSnitch Specify IP subnets for racks, grouped per datacenter. PropertySnitch Specify arbitrary mappings from individual IP addresses to racks and datacenters. Or write your own!
  • 22. Placement Map from Token Space to Nodes The first replica is always placed on the node that claims the range in which the token falls. Strategies determine where the rest of the replicas are placed.
  • 23. RackUnaware Place replicas on the N-1 subsequent nodes around the ring, ignoring topology. datacenter A datacenter B rack 1 rack 2 rack 1 rack 2
  • 24. RackAware Place the second replica in another datacenter, and the remaining N-2 replicas on nodes in other racks in the same datacenter. datacenter A datacenter B rack 1 rack 2 rack 1 rack 2
  • 25. DatacenterShard Place M of the N replicas in another datacenter, and the remaining N - (M + 1) replicas on nodes in other racks in the same datacenter. datacenter A datacenter B rack 1 rack 2 rack 1 rack 2
  • 27. [fin]
  • 29. Amazon Dynamo http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html Google BigTable http://labs.google.com/papers/bigtable.html Facebook Cassandra http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf

Editor's Notes