SlideShare a Scribd company logo
1 of 108
Download to read offline
Cassandra
Consistency
Quick Overview
Token/DHT
Consistent Hashing
Replication Factor(RF)
Consistency Level(CL)
Hinted Handoff(HH)
A hint is written to the coordinator node when a replica is down
Read Repair(RR)
Background digest query on-read to find and update out-of-date replicas*
* carried out in the background unless CL:ALL
http://www.planetcassandra.org/data-replication-in-nosql-databases-explained/#
更新(insert,update,delete)
https://uberdev.wordpress.com/2015/11/29/cassandra-developer-certification-study-notes-read-path/
Write Path
SSTable是不可变的,当Memtable刷写到磁盘后就不能继续写⼊入,同⼀一个Partition可能跨越多个SSTable,但是不可能跨越多个节点
Partition/Primary Index:Partition keys以及在Data File⽂文件中这⼀一⾏行的起始位置(数据的元数据,索引)
Partition/Index Summary:Partition Index的抽样信息,保存在内存中(元数据的元数据,索引的索引)
Bloom Filter:检查⼀一⾏行数据(Partition Key)是否在SSTable中,如果不再,就不会读取SSTable
http://docs.datastax.com/en/cassandra/2.2/cassandra/dml/dmlHowDataWritten.html
Read Path
①
https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlAboutReads.html
http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra
Memtable RowCache
N
Y
②
③
④
⑤
⑤
a pk is found
in key cache
⑥
⑦
Read Request Flow
Row cache & Key cache
The row cache is not write-through. If a write comes in for the row,
the cache for that row is invalidated and is not cached again until
the row is read. Similarly, if a partition is updated, the entire partition
is evicted from the cache. When the desired partition data is not
found in the row cache, then the Bloom filter is checked.
RowCache是不可写的,如果更新了⼀一⾏行,则在RowCache中的这
⼀一⾏行就彻底失效了:会从RowCache中移除直到下次访问这⼀一⾏行时
A Bloom filter can establish that a SSTable does not contain certain
partition data. A Bloom filter can also find the likelihood that partition
data is stored in a SSTable. However, because the Bloom filter is a
probabilistic function, it can result in false positives. Not all SSTables
identified by the Bloom filter will have data. If the Bloom filter does
not rule out an SSTable, Cassandra checks the partition key cache
The partition key cache stores a cache of the partition index off-heap.
If a partition key is found in the key cache can go directly to the
compression offset map to find the compressed block on disk that
has the data.
https://2012.nosql-matters.org/cgn/wp-content/uploads/2012/06/Sylvain_Lebresne-Cassandra_Storage_Engine.pdf
Write & Read Example
Compaction
SSTable
Storage
Format
Storage
http://distributeddatastore.blogspot.com/2013/08/cassandra-sstable-storage-format.html
Index.db
Data.db
索引⽂文件存储的是所有的Key(不采样)
⽽而MD5表数据的KeyValue⼤大⼩小均匀,
所以索引⽂文件和数据⽂文件⼤大⼩小差不多
Regular Column Tombstone Column
Full Index & Sample Index
Index.dbSummary.db
1. Row key length (short/2 bytes)
2. Key (N bytes)
3. Offset in SSTable data file (long/8 bytes)
4. Promoted size (int/4 bytes)
00000000 00 04 72 6f 77 41 00 00 00 00 00 00 00 00 00 00 |..rowA..........|
00000010 00 00 00 04 72 6f 77 42 00 00 00 00 00 00 00 5f |....rowB......._|
00000020 00 00 00 00 00 0a 72 6f 77 45 78 63 6c 75 64 65 |......rowExclude|
00000030 00 00 00 00 00 00 00 be 00 00 00 00 |............|
0000003c
Failure,Error Handling
http://www.datastax.com/dev/blog/cassandra-error-handling-done-right
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
When a timeout is not a failure
Rapid Read Protection(speculative_retry/dynamic snitch)
https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlClientRequestsRead.html
http://www.planetcassandra.org/blog/rapid-read-protection-in-cassandra-202/
https://issues.apache.org/jira/browse/CASSANDRA-5932
1.客户端向Coordinator节点请求数据,协调节点将请求
路由到性能最好的节点(副本),最后将结果返回给客户端
只针对读。读只会请求⼀一个节点的副本,然后根据⼀一致性级别和ReadRepair概率,
只会请求其他副本的Checksum(没有请求数据):选择⼀一个最适合的副本很重要。
DynamicSnitch会监测不同副本的读取性能,基于历史选择最好的那个副本。
ALTER TABLE users WITH speculative_retry = '10ms';
ALTER TABLE users WITH speculative_retry = '99percentile';
优点:某些节点性能差时可以降低读延迟
缺点:产⽣生额外的请求,吞吐量下降
注意:
1)不适⽤用于⼀一致性级别=ALL,因为该级别本⾝身就需要读取所有副本
2)集群规模较⼩小时,快速读保护也会降低吞吐量,规模较⼤大时不明显
Recovering from replica node failure with rapid read protection
2.如果路由到的节点在返回响应给协调节点
之前失败了,客户端的请求最终会超时
3.快速读保护: 允许协调者监测未完成的请求,
当原始副本的读取请求响应⽐比预期的要慢时,
协调者发送额外的请求给其他副本所在的节点
✅🙅
🙅
凡事不能绝对,都不开启推测执⾏行不好,总是开启也不是好主意
只对90%的请求开启推测执⾏行,这样只有10%的请求不会被保护
Data Consistency
数据⼀一致性
Paxos consensus protocol
Lightweight Transaction(CAS)two-phase commit
https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlAboutDataConsistency.html
Linearizable consistency
Tunable Consistency可调节的⼀一致性:
R:the consistency level of read operations
W: the consistency level of write operations
N:the number of replicas
Strong consistency guaranteed: R + W > N
Eventual consistency occured:R + W <= N
Client read or write requests can go to any node in the cluster because all nodes in Cassandra are peers(对等). When a client
connects to a node and issues a read or write request, that node serves as the coordinator for that particular client operation.
The job of the coordinator is to act as a proxy between the client application and the nodes (or replicas) that own the data
being requested. The coordinator determines which nodes in the ring should get the request based on the cluster configured
partitioner and replica placement strategy.
https://www.datadoghq.com/blog/how-to-monitor-cassandra-performance-metrics/
Coordinator
Consistency refers to how up-to-date and synchronized a row of Cassandra data is on all of its replicas.
Using repair operations, Cassandra data will eventually be consistent in all replicas. Repairs work to
decrease the variability in replica data, but at a given time, stale data can be present.
The consistency level determines the number of replicas that need to acknowledge the read or write
operation success to the client application. For read operations, the read consistency level specifies how
many replicas must respond to a read request before returning data to the client application. For write
operations, the write consistency level specified how many replicas must respond to a write request
before the write is considered successful.
Even at low consistency levels, Cassandra writes to all replicas of the partition key, including replicas in
other data centers. The write consistency level just specifies when the coordinator can report to the client
application that the write operation is considered completed.
If a read operation reveals(揭⽰示) inconsistency among replicas, Cassandra initiates(启动) a read repair to
update the inconsistent data. Write operations will use hinted handoffs to ensure the writes are
completed when replicas are down or otherwise not responsive to the write request.
Typically, a client specifies a consistency level that is less than the replication factor specified by the
keyspace. Another common practice is to write at a consistency level of QUORUM and read at a
consistency level of QUORUM. The choices made depend on the client application's needs, and Cassandra
provides maximum flexibility for application design. There is a tradeoff between operation latency and
consistency: higher consistency incurs higher latency, lower consistency permits lower latency. You can
control latency by tuning consistency.
Consistency Level(CL): How many replicas must respond to declare success?
Hinted Handoff(HH): A hint is written to the coordinator node when a replica is down
Read Repair(RR): Background digest query on-read to find and update out-of-date replicas
https://docs.datastax.com/en/cassandra/2.2/cassandra/dml/dmlAboutDataConsistency.html
Consistency Level
Client
Direct Read
Direct Read
Digest Read
Compare In Memory
Decide Which Latest
What If n4 newer than n3, issure another Direct Read to n4?
(Because n4 is just digest, for full data, we need Direct Read)
In this situation, n3 will also pull data from newer data at n4.
❓
虽然副本存储在n2,n3,n4,⽽而且n2可以认为是主副本
但是协调节点会根据历史数据选择最快那个节点的副本
CL=ONE?
读取负载最低的节点的数据(如果它不是最新的呢)
两两⽐比较,还是Direct Read和Digest Read⽐比较?
当CL=ONE时read_repair_chance配置有效:只有10%的请求需要进⾏行Read Repair.
chance对CL>ONE⽆无效,即CL=QUORUM/ALL,所有请求⼀一旦不⼀一致都需要Repair
read_repair_chance is ignored if the ConsistencyLevel
is greater than ONE and read repair always occurs.
Write=ALL, READ=ONE, 保证了强⼀一致性,同时只有10%的请求才会在后台启动Read Repair
Read repair means that when a query is made against a given key, we perform a digest query against all the replicas of the key and push the
most recent version to any out-of-date replicas. If a lower ConsistencyLevel than ALL was specified, this is done in the background after
returning the data from the closest replica to the client; otherwise(CL=ALL), it is done before returning the data. This means that in almost all
cases, at most the first instance of a query will return old data(第⼀一次可能会收到过期的数据,但是后续相同的查询因为修复过数据就是新的).
Read Repair机制:查询时先向最近的节点查询数据[1],然后向其他节点发送Digest请求,在对所有的副本进⾏行⽐比较后将最新时间撮的副本数据
推送到其他过期的副本。不同的⼀一致性级别只是Read Repair的时机不同,ONE或QUORUM时,在将最近那个节点的数据[1]返回给客户端之后
才在后台开始ReadRepair操作。当⼀一致性级别=ALL,在返回数据给客户端前完成ReadRepair。
不管哪种⼀一致性,请求完整的数据只会是最近的那个节点,即使这个节点的数据不是最新的,最终还是会返回给客户端,就有可能返回过期数据
https://wiki.apache.org/cassandra/ReadRepair
https://docs.datastax.com/en/cassandra/2.2/cassandra/dml/dmlClientRequestsRead.html
http://www.datastax.com/dev/blog/common-mistakes-and-misconceptions
There are three types of read requests that a coordinator can send to a replica:
+ A direct read request
+ A digest request
+ A background read repair request
The coordinator node contacts one replica node with a direct read request. Then the coordinator sends a digest request to a number of
replicas determined by the consistency level specified by the client. The digest request checks the data in the replica node to make sure it
is up to date. Then the coordinator sends a digest request to all remaining replicas. If any replica nodes have out of date data, a
background read repair request is sent. Read repair requests ensure that the requested row is made consistent on all replicas.
For a digest request the coordinator first contacts the replicas specified by the consistency level. The coordinator sends these requests to
the replicas that are currently responding the fastest. The nodes contacted respond with a digest of the requested data; if multiple nodes are
contacted, the rows from each replica are compared in memory to see if they are consistent. If they are not, then the replica that has the
most recent data (based on the timestamp) is used by the coordinator to forward the result back to the client. To ensure that all replicas have
the most recent version of the data, read repair is carried out to update out-of-date replicas.
CL=ONE,Direct Read⼀一个节点,但只有10%的请求会在后台发⽣生Read Repair(剩余的两个副本)
CL=QUORUM,Direct Read⼀一个节点,向另⼀一个节点发送Digest Read,此次满⾜足QUORUM级别,确保这两个节点数据⼀一致后
返回Direct Read读取的数据给客户端,再次向最后⼀一个节点发送Digest Read(如果最后这个节点才是最新的数据呢?)
CL=ALL,Direct Read⼀一个节点,向另外两个节点发送Digest Read,运⾏行Read Repair确保所有节点数据⼀一致,返回Direct Read数据给客户端
Read & Read Repair
Read repair is not directly related to repair, but both play a role in the overall anti-entropy system in Cassandra. read_repair_chance setting used to be
started out as 1. That is, at a consistency level of 1, for every read, we would check the other replicas to see if the thing data we just read is consistent
with the other replicas. This was good, because if you ever read stale data, the next time you read the same row you would probably read something
more up to date. The bad part about this was requiring every read to become RF reads (and typically your RF is set to at least 3). Meaning that reads
happen more often, and require more IO. In newer versions of Cassandra the default for this value is 0.1, and it is set on a per-columnfamily basis.
Which means 10% of your requests will trigger a background read repair. This is more than enough for typical scenarios.
When data is read to satisfy a query and return a result, all replicas are queried for the data needed(所有的副本都会被查询). The first replica
node receives a direct read request and supplies the full data(第⼀一个副本收到Direct Read请求,提供完整的数据给协调节点). The other
nodes contacted receive a digest request and return a digest, or hash of the data(其他节点收到Digest请求,返回数据的概要给协调节点). A
digest is requested because generally the hash is smaller than the data itself.
A comparison of the digests allows the coordinator to return the most up-to-date data to the query(对概要进⾏行⽐比较, 这样允许协调者返回最新
的数据给客户端, 问题:概要能直接返回给客户端吗?如果Direct Read不是最新的怎么办?概要可以和Direct Read⽐比较吗?). If the digests are
the same for enough replicas to meet the consistency level, the data is returned(概要的数量满⾜足⼀一致性级别,数据返回给客户端). If the
consistency level of the read query is ALL, the comparison must be completed before the results are returned; otherwise for all lower
consistency levels, it is done in the background(⼀一致性级别为ALL,⽐比较操作必须在返回结果给客户端之前完成,否则可以在返回结果后⽐比较).
The coordinator compares the digests, and if a mismatch is discovered(发现了不⼀一致), a request for the full data is sent to the mismatched
nodes(完整的数据会被发送到不匹配的节点,这个完整的数据是Direct Read的吗,还是Digest中时间撮最新的?). The most current data found
in a full data comparison is used to reconcile(调解) any inconsistent data on other replicas.
http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsRepairNodesTOC.html
http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsRepairNodesReadRepair.html
Node repair makes data on a replica consistent with data on other nodes and is important for every Cassandra cluster. Repair is the process
of correcting the inconsistencies so that eventually, all nodes have the same and most up-to-date data.
Repair can occur in the following ways:
✅ Hinted Handoff
During the write path, if a node that should receive data is unavailable, hints are written to the coordinator. When the node comes back online,
the coordinator can hand off the hints so that the node can catch up and write the data.
✅ Read Repair
During the read path, a query acquires data from several nodes. The acquired data from each node is checked against each other node. If a
node has outdated data, the most recent data is written back to the node.
✅ Anti-Entropy Repair
For maintenance purposes or recovery, manually run anti-entropy repair to rectify inconsistencies on any nodes(by nodetool repair).
Repair
Hint TTL, max_hint_window_in_ms=3hour
如果⼀一个节点当掉超过3⼩小时,后续的hint不会存储
可调节的⼀一致性
Low Latency,Low Consistency 低的⼀一致性才能有低的延迟
High Latency,High Consistency ⾼高的⼀一致性会产⽣生⾼高的延迟
Read
Write
Consistency Example
https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlClientRequestsWrite.html
The coordinator sends a write request to all replicas that own the row being written. As long as all replica nodes are up and available, they will get the
write regardless of the consistency level specified by the client. The write consistency level determines how many replica nodes must respond with a
success acknowledgment in order for the write to be considered successful. Success means that the data was written to the commit log and the
memtable as described in how data is written.
In a single data center 12 node cluster with a replication factor of 3, an incoming write will go to all 3 nodes that own the requested row. If the write
consistency level specified by the client is ONE, the first node [R1] to complete the write responds back to the coordinator, which then proxies the
success message back to the client [write response]. A consistency level of ONE means that it is possible that 2 of the 3 replicas [R2,R3] could miss
the write if they happened to be down at the time the request was made.
That node [coordinator] forwards the write to all replicas of that row. It responds to the client once it receives write acknowledgments from the number
of nodes specified by the consistency level.
1. If the coordinator cannot write to enough replicas to meet the requested CL, it throws an Unavailable Exception and does not perform any writes.
2. If there are enough replicas available but the required writes don't finish within the timeout window, the coordinator throws a Timeout Exception.
写⼀一致性
DC:2, RF:3, CL:QUORUM=>
所有数据中⼼心,两个副本
In multiple data center deployments, Cassandra
optimizes write performance by choosing one
coordinator node. The coordinator node contacted
by the client application forwards the write request
to each replica node in each all the data centers.
If using a consistency level of LOCAL_ONE or
LOCAL_QUORUM, only the nodes in the same
data center as the coordinator node must respond
to the client request in order for the request to
succeed. This way, geographical latency does not
impact client request response times.
https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlClientRequestsReadExp.html
DC:1, RF:3, CL:QUORUM=>2
In a single data center cluster with a replication factor of 3, and a read consistency level of QUORUM, 2 of the 3 replicas for the given row
must respond to fulfill the read request. If the contacted replicas have different versions of the row, the replica with the most recent version will
return the requested data [to Client]. In the background, the third replica is checked for consistency with the first two, and if needed, a read
repair is initiated for the out-of-date replicas.
读⼀一致性
DC:1, RF:3, CL:ONE=>1
In a single data center cluster with a replication factor of 3, and a read consistency level of ONE, the closest replica for the given row is
contacted to fulfill the read request. In the background a read repair is potentially initiated, based on the read_repair_chance setting of the
table, for the other replicas.
In a two data center cluster with a RF=3, and a
read consistency of QUORUM, 4 replicas for the
given row must respond to fulfill the read request.
The 4 replicas can be from any data center. In the
background, the remaining replicas are checked
for consistency with the first four, and if needed,
a read repair is initiated for the out-of-date replicas.
DC:2, RF:3, CL:QUORUM=>
任何数据中⼼心,四个副本
DC:2, RF:3, CL:LOCAL_QUORUM=>
本地数据中⼼心,两个副本
In a multiple data center cluster with a RF=3,
and a read consistency of LOCAL_QUORUM,
2 replicas in the same DC as the coordinator
node for the given row must respond to fulfill
the read request. In the background, the
remaining replicas are checked for consistency
with the first 2, and if needed, a read repair is
initiated for the out-of-date replicas.
DC:2, RF:3, CL:ONE=>
任何DC,⼀一个副本
In a multiple data center cluster with a RF=3,
and a read consistency of ONE, the closest replica
for the given row, regardless of data center,
is contacted to fulfill the read request. In the
background a read repair is potentially initiated,
based on the read_repair_chance setting of the
table, for the other replicas.
DC:2, RF:3, CL:LOCAL_ONE=>
本地数据中⼼心,⼀一个副本
In a multiple data center cluster with a RF=3,
and a read consistency of LOCAL_ONE, the
closest replica for the given row in the same
data center as the coordinator node is
contacted to fulfill the read request. In the
background a read repair is potentially initiated,
based on the read_repair_chance setting of the
table, for the other replicas.
Bloom Filter
sstable sstablekey1
Bloom
Filter
Bloom
Filter
sstable sstable
Bloom
Filter
Bloom
Filter
key1
Am I Here?
Query key1
sstable sstable
Bloom
Filter
Bloom
Filter
No,U’r NOT here!
sstable sstable
Bloom
Filter
Bloom
Filter
OK, I Believe U!
key1
key1
GO NEXT SSTABLE…
sstable sstable
Bloom
Filter
Bloom
Filter
bloom_filter_fp_chance
false positive
determines the percent chance of the bloom filter returning a false positive
that a partition exists in an SSTable when in fact it does not.
false positives are possible;
false negatives are not possible
If you increase the percent chance of false positives, then you lower memory usage via a smaller filter size at the expense of more disk seeks
due to an increase in false positives.
If you decrease the percent chance of false positives, then you increase memory usage via a larger filter size for the benefit of fewer disk
seeks thanks to fewer false positives.
https://grockdoc.com/cassandra/2.1/articles/tuning-reads-via-the-bloom-filter_88c8f57a-71d0-41ee-b77f-617c64ad4739/
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html
False positive matches are possible, but false negatives are not. In other words,
a query returns either “possibly in set” or “definitely not in set”.
http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation
https://issues.apache.org/jira/browse/CASSANDRA-6474
Merkle Tree
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesManualRepair.html
http://www.datastax.com/dev/blog/more-efficient-repairs
JAVA Driver
http://christopher-batey.blogspot.com/2015/02/cassandra-anti-pattern-misuse-of.html
https://www.pythian.com/blog/guide-to-cassandra-thread-pools/
Cassandra consistency

More Related Content

What's hot

PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsMydbops
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...DataWorks Summit/Hadoop Summit
 
Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...
Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...
Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...Artem Chebotko
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016DataStax
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...Spark Summit
 
Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Mydbops
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?Anton Zadorozhniy
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Jay Patel
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1Federico Campoli
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 

What's hot (20)

PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...
Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...
Using the Chebotko Method to Design Sound and Scalable Data Models for Apache...
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
 
PostgreSQL replication
PostgreSQL replicationPostgreSQL replication
PostgreSQL replication
 
Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 

Viewers also liked

Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!Acunu
 
Why Learn Cassandra
Why Learn CassandraWhy Learn Cassandra
Why Learn Cassandraseric2167
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Inquiry Learning Environments: WISE (Oct262013)
Inquiry Learning Environments: WISE (Oct262013)Inquiry Learning Environments: WISE (Oct262013)
Inquiry Learning Environments: WISE (Oct262013)Dermot Donnelly
 
FATCA - Foreign Accounts Tax Compliance Acts - bachir el nakib-
FATCA - Foreign Accounts Tax Compliance Acts -  bachir el nakib-FATCA - Foreign Accounts Tax Compliance Acts -  bachir el nakib-
FATCA - Foreign Accounts Tax Compliance Acts - bachir el nakib-Bachir El-Nakib, CAMS
 
Pengantar teknologi mineral3
Pengantar teknologi mineral3Pengantar teknologi mineral3
Pengantar teknologi mineral3Sylvester Saragih
 
Wade.Go Introduction Speech - SFD HCMC 2014
Wade.Go Introduction Speech - SFD HCMC 2014Wade.Go Introduction Speech - SFD HCMC 2014
Wade.Go Introduction Speech - SFD HCMC 2014Nguyễn Thành Hải
 
Assignment 12
Assignment 12Assignment 12
Assignment 12debbie14
 
プリントアウトのコストダウンのコツ《総務・経理の方へ》
プリントアウトのコストダウンのコツ《総務・経理の方へ》プリントアウトのコストダウンのコツ《総務・経理の方へ》
プリントアウトのコストダウンのコツ《総務・経理の方へ》吉田印刷所
 
Educ556 dl 01 διαδικτυακή και μικτή μάθηση
Educ556 dl 01 διαδικτυακή και μικτή μάθησηEduc556 dl 01 διαδικτυακή και μικτή μάθηση
Educ556 dl 01 διαδικτυακή και μικτή μάθησηAntonis Georgiou
 
Becoming familiar with the middle ear
Becoming familiar with the middle earBecoming familiar with the middle ear
Becoming familiar with the middle earLynn Royer
 
Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi dan prinsip ...
Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi  dan prinsip ...Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi  dan prinsip ...
Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi dan prinsip ...Sylvester Saragih
 
Bab 2.pptx [autosaved]
Bab 2.pptx [autosaved]Bab 2.pptx [autosaved]
Bab 2.pptx [autosaved]Widyawati Oigk
 
Primero corporate-presentation-june-final-july
Primero corporate-presentation-june-final-julyPrimero corporate-presentation-june-final-july
Primero corporate-presentation-june-final-julyprimero_mining
 

Viewers also liked (20)

Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!Cassandra deep-dive @ NoSQLNow!
Cassandra deep-dive @ NoSQLNow!
 
Why Learn Cassandra
Why Learn CassandraWhy Learn Cassandra
Why Learn Cassandra
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Verb to be
Verb to beVerb to be
Verb to be
 
Bahan kuliah materi 8
Bahan kuliah materi 8Bahan kuliah materi 8
Bahan kuliah materi 8
 
Inquiry Learning Environments: WISE (Oct262013)
Inquiry Learning Environments: WISE (Oct262013)Inquiry Learning Environments: WISE (Oct262013)
Inquiry Learning Environments: WISE (Oct262013)
 
Ec1
Ec1Ec1
Ec1
 
FATCA - Foreign Accounts Tax Compliance Acts - bachir el nakib-
FATCA - Foreign Accounts Tax Compliance Acts -  bachir el nakib-FATCA - Foreign Accounts Tax Compliance Acts -  bachir el nakib-
FATCA - Foreign Accounts Tax Compliance Acts - bachir el nakib-
 
Pengantar teknologi mineral3
Pengantar teknologi mineral3Pengantar teknologi mineral3
Pengantar teknologi mineral3
 
Wade.Go Introduction Speech - SFD HCMC 2014
Wade.Go Introduction Speech - SFD HCMC 2014Wade.Go Introduction Speech - SFD HCMC 2014
Wade.Go Introduction Speech - SFD HCMC 2014
 
Assignment 12
Assignment 12Assignment 12
Assignment 12
 
Prezentacione vestine
Prezentacione vestinePrezentacione vestine
Prezentacione vestine
 
プリントアウトのコストダウンのコツ《総務・経理の方へ》
プリントアウトのコストダウンのコツ《総務・経理の方へ》プリントアウトのコストダウンのコツ《総務・経理の方へ》
プリントアウトのコストダウンのコツ《総務・経理の方へ》
 
Educ556 dl 01 διαδικτυακή και μικτή μάθηση
Educ556 dl 01 διαδικτυακή και μικτή μάθησηEduc556 dl 01 διαδικτυακή και μικτή μάθηση
Educ556 dl 01 διαδικτυακή και μικτή μάθηση
 
Becoming familiar with the middle ear
Becoming familiar with the middle earBecoming familiar with the middle ear
Becoming familiar with the middle ear
 
Ptm
PtmPtm
Ptm
 
Hypertext system
Hypertext systemHypertext system
Hypertext system
 
Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi dan prinsip ...
Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi  dan prinsip ...Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi  dan prinsip ...
Kelompok 3 Teori Pengendapan partikel untuk konsentrasi operasi dan prinsip ...
 
Bab 2.pptx [autosaved]
Bab 2.pptx [autosaved]Bab 2.pptx [autosaved]
Bab 2.pptx [autosaved]
 
Primero corporate-presentation-june-final-july
Primero corporate-presentation-june-final-julyPrimero corporate-presentation-june-final-july
Primero corporate-presentation-june-final-july
 

Similar to Cassandra consistency

Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupAdam Hutson
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_storedrewz lin
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011sandeep_tata
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
Dynamo.ppt
Dynamo.pptDynamo.ppt
Dynamo.pptksjk1
 
Dynamo.ppt
Dynamo.pptDynamo.ppt
Dynamo.pptkaja56
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsPanagiotis Papadopoulos
 
Distributed Algorithms
Distributed AlgorithmsDistributed Algorithms
Distributed Algorithms913245857
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionKelum Senanayake
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraDataStax
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBANikhil Kumar
 

Similar to Cassandra consistency (20)

Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User Group
 
Cassandra
CassandraCassandra
Cassandra
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Dynamo.ppt
Dynamo.pptDynamo.ppt
Dynamo.ppt
 
Dynamo.ppt
Dynamo.pptDynamo.ppt
Dynamo.ppt
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and Limitations
 
Distributed Algorithms
Distributed AlgorithmsDistributed Algorithms
Distributed Algorithms
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 
No sql
No sqlNo sql
No sql
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Cassandra consistency

  • 7. Hinted Handoff(HH) A hint is written to the coordinator node when a replica is down
  • 8. Read Repair(RR) Background digest query on-read to find and update out-of-date replicas* * carried out in the background unless CL:ALL
  • 10.
  • 12.
  • 14. SSTable是不可变的,当Memtable刷写到磁盘后就不能继续写⼊入,同⼀一个Partition可能跨越多个SSTable,但是不可能跨越多个节点 Partition/Primary Index:Partition keys以及在Data File⽂文件中这⼀一⾏行的起始位置(数据的元数据,索引) Partition/Index Summary:Partition Index的抽样信息,保存在内存中(元数据的元数据,索引的索引) Bloom Filter:检查⼀一⾏行数据(Partition Key)是否在SSTable中,如果不再,就不会读取SSTable http://docs.datastax.com/en/cassandra/2.2/cassandra/dml/dmlHowDataWritten.html
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 21. ① https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlAboutReads.html http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra Memtable RowCache N Y ② ③ ④ ⑤ ⑤ a pk is found in key cache ⑥ ⑦ Read Request Flow Row cache & Key cache The row cache is not write-through. If a write comes in for the row, the cache for that row is invalidated and is not cached again until the row is read. Similarly, if a partition is updated, the entire partition is evicted from the cache. When the desired partition data is not found in the row cache, then the Bloom filter is checked. RowCache是不可写的,如果更新了⼀一⾏行,则在RowCache中的这 ⼀一⾏行就彻底失效了:会从RowCache中移除直到下次访问这⼀一⾏行时 A Bloom filter can establish that a SSTable does not contain certain partition data. A Bloom filter can also find the likelihood that partition data is stored in a SSTable. However, because the Bloom filter is a probabilistic function, it can result in false positives. Not all SSTables identified by the Bloom filter will have data. If the Bloom filter does not rule out an SSTable, Cassandra checks the partition key cache The partition key cache stores a cache of the partition index off-heap. If a partition key is found in the key cache can go directly to the compression offset map to find the compressed block on disk that has the data.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 41. Full Index & Sample Index Index.dbSummary.db 1. Row key length (short/2 bytes) 2. Key (N bytes) 3. Offset in SSTable data file (long/8 bytes) 4. Promoted size (int/4 bytes) 00000000 00 04 72 6f 77 41 00 00 00 00 00 00 00 00 00 00 |..rowA..........| 00000010 00 00 00 04 72 6f 77 42 00 00 00 00 00 00 00 5f |....rowB......._| 00000020 00 00 00 00 00 0a 72 6f 77 45 78 63 6c 75 64 65 |......rowExclude| 00000030 00 00 00 00 00 00 00 be 00 00 00 00 |............| 0000003c
  • 45.
  • 46. Rapid Read Protection(speculative_retry/dynamic snitch) https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlClientRequestsRead.html http://www.planetcassandra.org/blog/rapid-read-protection-in-cassandra-202/ https://issues.apache.org/jira/browse/CASSANDRA-5932 1.客户端向Coordinator节点请求数据,协调节点将请求 路由到性能最好的节点(副本),最后将结果返回给客户端 只针对读。读只会请求⼀一个节点的副本,然后根据⼀一致性级别和ReadRepair概率, 只会请求其他副本的Checksum(没有请求数据):选择⼀一个最适合的副本很重要。 DynamicSnitch会监测不同副本的读取性能,基于历史选择最好的那个副本。 ALTER TABLE users WITH speculative_retry = '10ms'; ALTER TABLE users WITH speculative_retry = '99percentile'; 优点:某些节点性能差时可以降低读延迟 缺点:产⽣生额外的请求,吞吐量下降 注意: 1)不适⽤用于⼀一致性级别=ALL,因为该级别本⾝身就需要读取所有副本 2)集群规模较⼩小时,快速读保护也会降低吞吐量,规模较⼤大时不明显 Recovering from replica node failure with rapid read protection
  • 49. Data Consistency 数据⼀一致性 Paxos consensus protocol Lightweight Transaction(CAS)two-phase commit https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlAboutDataConsistency.html Linearizable consistency Tunable Consistency可调节的⼀一致性: R:the consistency level of read operations W: the consistency level of write operations N:the number of replicas Strong consistency guaranteed: R + W > N Eventual consistency occured:R + W <= N
  • 50. Client read or write requests can go to any node in the cluster because all nodes in Cassandra are peers(对等). When a client connects to a node and issues a read or write request, that node serves as the coordinator for that particular client operation. The job of the coordinator is to act as a proxy between the client application and the nodes (or replicas) that own the data being requested. The coordinator determines which nodes in the ring should get the request based on the cluster configured partitioner and replica placement strategy. https://www.datadoghq.com/blog/how-to-monitor-cassandra-performance-metrics/ Coordinator
  • 51. Consistency refers to how up-to-date and synchronized a row of Cassandra data is on all of its replicas. Using repair operations, Cassandra data will eventually be consistent in all replicas. Repairs work to decrease the variability in replica data, but at a given time, stale data can be present. The consistency level determines the number of replicas that need to acknowledge the read or write operation success to the client application. For read operations, the read consistency level specifies how many replicas must respond to a read request before returning data to the client application. For write operations, the write consistency level specified how many replicas must respond to a write request before the write is considered successful. Even at low consistency levels, Cassandra writes to all replicas of the partition key, including replicas in other data centers. The write consistency level just specifies when the coordinator can report to the client application that the write operation is considered completed. If a read operation reveals(揭⽰示) inconsistency among replicas, Cassandra initiates(启动) a read repair to update the inconsistent data. Write operations will use hinted handoffs to ensure the writes are completed when replicas are down or otherwise not responsive to the write request. Typically, a client specifies a consistency level that is less than the replication factor specified by the keyspace. Another common practice is to write at a consistency level of QUORUM and read at a consistency level of QUORUM. The choices made depend on the client application's needs, and Cassandra provides maximum flexibility for application design. There is a tradeoff between operation latency and consistency: higher consistency incurs higher latency, lower consistency permits lower latency. You can control latency by tuning consistency. Consistency Level(CL): How many replicas must respond to declare success? Hinted Handoff(HH): A hint is written to the coordinator node when a replica is down Read Repair(RR): Background digest query on-read to find and update out-of-date replicas https://docs.datastax.com/en/cassandra/2.2/cassandra/dml/dmlAboutDataConsistency.html Consistency Level
  • 53.
  • 54.
  • 55.
  • 56.
  • 58. Direct Read Digest Read Compare In Memory Decide Which Latest What If n4 newer than n3, issure another Direct Read to n4? (Because n4 is just digest, for full data, we need Direct Read) In this situation, n3 will also pull data from newer data at n4. ❓
  • 59.
  • 61. CL=ONE? 读取负载最低的节点的数据(如果它不是最新的呢) 两两⽐比较,还是Direct Read和Digest Read⽐比较? 当CL=ONE时read_repair_chance配置有效:只有10%的请求需要进⾏行Read Repair. chance对CL>ONE⽆无效,即CL=QUORUM/ALL,所有请求⼀一旦不⼀一致都需要Repair read_repair_chance is ignored if the ConsistencyLevel is greater than ONE and read repair always occurs. Write=ALL, READ=ONE, 保证了强⼀一致性,同时只有10%的请求才会在后台启动Read Repair
  • 62. Read repair means that when a query is made against a given key, we perform a digest query against all the replicas of the key and push the most recent version to any out-of-date replicas. If a lower ConsistencyLevel than ALL was specified, this is done in the background after returning the data from the closest replica to the client; otherwise(CL=ALL), it is done before returning the data. This means that in almost all cases, at most the first instance of a query will return old data(第⼀一次可能会收到过期的数据,但是后续相同的查询因为修复过数据就是新的). Read Repair机制:查询时先向最近的节点查询数据[1],然后向其他节点发送Digest请求,在对所有的副本进⾏行⽐比较后将最新时间撮的副本数据 推送到其他过期的副本。不同的⼀一致性级别只是Read Repair的时机不同,ONE或QUORUM时,在将最近那个节点的数据[1]返回给客户端之后 才在后台开始ReadRepair操作。当⼀一致性级别=ALL,在返回数据给客户端前完成ReadRepair。 不管哪种⼀一致性,请求完整的数据只会是最近的那个节点,即使这个节点的数据不是最新的,最终还是会返回给客户端,就有可能返回过期数据 https://wiki.apache.org/cassandra/ReadRepair https://docs.datastax.com/en/cassandra/2.2/cassandra/dml/dmlClientRequestsRead.html http://www.datastax.com/dev/blog/common-mistakes-and-misconceptions There are three types of read requests that a coordinator can send to a replica: + A direct read request + A digest request + A background read repair request The coordinator node contacts one replica node with a direct read request. Then the coordinator sends a digest request to a number of replicas determined by the consistency level specified by the client. The digest request checks the data in the replica node to make sure it is up to date. Then the coordinator sends a digest request to all remaining replicas. If any replica nodes have out of date data, a background read repair request is sent. Read repair requests ensure that the requested row is made consistent on all replicas. For a digest request the coordinator first contacts the replicas specified by the consistency level. The coordinator sends these requests to the replicas that are currently responding the fastest. The nodes contacted respond with a digest of the requested data; if multiple nodes are contacted, the rows from each replica are compared in memory to see if they are consistent. If they are not, then the replica that has the most recent data (based on the timestamp) is used by the coordinator to forward the result back to the client. To ensure that all replicas have the most recent version of the data, read repair is carried out to update out-of-date replicas. CL=ONE,Direct Read⼀一个节点,但只有10%的请求会在后台发⽣生Read Repair(剩余的两个副本) CL=QUORUM,Direct Read⼀一个节点,向另⼀一个节点发送Digest Read,此次满⾜足QUORUM级别,确保这两个节点数据⼀一致后 返回Direct Read读取的数据给客户端,再次向最后⼀一个节点发送Digest Read(如果最后这个节点才是最新的数据呢?) CL=ALL,Direct Read⼀一个节点,向另外两个节点发送Digest Read,运⾏行Read Repair确保所有节点数据⼀一致,返回Direct Read数据给客户端 Read & Read Repair Read repair is not directly related to repair, but both play a role in the overall anti-entropy system in Cassandra. read_repair_chance setting used to be started out as 1. That is, at a consistency level of 1, for every read, we would check the other replicas to see if the thing data we just read is consistent with the other replicas. This was good, because if you ever read stale data, the next time you read the same row you would probably read something more up to date. The bad part about this was requiring every read to become RF reads (and typically your RF is set to at least 3). Meaning that reads happen more often, and require more IO. In newer versions of Cassandra the default for this value is 0.1, and it is set on a per-columnfamily basis. Which means 10% of your requests will trigger a background read repair. This is more than enough for typical scenarios.
  • 63. When data is read to satisfy a query and return a result, all replicas are queried for the data needed(所有的副本都会被查询). The first replica node receives a direct read request and supplies the full data(第⼀一个副本收到Direct Read请求,提供完整的数据给协调节点). The other nodes contacted receive a digest request and return a digest, or hash of the data(其他节点收到Digest请求,返回数据的概要给协调节点). A digest is requested because generally the hash is smaller than the data itself. A comparison of the digests allows the coordinator to return the most up-to-date data to the query(对概要进⾏行⽐比较, 这样允许协调者返回最新 的数据给客户端, 问题:概要能直接返回给客户端吗?如果Direct Read不是最新的怎么办?概要可以和Direct Read⽐比较吗?). If the digests are the same for enough replicas to meet the consistency level, the data is returned(概要的数量满⾜足⼀一致性级别,数据返回给客户端). If the consistency level of the read query is ALL, the comparison must be completed before the results are returned; otherwise for all lower consistency levels, it is done in the background(⼀一致性级别为ALL,⽐比较操作必须在返回结果给客户端之前完成,否则可以在返回结果后⽐比较). The coordinator compares the digests, and if a mismatch is discovered(发现了不⼀一致), a request for the full data is sent to the mismatched nodes(完整的数据会被发送到不匹配的节点,这个完整的数据是Direct Read的吗,还是Digest中时间撮最新的?). The most current data found in a full data comparison is used to reconcile(调解) any inconsistent data on other replicas. http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsRepairNodesTOC.html http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsRepairNodesReadRepair.html Node repair makes data on a replica consistent with data on other nodes and is important for every Cassandra cluster. Repair is the process of correcting the inconsistencies so that eventually, all nodes have the same and most up-to-date data. Repair can occur in the following ways: ✅ Hinted Handoff During the write path, if a node that should receive data is unavailable, hints are written to the coordinator. When the node comes back online, the coordinator can hand off the hints so that the node can catch up and write the data. ✅ Read Repair During the read path, a query acquires data from several nodes. The acquired data from each node is checked against each other node. If a node has outdated data, the most recent data is written back to the node. ✅ Anti-Entropy Repair For maintenance purposes or recovery, manually run anti-entropy repair to rectify inconsistencies on any nodes(by nodetool repair). Repair
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76. Low Latency,Low Consistency 低的⼀一致性才能有低的延迟 High Latency,High Consistency ⾼高的⼀一致性会产⽣生⾼高的延迟 Read Write
  • 77.
  • 79. https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlClientRequestsWrite.html The coordinator sends a write request to all replicas that own the row being written. As long as all replica nodes are up and available, they will get the write regardless of the consistency level specified by the client. The write consistency level determines how many replica nodes must respond with a success acknowledgment in order for the write to be considered successful. Success means that the data was written to the commit log and the memtable as described in how data is written. In a single data center 12 node cluster with a replication factor of 3, an incoming write will go to all 3 nodes that own the requested row. If the write consistency level specified by the client is ONE, the first node [R1] to complete the write responds back to the coordinator, which then proxies the success message back to the client [write response]. A consistency level of ONE means that it is possible that 2 of the 3 replicas [R2,R3] could miss the write if they happened to be down at the time the request was made. That node [coordinator] forwards the write to all replicas of that row. It responds to the client once it receives write acknowledgments from the number of nodes specified by the consistency level. 1. If the coordinator cannot write to enough replicas to meet the requested CL, it throws an Unavailable Exception and does not perform any writes. 2. If there are enough replicas available but the required writes don't finish within the timeout window, the coordinator throws a Timeout Exception. 写⼀一致性
  • 80.
  • 81.
  • 82. DC:2, RF:3, CL:QUORUM=> 所有数据中⼼心,两个副本 In multiple data center deployments, Cassandra optimizes write performance by choosing one coordinator node. The coordinator node contacted by the client application forwards the write request to each replica node in each all the data centers. If using a consistency level of LOCAL_ONE or LOCAL_QUORUM, only the nodes in the same data center as the coordinator node must respond to the client request in order for the request to succeed. This way, geographical latency does not impact client request response times.
  • 83. https://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlClientRequestsReadExp.html DC:1, RF:3, CL:QUORUM=>2 In a single data center cluster with a replication factor of 3, and a read consistency level of QUORUM, 2 of the 3 replicas for the given row must respond to fulfill the read request. If the contacted replicas have different versions of the row, the replica with the most recent version will return the requested data [to Client]. In the background, the third replica is checked for consistency with the first two, and if needed, a read repair is initiated for the out-of-date replicas. 读⼀一致性
  • 84. DC:1, RF:3, CL:ONE=>1 In a single data center cluster with a replication factor of 3, and a read consistency level of ONE, the closest replica for the given row is contacted to fulfill the read request. In the background a read repair is potentially initiated, based on the read_repair_chance setting of the table, for the other replicas.
  • 85. In a two data center cluster with a RF=3, and a read consistency of QUORUM, 4 replicas for the given row must respond to fulfill the read request. The 4 replicas can be from any data center. In the background, the remaining replicas are checked for consistency with the first four, and if needed, a read repair is initiated for the out-of-date replicas. DC:2, RF:3, CL:QUORUM=> 任何数据中⼼心,四个副本
  • 86. DC:2, RF:3, CL:LOCAL_QUORUM=> 本地数据中⼼心,两个副本 In a multiple data center cluster with a RF=3, and a read consistency of LOCAL_QUORUM, 2 replicas in the same DC as the coordinator node for the given row must respond to fulfill the read request. In the background, the remaining replicas are checked for consistency with the first 2, and if needed, a read repair is initiated for the out-of-date replicas.
  • 87. DC:2, RF:3, CL:ONE=> 任何DC,⼀一个副本 In a multiple data center cluster with a RF=3, and a read consistency of ONE, the closest replica for the given row, regardless of data center, is contacted to fulfill the read request. In the background a read repair is potentially initiated, based on the read_repair_chance setting of the table, for the other replicas.
  • 88. DC:2, RF:3, CL:LOCAL_ONE=> 本地数据中⼼心,⼀一个副本 In a multiple data center cluster with a RF=3, and a read consistency of LOCAL_ONE, the closest replica for the given row in the same data center as the coordinator node is contacted to fulfill the read request. In the background a read repair is potentially initiated, based on the read_repair_chance setting of the table, for the other replicas.
  • 89.
  • 91. sstable sstablekey1 Bloom Filter Bloom Filter sstable sstable Bloom Filter Bloom Filter key1 Am I Here? Query key1 sstable sstable Bloom Filter Bloom Filter No,U’r NOT here! sstable sstable Bloom Filter Bloom Filter OK, I Believe U! key1 key1 GO NEXT SSTABLE… sstable sstable Bloom Filter Bloom Filter
  • 92. bloom_filter_fp_chance false positive determines the percent chance of the bloom filter returning a false positive that a partition exists in an SSTable when in fact it does not. false positives are possible; false negatives are not possible If you increase the percent chance of false positives, then you lower memory usage via a smaller filter size at the expense of more disk seeks due to an increase in false positives. If you decrease the percent chance of false positives, then you increase memory usage via a larger filter size for the benefit of fewer disk seeks thanks to fewer false positives. https://grockdoc.com/cassandra/2.1/articles/tuning-reads-via-the-bloom-filter_88c8f57a-71d0-41ee-b77f-617c64ad4739/ http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html False positive matches are possible, but false negatives are not. In other words, a query returns either “possibly in set” or “definitely not in set”.
  • 97.
  • 99.
  • 100.
  • 101.
  • 102.
  • 103.
  • 106.