How does Cassandra 4.0’s performance compare to Cassandra 3.x’s? What’s been fixed in 4.0 and what remains unchanged? Should you upgrade or consider other options?
Join us for a webinar where we’ll answer these questions and more, based on our extensive benchmarks comparing Cassandra 4.0 against Cassandra 3.11. We’ll also share how the new release of Cassandra stacks up against Scylla Open Source. You’ll learn the rationale and results for our head-to-head comparisons, including:
- Throughput under various loads
- Comparison of long-tail (p95, p99) latencies
- Improvements to operations such as compactions
If you are considering upgrading your existing infrastructure from Cassandra 3.11, or if you are considering a new wide column database for a greenfield deployment, this is a session you won’t want to miss!
4. Presenters
4
Karol Baryła
Karol is a Junior Software Engineer at ScyllaDB. He often
participates in security CTF competitions as a member of team
"Armia Prezesa" where he solves web security and reverse
engineering tasks. He is currently pursuing an MSc in Computer
Science at the University of Warsaw.
Piotr Grabowski
Piotr is a software engineer working at ScyllaDB. From a young age,
he participated in many competitive programming contests. Piotr
holds a BSc in Computer Science from the University of Warsaw and
is now pursuing an MSc. For the past year, he worked on Kafka
connectors and Scylla Java Driver.
6. 6
+ The Real-Time Big Data Database
+ Drop-in replacement for Apache Cassandra
and Amazon DynamoDB
+ Outstanding performance & low tail latency
+ Open Source, Enterprise and Cloud options
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA, USA; Herzelia, Israel;
Warsaw, Poland
About ScyllaDB
7. At July 27th, 2021 Cassandra team
released version 4.0 - 6 years after the
release of version 3.0.
Let’s see how much Cassandra improved
during those 6 years, and how well it holds
against Scylla 4.4 now.
7
8. 8
1. Increased speed and scalability
a. Zero Copy Streaming streaming data up to 5x faster
b. Up to 25% faster throughput on reads and writes
2. Support for JDK 11
3. New configuration settings, better security and observability
4. Better compression settings (support for Zstd)
5. A shift to a 12-month release cycle
Cassandra 4.0 new features
11. 11
1. Latency at different throughputs
a. Gaussian distribution
b. Disk-intensive distribution
c. Memory-intensive distribution
2. Adding a single new node
3. Doubling cluster size
4. Replacing node
Benchmarked operations
12. 12
+ 3 vs 3:
+ Cluster nodes: 3x i3.4xlarge (16vCPU, 122GiB RAM, up to 10Gbps network, 2x1.9TB NVMe)
+ Loader nodes: 3x c5n.9xlarge (36vCPU, 96GiB RAM, up to 50Gbps network)
+ 4 vs 40:
+ Scylla cluster: 4x i3.metal (72vCPU, 512GiB RAM, up to 25Gbps network, 8x1.9TB NVMe)
+ Cassandra cluster: 40x i3.4xlarge (16vCPU, 122GiB RAM, up to 10Gbps network, 2x1.9TB NVMe)
+ Loader nodes: 15x c5n.9xlarge (36vCPU, 96GiB RAM, up to 50Gbps network)
+ Java version: JDK 16 (Cassandra 4.0), JDK 8 (Cassandra 3.11)
Benchmarks setup - 3vs3 and 4vs40
24. 24
+ Cassandra 3 officially supports only Java 8
+ Cassandra 4 officially supports Java 8 and Java 11
+ Java 11 introduced ZGC - as an experimental feature
+ ZGC is considered production ready from Java 15
+ We used Java 16 in benchmarks in order to utilize full power of ZGC
+ ZGC has extremely short pauses, which reduces Cassandra’s tail latencies.
What causes latency improvements?
25. 25
How much data do you have under management in your own transactional
database systems?
+ <1 terabyte
+ 1 to 50 terabytes
+ 50-100 terabytes
+ >100 terabytes
Quick Poll
46. 46
Summary of results
+ Cassandra 4 has much better tail latencies than Cassandra 3.
+ Scylla performs 3-4 times better than Cassandra when adding/replacing nodes.
+ Scylla adds 25% capacity to a 40 TB optimized cluster 11x faster than Cassandra 4.0.
+ Scylla performs major compaction 32x faster than Cassandra 4.0.
+ Scylla has 2x-5x better throughput than Cassandra 4.0 on the same 3-node cluster
+ Scylla has 3x-8x better throughput than Cassandra 4.0 on the same 3-node cluster while
P99 <10ms
+ A 40 TB cluster is 2.5x cheaper with Scylla while providing 42% more throughput under
P99 latency of 10 ms
47. 47
Should I upgrade to Scylla or Cassandra 4?
1. Upgrading is hard, so why not upgrade to Scylla right away?
a. Upgrading is problematic anyway - you should make backups, you risk downtime.
b. Migrating from Cassandra to Scylla is a bit more involving - but the benefits are worth it.
2. Upgrading to Scylla will save you the money in the long run.
3. Scylla offers better performance and lower latencies compared to Cassandra 4.
4. Scylla offers exciting new features:
a. Scylla CDC
b. Kubernetes support with Scylla Operator
c. Scylla Cloud
48. Download Scylla Open Source:
scylladb.com/download
Learn more https://university.scylladb.com/
Experience Scylla for Yourself
48