The Do’s and Don’ts of Benchmarking Databases

The Do’s and Don’ts of
Benchmarking Databases
Glauber Costa - Principal Architect
WEBINAR

2
+ Next-generation NoSQL database
+ Drop-in replacement for Cassandra
+ 10X the performance & low tail latency
+ Open source and enterprise editions
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA; Herzelia, Israel
About ScyllaDB

Glauber Costa
3
Glauber Costa is a Principal Architect at ScyllaDB.
He shares his time between the engineering department
working on upcoming Scylla features and helping
customers succeed.
Before ScyllaDB, Glauber worked with Virtualization in the
Linux Kernel for 10 years with contributions ranging from
the Xen Hypervisor to all sorts of guest functionality and
containers

Why benchmark?
5
Tesla Roadster

Why benchmark?
6
Tesla Roadster VS Yugo

Why benchmark?
7
0-60

Why benchmark?
8
1.9s 0-60

Why benchmark?
9
1.9s Yes0-60

Why benchmark?
10
1.9s Yes0-60
4 Seating
You can fit the whole
drunk squad no matter
how many they are

Why benchmark?
11
1.9s Yes0-60
4 Seating
You can fit the whole
drunk squad no matter
how many they are
Price$200,000
Whatever you have
in your pocket

DO be aware of client-side bottlenecks
12
+ “I have applied a certain pressure to the Roadster’s gas pedal. It does 30mph”
+ “I have applied a certain pressure to the Yugo’s gas pedal. It does 32mph”
“Conclusion”: The Yugo is Faster than the Roadster (but not much!)

DO use standard tools
13
+ Writing your own benchmark is cool, but what about the bugs?
+ cassandra-stress
+ YCSB
+ ndbench (Netflix)

DO understand what you want to measure
14
+ Are you benchmarking a disk-bound or CPU-bound load?
+ Sometimes the workload bottlenecks both, but that is rare
+ Throughput benchmarks
+ A resource needs to be at 100% or close to 100% utilization
+ Latency benchmarks
+ Throughput is constant and lower than 100%, otherwise doesn’t mean much
+ Sizing/cost benchmark
+ Throughput (and maybe latency requirements) are constant, how many nodes or how much $?

Corollary: understand your system
15
throughput vs latency curve on Intel Optane
maximum useful
throughput
latency response

Example latency benchmark
16
Source: Performance report: Scylla vs Apache Cassandra on low-end hardware

DO understand what you want to measure
17

DO look at what you want to measure
18
+ Familiarize yourself with the database theory of operation
+ Example: Scylla polling, compactions, caching, etc.
+ After you have results, you should be able to explain them

DO look at what you want to measure
19
+ Familiarize yourself with the database theory of operation
+ Example: Scylla polling, compactions, caching, etc.
Throughput benchmark!

DO look at steady state
20
+ Common Big Data Database workloads have 10s or 100s of TB
+ At least have more data than memory
+ Workloads tend to run for hours, so your benchmark should as well

DO look at steady state
21
+ What’s my throughput here?

DON’T run unrealistic workloads
22
+ “My write latencies are 2500ms for 500-byte writes!”
+ Great for testing (does the system survive?)
+ Most people would have scaled the cluster by then
+ “I get great performance by always reading from the same key”
+ Sure, but who does that?

Some examples:
23
+ Ingestion
+ Ingest as fast as possible for some hours, no timeouts allowed
+ RTB
+ Bulk writes or constantly low, lots of reads - latency requirements
+ Time Series
+ Heavy, constant writes to ever-growing partitions. Reads latest rows
+ Metadata store
+ Some writes, random reads with good cacheability
+ Analytics
+ Periodic writes, full table scans

DON’T share your nodes with the loaders
+ Pushing and pulling data can be expensive!
+ It steals resources from the database
+ Don’t do it with any database, but Scylla is particularly affected due to pinning and polling.
24

But if you DO share your nodes:
25
+ Statically partition resources
+ Taskset, memory reserves
+ In case of Scylla, use --cpuset
+ Example:
+ taskset -c 0,5-12 cassandra-stress write duration=15m …
+ scylla --cpuset 1-4 …

DO be pessimistic
26
+ Unless you can guarantee that your workload always caches well:
+ Benchmark cold scenarios as well
+ Disabling the cache is a good way to enforce that (miss rate: 100%)
+ But sometimes just restarting helps
+ What are the minimum amount of resources you will have in the field? (be realistic)
+ What is the maximum load you expect to see?

In Summary
27
1. Define the problem
2. Find the bottleneck
3. Explain the results
4. Optionally, raise the bar
5. goto #2

DO be careful with aggregation
29
+ Summaries are useful, but they hide a lot of information
+ Both those runs have the same load and about the same throughput/latencies

30
+ Summaries are useful, but they hide a lot of information
+ Both those runs have the same load and about the same throughput/latencies

31
+ Client1: 100 requests: 98 of them took 1ms. 2 took 3ms
+ Client2: 100 requests: 99 of them took 30ms, 1 took 31ms
+ Common mistake: 99 % is avg(3ms, 30ms) -> 16.5ms
+ Real 99 % is 30ms

DON’T assume people will just believe you
32
+ When reporting, be very descriptive with your setup
+ BAD: “Our cluster has a p99 which is lower than 1ms”
+ GOOD: “We set up 3 nodes with 24 Intel i7-7500U CPU @ 2.70GHz each and
512GB RAM and Samsung SSD 850 PRO 256GB SSDs, with
<client_description> as loaders, and here’s the graph of our p99 over time”

DO be as fair as possible in comparisons
33
+ Most other databases require tuning, as they lack Autonomous Operations
+ Unless in a specific “out of the box” benchmark: tune it! (and say how)
+ HORRIBLE: “we installed Cassandra, ran it, and Scylla is 2000x faster”
+ BAD: “we tuned Cassandra, ran it, and Scylla is 10x faster”
+ GOOD: “we tuned Cassandra, and here is how (a link or appendix is fine). After
that, we ran Scylla and it is has 10x more throughput for the same hardware”

Example of reporting
34
http://www.scylladb.com/product/benchmarks/aws-i2-8xlarge-benchmark/

Backup your claims
35
https://www.scylladb.com/2018/01/07/cost-of-avoiding-a-meltdown/
+ Some claims sound too fantastic, unless they are backed up

Backup your claims
36
https://www.scylladb.com/2018/01/07/cost-of-avoiding-a-meltdown/
+ Some claims sound too fantastic, unless they are backed up

DO report rich scenarios
37
Test Apache Cassandra 3.0.9 Scylla 1.6.1 Difference Better is:
Time to populate 5h 21m 29s 4h 27m 19s 20% lower
Time to compact 7h 32m 21m 21x lower
Total quiesce time (populate and compact) 12h 43m 4h 48m 2.68x lower
Read throughput
(small dataset)
51,267 reads/second 124,958 reads/second 2.43x higher
Read throughput
(medium dataset)
7,363 reads/second 6,958 reads/second -5% higher
http://www.scylladb.com/2017/03/06/performance-report-scylla-vs-cassandra-low-end-hardware/

DO report rich scenarios
38
Test Apache Cassandra 3.0.9 Scylla 1.6.1 Difference Better is:
Read throughput
(large dataset)
5,089 reads/second 5,592 reads/second 9.8% higher
Reads during writes 547 reads/second 920 reads/second 68% higher
99.9th latency
(at 5,000 writes/second)
130.3 milliseconds 11.9 milliseconds 10.9x lower
99.9th latency
(at 10,000 writes/second)
153.3 milliseconds 16.9 milliseconds 9.0x lower
http://www.scylladb.com/2017/03/06/performance-report-scylla-vs-cassandra-low-end-hardware/

39
Q&A
Stay in touch
Learn more
glauber@scylladb.com
@glcst
@ScyllaDB
scylladb.com/blog

United States
1900 Embarcadero Road
Palo Alto, CA 94303
Israel
11 Galgalei Haplada
Herzelia, Israel
www.scylladb.com
@scylladb
Thank You!

The Do’s and Don’ts of Benchmarking Databases

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The Do’s and Don’ts of Benchmarking Databases

Similar to The Do’s and Don’ts of Benchmarking Databases (20)

More from ScyllaDB

More from ScyllaDB (20)

Recently uploaded

Recently uploaded (20)

The Do’s and Don’ts of Benchmarking Databases