Solving Mysterious DB Benchmarking Results

Solving the Issue of
Mysterious Database
Benchmarking Results
Daniel Seybold, Co-Founder at benchANT

Daniel Seybold
■ PhD in computer science at the Ulm University (Germany)
■ Thesis topic: An automation-based approach for
reproducible evaluations of distributed DBMS on elastic
infrastructures
■ Co-founder of benchANT
■ Responsible for the product development at benchANT
Your photo
goes here,
smile :)

■ Is database benchmarking still relevant?
■ Challenges of reliable database benchmarks in the cloud
■ Demo: the open database performance ranking
■ Takeaways
Agenda

Is Database
Benchmarking Still
Relevant?

Is Database Benchmarking Still Relevant?
Recent trends of the database landscape:
■ “the idea of ‘one-size-ﬁts-all’ is over” (Stonebraker, M. et al. “‘One size ﬁts all’ an idea
whose time has come and gone.” Making Databases Work: the Pragmatic Wisdom of
Michael Stonebraker. 2018)
■ “cloud resources have become the preferred solution to operate databases” (Abadi, D., et
al. “The seattle report on database research.” ACM SIGMOD Record. 2020)
■ “DBaaS reached mainstream and serverless DBaaS might be the future” (Abadi, D., et al.
"The seattle report on database research." ACM SIGMOD Record. 2022)

Promises of the Database Landscape

Is Database Benchmarking Still Relevant?
Research perspective from the Seattle Report on Database Research 2022
■ benchmarks tremendously helped move forward the database industry and the database
research community
■ … without the development of appropriate benchmarking and data sets, a fair comparison
… will not be feasible…
■ Benchmarking in the cloud environment also presents unique challenges since differences
in infrastructure across cloud providers makes apples to apples comparison more diﬃcult.

Is Database Benchmarking still relevant?
Industry perspective
■ “… we recommend measuring the performance of applications to identify appropriate
instance types … we also recommend rigorous load/scale testing …” – AWS
■ “… measure everything, assume nothing …“ — MongoDB
■ “Benchmarking will help you fail fast and recover fast before it’s too late.” — ScyllaDB
■ “... approach performance problems systematically and do not settle for random googling”
— Peter Zaitsev (Percona)

Database Benchmarking Use Cases

Challenges of Reliable
Database Benchmarks in
the Cloud

Cloud-Centric Benchmarking Process
select & allocate
resources
deploy & conﬁgure
database cluster
deploy & execute
benchmark
process
benchmark
objective
> 500 AWS EC2
instance types
> 20.000 public cloud
instance types
> 850 databases
> 170 DBaaS
providers
> 10 established
NoSQL benchmark
suites
performance (/costs),
scalability, availability
complex and time consuming process ➡ automation required

Database Benchmark Automation
Recent trends in database benchmark automation:
■ ScyllaDB Cluster Tests (software)
■ Creating a Virtuous Cycle in Performance Testing at MongoDB (publication)
■ Flexibench (publication & software)
■ Mowgli (publications & software)
■ benchANT (Benchmarking-as-a-Service)

Remaining Challenges
select & allocate
resources
deploy & configure
database cluster
deploy & execute
benchmark
process
benchmark
objective
Even with a fully automated benchmarking process you might be challenged:
■ Which region and which storage type have been used?
■ Why database configuration has been applied?
■ What was the exact workload configurations?
■ How was the system utilized?

Benchmark Meta Data Collection
select & allocate
resources
deploy & configure
database cluster
deploy & execute
benchmark
process
benchmark
objective
Cloud provider &
instance metadata
database
configuration &
system monitoring
workload
configuration &
benchmark results
Raw & aggregated
results
comprehensive data sets are required to ensure
transparent and reproducible results

Demo: The Open
Database Performance
Ranking

The Open Database Performance Ranking
Disclaimer:
■ do not make your final database decision based on this ranking but use it as starting
point for application-specific benchmarks!
■ cloud resource performance volatility is not (yet) covered
■ non-production workloads based on YCSB/sysbench/TSBS
■ default database configuration

The Open Database Performance Ranking
https://benchant.com/ranking/database-ranking

Takeaways
■ Database benchmarks can support a broad range of use cases (cloud resource selection,
database tuning, database comparisons, …)
■ Running database benchmarks in the cloud is easy, ensuring transparency and
reproducibility is hard
■ Reliable and transparent benchmark results need to:
■ be automated
■ include comprehensive meta data to ensure reproducibility

Thank You
Stay in touch
Daniel Seybold
daniel.seybold@benchant.com
github.com/benchANT
www.linkedin.com/in/seybold-benchant

Solving Mysterious DB Benchmarking Results

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Solving Mysterious DB Benchmarking Results

Ähnlich wie Solving Mysterious DB Benchmarking Results (20)

Mehr von ScyllaDB

Mehr von ScyllaDB (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Solving Mysterious DB Benchmarking Results