Benchmarking tremendously helps to move forward the database industry and the database research community, especially since all database providers promise high performance and “unlimited” horizontal scalability. However, demonstrating these claims with comparable, transparent and reproducible database benchmarks is a methodological and technical challenge faced by every research paper, whitepaper, technical blog or customer benchmark. Moreover, running database benchmarks in the cloud adds unique challenges since differences in infrastructure across cloud providers makes apples to apples comparison even more difficult.
With benchANT, we address these challenges by providing a fully automated benchmarking platform that provides comprehensive data sets for ensuring full transparency and reproducibility of the benchmark results. We apply benchANT in a multi cloud context to benchmark ScyllaDB and other NoSQL databases using established open source benchmarks. These experiments demonstrate that unlike many competitors, ScyllaDB is able to keep up with its performance and scalability promises. The talks covers not only the in-depth discussion of the performance results and its impact on cloud TCO but also outlines how to specify fair and comparable benchmark scenarios and their execution. All discussed benchmarking data is released as open data on GitHub to ensure full transparency and reproducibility.
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Solving Mysterious DB Benchmarking Results
1. Solving the Issue of
Mysterious Database
Benchmarking Results
Daniel Seybold, Co-Founder at benchANT
2. Daniel Seybold
■ PhD in computer science at the Ulm University (Germany)
■ Thesis topic: An automation-based approach for
reproducible evaluations of distributed DBMS on elastic
infrastructures
■ Co-founder of benchANT
■ Responsible for the product development at benchANT
Your photo
goes here,
smile :)
3. ■ Is database benchmarking still relevant?
■ Challenges of reliable database benchmarks in the cloud
■ Demo: the open database performance ranking
■ Takeaways
Agenda
5. Is Database Benchmarking Still Relevant?
Recent trends of the database landscape:
■ “the idea of ‘one-size-fits-all’ is over” (Stonebraker, M. et al. “‘One size fits all’ an idea
whose time has come and gone.” Making Databases Work: the Pragmatic Wisdom of
Michael Stonebraker. 2018)
■ “cloud resources have become the preferred solution to operate databases” (Abadi, D., et
al. “The seattle report on database research.” ACM SIGMOD Record. 2020)
■ “DBaaS reached mainstream and serverless DBaaS might be the future” (Abadi, D., et al.
"The seattle report on database research." ACM SIGMOD Record. 2022)
8. Is Database Benchmarking Still Relevant?
Research perspective from the Seattle Report on Database Research 2022
■ benchmarks tremendously helped move forward the database industry and the database
research community
■ … without the development of appropriate benchmarking and data sets, a fair comparison
… will not be feasible…
■ Benchmarking in the cloud environment also presents unique challenges since differences
in infrastructure across cloud providers makes apples to apples comparison more difficult.
9. Is Database Benchmarking still relevant?
Industry perspective
■ “… we recommend measuring the performance of applications to identify appropriate
instance types … we also recommend rigorous load/scale testing …” – AWS
■ “… measure everything, assume nothing …“ — MongoDB
■ “Benchmarking will help you fail fast and recover fast before it’s too late.” — ScyllaDB
■ “... approach performance problems systematically and do not settle for random googling”
— Peter Zaitsev (Percona)
12. Cloud-Centric Benchmarking Process
select & allocate
resources
deploy & configure
database cluster
deploy & execute
benchmark
process
benchmark
objective
> 500 AWS EC2
instance types
> 20.000 public cloud
instance types
> 850 databases
> 170 DBaaS
providers
> 10 established
NoSQL benchmark
suites
performance (/costs),
scalability, availability
complex and time consuming process ➡ automation required
13. Database Benchmark Automation
Recent trends in database benchmark automation:
■ ScyllaDB Cluster Tests (software)
■ Creating a Virtuous Cycle in Performance Testing at MongoDB (publication)
■ Flexibench (publication & software)
■ Mowgli (publications & software)
■ benchANT (Benchmarking-as-a-Service)
14. Remaining Challenges
select & allocate
resources
deploy & configure
database cluster
deploy & execute
benchmark
process
benchmark
objective
Even with a fully automated benchmarking process you might be challenged:
■ Which region and which storage type have been used?
■ Why database configuration has been applied?
■ What was the exact workload configurations?
■ How was the system utilized?
15. Benchmark Meta Data Collection
select & allocate
resources
deploy & configure
database cluster
deploy & execute
benchmark
process
benchmark
objective
Cloud provider &
instance metadata
database
configuration &
system monitoring
workload
configuration &
benchmark results
Raw & aggregated
results
comprehensive data sets are required to ensure
transparent and reproducible results
17. The Open Database Performance Ranking
Disclaimer:
■ do not make your final database decision based on this ranking but use it as starting
point for application-specific benchmarks!
■ cloud resource performance volatility is not (yet) covered
■ non-production workloads based on YCSB/sysbench/TSBS
■ default database configuration
18. The Open Database Performance Ranking
https://benchant.com/ranking/database-ranking
20. Takeaways
■ Database benchmarks can support a broad range of use cases (cloud resource selection,
database tuning, database comparisons, …)
■ Running database benchmarks in the cloud is easy, ensuring transparency and
reproducibility is hard
■ Reliable and transparent benchmark results need to:
■ be automated
■ include comprehensive meta data to ensure reproducibility
21. Thank You
Stay in touch
Daniel Seybold
daniel.seybold@benchant.com
github.com/benchANT
www.linkedin.com/in/seybold-benchant