Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Solving the Issue of Mysterious Database Benchmarking Results

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 21 Anzeige

Solving the Issue of Mysterious Database Benchmarking Results

Herunterladen, um offline zu lesen

Benchmarking tremendously helps to move forward the database industry and the database research community, especially since all database providers promise high performance and “unlimited” horizontal scalability. However, demonstrating these claims with comparable, transparent and reproducible database benchmarks is a methodological and technical challenge faced by every research paper, whitepaper, technical blog or customer benchmark. Moreover, running database benchmarks in the cloud adds unique challenges since differences in infrastructure across cloud providers makes apples to apples comparison even more difficult.

With benchANT, we address these challenges by providing a fully automated benchmarking platform that provides comprehensive data sets for ensuring full transparency and reproducibility of the benchmark results. We apply benchANT in a multi cloud context to benchmark ScyllaDB and other NoSQL databases using established open source benchmarks. These experiments demonstrate that unlike many competitors, ScyllaDB is able to keep up with its performance and scalability promises. The talks covers not only the in-depth discussion of the performance results and its impact on cloud TCO but also outlines how to specify fair and comparable benchmark scenarios and their execution. All discussed benchmarking data is released as open data on GitHub to ensure full transparency and reproducibility.

Benchmarking tremendously helps to move forward the database industry and the database research community, especially since all database providers promise high performance and “unlimited” horizontal scalability. However, demonstrating these claims with comparable, transparent and reproducible database benchmarks is a methodological and technical challenge faced by every research paper, whitepaper, technical blog or customer benchmark. Moreover, running database benchmarks in the cloud adds unique challenges since differences in infrastructure across cloud providers makes apples to apples comparison even more difficult.

With benchANT, we address these challenges by providing a fully automated benchmarking platform that provides comprehensive data sets for ensuring full transparency and reproducibility of the benchmark results. We apply benchANT in a multi cloud context to benchmark ScyllaDB and other NoSQL databases using established open source benchmarks. These experiments demonstrate that unlike many competitors, ScyllaDB is able to keep up with its performance and scalability promises. The talks covers not only the in-depth discussion of the performance results and its impact on cloud TCO but also outlines how to specify fair and comparable benchmark scenarios and their execution. All discussed benchmarking data is released as open data on GitHub to ensure full transparency and reproducibility.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Ähnlich wie Solving the Issue of Mysterious Database Benchmarking Results (20)

Weitere von ScyllaDB (20)

Anzeige

Aktuellste (20)

Solving the Issue of Mysterious Database Benchmarking Results

  1. 1. Solving the Issue of Mysterious Database Benchmarking Results Daniel Seybold, Co-Founder at benchANT
  2. 2. Daniel Seybold ■ PhD in computer science at the Ulm University (Germany) ■ Thesis topic: An automation-based approach for reproducible evaluations of distributed DBMS on elastic infrastructures ■ Co-founder of benchANT ■ Responsible for the product development at benchANT Your photo goes here, smile :)
  3. 3. ■ Is database benchmarking still relevant? ■ Challenges of reliable database benchmarks in the cloud ■ Demo: the open database performance ranking ■ Takeaways Agenda
  4. 4. Is Database Benchmarking Still Relevant?
  5. 5. Is Database Benchmarking Still Relevant? Recent trends of the database landscape: ■ “the idea of ‘one-size-fits-all’ is over” (Stonebraker, M. et al. “‘One size fits all’ an idea whose time has come and gone.” Making Databases Work: the Pragmatic Wisdom of Michael Stonebraker. 2018) ■ “cloud resources have become the preferred solution to operate databases” (Abadi, D., et al. “The seattle report on database research.” ACM SIGMOD Record. 2020) ■ “DBaaS reached mainstream and serverless DBaaS might be the future” (Abadi, D., et al. "The seattle report on database research." ACM SIGMOD Record. 2022)
  6. 6. Promises of the Database Landscape
  7. 7. How to Verify the Promises?
  8. 8. Is Database Benchmarking Still Relevant? Research perspective from the Seattle Report on Database Research 2022 ■ benchmarks tremendously helped move forward the database industry and the database research community ■ … without the development of appropriate benchmarking and data sets, a fair comparison … will not be feasible… ■ Benchmarking in the cloud environment also presents unique challenges since differences in infrastructure across cloud providers makes apples to apples comparison more difficult.
  9. 9. Is Database Benchmarking still relevant? Industry perspective ■ “… we recommend measuring the performance of applications to identify appropriate instance types … we also recommend rigorous load/scale testing …” – AWS ■ “… measure everything, assume nothing …“ — MongoDB ■ “Benchmarking will help you fail fast and recover fast before it’s too late.” — ScyllaDB ■ “... approach performance problems systematically and do not settle for random googling” — Peter Zaitsev (Percona)
  10. 10. Database Benchmarking Use Cases
  11. 11. Challenges of Reliable Database Benchmarks in the Cloud
  12. 12. Cloud-Centric Benchmarking Process select & allocate resources deploy & configure database cluster deploy & execute benchmark process benchmark objective > 500 AWS EC2 instance types > 20.000 public cloud instance types > 850 databases > 170 DBaaS providers > 10 established NoSQL benchmark suites performance (/costs), scalability, availability complex and time consuming process ➡ automation required
  13. 13. Database Benchmark Automation Recent trends in database benchmark automation: ■ ScyllaDB Cluster Tests (software) ■ Creating a Virtuous Cycle in Performance Testing at MongoDB (publication) ■ Flexibench (publication & software) ■ Mowgli (publications & software) ■ benchANT (Benchmarking-as-a-Service)
  14. 14. Remaining Challenges select & allocate resources deploy & configure database cluster deploy & execute benchmark process benchmark objective Even with a fully automated benchmarking process you might be challenged: ■ Which region and which storage type have been used? ■ Why database configuration has been applied? ■ What was the exact workload configurations? ■ How was the system utilized?
  15. 15. Benchmark Meta Data Collection select & allocate resources deploy & configure database cluster deploy & execute benchmark process benchmark objective Cloud provider & instance metadata database configuration & system monitoring workload configuration & benchmark results Raw & aggregated results comprehensive data sets are required to ensure transparent and reproducible results
  16. 16. Demo: The Open Database Performance Ranking
  17. 17. The Open Database Performance Ranking Disclaimer: ■ do not make your final database decision based on this ranking but use it as starting point for application-specific benchmarks! ■ cloud resource performance volatility is not (yet) covered ■ non-production workloads based on YCSB/sysbench/TSBS ■ default database configuration
  18. 18. The Open Database Performance Ranking https://benchant.com/ranking/database-ranking
  19. 19. Takeaways
  20. 20. Takeaways ■ Database benchmarks can support a broad range of use cases (cloud resource selection, database tuning, database comparisons, …) ■ Running database benchmarks in the cloud is easy, ensuring transparency and reproducibility is hard ■ Reliable and transparent benchmark results need to: ■ be automated ■ include comprehensive meta data to ensure reproducibility
  21. 21. Thank You Stay in touch Daniel Seybold daniel.seybold@benchant.com github.com/benchANT www.linkedin.com/in/seybold-benchant

×