Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016

1.396 Aufrufe

Veröffentlicht am

This presentation will walk through some of the key considerations for planning and running load test to ensure your Cassandra application will meet you expected scaling requirements. We will also walk through some examples of using the cassandra-stress tool to construct load test for real-life application scenarios.

About the Speaker
Ben Slater Chief Product Officer, Instaclustr

Instaclustr provides Cassandra and Spark as a managed service in the cloud. As Chief Product Officer, Ben is charged with steering Instaclustr's development roadmap, managing product engineering and overseeing the production support and consulting teams. Ben has over 20 years experience in systems development including previously as lead architect for the product that is now Oracle Policy Automation and over 10 years as a solution architect and project manager for Accenture.

Veröffentlicht in: Software
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016

  1. 1. Ben Slater, Instaclustr Load Testing Cassandra Applications
  2. 2. Introduction • Ben Slater, Chief Product Officer, Instaclustr • Cassandra + Spark Managed Service, Support, Consulting • 20+ years experience as a developer, architect and dev/dev-ops team lead • DataStax MVP for Apache Cassandra © DataStax, All Rights Reserved. 2
  3. 3. Load Testing Cassandra Applications 1 Load testing background 2 Cassandra specific considerations 3 cassandra-stress walkthrough 3© DataStax, All Rights Reserved.
  4. 4. Why Load Test? • Benchmarking to compare configurations • Prove ability to handle forecast peak application load • Prove application stability under sustained application load • Establish parameters for capacity forecasting models © DataStax, All Rights Reserved. 4
  5. 5. Planning A Load Test • Need to understand or estimate: • peak minute/10 minute/hour/day in terms of reads/writes per sec (and types of reads/writes) • data demographics • production hardware configuration • Evaluate options for load generation • drive load through application • drive load through custom harness • cassandra-stress • other options • Jmeter w/ Cassandra plug-in • YCSB • Test environment sizing • ideally, full production size • 50 or 30% probably acceptable for large environments (assuming good practice data model) © DataStax, All Rights Reserved. 5
  6. 6. Executing a Load Test • Record everything! • Ensure load client is not a bottleneck • Understand natural variance between tests • Make sure you understand the bottleneck in the system under load © DataStax, All Rights Reserved. 6
  7. 7. Cassandra-specific considerations • Background operations • compactions • repairs • Data conditions • tombstones • skewed partitions • cache hit rates (including OS cache) • Non/poorly scaling operations • secondary indexes • logged batches • multi-partition queries • UDFs/UDAs ? © DataStax, All Rights Reserved. 7
  8. 8. cassandra-stress • Stress tool provide with cassandra • Able to simulate many application scenarios (although still not a perfect substitute for testing via your application) • Supports basic read/write/mixed commands and more sophisticated and custom testing via YAML configuration • Can even graph your results • Currently one table at a time but watch CASSANDRA-8780 © DataStax, All Rights Reserved. 8
  9. 9. cassandra-stress yaml file walkthrough (1) © DataStax, All Rights Reserved. 9 # # Keyspace name and create CQL # keyspace: stressexample keyspace_definition: | CREATE KEYSPACE stressexample WITH replication = {'class': 'NetworkTopologyStrategy', 'AWS_VPC_US_WEST_2': '2'}; # # Table name and create CQL # table: eventsrawtest table_definition: | CREATE TABLE eventsrawtest ( host text, bucket_time text, service text, time timestamp, metric double, state text, PRIMARY KEY ((host, bucket_time, service), time) ) WITH CLUSTERING ORDER BY (time DESC)
  10. 10. cassandra-stress yaml file walkthrough (2) © DataStax, All Rights Reserved. 10 # # Meta information for generating data # columnspec: - name: host size: fixed(32) #In chars, no. of chars of UUID population: uniform(1..600) # About 600 hosts with equal events per host - name: bucket_time size: fixed(18) population: seq(1..288) # 288 potential buckets - name: service size: uniform(10..100) population: gaussian(1000..2000) # 1000 - 2000 metrics per host - name: time cluster: fixed(15)
  11. 11. cassandra-stress yaml file walkthrough (3) © DataStax, All Rights Reserved. 11 # # Specs for insert queries # insert: partitions: fixed(1) # 1 partition per batch batchtype: UNLOGGED # use unlogged batches select: fixed(10)/10 # chance of skipping a row when generating inserts # # Read queries to run against the schema # queries: pull-for-rollup: cql: select * from eventsrawtest where host = ? and service = ? and bucket_time = ? fields: samerow get-a-value: cql: select * from eventsrawtest where host = ? and service = ? and bucket_time = ? and time = ? fields: multirow
  12. 12. misc cassandra-stress tips • use –rate threads= or throttle= to control level of load generated • when using write, read or mixed commands (simple test) beware that n= (or duration=) impacts default population generation • use sequence distribution for initial base data load © DataStax, All Rights Reserved. 12
  13. 13. Questions? Blogs: • Part 1: http://bit.ly/stressblog1 • Part 2: http://bit.ly/stressblog2 • Part 3: http://bit.ly/stressblog3 • (One or two more to come …) Thanks for attending! Have a beer with the Instaclustr Tech Team – 7:30PM, The Market Room, Hilton © DataStax, All Rights Reserved. 13