SlideShare a Scribd company logo
1 of 56
Download to read offline
Cassandra Drivers
instaclustr.com
@Instaclustr
Who am I and what do I do?
• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud
• Currently in AWS, Azure and IBM Softlayer
• We currently manage 400+ nodes
What this talk will cover
• Driver basics
• Sync vs Async
• Driver connection policies and tuning
The driver
• The Cassandra driver contains the logic for connecting to
Cassandra and running queries in a fast and efficient manner
• Focus on the Datastax Open Source drivers:
The driver
• Java
• .NET (C#)
• C/C++
• Python
• Node.js
• Ruby
• PHP
Cassandra Drivers
• All have a similar architecture that consists of:
• Session & pool management
• Chainable policies for managing failure and performance
• Sync vs Async queries
• Failover & Retry
• Tracing
Cassandra Drivers
A basic example in Java:
Cluster	
  cluster	
  =	
  Cluster.builder()	
  
	
  	
  	
  	
  .addContactPoints("52.89.183.67")	
  
	
  	
  	
  	
  .withPort(9042)	
  
	
  	
  	
  	
  .build();	
  
Session	
  session	
  =	
  cluster.newSession();	
  
session.execute("SELECT	
  *	
  FROM	
  foo…");
Cassandra Drivers
A basic example in Python:
cluster	
  =	
  Cluster(contact_points=["52.89.183.67"],	
  port=9042)	
  
session	
  =	
  cluster.connect()	
  
rows	
  =	
  session.execute("SELECT	
  name,	
  age,	
  email	
  FROM	
  users")
Cassandra Drivers
A basic example in Ruby:
cluster	
  =	
  Cassandra.cluster(	
  
	
  	
  	
  	
  :hosts	
  =>	
  ["52.89.183.67",	
  "52.89.99.88",	
  "54.69.217.141"],
	
  	
  	
  	
  :datacenter	
  =>	
  'AWS_VPC_US_WEST_2'	
  
)	
  
session	
  =	
  cluster.connect()	
  
rows	
  =	
  session.execute("SELECT	
  name,	
  age,	
  email	
  FROM	
  users")
Cassandra Drivers
• Architecture makes the driver similar across languages
• What happens under the hood?
• Cluster object creates configuration (auth, load balancing, contact
points).
• Session object holds the thread pool and manages connections.
• Session object authenticates and maintains connections.
• Session can be shared and is threadsafe!
Different ways of querying
• Synchronous:
session.execute("SELECT	
  *	
  FROM	
  foo..”);
• Asynchronous:
ResultSetFuture	
  result	
  =	
  session.executeAsync("SELECT	
  *	
  FROM	
  
foo..”);	
  
result.get();
How do these perform?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
Op/s
How do these perform?
Latency
0
20
40
60
80
Read Sync Write Sync Read Async Write Async
ms
Different ways of querying
Prepared Statements:
PreparedStatement	
  statement	
  =	
  getSession().prepare(	
  
	
  	
  	
  	
  	
  	
  "INSERT	
  INTO	
  simplex.songs	
  "	
  +	
  
	
  	
  	
  	
  	
  	
  "(id,	
  title,	
  album,	
  artist,	
  tags)	
  "	
  +	
  
	
  	
  	
  	
  	
  	
  "VALUES	
  (?,	
  ?,	
  ?,	
  ?,	
  ?);");
Different ways of querying
boundStatement	
  =	
  new	
  BoundStatement(statement);	
  
getSession().execute(boundStatement.bind(	
  
	
  	
  	
  	
  	
  	
  UUID.fromString("2cc9ccb7-­‐6221-­‐4ccb-­‐8387-­‐f22b6a1b354d"),	
  
	
  	
  	
  	
  	
  	
  UUID.fromString("756716f7-­‐2e54-­‐4715-­‐9f00-­‐91dcbea6cf50"),	
  
	
  	
  	
  	
  	
  	
  "La	
  Petite	
  Tonkinoise",	
  
	
  	
  	
  	
  	
  	
  "Bye	
  Bye	
  Blackbird",	
  
	
  	
  	
  	
  	
  	
  "Joséphine	
  Baker")	
  );
Drivers and consistency
• Within the different ways of querying Cassandra you can also adjust
the consistency level per query.
• Lets have a quick consistency refresh
A brief intro to tuneable consistency
• Cassandra is considered to be a db that favours Availability and
Partition Tolerance.
• Let’s you change those characteristics per query to suit your
application requirement
Two consistency levers
• Consistency level - How many acknowledgements/responses from
replicas before a query is considered a success.
• Replication Factor (RF) - How many copies of a record do I store.
Two consistency levers
• Consistency level - Chosen by the client at query time
• Replication Factor (RF) - Determined client on schema definition
Consistency Levels
• ALL - Every replica
• *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM)
• Numbered - (ONE, TWO, THREE, LOCAL_ONE)
• *SERIAL - (SERIAL, LOCAL_SERIAL)
• ANY
What does it all mean
• At the client level (your application) you have total control
• Define implicit and explicit failure handling
• Isolate queries to a single geography
• Trade consistency for latency (a decision is better than no
decision)
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
How does CL impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does CL impact latency ?
Latency
0
30
60
90
120
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
What happens when something goes wrong?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Required responses:
floor(3 * 0.5) + 1 = 2
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Success!
How does an outage impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does an outage impact latency ?
Latency
0
25
50
75
100
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff
• Read repair
We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff - lets cover this quickly
• Read repair
What is hinted handoff ?
• A performance optimisation for “catching up” nodes who missed
writes.
What isn’t hinted handoff ?
• A consistent distribution mechanism
Write
CL:QUORUM
RF:3
partition_key: b
1
2
3
4
How does it all work?
Write
CL:QUORUM
RF:3
partition_key: b
1
2
3
4
How does it all work?
How does hinted handoff work?
1
2
3
4
host / key A B
1 ✔ ✔
2 ?
3 ✔ ✔
…
✔
How does hinted handoff work?
partition_key: b
1
2
3
4
How does hinted handoff work?
partition_key: b
1
2
3
4
Gossip: 2 is now UP
Node 1: I have stored hints for 2
How does hinted handoff work?
partition_key: b
1
2
3
4
Some things to keep in mind
• Cassandra will only store hints for a certain period of time, set by
max_hint_window_in_ms. 3 hours by default
• Hints are not a reliable delivery mechanism
• Hint replay will cause counters to overcoat
• CF of ANY will cause a hint to be stored even if no replicas are
available. Sometimes called extreme availability… also called who
knows where and if your data is safe?
Hinted handoff performance
• Causes the same volume of writes to occur in a cluster with
reduced capacity (local write amplification on the co-ordinator
node)
• Hints are written to system.hints, each replica has hints stored in a
single partition.
• Hints use TTLs and tombstones.. the hint table is actually a queue!
• When cassandra starts compacting or throwing tombstone
warnings on the system.hints table… things are bad
Hinted handoff performance
• Rewritten in Cassandra 3.0 (in beta now)
• Takes a commitlog approach:
• No compaction
• no TTL
• no tombstones
• no memtables
How does this relate to the driver?
• With a node outage the “latency” on the down node becomes
hours/days until it becomes consistent
• Cassandra itself takes over the client portion of ensuring the write
makes it to the node that was down.
• You can control whether C* handles this (via repair, HH etc) or
whether your application controls this (have your client receive an
exception instead).
Driver policies
• Cassandra driver policies allow you to control failure
• Cassandra driver policies allow you to control how the driver routes
requests
• This can reduce your latency and/or increase op/s (in some cases)
Retry Policy
• Default Retry Policy
• Downgrading Consistency Retry Policy
• Fall through Retry Policy
• Logging Retry Policy
Load Balancing Policy
• Round Robin
• DC Aware
• TokenAware
• LatencyAware
Driver policies impact latency ?
Latency
0
0.3
0.6
0.9
1.2
Read Sync Write Sync
Round Robin Token Aware Latency Aware
Last but not least
• Use one Cluster instance per (physical) cluster (per application
lifetime)
• Use at most one Session per keyspace, or use a single Session and
explicitly specify the keyspace in your queries
• If you execute a statement more than once, consider using a
PreparedStatement
• You can reduce the number of network roundtrips and also have
atomic operations by using Batches
Thank you!
Questions?

More Related Content

What's hot

Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndJervin Real
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Marco Tusa
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with NginxMarian Marinov
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Troubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itTroubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itMichael Klishin
 
Redis trouble shooting_eng
Redis trouble shooting_engRedis trouble shooting_eng
Redis trouble shooting_engDaeMyung Kang
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CachePer Buer
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)Ontico
 
Nginx Internals
Nginx InternalsNginx Internals
Nginx InternalsJoshua Zhu
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and othersWalter Liu
 
PostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total SecurityPostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total SecurityRobert Bernier
 
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in OsloVarnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in OsloPer Buer
 

What's hot (20)

Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlnd
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Redis ndc2013
Redis ndc2013Redis ndc2013
Redis ndc2013
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with Nginx
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Troubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itTroubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use it
 
Redis trouble shooting_eng
Redis trouble shooting_engRedis trouble shooting_eng
Redis trouble shooting_eng
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
 
Nginx Internals
Nginx InternalsNginx Internals
Nginx Internals
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and others
 
How to monitor NGINX
How to monitor NGINXHow to monitor NGINX
How to monitor NGINX
 
PostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total SecurityPostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total Security
 
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in OsloVarnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
 

Similar to Cassandra Driver Performance Tuning

AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Three Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyThree Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyScyllaDB
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsConcentric Sky
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language ClientSayyaparaju Sunil
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scaleVinay Kumar Chella
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon KinesisAmazon Web Services
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenParticular Software
 
KoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganKoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganTobias Koprowski
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
Database and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on AzureDatabase and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on AzureRadu Vunvulea
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented DesignRodrigo Campos
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 

Similar to Cassandra Driver Performance Tuning (20)

AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Three Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyThree Perspectives on Measuring Latency
Three Perspectives on Measuring Latency
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language Client
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scale
 
Release it! - Takeaways
Release it! - TakeawaysRelease it! - Takeaways
Release it! - Takeaways
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves Goeleven
 
KoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganKoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just began
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Database and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on AzureDatabase and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on Azure
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 

Recently uploaded

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Cassandra Driver Performance Tuning

  • 2. Who am I and what do I do? • Ben Bromhead • Co-founder and CTO of Instaclustr -> www.instaclustr.com • Instaclustr provides Cassandra-as-a-Service in the cloud • Currently in AWS, Azure and IBM Softlayer • We currently manage 400+ nodes
  • 3. What this talk will cover • Driver basics • Sync vs Async • Driver connection policies and tuning
  • 4. The driver • The Cassandra driver contains the logic for connecting to Cassandra and running queries in a fast and efficient manner • Focus on the Datastax Open Source drivers:
  • 5. The driver • Java • .NET (C#) • C/C++ • Python • Node.js • Ruby • PHP
  • 6. Cassandra Drivers • All have a similar architecture that consists of: • Session & pool management • Chainable policies for managing failure and performance • Sync vs Async queries • Failover & Retry • Tracing
  • 7. Cassandra Drivers A basic example in Java: Cluster  cluster  =  Cluster.builder()          .addContactPoints("52.89.183.67")          .withPort(9042)          .build();   Session  session  =  cluster.newSession();   session.execute("SELECT  *  FROM  foo…");
  • 8. Cassandra Drivers A basic example in Python: cluster  =  Cluster(contact_points=["52.89.183.67"],  port=9042)   session  =  cluster.connect()   rows  =  session.execute("SELECT  name,  age,  email  FROM  users")
  • 9. Cassandra Drivers A basic example in Ruby: cluster  =  Cassandra.cluster(          :hosts  =>  ["52.89.183.67",  "52.89.99.88",  "54.69.217.141"],        :datacenter  =>  'AWS_VPC_US_WEST_2'   )   session  =  cluster.connect()   rows  =  session.execute("SELECT  name,  age,  email  FROM  users")
  • 10. Cassandra Drivers • Architecture makes the driver similar across languages • What happens under the hood? • Cluster object creates configuration (auth, load balancing, contact points). • Session object holds the thread pool and manages connections. • Session object authenticates and maintains connections. • Session can be shared and is threadsafe!
  • 11. Different ways of querying • Synchronous: session.execute("SELECT  *  FROM  foo..”); • Asynchronous: ResultSetFuture  result  =  session.executeAsync("SELECT  *  FROM   foo..”);   result.get();
  • 12. How do these perform? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async Op/s
  • 13. How do these perform? Latency 0 20 40 60 80 Read Sync Write Sync Read Async Write Async ms
  • 14. Different ways of querying Prepared Statements: PreparedStatement  statement  =  getSession().prepare(              "INSERT  INTO  simplex.songs  "  +              "(id,  title,  album,  artist,  tags)  "  +              "VALUES  (?,  ?,  ?,  ?,  ?);");
  • 15. Different ways of querying boundStatement  =  new  BoundStatement(statement);   getSession().execute(boundStatement.bind(              UUID.fromString("2cc9ccb7-­‐6221-­‐4ccb-­‐8387-­‐f22b6a1b354d"),              UUID.fromString("756716f7-­‐2e54-­‐4715-­‐9f00-­‐91dcbea6cf50"),              "La  Petite  Tonkinoise",              "Bye  Bye  Blackbird",              "Joséphine  Baker")  );
  • 16. Drivers and consistency • Within the different ways of querying Cassandra you can also adjust the consistency level per query. • Lets have a quick consistency refresh
  • 17. A brief intro to tuneable consistency • Cassandra is considered to be a db that favours Availability and Partition Tolerance. • Let’s you change those characteristics per query to suit your application requirement
  • 18. Two consistency levers • Consistency level - How many acknowledgements/responses from replicas before a query is considered a success. • Replication Factor (RF) - How many copies of a record do I store.
  • 19. Two consistency levers • Consistency level - Chosen by the client at query time • Replication Factor (RF) - Determined client on schema definition
  • 20. Consistency Levels • ALL - Every replica • *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM) • Numbered - (ONE, TWO, THREE, LOCAL_ONE) • *SERIAL - (SERIAL, LOCAL_SERIAL) • ANY
  • 21. What does it all mean • At the client level (your application) you have total control • Define implicit and explicit failure handling • Isolate queries to a single geography • Trade consistency for latency (a decision is better than no decision)
  • 22. How does it all work?
  • 27. How does CL impact Op/s ? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 28. How does CL impact latency ? Latency 0 30 60 90 120 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 29. What happens when something goes wrong?
  • 33. Write CL:QUORUM RF:3 1 2 3 4 partition_key: b How does it all work? ✓✓ Required responses: floor(3 * 0.5) + 1 = 2
  • 35. How does an outage impact Op/s ? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 36. How does an outage impact latency ? Latency 0 25 50 75 100 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 37. We are now have a replica that is not consistent • Anti-entropy repair (only guaranteed way to make things consistent) • Hinted handoff • Read repair
  • 38. We are now have a replica that is not consistent • Anti-entropy repair (only guaranteed way to make things consistent) • Hinted handoff - lets cover this quickly • Read repair
  • 39. What is hinted handoff ? • A performance optimisation for “catching up” nodes who missed writes.
  • 40. What isn’t hinted handoff ? • A consistent distribution mechanism
  • 43. How does hinted handoff work? 1 2 3 4 host / key A B 1 ✔ ✔ 2 ? 3 ✔ ✔ … ✔
  • 44. How does hinted handoff work? partition_key: b 1 2 3 4
  • 45. How does hinted handoff work? partition_key: b 1 2 3 4 Gossip: 2 is now UP Node 1: I have stored hints for 2
  • 46. How does hinted handoff work? partition_key: b 1 2 3 4
  • 47. Some things to keep in mind • Cassandra will only store hints for a certain period of time, set by max_hint_window_in_ms. 3 hours by default • Hints are not a reliable delivery mechanism • Hint replay will cause counters to overcoat • CF of ANY will cause a hint to be stored even if no replicas are available. Sometimes called extreme availability… also called who knows where and if your data is safe?
  • 48. Hinted handoff performance • Causes the same volume of writes to occur in a cluster with reduced capacity (local write amplification on the co-ordinator node) • Hints are written to system.hints, each replica has hints stored in a single partition. • Hints use TTLs and tombstones.. the hint table is actually a queue! • When cassandra starts compacting or throwing tombstone warnings on the system.hints table… things are bad
  • 49. Hinted handoff performance • Rewritten in Cassandra 3.0 (in beta now) • Takes a commitlog approach: • No compaction • no TTL • no tombstones • no memtables
  • 50. How does this relate to the driver? • With a node outage the “latency” on the down node becomes hours/days until it becomes consistent • Cassandra itself takes over the client portion of ensuring the write makes it to the node that was down. • You can control whether C* handles this (via repair, HH etc) or whether your application controls this (have your client receive an exception instead).
  • 51. Driver policies • Cassandra driver policies allow you to control failure • Cassandra driver policies allow you to control how the driver routes requests • This can reduce your latency and/or increase op/s (in some cases)
  • 52. Retry Policy • Default Retry Policy • Downgrading Consistency Retry Policy • Fall through Retry Policy • Logging Retry Policy
  • 53. Load Balancing Policy • Round Robin • DC Aware • TokenAware • LatencyAware
  • 54. Driver policies impact latency ? Latency 0 0.3 0.6 0.9 1.2 Read Sync Write Sync Round Robin Token Aware Latency Aware
  • 55. Last but not least • Use one Cluster instance per (physical) cluster (per application lifetime) • Use at most one Session per keyspace, or use a single Session and explicitly specify the keyspace in your queries • If you execute a statement more than once, consider using a PreparedStatement • You can reduce the number of network roundtrips and also have atomic operations by using Batches