4. Thrift Drivers
Problems?
● Backwards & forwards compatibility
● Too many connections
● No standard interface
● Thrift overhead
● Cluster state must be polled
5. Problems & Solutions
Backwards/Forwards Compatibility
● Possible with Thrift, but easier with a
query language (CQL)
● Separately versioned query language
and protocol
8. Problems & Solutions
No Standard Interface
● Query language
● Standard policies for load balancing,
connection management, and
retries/failure handling
● More similar to standard RDMBS drivers
12. New Driver API
Sync/Async Operations
future = session.execute_async(“SELECT * FROM foo”)
…
result = future.result()
13. New Driver API
Sync/Async Operations
session.execute_async(query).add_callbacks(
callback=process_data,
errback=log_error
)
14. New Driver Architecture
Connection Pooling
● Min/max conns per remote, local nodes
● Use least busy conn
● Open and close conns as needed
15. New Driver Architecture
What happens during Operations?
● Nodes to query are picked by
LoadBalancingPolicy
● Failures are handled by RetryPolicy
● On errors, nodes are marked down by
ConvictionPolicy
17. New Driver Architecture
What happens during Operations?
● Nodes to query are picked by
LoadBalancingPolicy
● Failures are handled by RetryPolicy
● On errors, nodes are marked down by
ConvictionPolicy
18. New Driver Architecture
Retry Policies
● Operation type
● Consistency level
● Number (and type) of responses
● Type of failure
● Retry, raise error, or ignore error
19. New Driver Architecture
What happens during Operations?
● Nodes to query are picked by
LoadBalancingPolicy
● Failures are handled by RetryPolicy
● On errors, nodes are marked down by
ConvictionPolicy, reconnect with
ReconnectionPolicy
21. New Driver Architecture
Policy Defaults
● RoundRobin load balancing (not token
or DC aware)
● Retry at most once (in a small number
of cases)
● Mark node down after one failure
● Exponential backoff on reconnection
attempts
22. New Driver Architecture
Prepared Statements
● Prepared against all nodes
● Cache
● Re-preparation
prepared = session.prepare(“SELECT foo FROM bar WHERE id=?”)
result = session.execute(prepared, [user_id1])
23. New Driver Architecture
Control Connection
● Listens for pushed updates to cluster
state and schema
● Marks nodes up and down
● Auto discovers nodes in cluster
● Updates schema metadata
24. New Driver Architecture
Metrics
● Count timeouts, connection errors, and
other errors
● Open connection stats
● Operation latency histogram
25. New Driver Architecture
Cursors
● No more manual paging over large
queries
● Works across multiple nodes
●
Paging state provided by client
27. New Driver Architecture
Languages Supported
● Java – 1.0 released
●
in Spring 2013
Simple object mapper under
development
● C# - 1.0 released in Summer 2013
●
LINQ integration
● Python – Beta since Summer 2013, 1.0
coming soon
●
Basic mapper available through
cqlengine
● C++ - Currently in Alpha state
● Ruby, JS, PHP – planned, but no
development so far
Maintainer of pycassa, phpcassa, and telephus. I've also worked on Hector and clj-hector a bit.
The Datastax python driver was originally ported from the DataStax Java driver.