4. VLDB benchmark (RWS)
THROUGHPUT OPS/SEC)
80000
Cassandra
MySQL
HBase
#CASSANDRAEU
Redis
C
SS
A
RA
ND
A
60000
40000
20000
0
0
2
4
6
NUMBER OF NODES
8
10
12
7. New core value
•Massive scalability
•High performance
•Reliability/Availabilty
•Ease of use
#CASSANDRAEU
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int
);
CREATE INDEX ON
users(state);
SELECT * FROM users
WHERE state=‘Texas’
AND birth_date > 1950;
13. Lightweight transactions
Session 1
#CASSANDRAEU
Session 2
SELECT * FROM users
WHERE username =
’jbellis’
SELECT * FROM users
WHERE username =
’jbellis’
[empty resultset]
[empty resultset]
INSERT INTO users (...)
VALUES (’jbellis’, ...)
INSERT INTO users (...)
VALUES (’jbellis’, ...)
14. Paxos
#CASSANDRAEU
•All operations are quorum-based
•Each replica sends information about unfinished
operations to the leader during prepare
•Paxos made Simple
15. Details
#CASSANDRAEU
•4 round trips vs 1 for normal updates
•Paxos state is durable
•Immediate consistency with no leader election or failover
•ConsistencyLevel.SERIAL
•http://www.datastax.com/dev/blog/lightweighttransactions-in-cassandra-2-0
16. Use with caution
#CASSANDRAEU
•Great for 1% of your application
•Eventual consistency is your friend
• http://www.slideshare.net/planetcassandra/c-summit-2013-
eventual-consistency-hopeful-consistency-by-christos-kalantzis
17. Syntax
#CASSANDRAEU
INSERT INTO USERS (username, email, ...)
VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )
IF NOT EXISTS;
UPDATE USERS
SET email = ’jonathan@datastax.com’, ...
WHERE username = ’jbellis’
IF email = ’jbellis@datastax.com’;
26. Other CQL improvements
•SELECT DISTINCT pk
•CREATE TABLE IF NOT EXISTS table
•SELECT ... AS
• SELECT
event_id, dateOf(created_at) AS creation_date
#CASSANDRAEU
27. Other CQL improvements
•SELECT DISTINCT pk
•CREATE TABLE IF NOT EXISTS table
•SELECT ... AS
• SELECT
event_id, dateOf(created_at) AS creation_date
•ALTER TABLE DROP column
•
#CASSANDRAEU
34. Read path (per sstable)
#CASSANDRAEU
Bloom
filter
Compression
offsets
Partition
summary
0X...
0X...
0X...
Memory
Disk
0X...
0X...
0X...
0X...
Data
Partition
index
Partition
key cache
35. Off heap in 2.0
#CASSANDRAEU
Partition key bloom filter
1-2GB per billion partitions
Bloom
filter
Compression
offsets
Partition
summary
0X...
0X...
0X...
Memory
Disk
0X...
0X...
0X...
0X...
Data
Partition
index
Partition
key cache
36. Off heap in 2.0
#CASSANDRAEU
Compression metadata
~1-3GB per TB compressed
Bloom
filter
Compression
offsets
Partition
summary
0X...
0X...
0X...
Memory
Disk
0X...
0X...
0X...
0X...
Data
Partition
index
Partition
key cache
37. Off heap in 2.0
#CASSANDRAEU
Partition index summary
(depends on rows per partition)
Bloom
filter
Compression
offsets
Partition
summary
0X...
0X...
0X...
Memory
Disk
0X...
0X...
0X...
0X...
Data
Partition
index
Partition
key cache
44. User defined types
#CASSANDRAEU
CREATE TYPE address (
street text,
city text,
zip_code int,
phones set<text>
)
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
addresses map<text, address>
)
SELECT id, name, addresses.city, addresses.phones FROM users;
id |
name | addresses.city |
addresses.phones
--------------------+----------------+-------------------------63bf691f | jbellis |
Austin | {'512-4567', '512-9999'}
45. Collection indexing
#CASSANDRAEU
CREATE TABLE songs (
id uuid PRIMARY KEY,
artist text,
album text,
title text,
data blob,
tags set<text>
);
CREATE INDEX song_tags_idx ON songs(tags);
SELECT * FROM songs WHERE 'blues' IN tags;
id
| album
| artist
| tags
| title
----------+---------------+-------------------+-----------------------+-----------------5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind