7. *Massive scalability
*High performance
*Reliabilty / Availability
*Ease of use
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int
);
CREATE INDEX ON users(state);
SELECT * FROM users
WHERE state=‘Texas’
AND birth_date > 1950;
New Core Value
8. CQL is working
"Coming from a relational database background we found
the transition to Cassandra to be very straightforward. There are a
few simple key concepts one must grasp at first but ever since it's
been smooth sailing for us."
Boris Wolf, Comcast
*Key concepts?
*The next Top Data Model (Tomorrow, 11:00, Festival)
*The State of CQL (Tomorrow, 3:10, Marina)
9. 1.2 for Developers
*CQL3
Thrift compatibility
Collections
Data dictionary
Auth support
Hadoop support
Native drivers
*Tracing
*Atomic batches
12. Collections
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int
);
CREATE TABLE users_addresses (
user_id uuid REFERENCES users,
email text
);
SELECT *
FROM users NATURAL JOIN users_addresses;
13. Collections
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int
);
CREATE TABLE users_addresses (
user_id uuid REFERENCES users,
email text
);
SELECT *
FROM users NATURAL JOIN users_addresses;X
14. Collections
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int,
email_addresses set<text>
);
15. Collections
UPDATE users
SET email_addresses = email_addresses +
{‘jbellis@gmail.com’, ‘jbellis@datastax.com’};
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int,
email_addresses set<text>
);
16. Data Dictionary
cqlsh:system> use system;
cqlsh:system> select columnfamily_name from schema_columnfamilies
where keyspace_name = 'system';
columnfamily_name
-----------------------
batchlog
hints
local
peer_events
peers
schema_columnfamilies
schema_columns
schema_keyspaces
36. Off Heap in 1.2+
*Partition key bloom filter
1-2GB per billion partitions
Data
Partition
summary
0X...
0X...
0X...
Bloom
filter
0X...
0X...
0X...
0X...
Partition
index
Compression
offsets
Partition
key cacheMemory
Disk
37. Off Heap in 1.2+
*Compression metadata
~1-3GB per TB compressed
Data
Partition
summary
0X...
0X...
0X...
Bloom
filter
0X...
0X...
0X...
0X...
Partition
index
Compression
offsets
Partition
key cacheMemory
Disk
38. Not off Heap until 2.0
*Partition index summary
(Size cut in ~half in 1.2.5+)
Data
Partition
summary
0X...
0X...
0X...
Bloom
filter
0X...
0X...
0X...
0X...
Partition
index
Compression
offsets
Partition
key cacheMemory
Disk
40. DSE 3.1
*Cassandra 1.2 shipping in
DataStax Enterprise 3.1 on
June 30
*Updated with CQL and
composite column
support for Hive and Solr
*Includes Solr 4.3
48. Removed in 2.0
*Token range bisection on bootstrap
*Supercolumns (only internally)
public List<ColumnOrSuperColumn> get_slice(...)
49. Removed in 2.0
*Token range bisection on bootstrap
*Supercolumns (only internally)
public List<ColumnOrSuperColumn> get_slice(...)
*Disk compatibility for < 1.2.5
50. Removed in 2.0
*Token range bisection on bootstrap
*Supercolumns (only internally)
public List<ColumnOrSuperColumn> get_slice(...)
*Disk compatibility for < 1.2.5
*Network compatibility for < 1.2
51. New in 2.0
*CAS (Compare-and-set = lightweight transactions)
*Eager retries
*Improved compaction
*Triggers (experimental)
*CQL cursors
52. CAS: The Problem
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
INSERT INTO users (...)
VALUES (’jbellis’, ...)
Session 1
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
INSERT INTO users (...)
VALUES (’jbellis’, ...)
Session 2
54. Why Locking Doesn’t Work
Client
(locks) Coordinator
request
Replica
internal
request
X
55. Why Locking Doesn’t Work
Client
(locks) Coordinator
request
Replica
internal
request
hint
X
56. Why Locking Doesn’t Work
Client
(locks) Coordinator
request
Replica
internal
request
hint
timeout
response
X
57. *All operations are quorum-based
*Each replica sends information about unfinished operations to the
leader during prepare
*Paxos made Simple
Paxos
58. CAS Details
*3 round trips vs 1 for normal updates
*Paxos state is durable
*Immediate consistency with no leader election or failover
*ConsistencyLevel.SERIAL
59. Use with Caution
*Great for 1% of your application
*Eventual consistency is your friend
Eventual Consistency != Hopeful Consistency (Today, 1:30, Golden Gate)
60. Using CAS
UPDATE USERS
SET email = ’jonathan@datastax.com’, ...
WHERE username = ’jbellis’
IF email = ’jbellis@datastax.com’;
INSERT INTO USERS (username, email, ...)
VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )
IF NOT EXISTS;