3. Hash Partitioning
A 0
Client
B 1
hash(“hello”)
mod
4
=
2
C 2
D 3
N
=
4
4. Hash Partitioning
A 0
Client
hello B 1
hash(“hello”)
mod
4
=
2
C 2
D 3
N
=
4
5. Hash Partitioning
A 0
Client
hello B 1
hash(“hello”)
mod
4
=
2
hash(“world”)
mod
4
=
0 C 2
D 3
N
=
4
6. Hash Partitioning
world
A 0
Client
hello B 1
hash(“hello”)
mod
4
=
2
hash(“world”)
mod
4
=
0 C 2
D 3
N
=
4
7. Hash Partitioning
world
A 0
Client
hello B 1
hash(“hello”)
mod
4
=
2
hash(“world”)
mod
4
=
0 C 2
hash(“bye”)
mod
4
=
3
D 3
N
=
4
8. Hash Partitioning
world
A 0
Client
hello B 1
hash(“hello”)
mod
4
=
2 bye
hash(“world”)
mod
4
=
0 C 2
hash(“bye”)
mod
4
=
3
D 3
N
=
4
9. Hash Partitioning
world
A 0
Client
hello B 1
hash(“hello”)
mod
4
=
2 bye
hash(“world”)
mod
4
=
0 C 2
hash(“bye”)
mod
4
=
3
D 3
Difficult to add/remove nodes
N
=
4
10. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
49152 16384
Client
32768
11. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
token
A
=
33015
49152 16384
Client
A
32768
12. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client
A
32768
13. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client token
C
=
31541
A C
32768
14. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client token
C
=
31541
token
D
=
40927
D
A C
32768
15. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client token
C
=
31541
token
D
=
40927
D
hash(“hello”)
=
13209
A C
32768
16. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client hello token
C
=
31541
token
D
=
40927
D
hash(“hello”)
=
13209
A C
32768
17. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client hello token
C
=
31541
token
D
=
40927
D
hash(“hello”)
=
13209
A C
hash(“world”)
=
36551
32768
18. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client hello token
C
=
31541
token
D
=
40927
world
D
hash(“hello”)
=
13209
A C
hash(“world”)
=
36551
32768
19. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
token
A
=
33015
token
B
=
8915
49152 16384
Client hello token
C
=
31541
token
D
=
40927
world
D
hash(“hello”)
=
13209
A C
hash(“world”)
=
36551
hash(“bye”)
=
60912 32768
20. Consistent Hashing / Random Tokens
0
E.g:
Address
Space
0..Max
=
0..65535
hash
function
with
range
0..Max
65535
B
bye token
A
=
33015
token
B
=
8915
49152 16384
Client hello token
C
=
31541
token
D
=
40927
world
D
hash(“hello”)
=
13209
A C
hash(“world”)
=
36551
hash(“bye”)
=
60912 32768
21. Consistent Hashing / Virtual Nodes
0
A
C D
65535 A
C
4 Virtual Nodes B
Random Tokens B
token
A.1
=
...
token
A.2
=
... D
token
A.3
=
...
A
token
A.4
=
...
A
token
B.1
=
...
B
D
C
D
B
A C
23. Token Generation
#>
cassandra/tools/bin/token-‐generator
Token
Generator
Interactive
Mode
-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
How
many
datacenters
will
participate
in
this
Cassandra
cluster?
1
How
many
nodes
are
in
datacenter
#1?
5
DC
#1:
Node
#1:
0
Node
#2:
34028236692093846346337460743176821145
Node
#3:
68056473384187692692674921486353642290
Node
#4:
102084710076281539039012382229530463435
Node
#5:
136112946768375385385349842972707284580
24. Partitioning Strategy
RandomPartitioner
(consistent hashing)
ByteOrderedPartitioner
Cassandra Documentation from DataStax:
“Unless
absolutely
required
by
your
application,
DataStax
strongly
recommends
against
using
the
ordered
partitioner”
35. CQL (Cassandra Query Language)
CREATE
KEYSPACE
demo
WITH
strategy_class
=
'SimpleStrategy'
AND
strategy_options:replication_factor
=
3;
CREATE
TABLE
users
(
login
varchar
PRIMARY
KEY,
name
varchar,
password
varchar,
country
varchar,
state
varchar
)
CREATE
INDEX
users_country
ON
users(country)
CREATE
INDEX
users_state
ON
users(state)
36. CQL (Cassandra Query Language)
INSERT
INTO
users
(login,
name,
country,
state)
VALUES
(‘alankay’,
‘Alan
Kay’,
‘US’,
‘CA’)
SELECT
*
FROM
users
WHERE
login
=
‘alankey’
SELECT
*
FROM
users
WHERE
country
=
‘US’
and
state
=
‘CA’
39. Tunable Consistency
Any (Only for Write)
One, Two, Three
Quorum Local Quorum
ALL Each Quorum
SELECT
*
FROM
users
USING
CONSISTENCY
QUORUM
WHERE
...
INSERT
INTO
users
(id,
name,
..)
VALUES
(...)
USING
CONSISTENCY
QUORUM
54. Hinted Handoff Writes
0
1
8
2
Client
7
Down 3
Hints stored for
down replicas 6
4
If consistency level = ANY, 5
always writable
55. Hinted Handoff Writes
0
1
8
A 2
Client
7
Down 3
Hints stored for
down replicas 6
4
If consistency level = ANY, 5
always writable
56. Hinted Handoff Writes
0
1
Coordinator
8
A 2
Client
7
Down 3
Hints stored for
down replicas 6
4
If consistency level = ANY, 5
always writable
57. Hinted Handoff Writes
0
1
Coordinator
8
A 2 R1
Client
7
Down 3 R2
Hints stored for
down replicas 6
4 R3
If consistency level = ANY, 5
always writable
58. Hinted Handoff Writes
0
1
Coordinator
A
8
A 2 R1
Client
7
A
Down 3 R2
Hints stored for
down replicas 6
4 R3
If consistency level = ANY, 5
always writable
59. Hinted Handoff Writes
0
1
Coordinator
A
8
A 2 R1
Client Hint
3:A
7
A
Down 3 R2
Hints stored for
down replicas 6
4 R3
If consistency level = ANY, 5
always writable
60. Hinted Handoff Writes
0
1
Coordinator
A
8
A 2 R1
Client Hint
3:A
7
A
Down 3 R2
Hints stored for
down replicas Hint
6
3:B
4 R3
If consistency level = ANY, 5
always writable
61. Anti-Entropy / Read Repair
0
KeySpace 1
with RF=3
8
2
Client
7
3
read_repair_chance 6
by column family
4
5
62. Anti-Entropy / Read Repair
0
KeySpace 1
Coordinator
with RF=3
8
2
Client
7
3
read_repair_chance 6
by column family
4
5
63. Anti-Entropy / Read Repair
0
KeySpace 1
Coordinator
with RF=3
8
2 R1
Client
7
3 R2
read_repair_chance 6
by column family
4 R3
5
64. Anti-Entropy / Read Repair
0
KeySpace 1
Coordinator
with RF=3
DigestQuery
8
2 R1
Client Qu
ery
7
Di
ge
stQ
3 R2
ue
ry
read_repair_chance 6
by column family
4 R3
5
65. Anti-Entropy / Read Repair
0
KeySpace 1
Coordinator
with RF=3
DigestQuery
8
2 R1
Client Qu
ery
7
Di
ge
stQ
3 R2
ue
ry
read_repair_chance 6
by column family
4 R3
5
66. Anti-Entropy / Read Repair
0
KeySpace 1
Coordinator
with RF=3
DigestQuery
8
2 R1
Client Qu
ery
7
Di
ge
stQ
3 R2
ue
ry
read_repair_chance 6
by column family
4 R3
5
79. Replica Placement
SimpleStrategy
(adjacent nodes)
CREATE
KEYSPACE
demo
WITH
strategy_class
=
‘SimpleStrategy’
AND
strategy_options:replication_factor
=
3;
80. Replica Placement
NetworkTopologyStrategy
(replication by datacenter)
CREATE
KEYSPACE
demo
WITH
strategy_class
=
‘NetworkTopologyStrategy’
AND
strategy_options:DC1
=
3
AND
strategy_options:DC2
=
2;
81. Topology Discovery
SimpleSnitch
(single datacenter)
EC2Snitch
(region as datancer, a. zone as rack)
PropertyFileSnitch
(cassandra-topology.properties)
RackInferringSnitch
(10.DataCenter.Rack.Node)
84. Wide Rows
(Composite Primary Key)
CREATE
TABLE
metrics
(
name
text,
day
int,
value
counter,
PRIMARY
KEY
(name,
day)
);
UPDATE
metrics
SET
value
=
value
+
1
WHERE
name
=
'google.com'
AND
day
=
20121201;
SELECT
*
FROM
metrics
WHERE
day
>
20121201
AND
day
<
20121205
AND
name
=
‘google.com’
85. Wide Rows
(Composite Primary Key)
CREATE
TABLE
tweets
(
tweet_id
uuid
PRIMARY
KEY,
author
varchar,
body
varchar
);
CREATE
TABLE
timeline
(
user_id
varchar,
tweet_id
uuid,
//
uuid
with
time
as
prefix
timeuuid
author
varchar,
body
varchar,
PRIMARY
KEY
(user_id,
tweet_id)
);
86. Atomic Batches (1.2+)
BEGIN
BATCH
USING
CONSISTENCY
QUORUM
INSERT
INTO
tweets
(user_id,
tweet_id,
author,
body)
VALUES
(‘alankay’,
...,
‘alan
kay’,
‘...’)
INSERT
INTO
timeline
(user_id,
tweet_id,
author,
body)
VALUES
(‘other’,
‘...’,
‘alankay’,
‘...’)
APPLY
BATCH
CREATE
TABLE
batchlog
(
id
uuid
PRIMARY
KEY,
written_at
timestamp,
data
blob
)
87. Collections / Sets (1.2+)
CREATE
TABLE
users
(
login
text
PRIMARY
KEY,
name
text,
emails
set<text>
);
INSERT
INTO
users
(login,
name,
emails)
VALUES
(‘alankay’,
‘Alan
Kay’,
{
“alan@kay.com”
})
UPDATE
users
SET
emails
+
{
“a@b.com”
}
WHERE
login
=
‘alankay’
88. Collections / Maps (1.2+)
CREATE
TABLE
users
(
login
text
PRIMARY
KEY,
name
text,
social_ids
map<text,
text>
);
INSERT
INTO
users
(login,
name,
social_ids)
VALUES
(‘alankay’,
‘Alan
Kay’,
{
“twitter”
:
“alankay”
})
UPDATE
users
SET
social_ids[“google”]
=
“+alankay”
WHERE
login
=
‘alankay’
89. Collections / Lists (1.2+)
CREATE
TABLE
users
(
login
text
PRIMARY
KEY,
name
text,
creditcards
list<text>
);
INSERT
INTO
users
(login,
name,
creditcards)
VALUES
(‘alankay’,
‘Alan
Kay’,
[
“1234-‐”
])
UPDATE
users
SET
creditcards
+
“2345-‐”
WHERE
login
=
‘alankay’
90. Cassandra Clients
Shells High Level APIs
Cassandra-‐CLI Java:
Hector
Client
API
CQLSH Java:
Astyanax
(Netflix)
Drivers Scala:
Cassie
(Twitter)
Java:
CQL
/
JDBC Python:
PyCassa
Client
API
PHP:
PhpCassa
Client
API
Mappings
Java:
Apache
Gora Low Level
Java:
Kundera
(JPA) Thrift
(multi
language)
91. Thanks,
Fernando Rodriguez Olivera
twitter:
@frodriguez
mail:
frodriguez
<at>
gmail.com
website:
nosqlessentials.com
Next course (Spanish only):
Hadoop/HBase/Cassandra/MongoDB
Buenos Aires, 18/19 Dec 2012:
Registration: nosqlessentials.com