SlideShare ist ein Scribd-Unternehmen logo
1 von 144
Downloaden Sie, um offline zu lesen
Cassandra
Integrating Cassandra into your project

dinsdag 12 november 13
Maurits Lawende
•
•
•

dinsdag 12 november 13

Work at Dutch Open Projects (DOP) since 2007
Development and technical design for challenging Drupal sites
Development of SaaS solutions in PHP & NodeJS
ToDoToDay
•
•
•
•
dinsdag 12 november 13

Data versus information
History and usage of Cassandra
How to use Cassandra
Developments
Data versus information
Celko, J. (1999). Data and databases

dinsdag 12 november 13
SQL is designed for information
DBMS knows how to use your data

dinsdag 12 november 13
SQL is designed for flexibility
Not even a single line on scalability

dinsdag 12 november 13
SQL
nearly 40 years of experience

dinsdag 12 november 13
SQL
Never designed for scalability

dinsdag 12 november 13
Alexa top 10
•
•
•
•
•

dinsdag 12 november 13

Google
Facebook
YouTube
Yahoo
Baidu

•
•
•
•
•

Wikipedia
QQ.com
LinkedIn
Live.com
Twitter
Alexa top 10
•
•
•
•
•

dinsdag 12 november 13

Google (BigTable)
Facebook (MySQL)
YouTube (MySQL)
Yahoo
Baidu (HyperTable)

•
•
•
•
•

Wikipedia (MySQL)
QQ.com
LinkedIn (Voldemort)
Live.com
Twitter (MySQL)
Cassandra users
•
•
•
•
•
•

dinsdag 12 november 13

Facebook (+ Redis & HBase & MySQL)
Twitter (+ MySQL)
Reddit (+ Postgres)
Digg (+ Redis)
Bit.ly (+ MongoDB)
Netflix
Cassandra users
•
•
•
•
•
•

dinsdag 12 november 13

Facebook (+ Redis & HBase & MySQL)
Twitter (+ MySQL)
Reddit (+ Postgres)
Digg (+ Redis)
Bit.ly (+ MongoDB)
Netflix

Jeff Hammerbacher
Cassandra users
•
•
•
•
•
•

dinsdag 12 november 13

Facebook (+ Redis & HBase & MySQL)
Twitter (+ MySQL)
Reddit (+ Postgres)
Digg (+ Redis)
Bit.ly (+ MongoDB)
Netflix

Jeff Hammerbacher
left Facebook in 2008
Back to basic
Don’t think SQL

dinsdag 12 november 13
Key/value store
Evolved towards tables

dinsdag 12 november 13
Just data
•
•
•

dinsdag 12 november 13

No joins
Limited sorting capabilities
No aggregation, grouping, subqueries whatsoever
Schemaless

•
•

dinsdag 12 november 13

Fixed <strike>tables</strike> column families, but;
Dynamic column names
Operations in Cassandra 1.0
•

CREATE KEYSPACE name

•
•
•
•
dinsdag 12 november 13

USE name

CREATE COLUMN FAMILY name
DROP KEYSPACE name
DROP COLUMN FAMILY name
Operations in Cassandra 1.0
•
•
•
•
•
dinsdag 12 november 13

SET columnfamily[‘row’][‘column’] = ‘value’;
GET columnfamily[‘row’]
LIST columnfamily
DEL columnfamily[‘row’]
DEL columnfamily[‘row’][‘column’]
Operations in Cassandra 1.0
•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘lastname’] = ‘Lawende’;
Operations in Cassandra 1.0
post

•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘lastname’] = ‘Lawende’;

title
uuid First post!
user

firstname
mau Maurits

lastname
Lawende
Operations in Cassandra 1.0
sorted by rowkey, columnname (all ascending)

•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘lastname’] = ‘Lawende’;
Operations in Cassandra 1.0
•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?

•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?

•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;

WHERE user = ‘mau’
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?

•
•
•

dinsdag 12 november 13

WHERE user = ‘mau’
post[‘uuid’][‘title’] = ‘First post!’;
Bad Request:
No indexed columns present in
post[‘uuid’][‘user’] = ‘mau’;
by-columns clause with
user[‘mau’][‘firstname’] = ‘Maurits’;
Equal operator
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?

•
•
•

WHERE user = ‘mau’
post[‘uuid’][‘title’] = ‘First post!’;
Bad Request:
No indexed columns present in
post[‘uuid’][‘user’] = ‘mau’;
by-columns clause with
user[‘mau’][‘firstname’] = ‘Maurits’;
Equal operator
sequal scans
are rejected

dinsdag 12 november 13
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?
WHERE user = ‘mau’
post[‘uuid’][‘title’] = ‘First post!’;
Bad Request:
No indexed columns present in
post[‘uuid’][‘user’] = ‘mau’;
by-columns clause with
user[‘mau’][‘firstname’] = ‘Maurits’;
Equal operator
Bad Request: Order by is currently only supported
on the clustered columns of the PRIMARY KEY

•
•
•

dinsdag 12 november 13
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?
WHERE user = ‘mau’
post[‘uuid’][‘title’] = ‘First post!’;
Bad Request:
No indexed columns present in
post[‘uuid’][‘user’] = ‘mau’;
by-columns clause with
user[‘mau’][‘firstname’] = ‘Maurits’;
Equal operator
Bad Request: Order by is currently only supported
on the clustered columns of the PRIMARY KEY
Bad Request: ORDER BY is only supported when the partition key is
restricted by an EQ or an IN.

•
•
•

dinsdag 12 november 13
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?

•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;

WHERE user = ‘mau’
ORDER BY date DESC
LIMIT 10
Operations in Cassandra 1.0
How to get a list
of blogs by “mau”?

•
•
•

dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;

WHERE user = ‘mau’
ORDER BY date DESC
LIMIT 10
only possible when user and
date is in primary key
Predictable performance
No performance degradation after data growth

dinsdag 12 november 13
Operations in Cassandra 1.0
•
•
•
•
•
dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘post001’] = ‘uuid’;
user[‘mau’][‘post002’] = ‘uuid’;
Operations in Cassandra 1.0
•
•
•
•
•
dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘mau’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘post001’] = ‘uuid’;
user[‘mau’][‘post002’] = ‘uuid’;

any order and limit
Operations in Cassandra 1.0
•
•
•
•
•
dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘uuid’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘post001’] = ‘uuid’;
user[‘mau’][‘post002’] = ‘uuid’;

join
Operations in Cassandra 1.0
•
•
•
•
•
dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
post[‘uuid’][‘user’] = ‘uuid’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘post001’] = ‘uuid’;
user[‘mau’][‘post002’] = ‘uuid’;

join
no uuid IN (...) or OR’s
Operations in Cassandra 1.0
•
•
•
•
dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘post001:uuid’] = ‘First post!’;
user[‘mau’][‘post002:uuid’] = ‘Second post!’;
Operations in Cassandra 1.0
•
•
•
•
dinsdag 12 november 13

post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘firstname’] = ‘Maurits’;

only one query required
to get user profile
with latest posts

user[‘mau’][‘post001:uuid’] = ‘First post!’;
user[‘mau’][‘post002:uuid’] = ‘Second post!’;
Operations in Cassandra 1.0
•
•
•
•

post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘firstname’] = ‘Maurits’;
user[‘mau’][‘post001:uuid’] = ‘First post!’;
user[‘mau’][‘post002:uuid’] = ‘Second post!’;
64 KB

dinsdag 12 november 13

2 billion cells

64 KB

2 GB
Beauty?
•
•
•
•
dinsdag 12 november 13

Dirty in the SQL world, but;
It’s a best practice in Big Data
Don’t think of it as a relational database
No strict rules on how to use it, just push it to the limits
dinsdag 12 november 13
Each row is a snapshot of data
meant to satisfy a given query, sort
of like a materialized view.

dinsdag 12 november 13
Storage in a cluster

dinsdag 12 november 13
Cluster structures

dinsdag 12 november 13
Master-slave

dinsdag 12 november 13
Master-master

dinsdag 12 november 13
Sharding

dinsdag 12 november 13
HDFS / GlusterFS

dinsdag 12 november 13
HyperTable

dinsdag 12 november 13
Dynamo

dinsdag 12 november 13
No master or single point of failure
Every node is (nearly) identical

dinsdag 12 november 13
Distribution and replication
2^127 0

dinsdag 12 november 13
Distribution and replication

dinsdag 12 november 13
Distribution and replication

dinsdag 12 november 13
Distribution and replication

dinsdag 12 november 13
Distribution and replication

dinsdag 12 november 13
Distribution and replication

dinsdag 12 november 13
Client can connect to any node

dinsdag 12 november 13
Seed nodes

•
•

dinsdag 12 november 13

Required for bootstrapping nodes
Define 2 or 3 seed nodes per cluster
Extending the ring
•
•
•

dinsdag 12 november 13

Assign a token for new node
Configure seed node host
Start Cassandra on new node
Extending the ring
•
•
•

dinsdag 12 november 13

Assign a token for new node
Configure seed node host
Start Cassandra on new node
Consistency

dinsdag 12 november 13
Writing data
•
•
•
•
dinsdag 12 november 13

Hinted handoff
Write to commit log
Write in memory
Write to disk (together with timestamp)
Write consistency

•
•

dinsdag 12 november 13

Choose from ANY, ONE, TWO, THREE, QUORUM, ALL
QUORUM = floor((replication factor / 2) + 1)
Read consistency

•
•

dinsdag 12 november 13

Choose from ONE, TWO, THREE, QUORUM, ALL
Most recent copy is returned
Read repair
•
•
•

dinsdag 12 november 13

Compares data with 2 other replica’s in the background
Fixes inconsistent and missing data
At 10% of all reads
Node repair

•
•

dinsdag 12 november 13

Gradually compares all data in nodes with replica’s
Required in conjunction with read repair to fix ‘forgotten deletes’
ACID theorem
•
•
•
•
dinsdag 12 november 13

Atomic; completed successfully or entirely rolled back
Consistent; transations never invalidates the database state
Isolated; transactions are processed sequential
Durable; completed actions are persistent
CAP theorem
Impossible to achieve all three:

•
•
•

dinsdag 12 november 13

Consistency
Availability
Partition tolerance
Eventual consistency
Not guaranteed to be consistent, but becomes consistent later

dinsdag 12 november 13
Eventual consistency
•
•

Best effort

•

Configurable consistency level, but no transaction support

dinsdag 12 november 13

Consistency is not always more important than speed and scalability
(doesn’t require locking)
Surrogate keys
Say bye to sequences

dinsdag 12 november 13
Surrogate keys
Say bye to sequences

ss cluster
istent acro
not cons

dinsdag 12 november 13
Surrogate keys
Say bye to sequences

ss cluster
istent acro
not cons

counters a
re for cou
n

dinsdag 12 november 13

ting
Native support for uuid’s
f47ac10b-58cc-4372-a567-0e02b2c3d479

Surrogate keys
Say bye to sequences

ss cluster
istent acro
not cons

counters a
re for cou
n

dinsdag 12 november 13

ting
Cassandra 1.2

dinsdag 12 november 13
Cassandra 1.2
•
•
•

dinsdag 12 november 13

Not longer schemaless
Introduced CQL3
No wide tables anymore
Collections
•
•
•

dinsdag 12 november 13

Lists
Maps
Sets
Lists
•
•

user[‘mau’][‘posts’] = ‘uuid’;

•
•

UPDATE user SET posts = posts + [‘uuid’]

dinsdag 12 november 13

CREATE TABLE user (
username text PRIMARY KEY,
posts list<uuid>
);

UPDATE user SET posts = [‘uuid’] + posts
Set
•

CREATE TABLE user (
username text PRIMARY KEY,
email set<text>
);

•

UPDATE user SET emails = emails + {‘mail@example.com’}

dinsdag 12 november 13
Maps
•

CREATE TABLE user (
username text PRIMARY KEY,
attending map<timestamp,text>
);

•
•

UPDATE user SET attending[‘2013-11-12’] = ‘PHPMeetup’

dinsdag 12 november 13

DELETE attending[‘2013-12-05’] FROM user
Limits on collections
•
•
•

dinsdag 12 november 13

64K
Whole collection loaded in memory when reading / writing
Not an alternative to wide tables!
Limits on collections
•
•
•

dinsdag 12 november 13

64K

No size check in CQL
SET list = list + [‘...’]

Whole collection loaded in memory when reading / writing
Not an alternative to wide tables!
Wide tables in CQL3
•

CREATE TABLE tweets (
tweet_id uuid PRIMARY KEY,
author varchar,
body varchar
);

•

CREATE TABLE timeline (
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
PRIMARY KEY (user_id, tweet_id)
)

dinsdag 12 november 13
Wide tables in CQL3
•

•

dinsdag 12 november 13

CREATE TABLE tweets (
tweet_id uuid PRIMARY KEY,
author varchar,
body varchar
);
CREATE TABLE timeline (
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
PRIMARY KEY (user_id, tweet_id)
)

user_id
mau
user_id
mike

uuid:author
anne
uuid:author
david

uuid:body
Tweet from Anne
uuid:body
Tweet from David
Wide tables in CQL3

For schemaless lovers:

•

•

dinsdag 12 november 13

CREATE TABLE tweets (
tweet_id uuid PRIMARY KEY,
author varchar,
body varchar
);
CREATE TABLE timeline (
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
PRIMARY KEY (user_id, tweet_id)
)

user_id
mau
user_id
mike

CREATE TABLE name (
rowkey varchar,
columnname varchar,
value blob,
PRIMARY KEY (rowkey, columnname)
);
uuid:author uuid:body
anne
Tweet from Anne
uuid:author uuid:body
david
Tweet from David
Secondary index

•
•

dinsdag 12 november 13

CREATE INDEX name ON table (column);
High memory usage when used with high cardinality
Iteration

•

dinsdag 12 november 13

SELECT * FROM users
Iteration
unpredictable performance

•

dinsdag 12 november 13

SELECT * FROM users LIMIT 10 OFFSET 100
Iteration

•
•

dinsdag 12 november 13

SELECT * FROM users
SELECT token(username), username, country, age FROM user
Iteration
•
•

dinsdag 12 november 13

SELECT * FROM users
SELECT token(username), username, country, age FROM user
WHERE token(username) > 23947239 LIMIT 10
Queries are always controlled by
one node

dinsdag 12 november 13
Queries are always controlled by
one node
Even if data from 100 nodes is involved

dinsdag 12 november 13
MapReduce
Or just ‘MapRed’

dinsdag 12 november 13
MapReduce

•
•

dinsdag 12 november 13

array_map
array_reduce
map()

•
•

dinsdag 12 november 13

Processes a subset of the data
array_map(function($v) { return strtoupper($v); }, array('a', 'b'))
reduce()

•
•

dinsdag 12 november 13

Merge results from the mapping function
array_reduce(array(1, 2, 3), function($a, $b) { return $a + $b; });
MapReduce

dinsdag 12 november 13
MapReduce
map()

map()

map()

map()

map()

map()

map()

map()

dinsdag 12 november 13

map()

map()

map()

map()
MapReduce

dinsdag 12 november 13
MapReduce

dinsdag 12 november 13
MapReduce

dinsdag 12 november 13
MapReduce

dinsdag 12 november 13
MapReduce

dinsdag 12 november 13
MapReduce

dinsdag 12 november 13
MapReduce

dinsdag 12 november 13
MapReduce
result

dinsdag 12 november 13
Wordcount
$data = array(‘red green blue’, ‘orange blue’, ‘purple green’);
$data = array_map(function($v) {
$words = array();
foreach (explode(' ', $v) as $word)
$words[$word] = isset($words[$word]) ? $words[$word] + 1 : 1;
return $words;
}, $data);
$data = array_reduce($data, function($a, $b) {
foreach ($a as $word => $count)
$b[$word] = isset($b[$word]) ? $b[$word] + $count : $count;
return $b;
}, array());
array(‘red’ => 1, ‘green’ => 2, ‘blue’ => 2, ‘orange’ => 1, ‘purple’ => 1)
dinsdag 12 november 13
ORDER BY value LIMIT 5
$data = array(array(4,5,2), array(62,35,1), array(74,56,2,34));
$data = array_map(function($v) {
sort($v);
return array_slice($v, 0, 5);
}, $data);
$data = array_reduce($data, function($a, $b) {
$v = array_merge($a, $b);
sort($v);
return array_slice($v, 0, 5);
}, array());
array(1, 2, 2, 4, 5)
dinsdag 12 november 13
Remember

•
•

dinsdag 12 november 13

Getting information is a bumpy road in big data
Use MapRed to transform data into information
MapReduce

•
•

dinsdag 12 november 13

No native support in Cassandra
MapReduce possible with Hadoop (requires Java programming)
Pig
input_lines = LOAD '/tmp/my-copy-of-all-pages-on-internet' AS (line:chararray);
words = FOREACH input_lines GENERATE FLATTEN(TOKENIZE(line)) AS word;
filtered_words = FILTER words BY word MATCHES 'w+';
word_groups = GROUP filtered_words BY word;
word_count = FOREACH word_groups GENERATE COUNT(filtered_words) AS
count, group AS word;
ordered_word_count = ORDER word_count BY count DESC;
STORE ordered_word_count INTO '/tmp/number-of-words-on-internet';

dinsdag 12 november 13
Hive
SELECT v['ip'], COUNT(1) AS cnt FROM www_access
GROUP BY v['ip']
ORDER BY cnt DESC LIMIT 30

dinsdag 12 november 13
Pig and Hive
•
•
•

dinsdag 12 november 13

Using MapReduce
No(t very) predictable performance
Good for analysis
Hack your own
•
•
•
•
dinsdag 12 november 13

Not too difficult
Data can be split into subsets by filtering on tokens
Application must run on all MapRed nodes
Probably better performance than Pig / Hive
dinsdag 12 november 13
Interfaces / protocols
•
•
•

dinsdag 12 november 13

Thrift
Binary protocol (1.2+)
Gossip (internode communication)
Thrift
•
•
•
•
•
dinsdag 12 november 13

Something like SOAP in a binary format
Tool which generates libraries based on definition files
Supports many languages (incl. PHP, JS, NodeJS, c, java, python, ruby.....)
Also used by HyperTable, HBase, Accumulo and ElasticSearch
Sole interface before 1.2
Thrift

•

dinsdag 12 november 13

No support for collections
Binary protocol
•
•
•

dinsdag 12 november 13

Recommended protocol for Cassandra 1.2
Few client libraries available
No binary connectors were available for PHP
https://github.com/mauritsl/php-cassandra
php-cassandra
require('lib/cassandra/Cassandra.php');
use CassandraConnection as Cassandra;
$connection = new Cassandra('localhost', 'keyspace');
$rows = $connection->query('SELECT * FROM user');
foreach ($rows as $row) {
print $row->firstname;
print $row->listfield[0];
}
$rows->count();
$rows->getColumns();
dinsdag 12 november 13
Scaling applications

dinsdag 12 november 13
Rule 1:
Don’t ask for NoSQL drivers for a CMS

dinsdag 12 november 13
Cassandra does not fit all
(same story for every NoSQL solution)

dinsdag 12 november 13
Every page (or API call) should only
require a few (if not one) query

dinsdag 12 november 13
Static versus Dynamic data
•

Static: information that doesn’t change very often

•
•
•

I.e.: translations
May go in a RDBMS or local storage (files?)

Dynamic: many changes

•
•
dinsdag 12 november 13

Changes must be visible on all nodes
Use Cassandra
Local versus Global data
•

Logging

•
•

Separate logs per node

Cache

•
•

Sometimes no need to share cache between nodes

Statistics

•
dinsdag 12 november 13

Can be kept local for a limited time
Local versus Global data

•

Sessions

•

dinsdag 12 november 13

Dependent on session stickiness
Caching
•
•

Memcache is recommended for local cache
Cassandra can be used for global cache

•

dinsdag 12 november 13

Has a TTL feature
INSERT INTO ... (...) VALUES (...) USING TTL 86400
What about files?

•

dinsdag 12 november 13

Use Hadoop Distributed File System (HDFS) or GlusterFS
What about files?

•
•

dinsdag 12 november 13

Use Hadoop Distributed File System (HDFS) or GlusterFS
Or use Cassandra
What about files?
•
•

Split files in chunks to avoid hotspots and save the heap
Not uncommon to have files in Cassandra

•
•
dinsdag 12 november 13

github.com/Netflix/astyanax

GB’s are ok, but do not store TB’s
Maximum size of cluster?
•
•

No satisfactory answer
Probably more dependent on network equipment

•
•
•
dinsdag 12 november 13

Rack awareness helps here

Facebook: 150 node cluster, 50TB data (2010)
Easou: 400 node cluster, 300TB data (300 million images)
Minimum size of a cluster?
•
•
•

dinsdag 12 november 13

Can run on a single node
4GB RAM recommended
Runs fine on 1GB RAM
Minimum size of a cluster?
•
•
•

dinsdag 12 november 13

Can run on a single node
4GB RAM recommended
Runs fine on 1GB RAM

“hot data” should fit in RAM
Installing Cassandra
•

Install JDK
Oracle Java recommended but OpenJDK works ok

•
•
•
•

Add Cassandra repository

dinsdag 12 november 13

apt-get install cassandra
Set listen and seed address (IP address of node and seed)
(Re)start Cassandra
Last words...

dinsdag 12 november 13
Data versus information
Data structure is naturally responsive for information

dinsdag 12 november 13
Data versus information
Data structure is naturally responsive for information

predictable performance

dinsdag 12 november 13
History and usage
Jeff Hammerbacher

dinsdag 12 november 13
How to use it
Schema design, CQL3 and limits

dinsdag 12 november 13
Developments
CQL3 and binary protocol

dinsdag 12 november 13
Thank you!

dinsdag 12 november 13
Questions?

dinsdag 12 november 13

Weitere ähnliche Inhalte

Was ist angesagt?

Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectMorningstar Tech Talks
 
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data ModelingCassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data ModelingDataStax Academy
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsDuyhai Doan
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Markus Klems
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced previewPatrick McFadin
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0Krishna Sankar
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...DataStax Academy
 
MariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityMariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityColin Charles
 
Montreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraMontreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraAdam Hutson
 
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingDataStax Academy
 
MariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityMariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityColin Charles
 
Getting Started with Apache Cassandra by Junior Evangelist Rebecca Mills
Getting Started with Apache Cassandra by Junior Evangelist Rebecca MillsGetting Started with Apache Cassandra by Junior Evangelist Rebecca Mills
Getting Started with Apache Cassandra by Junior Evangelist Rebecca MillsDataStax Academy
 

Was ist angesagt? (20)

Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data ModelingCassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
Cassandra Day SV 2014: Fundamentals of Apache Cassandra Data Modeling
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patterns
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0A Hitchhiker's Guide to NOSQL v1.0
A Hitchhiker's Guide to NOSQL v1.0
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin C...
 
CQL3 in depth
CQL3 in depthCQL3 in depth
CQL3 in depth
 
MariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityMariaDB Cassandra Interoperability
MariaDB Cassandra Interoperability
 
Montreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraMontreal User Group - Cloning Cassandra
Montreal User Group - Cloning Cassandra
 
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data Modeling
 
MariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityMariaDB and Cassandra Interoperability
MariaDB and Cassandra Interoperability
 
Getting Started with Apache Cassandra by Junior Evangelist Rebecca Mills
Getting Started with Apache Cassandra by Junior Evangelist Rebecca MillsGetting Started with Apache Cassandra by Junior Evangelist Rebecca Mills
Getting Started with Apache Cassandra by Junior Evangelist Rebecca Mills
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
 

Ähnlich wie Cassandra - PHP

Using Apache Cassandra: What is this thing, and how do I use it?
Using Apache Cassandra: What is this thing, and how do I use it?Using Apache Cassandra: What is this thing, and how do I use it?
Using Apache Cassandra: What is this thing, and how do I use it?jeremiahdjordan
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruTim Callaghan
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012Theo Schlossnagle
 
Rabbitmq Boot System
Rabbitmq Boot SystemRabbitmq Boot System
Rabbitmq Boot SystemAlvaro Videla
 
Cloud Messaging With Cloud Foundry
Cloud Messaging With Cloud FoundryCloud Messaging With Cloud Foundry
Cloud Messaging With Cloud FoundryAlvaro Videla
 
"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012
"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012
"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012Blend Interactive
 
Keeping responsive into the future by Chris mills
Keeping responsive into the future by Chris millsKeeping responsive into the future by Chris mills
Keeping responsive into the future by Chris millsCodemotion
 
Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modelingnickmbailey
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeSpyros Passas
 
Performance & Responsive Web Design
Performance & Responsive Web DesignPerformance & Responsive Web Design
Performance & Responsive Web DesignZach Leatherman
 
What's New in the PHP Driver
What's New in the PHP DriverWhat's New in the PHP Driver
What's New in the PHP DriverMongoDB
 
Productionalizing Spark Streaming
Productionalizing Spark StreamingProductionalizing Spark Streaming
Productionalizing Spark StreamingRyan Weald
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?Ivan Zoratti
 
2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services
2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services 2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services
2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services PHP Conference Argentina
 
Escalando una PHP App con DB sharding - PHP Conference
Escalando una PHP App con DB sharding - PHP ConferenceEscalando una PHP App con DB sharding - PHP Conference
Escalando una PHP App con DB sharding - PHP ConferenceMatias Paterlini
 
Active Record Introduction - 3
Active Record Introduction - 3Active Record Introduction - 3
Active Record Introduction - 3Blazing Cloud
 
elasticsearch basics workshop
elasticsearch basics workshopelasticsearch basics workshop
elasticsearch basics workshopMathieu Elie
 
MYSQLCLONE Introduction
MYSQLCLONE IntroductionMYSQLCLONE Introduction
MYSQLCLONE IntroductionZhaoyang Wang
 

Ähnlich wie Cassandra - PHP (20)

Using Apache Cassandra: What is this thing, and how do I use it?
Using Apache Cassandra: What is this thing, and how do I use it?Using Apache Cassandra: What is this thing, and how do I use it?
Using Apache Cassandra: What is this thing, and how do I use it?
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012
 
Rabbitmq Boot System
Rabbitmq Boot SystemRabbitmq Boot System
Rabbitmq Boot System
 
Cloud Messaging With Cloud Foundry
Cloud Messaging With Cloud FoundryCloud Messaging With Cloud Foundry
Cloud Messaging With Cloud Foundry
 
"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012
"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012
"Searching with Solr" - Tyler Harms, South Dakota Code Camp 2012
 
Keeping responsive into the future by Chris mills
Keeping responsive into the future by Chris millsKeeping responsive into the future by Chris mills
Keeping responsive into the future by Chris mills
 
Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modeling
 
Mongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappeMongo db php_shaken_not_stirred_joomlafrappe
Mongo db php_shaken_not_stirred_joomlafrappe
 
Performance & Responsive Web Design
Performance & Responsive Web DesignPerformance & Responsive Web Design
Performance & Responsive Web Design
 
What's New in the PHP Driver
What's New in the PHP DriverWhat's New in the PHP Driver
What's New in the PHP Driver
 
Productionalizing Spark Streaming
Productionalizing Spark StreamingProductionalizing Spark Streaming
Productionalizing Spark Streaming
 
What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?What can we learn from NoSQL technologies?
What can we learn from NoSQL technologies?
 
2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services
2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services 2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services
2013 - Matías Paterlini: Escalando PHP con sharding y Amazon Web Services
 
Escalando una PHP App con DB sharding - PHP Conference
Escalando una PHP App con DB sharding - PHP ConferenceEscalando una PHP App con DB sharding - PHP Conference
Escalando una PHP App con DB sharding - PHP Conference
 
NATO IST Symposium 2013
NATO IST Symposium 2013NATO IST Symposium 2013
NATO IST Symposium 2013
 
Active Record Introduction - 3
Active Record Introduction - 3Active Record Introduction - 3
Active Record Introduction - 3
 
elasticsearch basics workshop
elasticsearch basics workshopelasticsearch basics workshop
elasticsearch basics workshop
 
Rapid Home Provisioning
Rapid Home ProvisioningRapid Home Provisioning
Rapid Home Provisioning
 
MYSQLCLONE Introduction
MYSQLCLONE IntroductionMYSQLCLONE Introduction
MYSQLCLONE Introduction
 

Kürzlich hochgeladen

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Kürzlich hochgeladen (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Cassandra - PHP

  • 1. Cassandra Integrating Cassandra into your project dinsdag 12 november 13
  • 2. Maurits Lawende • • • dinsdag 12 november 13 Work at Dutch Open Projects (DOP) since 2007 Development and technical design for challenging Drupal sites Development of SaaS solutions in PHP & NodeJS
  • 3. ToDoToDay • • • • dinsdag 12 november 13 Data versus information History and usage of Cassandra How to use Cassandra Developments
  • 4. Data versus information Celko, J. (1999). Data and databases dinsdag 12 november 13
  • 5. SQL is designed for information DBMS knows how to use your data dinsdag 12 november 13
  • 6. SQL is designed for flexibility Not even a single line on scalability dinsdag 12 november 13
  • 7. SQL nearly 40 years of experience dinsdag 12 november 13
  • 8. SQL Never designed for scalability dinsdag 12 november 13
  • 9. Alexa top 10 • • • • • dinsdag 12 november 13 Google Facebook YouTube Yahoo Baidu • • • • • Wikipedia QQ.com LinkedIn Live.com Twitter
  • 10. Alexa top 10 • • • • • dinsdag 12 november 13 Google (BigTable) Facebook (MySQL) YouTube (MySQL) Yahoo Baidu (HyperTable) • • • • • Wikipedia (MySQL) QQ.com LinkedIn (Voldemort) Live.com Twitter (MySQL)
  • 11. Cassandra users • • • • • • dinsdag 12 november 13 Facebook (+ Redis & HBase & MySQL) Twitter (+ MySQL) Reddit (+ Postgres) Digg (+ Redis) Bit.ly (+ MongoDB) Netflix
  • 12. Cassandra users • • • • • • dinsdag 12 november 13 Facebook (+ Redis & HBase & MySQL) Twitter (+ MySQL) Reddit (+ Postgres) Digg (+ Redis) Bit.ly (+ MongoDB) Netflix Jeff Hammerbacher
  • 13. Cassandra users • • • • • • dinsdag 12 november 13 Facebook (+ Redis & HBase & MySQL) Twitter (+ MySQL) Reddit (+ Postgres) Digg (+ Redis) Bit.ly (+ MongoDB) Netflix Jeff Hammerbacher left Facebook in 2008
  • 14. Back to basic Don’t think SQL dinsdag 12 november 13
  • 15. Key/value store Evolved towards tables dinsdag 12 november 13
  • 16. Just data • • • dinsdag 12 november 13 No joins Limited sorting capabilities No aggregation, grouping, subqueries whatsoever
  • 17. Schemaless • • dinsdag 12 november 13 Fixed <strike>tables</strike> column families, but; Dynamic column names
  • 18. Operations in Cassandra 1.0 • CREATE KEYSPACE name • • • • dinsdag 12 november 13 USE name CREATE COLUMN FAMILY name DROP KEYSPACE name DROP COLUMN FAMILY name
  • 19. Operations in Cassandra 1.0 • • • • • dinsdag 12 november 13 SET columnfamily[‘row’][‘column’] = ‘value’; GET columnfamily[‘row’] LIST columnfamily DEL columnfamily[‘row’] DEL columnfamily[‘row’][‘column’]
  • 20. Operations in Cassandra 1.0 • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘lastname’] = ‘Lawende’;
  • 21. Operations in Cassandra 1.0 post • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘lastname’] = ‘Lawende’; title uuid First post! user firstname mau Maurits lastname Lawende
  • 22. Operations in Cassandra 1.0 sorted by rowkey, columnname (all ascending) • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘lastname’] = ‘Lawende’;
  • 23. Operations in Cassandra 1.0 • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’;
  • 24. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’;
  • 25. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’; WHERE user = ‘mau’
  • 26. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? • • • dinsdag 12 november 13 WHERE user = ‘mau’ post[‘uuid’][‘title’] = ‘First post!’; Bad Request: No indexed columns present in post[‘uuid’][‘user’] = ‘mau’; by-columns clause with user[‘mau’][‘firstname’] = ‘Maurits’; Equal operator
  • 27. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? • • • WHERE user = ‘mau’ post[‘uuid’][‘title’] = ‘First post!’; Bad Request: No indexed columns present in post[‘uuid’][‘user’] = ‘mau’; by-columns clause with user[‘mau’][‘firstname’] = ‘Maurits’; Equal operator sequal scans are rejected dinsdag 12 november 13
  • 28. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? WHERE user = ‘mau’ post[‘uuid’][‘title’] = ‘First post!’; Bad Request: No indexed columns present in post[‘uuid’][‘user’] = ‘mau’; by-columns clause with user[‘mau’][‘firstname’] = ‘Maurits’; Equal operator Bad Request: Order by is currently only supported on the clustered columns of the PRIMARY KEY • • • dinsdag 12 november 13
  • 29. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? WHERE user = ‘mau’ post[‘uuid’][‘title’] = ‘First post!’; Bad Request: No indexed columns present in post[‘uuid’][‘user’] = ‘mau’; by-columns clause with user[‘mau’][‘firstname’] = ‘Maurits’; Equal operator Bad Request: Order by is currently only supported on the clustered columns of the PRIMARY KEY Bad Request: ORDER BY is only supported when the partition key is restricted by an EQ or an IN. • • • dinsdag 12 november 13
  • 30. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’; WHERE user = ‘mau’ ORDER BY date DESC LIMIT 10
  • 31. Operations in Cassandra 1.0 How to get a list of blogs by “mau”? • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’; WHERE user = ‘mau’ ORDER BY date DESC LIMIT 10 only possible when user and date is in primary key
  • 32. Predictable performance No performance degradation after data growth dinsdag 12 november 13
  • 33. Operations in Cassandra 1.0 • • • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘post001’] = ‘uuid’; user[‘mau’][‘post002’] = ‘uuid’;
  • 34. Operations in Cassandra 1.0 • • • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘mau’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘post001’] = ‘uuid’; user[‘mau’][‘post002’] = ‘uuid’; any order and limit
  • 35. Operations in Cassandra 1.0 • • • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘uuid’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘post001’] = ‘uuid’; user[‘mau’][‘post002’] = ‘uuid’; join
  • 36. Operations in Cassandra 1.0 • • • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; post[‘uuid’][‘user’] = ‘uuid’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘post001’] = ‘uuid’; user[‘mau’][‘post002’] = ‘uuid’; join no uuid IN (...) or OR’s
  • 37. Operations in Cassandra 1.0 • • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘post001:uuid’] = ‘First post!’; user[‘mau’][‘post002:uuid’] = ‘Second post!’;
  • 38. Operations in Cassandra 1.0 • • • • dinsdag 12 november 13 post[‘uuid’][‘title’] = ‘First post!’; user[‘mau’][‘firstname’] = ‘Maurits’; only one query required to get user profile with latest posts user[‘mau’][‘post001:uuid’] = ‘First post!’; user[‘mau’][‘post002:uuid’] = ‘Second post!’;
  • 39. Operations in Cassandra 1.0 • • • • post[‘uuid’][‘title’] = ‘First post!’; user[‘mau’][‘firstname’] = ‘Maurits’; user[‘mau’][‘post001:uuid’] = ‘First post!’; user[‘mau’][‘post002:uuid’] = ‘Second post!’; 64 KB dinsdag 12 november 13 2 billion cells 64 KB 2 GB
  • 40. Beauty? • • • • dinsdag 12 november 13 Dirty in the SQL world, but; It’s a best practice in Big Data Don’t think of it as a relational database No strict rules on how to use it, just push it to the limits
  • 42. Each row is a snapshot of data meant to satisfy a given query, sort of like a materialized view. dinsdag 12 november 13
  • 43. Storage in a cluster dinsdag 12 november 13
  • 48. HDFS / GlusterFS dinsdag 12 november 13
  • 51. No master or single point of failure Every node is (nearly) identical dinsdag 12 november 13
  • 52. Distribution and replication 2^127 0 dinsdag 12 november 13
  • 58. Client can connect to any node dinsdag 12 november 13
  • 59. Seed nodes • • dinsdag 12 november 13 Required for bootstrapping nodes Define 2 or 3 seed nodes per cluster
  • 60. Extending the ring • • • dinsdag 12 november 13 Assign a token for new node Configure seed node host Start Cassandra on new node
  • 61. Extending the ring • • • dinsdag 12 november 13 Assign a token for new node Configure seed node host Start Cassandra on new node
  • 63. Writing data • • • • dinsdag 12 november 13 Hinted handoff Write to commit log Write in memory Write to disk (together with timestamp)
  • 64. Write consistency • • dinsdag 12 november 13 Choose from ANY, ONE, TWO, THREE, QUORUM, ALL QUORUM = floor((replication factor / 2) + 1)
  • 65. Read consistency • • dinsdag 12 november 13 Choose from ONE, TWO, THREE, QUORUM, ALL Most recent copy is returned
  • 66. Read repair • • • dinsdag 12 november 13 Compares data with 2 other replica’s in the background Fixes inconsistent and missing data At 10% of all reads
  • 67. Node repair • • dinsdag 12 november 13 Gradually compares all data in nodes with replica’s Required in conjunction with read repair to fix ‘forgotten deletes’
  • 68. ACID theorem • • • • dinsdag 12 november 13 Atomic; completed successfully or entirely rolled back Consistent; transations never invalidates the database state Isolated; transactions are processed sequential Durable; completed actions are persistent
  • 69. CAP theorem Impossible to achieve all three: • • • dinsdag 12 november 13 Consistency Availability Partition tolerance
  • 70. Eventual consistency Not guaranteed to be consistent, but becomes consistent later dinsdag 12 november 13
  • 71. Eventual consistency • • Best effort • Configurable consistency level, but no transaction support dinsdag 12 november 13 Consistency is not always more important than speed and scalability (doesn’t require locking)
  • 72. Surrogate keys Say bye to sequences dinsdag 12 november 13
  • 73. Surrogate keys Say bye to sequences ss cluster istent acro not cons dinsdag 12 november 13
  • 74. Surrogate keys Say bye to sequences ss cluster istent acro not cons counters a re for cou n dinsdag 12 november 13 ting
  • 75. Native support for uuid’s f47ac10b-58cc-4372-a567-0e02b2c3d479 Surrogate keys Say bye to sequences ss cluster istent acro not cons counters a re for cou n dinsdag 12 november 13 ting
  • 77. Cassandra 1.2 • • • dinsdag 12 november 13 Not longer schemaless Introduced CQL3 No wide tables anymore
  • 79. Lists • • user[‘mau’][‘posts’] = ‘uuid’; • • UPDATE user SET posts = posts + [‘uuid’] dinsdag 12 november 13 CREATE TABLE user ( username text PRIMARY KEY, posts list<uuid> ); UPDATE user SET posts = [‘uuid’] + posts
  • 80. Set • CREATE TABLE user ( username text PRIMARY KEY, email set<text> ); • UPDATE user SET emails = emails + {‘mail@example.com’} dinsdag 12 november 13
  • 81. Maps • CREATE TABLE user ( username text PRIMARY KEY, attending map<timestamp,text> ); • • UPDATE user SET attending[‘2013-11-12’] = ‘PHPMeetup’ dinsdag 12 november 13 DELETE attending[‘2013-12-05’] FROM user
  • 82. Limits on collections • • • dinsdag 12 november 13 64K Whole collection loaded in memory when reading / writing Not an alternative to wide tables!
  • 83. Limits on collections • • • dinsdag 12 november 13 64K No size check in CQL SET list = list + [‘...’] Whole collection loaded in memory when reading / writing Not an alternative to wide tables!
  • 84. Wide tables in CQL3 • CREATE TABLE tweets ( tweet_id uuid PRIMARY KEY, author varchar, body varchar ); • CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id) ) dinsdag 12 november 13
  • 85. Wide tables in CQL3 • • dinsdag 12 november 13 CREATE TABLE tweets ( tweet_id uuid PRIMARY KEY, author varchar, body varchar ); CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id) ) user_id mau user_id mike uuid:author anne uuid:author david uuid:body Tweet from Anne uuid:body Tweet from David
  • 86. Wide tables in CQL3 For schemaless lovers: • • dinsdag 12 november 13 CREATE TABLE tweets ( tweet_id uuid PRIMARY KEY, author varchar, body varchar ); CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id) ) user_id mau user_id mike CREATE TABLE name ( rowkey varchar, columnname varchar, value blob, PRIMARY KEY (rowkey, columnname) ); uuid:author uuid:body anne Tweet from Anne uuid:author uuid:body david Tweet from David
  • 87. Secondary index • • dinsdag 12 november 13 CREATE INDEX name ON table (column); High memory usage when used with high cardinality
  • 88. Iteration • dinsdag 12 november 13 SELECT * FROM users
  • 89. Iteration unpredictable performance • dinsdag 12 november 13 SELECT * FROM users LIMIT 10 OFFSET 100
  • 90. Iteration • • dinsdag 12 november 13 SELECT * FROM users SELECT token(username), username, country, age FROM user
  • 91. Iteration • • dinsdag 12 november 13 SELECT * FROM users SELECT token(username), username, country, age FROM user WHERE token(username) > 23947239 LIMIT 10
  • 92. Queries are always controlled by one node dinsdag 12 november 13
  • 93. Queries are always controlled by one node Even if data from 100 nodes is involved dinsdag 12 november 13
  • 95. MapReduce • • dinsdag 12 november 13 array_map array_reduce
  • 96. map() • • dinsdag 12 november 13 Processes a subset of the data array_map(function($v) { return strtoupper($v); }, array('a', 'b'))
  • 97. reduce() • • dinsdag 12 november 13 Merge results from the mapping function array_reduce(array(1, 2, 3), function($a, $b) { return $a + $b; });
  • 108. Wordcount $data = array(‘red green blue’, ‘orange blue’, ‘purple green’); $data = array_map(function($v) { $words = array(); foreach (explode(' ', $v) as $word) $words[$word] = isset($words[$word]) ? $words[$word] + 1 : 1; return $words; }, $data); $data = array_reduce($data, function($a, $b) { foreach ($a as $word => $count) $b[$word] = isset($b[$word]) ? $b[$word] + $count : $count; return $b; }, array()); array(‘red’ => 1, ‘green’ => 2, ‘blue’ => 2, ‘orange’ => 1, ‘purple’ => 1) dinsdag 12 november 13
  • 109. ORDER BY value LIMIT 5 $data = array(array(4,5,2), array(62,35,1), array(74,56,2,34)); $data = array_map(function($v) { sort($v); return array_slice($v, 0, 5); }, $data); $data = array_reduce($data, function($a, $b) { $v = array_merge($a, $b); sort($v); return array_slice($v, 0, 5); }, array()); array(1, 2, 2, 4, 5) dinsdag 12 november 13
  • 110. Remember • • dinsdag 12 november 13 Getting information is a bumpy road in big data Use MapRed to transform data into information
  • 111. MapReduce • • dinsdag 12 november 13 No native support in Cassandra MapReduce possible with Hadoop (requires Java programming)
  • 112. Pig input_lines = LOAD '/tmp/my-copy-of-all-pages-on-internet' AS (line:chararray); words = FOREACH input_lines GENERATE FLATTEN(TOKENIZE(line)) AS word; filtered_words = FILTER words BY word MATCHES 'w+'; word_groups = GROUP filtered_words BY word; word_count = FOREACH word_groups GENERATE COUNT(filtered_words) AS count, group AS word; ordered_word_count = ORDER word_count BY count DESC; STORE ordered_word_count INTO '/tmp/number-of-words-on-internet'; dinsdag 12 november 13
  • 113. Hive SELECT v['ip'], COUNT(1) AS cnt FROM www_access GROUP BY v['ip'] ORDER BY cnt DESC LIMIT 30 dinsdag 12 november 13
  • 114. Pig and Hive • • • dinsdag 12 november 13 Using MapReduce No(t very) predictable performance Good for analysis
  • 115. Hack your own • • • • dinsdag 12 november 13 Not too difficult Data can be split into subsets by filtering on tokens Application must run on all MapRed nodes Probably better performance than Pig / Hive
  • 117. Interfaces / protocols • • • dinsdag 12 november 13 Thrift Binary protocol (1.2+) Gossip (internode communication)
  • 118. Thrift • • • • • dinsdag 12 november 13 Something like SOAP in a binary format Tool which generates libraries based on definition files Supports many languages (incl. PHP, JS, NodeJS, c, java, python, ruby.....) Also used by HyperTable, HBase, Accumulo and ElasticSearch Sole interface before 1.2
  • 119. Thrift • dinsdag 12 november 13 No support for collections
  • 120. Binary protocol • • • dinsdag 12 november 13 Recommended protocol for Cassandra 1.2 Few client libraries available No binary connectors were available for PHP https://github.com/mauritsl/php-cassandra
  • 121. php-cassandra require('lib/cassandra/Cassandra.php'); use CassandraConnection as Cassandra; $connection = new Cassandra('localhost', 'keyspace'); $rows = $connection->query('SELECT * FROM user'); foreach ($rows as $row) { print $row->firstname; print $row->listfield[0]; } $rows->count(); $rows->getColumns(); dinsdag 12 november 13
  • 123. Rule 1: Don’t ask for NoSQL drivers for a CMS dinsdag 12 november 13
  • 124. Cassandra does not fit all (same story for every NoSQL solution) dinsdag 12 november 13
  • 125. Every page (or API call) should only require a few (if not one) query dinsdag 12 november 13
  • 126. Static versus Dynamic data • Static: information that doesn’t change very often • • • I.e.: translations May go in a RDBMS or local storage (files?) Dynamic: many changes • • dinsdag 12 november 13 Changes must be visible on all nodes Use Cassandra
  • 127. Local versus Global data • Logging • • Separate logs per node Cache • • Sometimes no need to share cache between nodes Statistics • dinsdag 12 november 13 Can be kept local for a limited time
  • 128. Local versus Global data • Sessions • dinsdag 12 november 13 Dependent on session stickiness
  • 129. Caching • • Memcache is recommended for local cache Cassandra can be used for global cache • dinsdag 12 november 13 Has a TTL feature INSERT INTO ... (...) VALUES (...) USING TTL 86400
  • 130. What about files? • dinsdag 12 november 13 Use Hadoop Distributed File System (HDFS) or GlusterFS
  • 131. What about files? • • dinsdag 12 november 13 Use Hadoop Distributed File System (HDFS) or GlusterFS Or use Cassandra
  • 132. What about files? • • Split files in chunks to avoid hotspots and save the heap Not uncommon to have files in Cassandra • • dinsdag 12 november 13 github.com/Netflix/astyanax GB’s are ok, but do not store TB’s
  • 133. Maximum size of cluster? • • No satisfactory answer Probably more dependent on network equipment • • • dinsdag 12 november 13 Rack awareness helps here Facebook: 150 node cluster, 50TB data (2010) Easou: 400 node cluster, 300TB data (300 million images)
  • 134. Minimum size of a cluster? • • • dinsdag 12 november 13 Can run on a single node 4GB RAM recommended Runs fine on 1GB RAM
  • 135. Minimum size of a cluster? • • • dinsdag 12 november 13 Can run on a single node 4GB RAM recommended Runs fine on 1GB RAM “hot data” should fit in RAM
  • 136. Installing Cassandra • Install JDK Oracle Java recommended but OpenJDK works ok • • • • Add Cassandra repository dinsdag 12 november 13 apt-get install cassandra Set listen and seed address (IP address of node and seed) (Re)start Cassandra
  • 137. Last words... dinsdag 12 november 13
  • 138. Data versus information Data structure is naturally responsive for information dinsdag 12 november 13
  • 139. Data versus information Data structure is naturally responsive for information predictable performance dinsdag 12 november 13
  • 140. History and usage Jeff Hammerbacher dinsdag 12 november 13
  • 141. How to use it Schema design, CQL3 and limits dinsdag 12 november 13
  • 142. Developments CQL3 and binary protocol dinsdag 12 november 13
  • 143. Thank you! dinsdag 12 november 13