SlideShare a Scribd company logo
1 of 36
Download to read offline
#CASSANDRA13
Colin Charles | colin@mariadb.org | SkySQL Ab | http://mariadb.org/
@bytebot on Twitter | http://bytebot.net/blog/
MariaDB and Cassandra Interoperability
#CASSANDRA13
whoami
*Work on MariaDB today
*Formerly of MySQL AB (USD$1bil acquisition -> Sun Microsystems)
*Worked on The Fedora Project & OpenOffice.org previously
*Monty Program Ab is a major sponsor of MariaDB
*SkySQL & Monty Program Ab merge
*MariaDB governed by MariaDB Foundation
#CASSANDRA13
What we will discuss today...
*What is MariaDB?
*MariaDB Architecture
*The Cassandra Storage Engine (CassandraSE)
*Data & Command Mapping
*Use Cases
*Benchmarks
*Conclusions
#CASSANDRA13
What is MariaDB?
*Community developed, feature enhanced, backward compatible MySQL
*Drop-in replacement to MySQL
*Shipped in many Linux distributions as a default
*Enhanced features: threadpool, table elimination, optimizer changes
(subqueries materialize!), group commit in the replication binary log,
HandlerSocket, SphinxSE, multi-source replication, dynamic columns
#CASSANDRA13
#CASSANDRA13
MariaDB/MySQL and NoSQL
*HandlerSocket
*memcached access to InnoDB
*Hadoop Applier
*LevelDB Storage Engine
*Cassandra Storage Engine
*CONNECT Engine
#CASSANDRA13
Dynamic Columns
*Store a different set of columns for every row in the table
*Basically a blob with handling functions (GET, CREATE, ADD, DELETE,
EXISTS, LIST, JSON)
*Dynamic columns can be nested
*You can request rows in JSON format
*You can now name dynamic columns as well
INSERT INTO tbl SET
dyncol_blob=COLUMN_CREATE("column_name", "value");
#CASSANDRA13
Cassandra background
*Distributed key/value store (limited range scan support),
optionally flexible schema (pre-defined “static” columns,
ad-hoc dynamic columns), automatic sharding/
replication, eventual consistency
*Column families are like “tables”
*Row key -> column mapping
*Supercolumns are not supported
#CASSANDRA13
CQL at work
cqlsh> CREATE KEYSPACE mariadbtest
... WITH REPLICATION ={'class':'SimpleStrategy','replication_factor':1};
cqlsh> use mariadbtest;
cqlsh:mariadbtest> create columnfamily cf1 ( pk varchar primary key, data1 varchar, data2 bigint ) with compactstorage;
cqlsh:mariadbtest> insert into cf1 (pk, data1,data2) values ('row1', 'data-in-cassandra', 1234);
cqlsh:mariadbtest> select * from cf1;
pk | data1 | data2
------+-------------------+-------
row1 | data-in-cassandra | 1234
cqlsh:mariadbtest> select * from cf1 where pk='row1';
pk | data1 | data2
------+-------------------+-------
row1 | data-in-cassandra | 1234
cqlsh:mariadbtest> select * from cf1 where data2=1234;
Bad Request: No indexed columns present in by-columns clause with Equal operator
cqlsh:mariadbtest> select * from cf1 where pk='row1' or pk='row2';
Bad Request: line 1:34 missing EOF at 'or'
#CASSANDRA13
CQL
*Looks like SQL at first glance
*No joins or subqueries
*No GROUP BY, ORDER BY must be able to use available indexes
*WHERE clause must represent an index lookup
*Simple goal of the Cassandra Storage Engine? Provide a “view” of
Cassandra’s data from MariaDB
#CASSANDRA13
Getting started
*Get MariaDB 10.0.2 from https://downloads.mariadb.org/
*Load the Cassandra plugin
- From SQL:
MariaDB [(none)]> install plugin cassandra soname 'ha_cassandra.so';
- Or start it from my.cnf
[mysqld]
...
plugin-load=ha_cassandra.so
#CASSANDRA13
Is everything ok?
*Check to see that it is loaded - SHOW PLUGINS
MariaDB [(none)]> show plugins;
+--------------------+--------+-----------------+-----------------+---------+
| Name | Status | Type | Library | License |
+--------------------+--------+-----------------+-----------------+---------+
...
| CASSANDRA | ACTIVE | STORAGE ENGINE | ha_cassandra.so | GPL |
+--------------------+--------+-----------------+-----------------+---------+
#CASSANDRA13
Create an SQL table which is a view of a column family
MariaDB [test]> set global cassandra_default_thrift_host='10.196.2.113';
MariaDB [test]> create table t2 (pk varchar(36) primary key,
-> data1 varchar(60),
-> data2 bigint
-> ) engine=cassandra
-> keyspace='mariadbtest'
-> thrift_host='10.196.2.113'
-> column_family='cf1';
*thrift_host can be set per-table
*@@cassandra_default_thrift_host allows to re-point the table to different node
dynamically, and not change table DDL when Cassandra IP changes
#CASSANDRA13
Potential issues
*SELinux blocks the connection
ERROR 1429 (HY000): Unable to connect to foreign data source:
connect() failed: Permission denied [1]
*Disable SELinux: echo 0 > /selinux/enforce
*Cassandra 1.2 with Column Families (CFs) without “COMPACT
STORAGE”
ERROR 1429 (HY000): Unable to connect to foreign data source:
Column family cf1 not found in keyspace mariadbtest
*Change in Cassandra 1.2 that broke Pig as well; we’ll update this soon
#CASSANDRA13
Accessing Cassandra data from MariaDB
*Get data from Cassandra
MariaDB [test]> select * from t2;
+------+-------------------+-------+
| pk | data1 | data2 |
+------+-------------------+-------+
| row1 | data-in-cassandra | 1234 |
+------+-------------------+-------+
*Insert data into Cassandra
MariaDB [test]> insert into t2 values ('row2','data-from-mariadb', 123);
*Ensure Cassandra sees inserted data
cqlsh:mariadbtest> select * from cf1;
pk | data1 | data2
------+-------------------+-------
row1 | data-in-cassandra | 1234
row2 | data-from-mariadb | 123
#CASSANDRA13
Data mapping between Cassandra and SQL
create table tbl (
pk varchar(36) primary key,
data1 varchar(60),
data2 bigint
) engine=cassandra keyspace='ks1' column_family='cf1'
*MariaDB table represents Cassandra’s Column Family
- can use any table name, column_family=... specifies CF
#CASSANDRA13
Data mapping between Cassandra and SQL
create table tbl (
pk varchar(36) primary key,
data1 varchar(60),
data2 bigint
) engine=cassandra keyspace='ks1' column_family='cf1'
*MariaDB table represents Cassandra’s Column Family
- can use any table name, column_family=... specifies CF
*Table must have a primary key
- name/type must match Cassandra’s rowkey
#CASSANDRA13
Data mapping between Cassandra and SQL
create table tbl (
pk varchar(36) primary key,
data1 varchar(60),
data2 bigint
) engine=cassandra keyspace='ks1' column_family='cf1'
*MariaDB table represents Cassandra’s Column Family
- can use any table name, column_family=... specifies CF
*Table must have a primary key
- name/type must match Cassandra’s rowkey
*Columns map to Cassandra’s static columns
- name must be same as in Cassandra, datatypes must match, can be subset of CF’s columns
#CASSANDRA13
Datatype mapping
Cassandra MariaDB
blob BLOB, VARBINARY(n)
ascii BLOB, VARCHAR(n), use charset=latin1
text BLOB, VARCHAR(n), use charset=utf8
varint VARBINARY(n)
int INT
bigint BIGINT, TINY, SHORT
uuid CHAR(36) (text in MariaDB)
timestamp TIMESTAMP (second), TIMESTAMP(6) (microsecond), BIGINT
boolean BOOL
float FLOAT
double DOUBLE
decimal VARBINARY(n)
counter BIGINT
#CASSANDRA13
Dynamic columns revisited
*Cassandra supports “dynamic column families”, can access ad-hoc
columns
create table tbl
(
rowkey type PRIMARY KEY
column1 type,
...
dynamic_cols blob DYNAMIC_COLUMN_STORAGE=yes
) engine=cassandra keyspace=... column_family=...;
insert into tbl values (1, column_create('col1', 1, 'col2', 'value-2'));
select rowkey, column_get(dynamic_cols, 'uuidcol' as char) from tbl;
#CASSANDRA13
All data mapping is safe
*CassandraSE will refuse incorrect mappings (throw errors)
create table t3 (pk varchar(60) primary key, no_such_field int)
engine=cassandra `keyspace`='mariadbtest' `column_family`='cf1';
ERROR 1928 (HY000): Internal error: 'Field `no_such_field` could not be mapped to any field in Cassandra'
create table t3 (pk varchar(60) primary key, data1 double)
engine=cassandra `keyspace`='mariadbtest' `column_family`='cf1';
ERROR 1928 (HY000): Internal error: 'Failed to map column data1 to datatype org.apache.cassandra.db.marshal.UTF8Type'
#CASSANDRA13
Command Mapping
*Cassandra commands
- PUT (upsert)
- GET (performs a scan)
- DELETE (if exists)
*SQL commands
- SELECT -> GET/Scan
- INSERT -> PUT (upsert)
- UPDATE/DELETE -> read/write
#CASSANDRA13
SELECT command mapping
*MariaDB has an SQL interpreter
*CassandraSE supports lookups and scans
*Can now do:
- arbitrary WHERE clauses
- JOINs between Cassandra tables and MariaDB tables (BKA
supported)
#CASSANDRA13
Batched Key Access is fast!
select max(l_extendedprice) from orders, lineitem where
o_orderdate between $DATE1 and $DATE2 and
l_orderkey=o_orderkey
#CASSANDRA13
DML command mapping
*No SQL semantics
- INSERT overwrites rows
- UPDATE reads, then writes (have you updated what you read?)
- DELETE reads, then writes (can’t be sure if/what you’ve deleted)
*CassandraSE doesn’t make it SQL!
#CASSANDRA13
CassandraSE use cases
*Collect massive amounts of data like web page hits
*Collect massive amounts of data from sensors
*Updates are non-conflicting
- keyed by UUIDs, timestamps
*Reads are served with one lookup
*Good for certain kinds of data (though moving from SQL entirely may be
difficult)
#CASSANDRA13
Access Cassandra data from SQL
*Send an update to Cassandra
- be a sensor
*Get a piece of data from Cassandra
- This webpage was last viewed by...
- Last known position of this user was...
- You are user number n of n-thousands...
#CASSANDRA13
From MariaDB...
*Want a table that is:
- auto-replicated
- fault-tolerant
- very fast
*Get Cassandra and create a CassandraSE table
#CASSANDRA13
CassandraSE non-use cases
*Huge, sift through all data joins?
- use Pig
*Bulk data transfer to/from Cassandra Cluster?
- use Sqoop
*A replacement for InnoDB?
- remember no full SQL semantics, InnoDB is useful for myriad
reasons
#CASSANDRA13
A tiny benchmark
*One table
*Amazon EC2 environment
- m1.large nodes
- ephemeral disks
*Stream of single-line INSERTs
*Tried InnoDB & CassandraSE
*No tuning
#CASSANDRA13
A tiny benchmark II
*InnoDB with tuning, same
setup as before
#CASSANDRA13
Conclusions
*CassandraSE can be used to peek at data in Cassandra from MariaDB
*It is not a replacement for Pig/Hive
*It is really easy to setup & use
#CASSANDRA13
Roadmap
*Do you want support for:
- fast counter column updates?
- awareness/discovery of Cassandra cluster topology?
- secondary indexes?
- ... ?
#CASSANDRA13
Resources
*https://kb.askmonty.org/en/cassandrase/
*http://wiki.apache.org/cassandra/DataModel
*http://cassandra.apache.org/
*http://www.datastax.com/docs/1.1/ddl/column_family
#CASSANDRA13
THANK YOU
Colin Charles | colin@mariadb.org | SkySQL Ab | http://mariadb.org/
@bytebot on Twitter | http://bytebot.net/blog/
#CASSANDRA13
Cassandra SE internals
*Developed against Cassandra 1.1
*Uses Thrift API
- cannot stream CQL resultset in 1.1
- cannot use secondary indexes
*Only supports AllowAllAuthenticator
*In Cassandra 1.2
- “CQL Binary Protocol” with streaming
- CASSANDRA-5234: Thrift can only read CFs “WITH COMPACT STORAGE”

More Related Content

What's hot

Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)zznate
 
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyC* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyDataStax Academy
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersJulien Anguenot
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCaleb Rackliffe
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applicationsBen Slater
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...DataStax
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...ScyllaDB
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandrazznate
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 

What's hot (20)

Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyC* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandra
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 

Similar to C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin Charles

MariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityMariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityColin Charles
 
MariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityMariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityColin Charles
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilitySergey Petrunya
 
Maria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadbMaria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadbYUCHENG HU
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupMichael Wynholds
 
MariaDB for developers
MariaDB for developersMariaDB for developers
MariaDB for developersColin Charles
 
MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)Colin Charles
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
Transparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting startedTransparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting startedMariaDB plc
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
Cassandra Tutorial | Data types | Why Cassandra for Big Data
 Cassandra Tutorial | Data types | Why Cassandra for Big Data Cassandra Tutorial | Data types | Why Cassandra for Big Data
Cassandra Tutorial | Data types | Why Cassandra for Big Datavinayiqbusiness
 
[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin CharlesInsight Technology, Inc.
 
Spark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousSpark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousJen Aman
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousRussell Spitzer
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 

Similar to C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin Charles (20)

MariaDB and Cassandra Interoperability
MariaDB and Cassandra InteroperabilityMariaDB and Cassandra Interoperability
MariaDB and Cassandra Interoperability
 
MariaDB Cassandra Interoperability
MariaDB Cassandra InteroperabilityMariaDB Cassandra Interoperability
MariaDB Cassandra Interoperability
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
 
Maria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadbMaria db cassandra interoperability cassandra storage engine in mariadb
Maria db cassandra interoperability cassandra storage engine in mariadb
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
 
MariaDB for developers
MariaDB for developersMariaDB for developers
MariaDB for developers
 
MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Transparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting startedTransparent sharding with Spider: what's new and getting started
Transparent sharding with Spider: what's new and getting started
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Cassandra Tutorial | Data types | Why Cassandra for Big Data
 Cassandra Tutorial | Data types | Why Cassandra for Big Data Cassandra Tutorial | Data types | Why Cassandra for Big Data
Cassandra Tutorial | Data types | Why Cassandra for Big Data
 
[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles
 
Spark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousSpark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 Furious
 
Spark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 FuriousSpark and Cassandra 2 Fast 2 Furious
Spark and Cassandra 2 Fast 2 Furious
 
Cassandra4Hadoop
Cassandra4HadoopCassandra4Hadoop
Cassandra4Hadoop
 
Cassandra4hadoop
Cassandra4hadoopCassandra4hadoop
Cassandra4hadoop
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Recently uploaded

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

C* Summit 2013: Can't we all just get along? MariaDB and Cassandra by Colin Charles

  • 1. #CASSANDRA13 Colin Charles | colin@mariadb.org | SkySQL Ab | http://mariadb.org/ @bytebot on Twitter | http://bytebot.net/blog/ MariaDB and Cassandra Interoperability
  • 2. #CASSANDRA13 whoami *Work on MariaDB today *Formerly of MySQL AB (USD$1bil acquisition -> Sun Microsystems) *Worked on The Fedora Project & OpenOffice.org previously *Monty Program Ab is a major sponsor of MariaDB *SkySQL & Monty Program Ab merge *MariaDB governed by MariaDB Foundation
  • 3. #CASSANDRA13 What we will discuss today... *What is MariaDB? *MariaDB Architecture *The Cassandra Storage Engine (CassandraSE) *Data & Command Mapping *Use Cases *Benchmarks *Conclusions
  • 4. #CASSANDRA13 What is MariaDB? *Community developed, feature enhanced, backward compatible MySQL *Drop-in replacement to MySQL *Shipped in many Linux distributions as a default *Enhanced features: threadpool, table elimination, optimizer changes (subqueries materialize!), group commit in the replication binary log, HandlerSocket, SphinxSE, multi-source replication, dynamic columns
  • 6. #CASSANDRA13 MariaDB/MySQL and NoSQL *HandlerSocket *memcached access to InnoDB *Hadoop Applier *LevelDB Storage Engine *Cassandra Storage Engine *CONNECT Engine
  • 7. #CASSANDRA13 Dynamic Columns *Store a different set of columns for every row in the table *Basically a blob with handling functions (GET, CREATE, ADD, DELETE, EXISTS, LIST, JSON) *Dynamic columns can be nested *You can request rows in JSON format *You can now name dynamic columns as well INSERT INTO tbl SET dyncol_blob=COLUMN_CREATE("column_name", "value");
  • 8. #CASSANDRA13 Cassandra background *Distributed key/value store (limited range scan support), optionally flexible schema (pre-defined “static” columns, ad-hoc dynamic columns), automatic sharding/ replication, eventual consistency *Column families are like “tables” *Row key -> column mapping *Supercolumns are not supported
  • 9. #CASSANDRA13 CQL at work cqlsh> CREATE KEYSPACE mariadbtest ... WITH REPLICATION ={'class':'SimpleStrategy','replication_factor':1}; cqlsh> use mariadbtest; cqlsh:mariadbtest> create columnfamily cf1 ( pk varchar primary key, data1 varchar, data2 bigint ) with compactstorage; cqlsh:mariadbtest> insert into cf1 (pk, data1,data2) values ('row1', 'data-in-cassandra', 1234); cqlsh:mariadbtest> select * from cf1; pk | data1 | data2 ------+-------------------+------- row1 | data-in-cassandra | 1234 cqlsh:mariadbtest> select * from cf1 where pk='row1'; pk | data1 | data2 ------+-------------------+------- row1 | data-in-cassandra | 1234 cqlsh:mariadbtest> select * from cf1 where data2=1234; Bad Request: No indexed columns present in by-columns clause with Equal operator cqlsh:mariadbtest> select * from cf1 where pk='row1' or pk='row2'; Bad Request: line 1:34 missing EOF at 'or'
  • 10. #CASSANDRA13 CQL *Looks like SQL at first glance *No joins or subqueries *No GROUP BY, ORDER BY must be able to use available indexes *WHERE clause must represent an index lookup *Simple goal of the Cassandra Storage Engine? Provide a “view” of Cassandra’s data from MariaDB
  • 11. #CASSANDRA13 Getting started *Get MariaDB 10.0.2 from https://downloads.mariadb.org/ *Load the Cassandra plugin - From SQL: MariaDB [(none)]> install plugin cassandra soname 'ha_cassandra.so'; - Or start it from my.cnf [mysqld] ... plugin-load=ha_cassandra.so
  • 12. #CASSANDRA13 Is everything ok? *Check to see that it is loaded - SHOW PLUGINS MariaDB [(none)]> show plugins; +--------------------+--------+-----------------+-----------------+---------+ | Name | Status | Type | Library | License | +--------------------+--------+-----------------+-----------------+---------+ ... | CASSANDRA | ACTIVE | STORAGE ENGINE | ha_cassandra.so | GPL | +--------------------+--------+-----------------+-----------------+---------+
  • 13. #CASSANDRA13 Create an SQL table which is a view of a column family MariaDB [test]> set global cassandra_default_thrift_host='10.196.2.113'; MariaDB [test]> create table t2 (pk varchar(36) primary key, -> data1 varchar(60), -> data2 bigint -> ) engine=cassandra -> keyspace='mariadbtest' -> thrift_host='10.196.2.113' -> column_family='cf1'; *thrift_host can be set per-table *@@cassandra_default_thrift_host allows to re-point the table to different node dynamically, and not change table DDL when Cassandra IP changes
  • 14. #CASSANDRA13 Potential issues *SELinux blocks the connection ERROR 1429 (HY000): Unable to connect to foreign data source: connect() failed: Permission denied [1] *Disable SELinux: echo 0 > /selinux/enforce *Cassandra 1.2 with Column Families (CFs) without “COMPACT STORAGE” ERROR 1429 (HY000): Unable to connect to foreign data source: Column family cf1 not found in keyspace mariadbtest *Change in Cassandra 1.2 that broke Pig as well; we’ll update this soon
  • 15. #CASSANDRA13 Accessing Cassandra data from MariaDB *Get data from Cassandra MariaDB [test]> select * from t2; +------+-------------------+-------+ | pk | data1 | data2 | +------+-------------------+-------+ | row1 | data-in-cassandra | 1234 | +------+-------------------+-------+ *Insert data into Cassandra MariaDB [test]> insert into t2 values ('row2','data-from-mariadb', 123); *Ensure Cassandra sees inserted data cqlsh:mariadbtest> select * from cf1; pk | data1 | data2 ------+-------------------+------- row1 | data-in-cassandra | 1234 row2 | data-from-mariadb | 123
  • 16. #CASSANDRA13 Data mapping between Cassandra and SQL create table tbl ( pk varchar(36) primary key, data1 varchar(60), data2 bigint ) engine=cassandra keyspace='ks1' column_family='cf1' *MariaDB table represents Cassandra’s Column Family - can use any table name, column_family=... specifies CF
  • 17. #CASSANDRA13 Data mapping between Cassandra and SQL create table tbl ( pk varchar(36) primary key, data1 varchar(60), data2 bigint ) engine=cassandra keyspace='ks1' column_family='cf1' *MariaDB table represents Cassandra’s Column Family - can use any table name, column_family=... specifies CF *Table must have a primary key - name/type must match Cassandra’s rowkey
  • 18. #CASSANDRA13 Data mapping between Cassandra and SQL create table tbl ( pk varchar(36) primary key, data1 varchar(60), data2 bigint ) engine=cassandra keyspace='ks1' column_family='cf1' *MariaDB table represents Cassandra’s Column Family - can use any table name, column_family=... specifies CF *Table must have a primary key - name/type must match Cassandra’s rowkey *Columns map to Cassandra’s static columns - name must be same as in Cassandra, datatypes must match, can be subset of CF’s columns
  • 19. #CASSANDRA13 Datatype mapping Cassandra MariaDB blob BLOB, VARBINARY(n) ascii BLOB, VARCHAR(n), use charset=latin1 text BLOB, VARCHAR(n), use charset=utf8 varint VARBINARY(n) int INT bigint BIGINT, TINY, SHORT uuid CHAR(36) (text in MariaDB) timestamp TIMESTAMP (second), TIMESTAMP(6) (microsecond), BIGINT boolean BOOL float FLOAT double DOUBLE decimal VARBINARY(n) counter BIGINT
  • 20. #CASSANDRA13 Dynamic columns revisited *Cassandra supports “dynamic column families”, can access ad-hoc columns create table tbl ( rowkey type PRIMARY KEY column1 type, ... dynamic_cols blob DYNAMIC_COLUMN_STORAGE=yes ) engine=cassandra keyspace=... column_family=...; insert into tbl values (1, column_create('col1', 1, 'col2', 'value-2')); select rowkey, column_get(dynamic_cols, 'uuidcol' as char) from tbl;
  • 21. #CASSANDRA13 All data mapping is safe *CassandraSE will refuse incorrect mappings (throw errors) create table t3 (pk varchar(60) primary key, no_such_field int) engine=cassandra `keyspace`='mariadbtest' `column_family`='cf1'; ERROR 1928 (HY000): Internal error: 'Field `no_such_field` could not be mapped to any field in Cassandra' create table t3 (pk varchar(60) primary key, data1 double) engine=cassandra `keyspace`='mariadbtest' `column_family`='cf1'; ERROR 1928 (HY000): Internal error: 'Failed to map column data1 to datatype org.apache.cassandra.db.marshal.UTF8Type'
  • 22. #CASSANDRA13 Command Mapping *Cassandra commands - PUT (upsert) - GET (performs a scan) - DELETE (if exists) *SQL commands - SELECT -> GET/Scan - INSERT -> PUT (upsert) - UPDATE/DELETE -> read/write
  • 23. #CASSANDRA13 SELECT command mapping *MariaDB has an SQL interpreter *CassandraSE supports lookups and scans *Can now do: - arbitrary WHERE clauses - JOINs between Cassandra tables and MariaDB tables (BKA supported)
  • 24. #CASSANDRA13 Batched Key Access is fast! select max(l_extendedprice) from orders, lineitem where o_orderdate between $DATE1 and $DATE2 and l_orderkey=o_orderkey
  • 25. #CASSANDRA13 DML command mapping *No SQL semantics - INSERT overwrites rows - UPDATE reads, then writes (have you updated what you read?) - DELETE reads, then writes (can’t be sure if/what you’ve deleted) *CassandraSE doesn’t make it SQL!
  • 26. #CASSANDRA13 CassandraSE use cases *Collect massive amounts of data like web page hits *Collect massive amounts of data from sensors *Updates are non-conflicting - keyed by UUIDs, timestamps *Reads are served with one lookup *Good for certain kinds of data (though moving from SQL entirely may be difficult)
  • 27. #CASSANDRA13 Access Cassandra data from SQL *Send an update to Cassandra - be a sensor *Get a piece of data from Cassandra - This webpage was last viewed by... - Last known position of this user was... - You are user number n of n-thousands...
  • 28. #CASSANDRA13 From MariaDB... *Want a table that is: - auto-replicated - fault-tolerant - very fast *Get Cassandra and create a CassandraSE table
  • 29. #CASSANDRA13 CassandraSE non-use cases *Huge, sift through all data joins? - use Pig *Bulk data transfer to/from Cassandra Cluster? - use Sqoop *A replacement for InnoDB? - remember no full SQL semantics, InnoDB is useful for myriad reasons
  • 30. #CASSANDRA13 A tiny benchmark *One table *Amazon EC2 environment - m1.large nodes - ephemeral disks *Stream of single-line INSERTs *Tried InnoDB & CassandraSE *No tuning
  • 31. #CASSANDRA13 A tiny benchmark II *InnoDB with tuning, same setup as before
  • 32. #CASSANDRA13 Conclusions *CassandraSE can be used to peek at data in Cassandra from MariaDB *It is not a replacement for Pig/Hive *It is really easy to setup & use
  • 33. #CASSANDRA13 Roadmap *Do you want support for: - fast counter column updates? - awareness/discovery of Cassandra cluster topology? - secondary indexes? - ... ?
  • 35. #CASSANDRA13 THANK YOU Colin Charles | colin@mariadb.org | SkySQL Ab | http://mariadb.org/ @bytebot on Twitter | http://bytebot.net/blog/
  • 36. #CASSANDRA13 Cassandra SE internals *Developed against Cassandra 1.1 *Uses Thrift API - cannot stream CQL resultset in 1.1 - cannot use secondary indexes *Only supports AllowAllAuthenticator *In Cassandra 1.2 - “CQL Binary Protocol” with streaming - CASSANDRA-5234: Thrift can only read CFs “WITH COMPACT STORAGE”