SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Developing with Cassandra
In memory Key-Value:
Redis
CouchBase
File system
BerkeleyDB
Distributed file system
HDFS
Velocity
Volume
Variety
In memory NewSQL /
domain specific
VoltDB
Traditional
[SQL] RDBMS
MongoDB
Distributed DBMS
Cassandra
HBase
CAP theorem:
• Consistency
• Availability
• Partition tolerance
- pick two
Eventual consistency
Transactional?
• No
Choose the DB
Why select Cassandra?
• [relatively] easy to setup
• [relatively] easy to use
• ~zero routine ops
• it works (!!) as promised:
o real-time replication
o node/site failure recovery
o zero load writes
o double of nodes = double of speed
Because Cassandra is Fast!
But needs some time to deliver
• 12'000 WPS on a laptop
• ~0.1 / 1 ms constant latency for writes/reads
Check References:
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf
Scalability
Good For:
• log-like data
TTL helps
• massive writes
1M WPS enough?
• simple real-time analytics
Not So Good For:
• dump of junk
(consider HDFS)
• OLAP
(depends on "O")
Good and Not So Good
Distributed DBMS
Just DBMS - closed monolithic solution
o not a platform to run custom code (as MongoDB);
o not an extension (as HBase);
o highly optimized
No-master, eventually consistent
NoSQL
Data model - Key-Value
http://cassandra.apache.org
Apache Cassandra
Developed at Facebook for Inbox search
Released to open source in 2008
In use:
• Netflix - main non-content data store~500 Cassandra nodes (2012)
• eBay - recommendation system"dozens of nodes", 200 TB storage (2012)
• Twitter - tweet analysis100 + TB of data
• More clients: (http://www.datastax.com/cassandrausers)
History
1.0 - October 2011
1.1 - April 2012
1.2 - January 2013
2.0 - expected this summer (2013)
June 26 2013 - 158 bugs, 89 worth to notice
Sperasoft Experience:
• hit 1 bug in production (stability issue)
• hit 1 bug in QA (in a crafted case)
Mature & Agile
Apache .tar.gz and Debian
packageshttp://cassandra.apache.org/download/
DataStax DSC - Cassandra + OpsCenter
http://planetcassandra.org/Download/DataStaxCommunityEdition
Embedded – for funct. tests on Java apps
Maven
Documentation
http://wiki.apache.org/cassandra/
http://www.datastax.com/docs
Distributions
http://www.slideshare.net/planetcassandra/5-andy-cobley-raspberry-pi
CPU: ARM 700 MHz
RAM: 500 MB
Storage: SD card
Price: $25
200 WPS!
What Hardware?
Bare metal
CPU: 8 cores (4 works too)
RAM: 16 - 64 GB (min 8 GB)
Storage: rotating disks 3 - 5 TB total (SSD better)
VM works too, but...
Storage: local disks, avoid NAS
More on Hardware
Software Yes & No (production)
?
Software for Development: All Yes
Plus more: Java, Python, C#, PHP, Ruby, Clojure, Go, R, ...
KeyspaceKeyspace
Column
family
Column
family
~ RDBMS DB / Schema
~ RDBMS table
Key1
Key2
Value
Column
Row
Clustering key
Partitioning key Map < ... , Map < ... , ... > >
Data Model
Node 3Node 2Node 1
CFCFCF
1
2
3
Client
Parallel reads,writes
1
2
3
Data on Discs
https://github.com/datastax/java-driver
Client API Options
Thrift RPC Native protocol + CQL3
Apache Thrift Custom protocol
Synchronous Asynchronous
Schema-less Static schema
Store & Forward Cursors promised in 2.0
API for any language Java; Python, C# coming
Cryptic API JDBC-like API
Supported yet Going forward
• Forget RDB design principles
• Forget abstract data model
- shape data for queries
• No joins - materialized views
• Data duplication - OK
• Remember eventual consistency
• Queries are precious
• Use right data types - timestamp, uuid
Why? Because NoSQL is a low level tool for high optimization.
Data Modeling for NoSQL
Do & Don't
Patterns and Anti-patterns
Query
Wait
Read
Query
Wait
Read
Parallel it:
slo-o-ow
Sequential Read
CREATE TABLE timeline (
event uuid,
timestamp timeuuid,
...
PRIMARY KEY (event, timestamp)
);
event:date
timestamp
...
CREATE TABLE timeline (
event uuid,
date long,
timestamp timeuuid,
...
PRIMARY KEY ((event, date), timestamp)
);
event
timestamp
...
Still bad - need sharding
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra
Long rows - Cassandra handle 2G columns, but... slo-o-ow
Timeline
Insert = Update = Delete
A B C D1
Y1
Z1
1
A Y Z1
a b c d
UPDATE ... SET b = 'Y' WHERE id = 1
INSERT INTO ... SET (id, c) values (1, 'Z')
DELETE d FROM ... WHERE id = 1
SELECT * FROM ... WHERE id = 1
have to fetch 4 rows
slo-o-ow
Plan Data Immutable
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
Q
Queue: INSERT INTO ...
SET( name, enqueued_at, payload )
VALUES ( 'Q', now(), ... )
Dequeue: DELETE payload FROM ...
WHERE name = 'Q'
AND enqueued_at = ...
Pick the next: SELECT * FROM ...
WHERE id = 1 LIMIT 1
*Q
*Q
*Q
*Q
Q
Q
*? ? ?Q
have to fetch 4 rowsslo-o-ow
Queue
Remember - eventual consistency.
Concurrent updates
=> wrong count
SELECT count(*)
FROM ... WHERE .... ;
Full scan over the selection
=>
Default 10'000 rows limit
=> wrong count
Have an integer column and increment it
CREATE TABLE count_table (
id
uuid,
value counter,
PRIMARY KEY (id)
);
...
UPDATE count_table
SET value = value + 1
WHERE id = ... ;
Counter column family
http://www.datastax.com/documentation/cassandr
a/1.2/cassandra/cql_using/use_counter_t.html
slo-o-ow
mess
mess
How Many?
CREATE TABLE blob (
id uuid,
data blob,
PRIMARY KEY (id)
);
id
chunk_no
data
CREATE TABLE blob (
id uuid,
chunk_no int,
data blob,
PRIMARY KEY (id, chunk_no)
);
id data
http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
http://wiki.apache.org/cassandra/CassandraLimitations
OutOfMemory
Blobs
Unbounded Queries
Result set
Client
OutOfMemory
OutOfMemory
Cursors. Planned to 2.0.
https://issues.apache.org/jira/browse/CASSANDRA-4415
C* clients use RPC yet
Cassandra
slo-o-ow
Column names - data... yet
Keep them short.
Node 3Node 1
CF
3
Client
2
Node 3Node 2
Why??
slo-o-ow
hot spots
Limit Client to a node
Helpful Links
Sperasoft @ slideshare: http://www.slideshare.net/Sperasoft
Sperasoft @ speakerdeck: https://speakerdeck.com/sperasoft
Sperasoft @ github: https://github.com/Sperasoft/Workshop

Weitere ähnliche Inhalte

Was ist angesagt?

DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big DataDataStax Academy
 
Tuning tips for Apache Spark Jobs
Tuning tips for Apache Spark JobsTuning tips for Apache Spark Jobs
Tuning tips for Apache Spark JobsSamir Bessalah
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Cosmin Lehene
 
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014Amazon Web Services
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpDataStax
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in SparkShiao-An Yuan
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionKaran Singh
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...DataStax Academy
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSDataStax Academy
 
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!ScyllaDB
 
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014Amazon Web Services
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsKeeyong Han
 
Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?ScyllaDB
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0J.B. Langston
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
 
Automation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure DataAutomation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure DataYan Wang
 

Was ist angesagt? (20)

DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 
Tuning tips for Apache Spark Jobs
Tuning tips for Apache Spark JobsTuning tips for Apache Spark Jobs
Tuning tips for Apache Spark Jobs
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
 
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
(SDD403) Amazon RDS for MySQL Deep Dive | AWS re:Invent 2014
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt Stump
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
 
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake SolutionCeph Object Storage Performance Secrets and Ceph Data Lake Solution
Ceph Object Storage Performance Secrets and Ceph Data Lake Solution
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
 
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
 
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
(SDD409) Amazon RDS for PostgreSQL Deep Dive | AWS re:Invent 2014
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
 
Automation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure DataAutomation of Hadoop cluster operations in Arm Treasure Data
Automation of Hadoop cluster operations in Arm Treasure Data
 

Andere mochten auch

BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraVictor Coustenoble
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataVictor Coustenoble
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Cassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful APICassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful APISimran Kedia
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Applicationsupertom
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisDuyhai Doan
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceWSO2
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and CassandraStratio
 
NodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin WayNodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin WayEdureka!
 
69 claves para conocer Big Data
69 claves para conocer Big Data69 claves para conocer Big Data
69 claves para conocer Big DataStratebi
 

Andere mochten auch (12)

BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Cassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful APICassandra DataTables Using RESTful API
Cassandra DataTables Using RESTful API
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Application
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS Paris
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Application Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a ServiceApplication Development with Apache Cassandra as a Service
Application Development with Apache Cassandra as a Service
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and Cassandra
 
NodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin WayNodeJS : Communication and Round Robin Way
NodeJS : Communication and Round Robin Way
 
69 claves para conocer Big Data
69 claves para conocer Big Data69 claves para conocer Big Data
69 claves para conocer Big Data
 

Ähnlich wie Developing with Cassandra

Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at PollfishPollfish
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/KuduChris George
 
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScyllaDB
 
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to knowQuick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to knowRafał Hryniewski
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016Tzach Livyatan
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKSkills Matter
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and BeyondSage Weil
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...DataStax Academy
 
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStackPuppet
 
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackke4qqq
 
Postgres the hardway
Postgres the hardwayPostgres the hardway
Postgres the hardwayDave Pitts
 
OSv at Cassandra Summit
OSv at Cassandra SummitOSv at Cassandra Summit
OSv at Cassandra SummitDon Marti
 
Reusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modulesReusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modulesYevgeniy Brikman
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureScyllaDB
 
OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"Giivee The
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Javasunnygleason
 
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developersChristopher Batey
 

Ähnlich wie Developing with Cassandra (20)

Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
 
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
 
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to knowQuick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to know
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
Cassandra Summit 2014: Down with Tweaking! Removing Tunable Complexity for Ca...
 
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStack
 
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStack
 
Postgres the hardway
Postgres the hardwayPostgres the hardway
Postgres the hardway
 
OSv at Cassandra Summit
OSv at Cassandra SummitOSv at Cassandra Summit
OSv at Cassandra Summit
 
Reusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modulesReusable, composable, battle-tested Terraform modules
Reusable, composable, battle-tested Terraform modules
 
Under The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database ArchitectureUnder The Hood Of A Shard-Per-Core Database Architecture
Under The Hood Of A Shard-Per-Core Database Architecture
 
OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
 

Mehr von Sperasoft

особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4Sperasoft
 
концепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted Worldконцепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted WorldSperasoft
 
Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Sperasoft
 
Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Sperasoft
 
Gameplay Tags
Gameplay TagsGameplay Tags
Gameplay TagsSperasoft
 
Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Sperasoft
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Sperasoft
 
The theory of relational databases
The theory of relational databasesThe theory of relational databases
The theory of relational databasesSperasoft
 
Automated layout testing using Galen Framework
Automated layout testing using Galen FrameworkAutomated layout testing using Galen Framework
Automated layout testing using Galen FrameworkSperasoft
 
Sperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft
 
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft
 
Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft
 
Effective Мeetings
Effective МeetingsEffective Мeetings
Effective МeetingsSperasoft
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 IntroductionSperasoft
 
JIRA Development
JIRA DevelopmentJIRA Development
JIRA DevelopmentSperasoft
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
MOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSMOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSSperasoft
 
Quick Intro Into Kanban
Quick Intro Into KanbanQuick Intro Into Kanban
Quick Intro Into KanbanSperasoft
 
ECMAScript 6 Review
ECMAScript 6 ReviewECMAScript 6 Review
ECMAScript 6 ReviewSperasoft
 
Console Development in 15 minutes
Console Development in 15 minutesConsole Development in 15 minutes
Console Development in 15 minutesSperasoft
 

Mehr von Sperasoft (20)

особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4
 
концепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted Worldконцепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted World
 
Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4
 
Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек
 
Gameplay Tags
Gameplay TagsGameplay Tags
Gameplay Tags
 
Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Data Driven Gameplay in UE4
Data Driven Gameplay in UE4
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
 
The theory of relational databases
The theory of relational databasesThe theory of relational databases
The theory of relational databases
 
Automated layout testing using Galen Framework
Automated layout testing using Galen FrameworkAutomated layout testing using Galen Framework
Automated layout testing using Galen Framework
 
Sperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft talks: Android Security Threats
Sperasoft talks: Android Security Threats
 
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on Android
 
Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015
 
Effective Мeetings
Effective МeetingsEffective Мeetings
Effective Мeetings
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 Introduction
 
JIRA Development
JIRA DevelopmentJIRA Development
JIRA Development
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
MOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSMOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JS
 
Quick Intro Into Kanban
Quick Intro Into KanbanQuick Intro Into Kanban
Quick Intro Into Kanban
 
ECMAScript 6 Review
ECMAScript 6 ReviewECMAScript 6 Review
ECMAScript 6 Review
 
Console Development in 15 minutes
Console Development in 15 minutesConsole Development in 15 minutes
Console Development in 15 minutes
 

Kürzlich hochgeladen

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Kürzlich hochgeladen (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Developing with Cassandra

  • 2. In memory Key-Value: Redis CouchBase File system BerkeleyDB Distributed file system HDFS Velocity Volume Variety In memory NewSQL / domain specific VoltDB Traditional [SQL] RDBMS MongoDB Distributed DBMS Cassandra HBase CAP theorem: • Consistency • Availability • Partition tolerance - pick two Eventual consistency Transactional? • No Choose the DB
  • 3. Why select Cassandra? • [relatively] easy to setup • [relatively] easy to use • ~zero routine ops • it works (!!) as promised: o real-time replication o node/site failure recovery o zero load writes o double of nodes = double of speed
  • 4. Because Cassandra is Fast! But needs some time to deliver • 12'000 WPS on a laptop • ~0.1 / 1 ms constant latency for writes/reads
  • 6. Good For: • log-like data TTL helps • massive writes 1M WPS enough? • simple real-time analytics Not So Good For: • dump of junk (consider HDFS) • OLAP (depends on "O") Good and Not So Good
  • 7. Distributed DBMS Just DBMS - closed monolithic solution o not a platform to run custom code (as MongoDB); o not an extension (as HBase); o highly optimized No-master, eventually consistent NoSQL Data model - Key-Value http://cassandra.apache.org Apache Cassandra
  • 8. Developed at Facebook for Inbox search Released to open source in 2008 In use: • Netflix - main non-content data store~500 Cassandra nodes (2012) • eBay - recommendation system"dozens of nodes", 200 TB storage (2012) • Twitter - tweet analysis100 + TB of data • More clients: (http://www.datastax.com/cassandrausers) History
  • 9. 1.0 - October 2011 1.1 - April 2012 1.2 - January 2013 2.0 - expected this summer (2013) June 26 2013 - 158 bugs, 89 worth to notice Sperasoft Experience: • hit 1 bug in production (stability issue) • hit 1 bug in QA (in a crafted case) Mature & Agile
  • 10. Apache .tar.gz and Debian packageshttp://cassandra.apache.org/download/ DataStax DSC - Cassandra + OpsCenter http://planetcassandra.org/Download/DataStaxCommunityEdition Embedded – for funct. tests on Java apps Maven Documentation http://wiki.apache.org/cassandra/ http://www.datastax.com/docs Distributions
  • 11. http://www.slideshare.net/planetcassandra/5-andy-cobley-raspberry-pi CPU: ARM 700 MHz RAM: 500 MB Storage: SD card Price: $25 200 WPS! What Hardware?
  • 12. Bare metal CPU: 8 cores (4 works too) RAM: 16 - 64 GB (min 8 GB) Storage: rotating disks 3 - 5 TB total (SSD better) VM works too, but... Storage: local disks, avoid NAS More on Hardware
  • 13. Software Yes & No (production) ?
  • 14. Software for Development: All Yes Plus more: Java, Python, C#, PHP, Ruby, Clojure, Go, R, ...
  • 15. KeyspaceKeyspace Column family Column family ~ RDBMS DB / Schema ~ RDBMS table Key1 Key2 Value Column Row Clustering key Partitioning key Map < ... , Map < ... , ... > > Data Model
  • 16. Node 3Node 2Node 1 CFCFCF 1 2 3 Client Parallel reads,writes 1 2 3 Data on Discs
  • 17. https://github.com/datastax/java-driver Client API Options Thrift RPC Native protocol + CQL3 Apache Thrift Custom protocol Synchronous Asynchronous Schema-less Static schema Store & Forward Cursors promised in 2.0 API for any language Java; Python, C# coming Cryptic API JDBC-like API Supported yet Going forward
  • 18. • Forget RDB design principles • Forget abstract data model - shape data for queries • No joins - materialized views • Data duplication - OK • Remember eventual consistency • Queries are precious • Use right data types - timestamp, uuid Why? Because NoSQL is a low level tool for high optimization. Data Modeling for NoSQL
  • 19. Do & Don't Patterns and Anti-patterns
  • 21. CREATE TABLE timeline ( event uuid, timestamp timeuuid, ... PRIMARY KEY (event, timestamp) ); event:date timestamp ... CREATE TABLE timeline ( event uuid, date long, timestamp timeuuid, ... PRIMARY KEY ((event, date), timestamp) ); event timestamp ... Still bad - need sharding http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra Long rows - Cassandra handle 2G columns, but... slo-o-ow Timeline
  • 22. Insert = Update = Delete A B C D1 Y1 Z1 1 A Y Z1 a b c d UPDATE ... SET b = 'Y' WHERE id = 1 INSERT INTO ... SET (id, c) values (1, 'Z') DELETE d FROM ... WHERE id = 1 SELECT * FROM ... WHERE id = 1 have to fetch 4 rows slo-o-ow Plan Data Immutable
  • 23. http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets Q Queue: INSERT INTO ... SET( name, enqueued_at, payload ) VALUES ( 'Q', now(), ... ) Dequeue: DELETE payload FROM ... WHERE name = 'Q' AND enqueued_at = ... Pick the next: SELECT * FROM ... WHERE id = 1 LIMIT 1 *Q *Q *Q *Q Q Q *? ? ?Q have to fetch 4 rowsslo-o-ow Queue
  • 24. Remember - eventual consistency. Concurrent updates => wrong count SELECT count(*) FROM ... WHERE .... ; Full scan over the selection => Default 10'000 rows limit => wrong count Have an integer column and increment it CREATE TABLE count_table ( id uuid, value counter, PRIMARY KEY (id) ); ... UPDATE count_table SET value = value + 1 WHERE id = ... ; Counter column family http://www.datastax.com/documentation/cassandr a/1.2/cassandra/cql_using/use_counter_t.html slo-o-ow mess mess How Many?
  • 25. CREATE TABLE blob ( id uuid, data blob, PRIMARY KEY (id) ); id chunk_no data CREATE TABLE blob ( id uuid, chunk_no int, data blob, PRIMARY KEY (id, chunk_no) ); id data http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage http://wiki.apache.org/cassandra/CassandraLimitations OutOfMemory Blobs
  • 26. Unbounded Queries Result set Client OutOfMemory OutOfMemory Cursors. Planned to 2.0. https://issues.apache.org/jira/browse/CASSANDRA-4415 C* clients use RPC yet Cassandra slo-o-ow
  • 27. Column names - data... yet Keep them short.
  • 28. Node 3Node 1 CF 3 Client 2 Node 3Node 2 Why?? slo-o-ow hot spots Limit Client to a node
  • 29. Helpful Links Sperasoft @ slideshare: http://www.slideshare.net/Sperasoft Sperasoft @ speakerdeck: https://speakerdeck.com/sperasoft Sperasoft @ github: https://github.com/Sperasoft/Workshop