SlideShare a Scribd company logo
1 of 80
Download to read offline
Cassandra nice use-cases and worst anti-patterns 
DuyHai DOAN, Technical Advocate 
@doanduyhai
Agenda! 
@doanduyhai 
2 
Anti-patterns 
• Queue-like designs 
• CQL null values 
• Intensive update on same column 
• Design around dynamic schema
Agenda! 
@doanduyhai 
3 
Nice use-cases 
• Rate-limiting 
• Anti Fraud 
• Account validation 
• Sensor data timeseries
Worst anti-patterns! 
Queue-like designs! 
CQL null! 
Intensive update on same column! 
Design around dynamic schema! 
!
Failure level! 
@doanduyhai 
5 
☠ 
☠☠ 
☠☠☠ 
☠☠☠☠
Queue-like designs! 
@doanduyhai 
6 
Adding new message ☞ 1 physical insert
Queue-like designs! 
@doanduyhai 
7 
Adding new message ☞ 1 physical insert 
Consuming message = deleting it ☞ 1 physical insert (tombstone)
Queue-like designs! 
@doanduyhai 
8 
Adding new message ☞ 1 physical insert 
Consuming message = deleting it ☞ 1 physical insert (tombstone) 
Transactional queue = re-inserting messages ☞ physical insert * <many>
Queue-like designs! 
FIFO queue 
@doanduyhai 
9 
A 
{ A }
Queue-like designs! 
FIFO queue 
@doanduyhai 
10 
A B 
{ A, B }
Queue-like designs! 
FIFO queue 
@doanduyhai 
11 
A B C 
{ A, B, C }
Queue-like designs! 
FIFO queue 
@doanduyhai 
12 
A B C A 
{ B, C }
Queue-like designs! 
FIFO queue 
@doanduyhai 
13 
A B C A D 
{ B, C, D }
Queue-like designs! 
FIFO queue 
@doanduyhai 
14 
A B C A D B 
{ C, D }
Queue-like designs! 
FIFO queue 
@doanduyhai 
15 
A B C A D B C 
{ D }
Queue-like designs! 
FIFO queue, worst case 
@doanduyhai 
16 
A A A A A A A A A A 
{ }
Failure level! 
@doanduyhai 
17 
☠☠☠
CQL null semantics! 
@doanduyhai 
18 
Reading null value means 
• value does not exist (has never bean created) 
• value deleted (tombstone) 
SELECT age FROM users WHERE login = ddoan; à NULL
CQL null semantics! 
@doanduyhai 
19 
Writing null means 
• delete value (creating tombstone) 
• even though it does not exist 
UPDATE users SET age = NULL WHERE login = ddoan;
CQL null semantics! 
@doanduyhai 
20 
Seen in production: prepared statement 
UPDATE users SET 
age = ?, 
… 
geo_location = ?, 
mood = ?, 
… 
WHERE login = ?;
CQL null semantics! 
@doanduyhai 
21 
Seen in production: bound statement 
preparedStatement.bind(33, …, null, null, null, …); 
null ☞ tombstone creation on each update … 
jdoe 
age name geo_loc mood status 
33 John DOE ý ý ý
Failure level! 
@doanduyhai 
22 
☠
Intensive update! 
@doanduyhai 
23 
Context 
• small start-up 
• cloud-based video recording & alarm 
• internet of things (sensor) 
• 10 updates/sec for some sensors
Intensive update on same column! 
@doanduyhai 
24 
Data model 
sensor_id 
value 
45.0034 
CREATE TABLE sensor_data ( 
sensor_id long, 
value double, 
PRIMARY KEY(sensor_id));
Intensive update on same column! 
UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; 
UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; 
UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …; 
@doanduyhai 
25 
Updates 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
26 
Read 
SELECT sensor_value from sensor_data WHERE sensor_id = …; 
read N physical columns, only 1 useful … 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
27 
Solution 1: leveled compaction! (if your I/O can keep up) 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
28 
Solution 2: reversed timeseries & DateTiered compaction strategy 
CREATE TABLE sensor_data ( 
sensor_id long, 
date timestamp, 
sensor_value double, 
PRIMARY KEY((sensor_id), date)) 
WITH CLUSTERING ORDER (date DESC);
Intensive update on same column! 
SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1; 
@doanduyhai 
29 
sensor_id 
date3(t3) 
date2(t2) 
date1(t1) 
Data cleaning by configuration (max_sstable_age_days) 
... 
48.0300 47.4182 45.0034 …
Failure level! 
@doanduyhai 
30 
☠☠
Design around dynamic schema! 
@doanduyhai 
31 
Customer emergency call 
• 3 nodes cluster almost full 
• impossible to scale out 
• 4th node in JOINING state for 1 week 
• disk space is filling up, production at risk!
Design around dynamic schema! 
@doanduyhai 
32 
After investigation 
• 4th node in JOINING state because streaming is stalled 
• NPE in logs
Design around dynamic schema! 
@doanduyhai 
33 
After investigation 
• 4th node in JOINING state because streaming is stalled 
• NPE in logs 
Cassandra source-code to the rescue
Design around dynamic schema! 
@doanduyhai 
34 
public class CompressedStreamReader extends StreamReader 
{ 
… 
@Override 
public SSTableWriter read(ReadableByteChannel channel) throws IOException 
{ 
… 
Pair<String, String> kscf = Schema.instance.getCF(cfId); 
ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right); 
NPE here
Design around dynamic schema! 
@doanduyhai 
35 
The truth is 
• the devs dynamically drop & recreate table every day 
• dynamic schema is in the core of their design 
Example: 
DROP TABLE catalog_127_20140613; 
CREATE TABLE catalog_127_20140614( … );
Design around dynamic schema! 
@doanduyhai 
36 
Failure sequence 
n1 
n2 
n4 
n3 
catalog_x_y 
catalog_x_y 
catalog_x_y 
catalog_x_y 
4 1 
2 
3 
5 
6
Design around dynamic schema! 
@doanduyhai 
37 
Failure sequence 
n1 
n2 
n4 
n3 
catalog_x_y 
catalog_x_y 
catalog_x_y 
catalog_x_y 
4 1 
2 
3 
5 
6 
catalog_x_z 
catalog_x_z 
catalog_x_z 
catalog_x_z
Design around dynamic schema! 
@doanduyhai 
catalog_x_y ???? 
38 
Failure sequence 
n1 
n2 
n4 
n3 
4 1 
2 
3 
5 
6 
catalog_x_z 
catalog_x_z 
catalog_x_z 
catalog_x_z
Design around dynamic schema! 
@doanduyhai 
39 
Consequences 
• joining node got always stuck 
• à cannot extend cluster 
• 
à changing code takes time 
• 
à production in danger (no space left) 
• 
à sacrify analytics data to survive
Design around dynamic schema! 
@doanduyhai 
40 
Nutshell 
• dynamic schema change as normal operations is not recommended 
• concurrent schema AND topology change is an anti-pattern
Failure level! 
@doanduyhai 
41 
☠☠☠☠
! " 
! 
Q & R
Nice Examples! 
Rate limiting! 
Anti Fraud! 
Account Validation! 
Sensor Data Timeseries!
Rate limiting! 
@doanduyhai 
44 
Start-up company, reset password feature 
1) /password/reset 
2) SMS with token A0F83E63DB935465CE73DFE…. 
Phone number Random token 
3) /password/new/<token>/<password>
Rate limiting! 
@doanduyhai 
45 
Problem 1 
• account created with premium phone number
Rate limiting! 
@doanduyhai 
46 
Problem 1 
• account created with premium phone number 
• /password/reset x 100
Rate limiting! 
@doanduyhai 
47 
« money, money, money, give money, in the richman’s world » $$$
Rate limiting! 
@doanduyhai 
48 
Problem 2 
• massive hack
Rate limiting! 
@doanduyhai 
49 
Problem 2 
• massive hack 
• 106 /password/reset calls from few accounts
Rate limiting! 
@doanduyhai 
50 
Problem 2 
• massive hack 
• 106 /password/reset calls from few accounts 
• SMS messages are cheap
Rate limiting! 
@doanduyhai 
51 
Problem 2 
• ☞ but not at the 106/per user/per day scale
Rate limiting! 
@doanduyhai 
52 
Solution 
• premium phone number ☞ Google libphonenumber
Rate limiting! 
@doanduyhai 
53 
Solution 
• premium phone number ☞ Google libphonenumber 
• massive hack ☞ rate limiting with Cassandra
Cassandra Time To Live! 
@doanduyhai 
54 
Time to live 
• built-in feature 
• insert data with a TTL in sec 
• expires server-side automatically 
• ☞ use as sliding-window
Rate limiting in action! 
@doanduyhai 
55 
Implementation 
• threshold = max 3 reset password per sliding 24h
Rate limiting in action! 
@doanduyhai 
56 
Implementation 
• when /password/reset called 
• check threshold 
• reached ☞ error message/ignore 
• not reached ☞ log the attempt with TTL = 86400
Rate limiting 
demo
Anti Fraud! 
@doanduyhai 
58 
Real story 
• many special offers available 
• 30 mins international calls (50 countries) 
• unlimited land-line calls to 5 countries 
• …
Anti Fraud! 
@doanduyhai 
59 
Real story 
• each offer has a duration (week/month/year) 
• only one offer active at a time
Anti Fraud! 
@doanduyhai 
60 
Cassandra TTL 
• check for existing offer before 
SELECT count(*) FROM user_special_offer WHERE login = ‘jdoe’;
Anti Fraud! 
@doanduyhai 
61 
Cassandra TTL 
• then grant new offer 
INSERT INTO user_special_offer(login, offer_code, …) 
VALUES(‘jdoe’, ’30_mins_international’,…) 
USING TTL <offer_duration>;
Account Validation! 
@doanduyhai 
62 
Requirement 
• user creates new account 
• sends sms/email link with token to validate account 
• 10 days to validate
Account Validation! 
@doanduyhai 
63 
How to ? 
• create account with 10 days TTL 
INSERT INTO users(login, name, age) 
VALUES(‘jdoe’, ‘John DOE’, 33) 
USING TTL 864000;
Account Validation! 
@doanduyhai 
64 
How to ? 
• create random token for validation with 10 days TTL 
INSERT INTO account_validation(token, login, name, age) 
VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) 
USING TTL 864000;
Account Validation! 
@doanduyhai 
65 
On token validation 
• check token exist & retrieve user details 
SELECT login, name, age FROM account_validation 
WHERE token = ‘A0F83E63DB935465CE73DFE…’; 
• re-insert durably user details without TTL 
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
Sensor Data Timeseries! 
@doanduyhai 
66 
Requirements 
• lots of sensors (103 – 106) 
• medium to high insertion rate (0.1 – 10/secs) 
• keep good load balancing 
• fast read & write
Bucketing! 
@doanduyhai 
67 
CREATE TABLE sensor_data ( 
sensor_id text, 
date timestamp, 
raw_data blob, 
PRIMARY KEY(sensor_id, date)); 
sensor_id 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 …
Bucketing! 
@doanduyhai 
68 
Problems: 
• limit of 2.109 physical columns 
• bad load balancing (1 sensor = 1 node) 
• wide row spans over many files 
sensor_id 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 …
Bucketing! 
@doanduyhai 
69 
Idea: 
• composite partition key: sensor_id:date_bucket 
• tunable date granularity: per hour/per day/per month … 
CREATE TABLE sensor_data ( 
sensor_id text, 
date_bucket int, //format YYYYMMdd 
date timestamp, 
raw_data blob, 
PRIMARY KEY((sensor_id, date_bucket), date));
Bucketing! 
Idea: 
• composite partition key: sensor_id:date_bucket 
• tunable date granularity: per hour/per day/per month … 
@doanduyhai 
70 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 … 
Buckets
Bucketing! 
@doanduyhai 
71 
Advantage: 
• distribute load: 1 bucket = 1 node 
• limit partition width (max x columns per bucket) 
Buckets 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 …
Bucketing! 
@doanduyhai 
72 
But how can I select raw data between 14:45 and 15:10 ? 
14:45 à ? 
15:00 à 15:10 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 …
Bucketing! 
Solution 
• use IN clause on partition key component 
• with range condition on date column 
☞ date column should be monotonic function (increasing/decreasing) 
@doanduyhai 
73 
SELECT * FROM sensor_data WHERE sensor_id = xxx 
AND date_bucket IN (2014091014 , 2014091015) 
AND date >= ‘2014-09-10 14:45:00.000‘ 
AND date <= ‘2014-09-10 15:10:00.000‘
Bucketing Caveats! 
@doanduyhai 
74 
IN clause for #partition is not silver bullet ! 
• use scarcely 
• keep cardinality low (≤ 5) 
n1 
n2 
n3 
n4 
n5 
n6 
n7 
coordinator 
n8 
sensor_id:2014091014 
sensor_id:2014091015
Bucketing Caveats! 
@doanduyhai 
75 
IN clause for #partition is not silver bullet ! 
• use scarcely 
• keep cardinality low (≤ 5) 
• prefer // async queries 
• ease of query vs perf 
n1 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Async client 
sensor_id:2014091014 
sensor_id:2014091015
! " 
! 
Q & R
Cassandra developers! 
@doanduyhai 
77 
Rule n°1 
If you don’t know, ask for help 
(me, Cassandra ML, PlanetCassandra, stackoverflow, …) 
!
Cassandra developers! 
@doanduyhai 
78 
Rule n°2 
Do not blind-guess troubleshooting 
alone in production 
(ask for help, see rule n°1) 
!
Cassandra developers! 
@doanduyhai 
79 
Rule n°3 
Share with the community 
(your best use-cases … and worst failures) 
! 
http://planetcassandra.org/
Thank You 
@doanduyhai 
duy_hai.doan@datastax.com

More Related Content

What's hot

RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedis Labs
 
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...DataStax
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Databricks
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLScyllaDB
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Markus Höfer
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling TwitterBlaine
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
cassandra
cassandracassandra
cassandraAkash R
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...ScyllaDB
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loadingalex_araujo
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzC* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzDataStax Academy
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLCockroachDB
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward
 
Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redisDaeMyung Kang
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 

What's hot (20)

RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetchRedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
RedisConf17 - Internet Archive - Preventing Cache Stampede with Redis and XFetch
 
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
Scaling Twitter
Scaling TwitterScaling Twitter
Scaling Twitter
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
cassandra
cassandracassandra
cassandra
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzC* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQL
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redis
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 

Viewers also liked

strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-PatternsMatthew Dennis
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestDuyhai Doan
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentationDuyhai Doan
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Ontico
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Cassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesCassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesDuyhai Doan
 
Cassandra java libraries
Cassandra java librariesCassandra java libraries
Cassandra java librariesDuyhai Doan
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Apekshit Sharma
 
Achilles presentation
Achilles presentationAchilles presentation
Achilles presentationDuyhai Doan
 
Cassandra Drivers and Tools
Cassandra Drivers and ToolsCassandra Drivers and Tools
Cassandra Drivers and ToolsDuyhai Doan
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingMatthew Dennis
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achillesDuyhai Doan
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisDuyhai Doan
 
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...NoSQLmatters
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...DataStax
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarMatthew Dennis
 

Viewers also liked (20)

strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentation
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Cassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesCassandra rapid prototyping with achilles
Cassandra rapid prototyping with achilles
 
Cassandra java libraries
Cassandra java librariesCassandra java libraries
Cassandra java libraries
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015
 
Achilles presentation
Achilles presentationAchilles presentation
Achilles presentation
 
Cassandra Drivers and Tools
Cassandra Drivers and ToolsCassandra Drivers and Tools
Cassandra Drivers and Tools
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achilles
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS Paris
 
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling Webinar
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 

Similar to Cassandra nice use cases and worst anti patterns

Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaDuyhai Doan
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and dontsDuyhai Doan
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentationDuyhai Doan
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGDuyhai Doan
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUGDuyhai Doan
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and librariesDuyhai Doan
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jugDuyhai Doan
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithmsDuyhai Doan
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014Duyhai Doan
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGDuyhai Doan
 
KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)DataStax Academy
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data ModelingDuyhai Doan
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementationArcBlock
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016Duyhai Doan
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyDataStax Academy
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
 

Similar to Cassandra nice use cases and worst anti patterns (20)

Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentation
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUG
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUG
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and libraries
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithms
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUG
 
KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data Modeling
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementation
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon Valley
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
 

More from Duyhai Doan

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Duyhai Doan
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandraDuyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplDuyhai Doan
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysDuyhai Doan
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandraDuyhai Doan
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016Duyhai Doan
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronDuyhai Doan
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016Duyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Duyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Duyhai Doan
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016Duyhai Doan
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016Duyhai Doan
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemDuyhai Doan
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsDuyhai Doan
 
Data stax academy
Data stax academyData stax academy
Data stax academyDuyhai Doan
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemDuyhai Doan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...Duyhai Doan
 

More from Duyhai Doan (20)

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandra
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev days
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandra
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basics
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotron
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystem
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized Views
 
Data stax academy
Data stax academyData stax academy
Data stax academy
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystem
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
 

Recently uploaded

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Cassandra nice use cases and worst anti patterns

  • 1. Cassandra nice use-cases and worst anti-patterns DuyHai DOAN, Technical Advocate @doanduyhai
  • 2. Agenda! @doanduyhai 2 Anti-patterns • Queue-like designs • CQL null values • Intensive update on same column • Design around dynamic schema
  • 3. Agenda! @doanduyhai 3 Nice use-cases • Rate-limiting • Anti Fraud • Account validation • Sensor data timeseries
  • 4. Worst anti-patterns! Queue-like designs! CQL null! Intensive update on same column! Design around dynamic schema! !
  • 5. Failure level! @doanduyhai 5 ☠ ☠☠ ☠☠☠ ☠☠☠☠
  • 6. Queue-like designs! @doanduyhai 6 Adding new message ☞ 1 physical insert
  • 7. Queue-like designs! @doanduyhai 7 Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone)
  • 8. Queue-like designs! @doanduyhai 8 Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone) Transactional queue = re-inserting messages ☞ physical insert * <many>
  • 9. Queue-like designs! FIFO queue @doanduyhai 9 A { A }
  • 10. Queue-like designs! FIFO queue @doanduyhai 10 A B { A, B }
  • 11. Queue-like designs! FIFO queue @doanduyhai 11 A B C { A, B, C }
  • 12. Queue-like designs! FIFO queue @doanduyhai 12 A B C A { B, C }
  • 13. Queue-like designs! FIFO queue @doanduyhai 13 A B C A D { B, C, D }
  • 14. Queue-like designs! FIFO queue @doanduyhai 14 A B C A D B { C, D }
  • 15. Queue-like designs! FIFO queue @doanduyhai 15 A B C A D B C { D }
  • 16. Queue-like designs! FIFO queue, worst case @doanduyhai 16 A A A A A A A A A A { }
  • 18. CQL null semantics! @doanduyhai 18 Reading null value means • value does not exist (has never bean created) • value deleted (tombstone) SELECT age FROM users WHERE login = ddoan; à NULL
  • 19. CQL null semantics! @doanduyhai 19 Writing null means • delete value (creating tombstone) • even though it does not exist UPDATE users SET age = NULL WHERE login = ddoan;
  • 20. CQL null semantics! @doanduyhai 20 Seen in production: prepared statement UPDATE users SET age = ?, … geo_location = ?, mood = ?, … WHERE login = ?;
  • 21. CQL null semantics! @doanduyhai 21 Seen in production: bound statement preparedStatement.bind(33, …, null, null, null, …); null ☞ tombstone creation on each update … jdoe age name geo_loc mood status 33 John DOE ý ý ý
  • 23. Intensive update! @doanduyhai 23 Context • small start-up • cloud-based video recording & alarm • internet of things (sensor) • 10 updates/sec for some sensors
  • 24. Intensive update on same column! @doanduyhai 24 Data model sensor_id value 45.0034 CREATE TABLE sensor_data ( sensor_id long, value double, PRIMARY KEY(sensor_id));
  • 25. Intensive update on same column! UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …; @doanduyhai 25 Updates sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300
  • 26. Intensive update on same column! @doanduyhai 26 Read SELECT sensor_value from sensor_data WHERE sensor_id = …; read N physical columns, only 1 useful … sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300
  • 27. Intensive update on same column! @doanduyhai 27 Solution 1: leveled compaction! (if your I/O can keep up) sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300 sensor_id value (t36) 48.0300
  • 28. Intensive update on same column! @doanduyhai 28 Solution 2: reversed timeseries & DateTiered compaction strategy CREATE TABLE sensor_data ( sensor_id long, date timestamp, sensor_value double, PRIMARY KEY((sensor_id), date)) WITH CLUSTERING ORDER (date DESC);
  • 29. Intensive update on same column! SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1; @doanduyhai 29 sensor_id date3(t3) date2(t2) date1(t1) Data cleaning by configuration (max_sstable_age_days) ... 48.0300 47.4182 45.0034 …
  • 31. Design around dynamic schema! @doanduyhai 31 Customer emergency call • 3 nodes cluster almost full • impossible to scale out • 4th node in JOINING state for 1 week • disk space is filling up, production at risk!
  • 32. Design around dynamic schema! @doanduyhai 32 After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs
  • 33. Design around dynamic schema! @doanduyhai 33 After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs Cassandra source-code to the rescue
  • 34. Design around dynamic schema! @doanduyhai 34 public class CompressedStreamReader extends StreamReader { … @Override public SSTableWriter read(ReadableByteChannel channel) throws IOException { … Pair<String, String> kscf = Schema.instance.getCF(cfId); ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right); NPE here
  • 35. Design around dynamic schema! @doanduyhai 35 The truth is • the devs dynamically drop & recreate table every day • dynamic schema is in the core of their design Example: DROP TABLE catalog_127_20140613; CREATE TABLE catalog_127_20140614( … );
  • 36. Design around dynamic schema! @doanduyhai 36 Failure sequence n1 n2 n4 n3 catalog_x_y catalog_x_y catalog_x_y catalog_x_y 4 1 2 3 5 6
  • 37. Design around dynamic schema! @doanduyhai 37 Failure sequence n1 n2 n4 n3 catalog_x_y catalog_x_y catalog_x_y catalog_x_y 4 1 2 3 5 6 catalog_x_z catalog_x_z catalog_x_z catalog_x_z
  • 38. Design around dynamic schema! @doanduyhai catalog_x_y ???? 38 Failure sequence n1 n2 n4 n3 4 1 2 3 5 6 catalog_x_z catalog_x_z catalog_x_z catalog_x_z
  • 39. Design around dynamic schema! @doanduyhai 39 Consequences • joining node got always stuck • à cannot extend cluster • à changing code takes time • à production in danger (no space left) • à sacrify analytics data to survive
  • 40. Design around dynamic schema! @doanduyhai 40 Nutshell • dynamic schema change as normal operations is not recommended • concurrent schema AND topology change is an anti-pattern
  • 41. Failure level! @doanduyhai 41 ☠☠☠☠
  • 42. ! " ! Q & R
  • 43. Nice Examples! Rate limiting! Anti Fraud! Account Validation! Sensor Data Timeseries!
  • 44. Rate limiting! @doanduyhai 44 Start-up company, reset password feature 1) /password/reset 2) SMS with token A0F83E63DB935465CE73DFE…. Phone number Random token 3) /password/new/<token>/<password>
  • 45. Rate limiting! @doanduyhai 45 Problem 1 • account created with premium phone number
  • 46. Rate limiting! @doanduyhai 46 Problem 1 • account created with premium phone number • /password/reset x 100
  • 47. Rate limiting! @doanduyhai 47 « money, money, money, give money, in the richman’s world » $$$
  • 48. Rate limiting! @doanduyhai 48 Problem 2 • massive hack
  • 49. Rate limiting! @doanduyhai 49 Problem 2 • massive hack • 106 /password/reset calls from few accounts
  • 50. Rate limiting! @doanduyhai 50 Problem 2 • massive hack • 106 /password/reset calls from few accounts • SMS messages are cheap
  • 51. Rate limiting! @doanduyhai 51 Problem 2 • ☞ but not at the 106/per user/per day scale
  • 52. Rate limiting! @doanduyhai 52 Solution • premium phone number ☞ Google libphonenumber
  • 53. Rate limiting! @doanduyhai 53 Solution • premium phone number ☞ Google libphonenumber • massive hack ☞ rate limiting with Cassandra
  • 54. Cassandra Time To Live! @doanduyhai 54 Time to live • built-in feature • insert data with a TTL in sec • expires server-side automatically • ☞ use as sliding-window
  • 55. Rate limiting in action! @doanduyhai 55 Implementation • threshold = max 3 reset password per sliding 24h
  • 56. Rate limiting in action! @doanduyhai 56 Implementation • when /password/reset called • check threshold • reached ☞ error message/ignore • not reached ☞ log the attempt with TTL = 86400
  • 58. Anti Fraud! @doanduyhai 58 Real story • many special offers available • 30 mins international calls (50 countries) • unlimited land-line calls to 5 countries • …
  • 59. Anti Fraud! @doanduyhai 59 Real story • each offer has a duration (week/month/year) • only one offer active at a time
  • 60. Anti Fraud! @doanduyhai 60 Cassandra TTL • check for existing offer before SELECT count(*) FROM user_special_offer WHERE login = ‘jdoe’;
  • 61. Anti Fraud! @doanduyhai 61 Cassandra TTL • then grant new offer INSERT INTO user_special_offer(login, offer_code, …) VALUES(‘jdoe’, ’30_mins_international’,…) USING TTL <offer_duration>;
  • 62. Account Validation! @doanduyhai 62 Requirement • user creates new account • sends sms/email link with token to validate account • 10 days to validate
  • 63. Account Validation! @doanduyhai 63 How to ? • create account with 10 days TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
  • 64. Account Validation! @doanduyhai 64 How to ? • create random token for validation with 10 days TTL INSERT INTO account_validation(token, login, name, age) VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
  • 65. Account Validation! @doanduyhai 65 On token validation • check token exist & retrieve user details SELECT login, name, age FROM account_validation WHERE token = ‘A0F83E63DB935465CE73DFE…’; • re-insert durably user details without TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
  • 66. Sensor Data Timeseries! @doanduyhai 66 Requirements • lots of sensors (103 – 106) • medium to high insertion rate (0.1 – 10/secs) • keep good load balancing • fast read & write
  • 67. Bucketing! @doanduyhai 67 CREATE TABLE sensor_data ( sensor_id text, date timestamp, raw_data blob, PRIMARY KEY(sensor_id, date)); sensor_id date1 date2 date3 date4 … blob1 blob2 blob3 blob4 …
  • 68. Bucketing! @doanduyhai 68 Problems: • limit of 2.109 physical columns • bad load balancing (1 sensor = 1 node) • wide row spans over many files sensor_id date1 date2 date3 date4 … blob1 blob2 blob3 blob4 …
  • 69. Bucketing! @doanduyhai 69 Idea: • composite partition key: sensor_id:date_bucket • tunable date granularity: per hour/per day/per month … CREATE TABLE sensor_data ( sensor_id text, date_bucket int, //format YYYYMMdd date timestamp, raw_data blob, PRIMARY KEY((sensor_id, date_bucket), date));
  • 70. Bucketing! Idea: • composite partition key: sensor_id:date_bucket • tunable date granularity: per hour/per day/per month … @doanduyhai 70 sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 … Buckets
  • 71. Bucketing! @doanduyhai 71 Advantage: • distribute load: 1 bucket = 1 node • limit partition width (max x columns per bucket) Buckets sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 …
  • 72. Bucketing! @doanduyhai 72 But how can I select raw data between 14:45 and 15:10 ? 14:45 à ? 15:00 à 15:10 sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 …
  • 73. Bucketing! Solution • use IN clause on partition key component • with range condition on date column ☞ date column should be monotonic function (increasing/decreasing) @doanduyhai 73 SELECT * FROM sensor_data WHERE sensor_id = xxx AND date_bucket IN (2014091014 , 2014091015) AND date >= ‘2014-09-10 14:45:00.000‘ AND date <= ‘2014-09-10 15:10:00.000‘
  • 74. Bucketing Caveats! @doanduyhai 74 IN clause for #partition is not silver bullet ! • use scarcely • keep cardinality low (≤ 5) n1 n2 n3 n4 n5 n6 n7 coordinator n8 sensor_id:2014091014 sensor_id:2014091015
  • 75. Bucketing Caveats! @doanduyhai 75 IN clause for #partition is not silver bullet ! • use scarcely • keep cardinality low (≤ 5) • prefer // async queries • ease of query vs perf n1 n2 n3 n4 n5 n6 n7 n8 Async client sensor_id:2014091014 sensor_id:2014091015
  • 76. ! " ! Q & R
  • 77. Cassandra developers! @doanduyhai 77 Rule n°1 If you don’t know, ask for help (me, Cassandra ML, PlanetCassandra, stackoverflow, …) !
  • 78. Cassandra developers! @doanduyhai 78 Rule n°2 Do not blind-guess troubleshooting alone in production (ask for help, see rule n°1) !
  • 79. Cassandra developers! @doanduyhai 79 Rule n°3 Share with the community (your best use-cases … and worst failures) ! http://planetcassandra.org/
  • 80. Thank You @doanduyhai duy_hai.doan@datastax.com