SlideShare ist ein Scribd-Unternehmen logo
1 von 64
Downloaden Sie, um offline zu lesen
June 19, 2013
#Cassandra13
Axel Liljencrantz
liljencrantz@spotify.com
How not to use
Cassandra
#Cassandra13
The Spotify backend
#Cassandra13
The Spotify backend
•  Around 3000 servers in 3 datacenters
•  Volumes
o  We have ~ 12 soccer fields of music
o  Streaming ~ 4 Wikipedias/second
o  ~ 24 000 000 active users
#Cassandra13
The Spotify backend
•  Specialized software powering Spotify
o  ~ 70 services
o  Mostly Python, some Java
o  Small, simple services responsible for single task
#Cassandra13
Storage needs
•  Used to be a pure PostgreSQL shop
•  Postgres is awesome, but...
o  Poor cross-site replication support
o  Write master failure requires manual intervention
o  Sharding throws most relational advantages out the
window
#Cassandra13
Cassandra @ Spotify
•  We started using Cassandra ~2 years ago
•  About a dozen services use it by now
•  Back then, there was little information about how to
design efficient, scalable storage schemas for
Cassandra
#Cassandra13
Cassandra @ Spotify
•  We started using Cassandra ~2 years ago
•  About a dozen services use it by now
•  Back then, there was little information about how to
design efficient, scalable storage schemas for
Cassandra
•  So we screwed up
•  A lot
#Cassandra13
How to misconfigure Cassandra
#Cassandra13
Read repair
•  Repair from outages during regular read operation
•  With RR, all reads request hash digests from all nodes
•  Result is still returned as soon as enough nodes have
replied
•  If there is a mismatch, perform a repair
#Cassandra13
Read repair
•  Useful factoid: Read repair is performed across all data
centers
•  So in a multi-DC setup, all reads will result in requests being
sent to every data center
•  We've made this mistake a bunch of times
•  New in 1.1: dclocal_read_repair
#Cassandra13
Row cache
•  Cassandra can be configured to cache entire data rows in
RAM
•  Intended as a memcache alternative
•  Lets enable it. What's the worst that could happen, right?
#Cassandra13
Row cache
NO!
•  Only stores full rows
•  All cache misses are silently promoted to full row slices
•  All writes invalidate entire row
•  Don't use unless you understand all use cases
#Cassandra13
Compression
•  Cassandra supports transparent compression of all data
•  Compression algorithm (snappy) is super fast
•  So you can just enable it and everything will be better, right?
#Cassandra13
Compression
•  Cassandra supports transparent compression of all data
•  Compression algorithm (snappy) is super fast
•  So you can just enable it and everything will be better, right?
•  NO!
•  Compression disables a bunch of fast paths, slowing down
fast reads
#Cassandra13
How to misuse Cassandra
#Cassandra13
Performance worse over time
•  A freshly loaded Cassandra cluster is usually snappy
•  But when you keep writing to the same columns over for a
long time, performance goes down
•  We've seen clusters where reads touch a dozen SSTables
on average
•  nodetool cfhistograms is your friend
#Cassandra13
Performance worse over time
•  CASSANDRA-5514
•  Every SSTable stores first/last column of SSTable
•  Time series-like data is effectively partitioned
#Cassandra13
Few cross continent clusters
•  Few cross continent Cassandra users
•  We are kind of on our own when it comes to some problems
•  CASSANDRA-5148
•  Disable TCP nodelay
•  Reduced packet count by 20 %
#Cassandra13
How not to upgrade Cassandra
#Cassandra13
How not to upgrade Cassandra
•  Very few total cluster outages
o  Clusters have been up and running since the early
0.7 days, been rolling upgraded, expanded, full
hardware replacements etc.
•  Never lost any data!
o  No matter how spectacularly Cassandra fails, it has
never written bad data
o  Immutable SSTables FTW
#Cassandra13
Upgrade from 0.7 to 0.8
•  This was the first big upgrade we did, 0.7.4 ⇾ 0.8.6
•  Everyone claimed rolling upgrade would work
o  It did not
•  One would expect 0.8.6 to have this fixed
•  Patched Cassandra and rolled it a day later
•  Takeaways:
o  ALWAYS try rolling upgrades in a testing environment
o  Don't believe what people on the Internet tell you
#Cassandra13
Upgrade 0.8 to 1.0
•  We tried upgrading in test env, worked fine
•  Worked fine in production...
•  Except the last cluster
•  All data gone
#Cassandra13
Upgrade 0.8 to 1.0
•  We tried upgrading in test env, worked fine
•  Worked fine in production...
•  Except the last cluster
•  All data gone
•  Many keys per SSTable ⇾ corrupt bloom filters
•  Made Cassandra think it didn't have any keys
•  Scrub data ⇾ fixed
•  Takeaway: ALWAYS test upgrades using production data
#Cassandra13
Upgrading 1.0 to 1.1
•  After the previous upgrades, we did all the tests with
production data and everything worked fine...
•  Until we redid it in production, and we had reports of
missing rows
•  Scrub ⇾ restart made them reappear
•  This was in December, have not been able to reproduce
•  PEBKAC?
•  Takeaway: ?
#Cassandra13
How not to deal with large clusters
#Cassandra13
Coordinator
•  Coordinator performs partitioning, passes on request to
the right nodes
•  Merges all responses
#Cassandra13
What happens if one node is slow?
#Cassandra13
What happens if one node is slow?
Many reasons for temporary slowness:
•  Bad raid battery
•  Sudden bursts of compaction/repair
•  Bursty load
•  Net hiccup
•  Major GC
•  Reality
#Cassandra13
What happens if one node is slow?
•  Coordinator has a request queue
•  If a node goes down completely, gossip will notice
quickly and drop the node
•  But what happens if a node is just super slow?
#Cassandra13
What happens if one node is slow?
•  Gossip doesn't react quickly to slow nodes
•  The request queue for the coordinator on every node in
the cluster fills up
•  And the entire cluster stops accepting requests
#Cassandra13
What happens if one node is slow?
•  Gossip doesn't react quickly to slow nodes
•  The request queue for the coordinator on every node in
the cluster fills up
•  And the entire cluster stops accepting requests
•  No single point of failure?
#Cassandra13
What happens if one node is slow?
•  Solution: Partitioner awareness in client
•  Max 3 nodes go down
•  Available in Astyanax
#Cassandra13
How not to delete data
#Cassandra13
Deleting data
How is data deleted?
•  SSTables are immutable, we can't remove the data
•  Cassandra creates tombstones for deleted data
•  Tombstones are versioned the same way as any other
write
#Cassandra13
How not to delete data
Do tombstones ever go away?
•  During compactions, tombstones can get merged into
SStables that hold the original data, making the
tombstones redundant
•  Once a tombstone is the only value for a specific
column, the tombstone can go away
•  Still need grace time to handle node downtime
#Cassandra13
How not to delete data
•  Tombstones can only be deleted once all non-
tombstone values have been deleted
•  If you're using SizeTiered compaction, 'old' rows will
rarely get deleted
#Cassandra13
How not to delete data
•  Tombstones are a problem even when using levelled
compaction
•  In theory, 90 % of all rows should live in a single
SSTable
•  In production, we've found that 20 - 50 % of all reads hit
more than one SSTable
•  Frequently updated columns will exist in many levels,
causing tombstones to stick around
#Cassandra13
How not to delete data
•  Deletions are messy
•  Unless you perform major compactions, tombstones will
rarely get deleted from «popular» rows
•  Avoid schemas that delete data!
#Cassandra13
TTL:ed data
•  Cassandra supports TTL:ed data
•  Once TTL:ed data expires, it should just be compacted
away, right?
•  We know we don't need the data anymore, no need for
a tombstone, so it should be fast, right?
#Cassandra13
TTL:ed data
•  Cassandra supports TTL:ed data
•  Once TTL:ed data expires, it should just be compacted
away, right?
•  We know we don't need the data anymore, no need for
a tombstone, so it should be fast, right?
•  Noooooo...
•  (Overwritten data could theoretically bounce back)
#Cassandra13
TTL:ed data
•  CASSANDRA-5228
•  Drop entire sstables when all columns are expired
#Cassandra13
The Playlist service
Our most complex service
•  ~1 billion playlists
•  40 000 reads per second
•  22 TB of compressed data
#Cassandra13
The Playlist service
Our old playlist system had many problems:
•  Stored data across hundreds of millions of files, making
backup process really slow.
•  Home brewed replication model that didn't work very
well
•  Frequent downtimes, huge scalability problems
#Cassandra13
The Playlist service
Our old playlist system had many problems:
•  Stored data across hundreds of millions of files, making
backup process really slow.
•  Home brewed replication model that didn't work very
well
•  Frequent downtimes, huge scalability problems
•  Perfect test case for
Cassandra!
#Cassandra13
Playlist data model
•  Every playlist is a revisioned object
•  Think of it like a distributed versioning system
•  Allows concurrent modification on multiple offlined clients
•  We even have an automatic merge conflict resolver that
works really well!
•  That's actually a really useful feature
#Cassandra13
Playlist data model
•  Every playlist is a revisioned object
•  Think of it like a distributed versioning system
•  Allows concurrent modification on multiple offlined clients
•  We even have an automatic merge conflict resolver that
works really well!
•  That's actually a really useful feature said no one ever
#Cassandra13
Playlist data model
•  Sequence of changes
•  The changes are the authoritative data
•  Everything else is optimization
•  Cassandra pretty neat for storing this kind of stuff
•  Can use consistency level ONE safely
#Cassandra13
#Cassandra13
Tombstone hell
Noticed that HEAD requests took several seconds for some
lists
Easy to reproduce in cassandra-cli
•  get playlist_head[utf8('spotify:user...')];
•  1-15 seconds latency - should be < 0.1 s
Copy head SSTables to development machine for
investigation
Cassandra tool sstabletojson showed that the row contained
600 000 tombstones!
#Cassandra13
Tombstone hell
We expected tombstones would be deleted after 30 days
•  Nope, all tombstones since 1.5 years ago were there
Revelation: Rows existing in 4+ SSTables never have
tombstones deleted during minor compactions
•  Frequently updated lists exists in nearly all SSTables
Solution:
Major compaction (CF size cut in half)
#Cassandra13
Zombie tombstones
•  Ran major compaction manually on all nodes during a
few days.
•  All seemed well...
•  But a week later, the same lists took several seconds
again‽‽‽
#Cassandra13
Repair vs major compactions
A repair between the major compactions "resurrected" the
tombstones :(
New solution:
•  Repairs during Monday-Friday
•  Major compaction Saturday-Sunday
A (by now) well-known Cassandra anti-pattern:
Don't use Cassandra to store queues
#Cassandra13
Cassandra counters
•  There are lots of places in the Spotify UI where we
count things
•  # of followers of a playlist
•  # of followers of an artist
•  # of times a song has been played
•  Cassandra has a feature called distributed counters that
sounds suitable
•  Is this awesome?
#Cassandra13
Cassandra counters
•  They've actually worked reasonably well for us.
#Cassandra13
Lessons
•  There are still various esoteric problems with large scale
Cassandra installations
•  Debugging them is interesting
•  If you agree with the above statements, you should
totally come work with us
#Cassandra13
Lessons
•  Cassandra read performance is heavily dependent on
the temporal patterns of your writes
•  Cassandra is initially snappy, but various write patterns
make read performance slowly decrease
•  Super hard to perform realistic benchmarks
#Cassandra13
Lessons
•  Avoid repeatedly writing data to the same row over very
long spans of time
•  If you're working at scale, you'll need to know how
Cassandra works under the hood
•  nodetool cfhistograms is your friend
June 19, 2013
#Cassandra13
Questions?
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz

Weitere ähnliche Inhalte

Was ist angesagt?

Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationDataWorks Summit
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...InfluxData
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesDoKC
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Ozone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsOzone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsDataWorks Summit
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxData
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeErik Krogen
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...
Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...
Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...InfluxData
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScyllaDB
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Improving Data Locality for Spark Jobs on Kubernetes Using Alluxio
Improving Data Locality for Spark Jobs on Kubernetes Using AlluxioImproving Data Locality for Spark Jobs on Kubernetes Using Alluxio
Improving Data Locality for Spark Jobs on Kubernetes Using AlluxioAlluxio, Inc.
 
MariaDB Performance Tuning and Optimization
MariaDB Performance Tuning and OptimizationMariaDB Performance Tuning and Optimization
MariaDB Performance Tuning and OptimizationMariaDB plc
 

Was ist angesagt? (20)

Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on Kubernetes
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra compaction
Cassandra compactionCassandra compaction
Cassandra compaction
 
Ozone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objectsOzone: scaling HDFS to trillions of objects
Ozone: scaling HDFS to trillions of objects
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...
Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...
Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query...
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
 
Improving Data Locality for Spark Jobs on Kubernetes Using Alluxio
Improving Data Locality for Spark Jobs on Kubernetes Using AlluxioImproving Data Locality for Spark Jobs on Kubernetes Using Alluxio
Improving Data Locality for Spark Jobs on Kubernetes Using Alluxio
 
MariaDB Performance Tuning and Optimization
MariaDB Performance Tuning and OptimizationMariaDB Performance Tuning and Optimization
MariaDB Performance Tuning and Optimization
 

Andere mochten auch

Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsDave Gardner
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Modelebenhewitt
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning CassandraDave Gardner
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLRyu Kobayashi
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
Migration from MySQL to Cassandra for millions of active users
Migration from MySQL to Cassandra for millions of active usersMigration from MySQL to Cassandra for millions of active users
Migration from MySQL to Cassandra for millions of active usersAndrey Panasyuk
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012aaronmorton
 
Debunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra BackupDebunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra BackupImanis Data
 
So you think you know REST - DPC11
So you think you know REST - DPC11So you think you know REST - DPC11
So you think you know REST - DPC11Evert Pot
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixAcunu
 
Cassandra Instalacion y Utilizacion
Cassandra Instalacion y UtilizacionCassandra Instalacion y Utilizacion
Cassandra Instalacion y UtilizacionLeandro Carrera
 
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache CassandraCassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache CassandraDataStax Academy
 
High performance queues with Cassandra
High performance queues with CassandraHigh performance queues with Cassandra
High performance queues with CassandraMikalai Alimenkou
 

Andere mochten auch (20)

Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Tombstones and Compaction
Tombstones and CompactionTombstones and Compaction
Tombstones and Compaction
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Migration from MySQL to Cassandra for millions of active users
Migration from MySQL to Cassandra for millions of active usersMigration from MySQL to Cassandra for millions of active users
Migration from MySQL to Cassandra for millions of active users
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012
 
Debunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra BackupDebunking Common Myths of Cassandra Backup
Debunking Common Myths of Cassandra Backup
 
So you think you know REST - DPC11
So you think you know REST - DPC11So you think you know REST - DPC11
So you think you know REST - DPC11
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 
Cassandra Instalacion y Utilizacion
Cassandra Instalacion y UtilizacionCassandra Instalacion y Utilizacion
Cassandra Instalacion y Utilizacion
 
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache CassandraCassandra Day SV 2014: Basic Operations with Apache Cassandra
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
 
Cassandra design patterns
Cassandra design patternsCassandra design patterns
Cassandra design patterns
 
High performance queues with Cassandra
High performance queues with CassandraHigh performance queues with Cassandra
High performance queues with Cassandra
 

Ähnlich wie C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz

From 100s to 100s of Millions
From 100s to 100s of MillionsFrom 100s to 100s of Millions
From 100s to 100s of MillionsErik Onnen
 
Riak at Posterous
Riak at PosterousRiak at Posterous
Riak at Posterouscapotej
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core ConceptsJon Haddad
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraMichael Kjellman
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanDataStax Academy
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoJon Haddad
 
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
 
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraChristopher Batey
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsJeff Jirsa
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraJon Haddad
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantJoshua Goldbard
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Jason Brown
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06jimbojsb
 
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...Ontico
 

Ähnlich wie C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz (20)

From 100s to 100s of Millions
From 100s to 100s of MillionsFrom 100s to 100s of Millions
From 100s to 100s of Millions
 
Riak at Posterous
Riak at PosterousRiak at Posterous
Riak at Posterous
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
 
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
 
Cassandra at scale
Cassandra at scaleCassandra at scale
Cassandra at scale
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
 
Cassandra Silicon Valley
Cassandra Silicon ValleyCassandra Silicon Valley
Cassandra Silicon Valley
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Using Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series WorkloadsUsing Time Window Compaction Strategy For Time Series Workloads
Using Time Window Compaction Strategy For Time Series Workloads
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06Austin NoSQL 2011-07-06
Austin NoSQL 2011-07-06
 
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...Frontera распределенный робот для обхода веба в больших объемах / Александр С...
Frontera распределенный робот для обхода веба в больших объемах / Александр С...
 

Mehr von DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 

Mehr von DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Kürzlich hochgeladen

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Kürzlich hochgeladen (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz

  • 1. June 19, 2013 #Cassandra13 Axel Liljencrantz liljencrantz@spotify.com How not to use Cassandra
  • 3. #Cassandra13 The Spotify backend •  Around 3000 servers in 3 datacenters •  Volumes o  We have ~ 12 soccer fields of music o  Streaming ~ 4 Wikipedias/second o  ~ 24 000 000 active users
  • 4. #Cassandra13 The Spotify backend •  Specialized software powering Spotify o  ~ 70 services o  Mostly Python, some Java o  Small, simple services responsible for single task
  • 5. #Cassandra13 Storage needs •  Used to be a pure PostgreSQL shop •  Postgres is awesome, but... o  Poor cross-site replication support o  Write master failure requires manual intervention o  Sharding throws most relational advantages out the window
  • 6. #Cassandra13 Cassandra @ Spotify •  We started using Cassandra ~2 years ago •  About a dozen services use it by now •  Back then, there was little information about how to design efficient, scalable storage schemas for Cassandra
  • 7. #Cassandra13 Cassandra @ Spotify •  We started using Cassandra ~2 years ago •  About a dozen services use it by now •  Back then, there was little information about how to design efficient, scalable storage schemas for Cassandra •  So we screwed up •  A lot
  • 9. #Cassandra13 Read repair •  Repair from outages during regular read operation •  With RR, all reads request hash digests from all nodes •  Result is still returned as soon as enough nodes have replied •  If there is a mismatch, perform a repair
  • 10. #Cassandra13 Read repair •  Useful factoid: Read repair is performed across all data centers •  So in a multi-DC setup, all reads will result in requests being sent to every data center •  We've made this mistake a bunch of times •  New in 1.1: dclocal_read_repair
  • 11. #Cassandra13 Row cache •  Cassandra can be configured to cache entire data rows in RAM •  Intended as a memcache alternative •  Lets enable it. What's the worst that could happen, right?
  • 12. #Cassandra13 Row cache NO! •  Only stores full rows •  All cache misses are silently promoted to full row slices •  All writes invalidate entire row •  Don't use unless you understand all use cases
  • 13. #Cassandra13 Compression •  Cassandra supports transparent compression of all data •  Compression algorithm (snappy) is super fast •  So you can just enable it and everything will be better, right?
  • 14. #Cassandra13 Compression •  Cassandra supports transparent compression of all data •  Compression algorithm (snappy) is super fast •  So you can just enable it and everything will be better, right? •  NO! •  Compression disables a bunch of fast paths, slowing down fast reads
  • 16. #Cassandra13 Performance worse over time •  A freshly loaded Cassandra cluster is usually snappy •  But when you keep writing to the same columns over for a long time, performance goes down •  We've seen clusters where reads touch a dozen SSTables on average •  nodetool cfhistograms is your friend
  • 17. #Cassandra13 Performance worse over time •  CASSANDRA-5514 •  Every SSTable stores first/last column of SSTable •  Time series-like data is effectively partitioned
  • 18. #Cassandra13 Few cross continent clusters •  Few cross continent Cassandra users •  We are kind of on our own when it comes to some problems •  CASSANDRA-5148 •  Disable TCP nodelay •  Reduced packet count by 20 %
  • 19. #Cassandra13 How not to upgrade Cassandra
  • 20. #Cassandra13 How not to upgrade Cassandra •  Very few total cluster outages o  Clusters have been up and running since the early 0.7 days, been rolling upgraded, expanded, full hardware replacements etc. •  Never lost any data! o  No matter how spectacularly Cassandra fails, it has never written bad data o  Immutable SSTables FTW
  • 21. #Cassandra13 Upgrade from 0.7 to 0.8 •  This was the first big upgrade we did, 0.7.4 ⇾ 0.8.6 •  Everyone claimed rolling upgrade would work o  It did not •  One would expect 0.8.6 to have this fixed •  Patched Cassandra and rolled it a day later •  Takeaways: o  ALWAYS try rolling upgrades in a testing environment o  Don't believe what people on the Internet tell you
  • 22. #Cassandra13 Upgrade 0.8 to 1.0 •  We tried upgrading in test env, worked fine •  Worked fine in production... •  Except the last cluster •  All data gone
  • 23. #Cassandra13 Upgrade 0.8 to 1.0 •  We tried upgrading in test env, worked fine •  Worked fine in production... •  Except the last cluster •  All data gone •  Many keys per SSTable ⇾ corrupt bloom filters •  Made Cassandra think it didn't have any keys •  Scrub data ⇾ fixed •  Takeaway: ALWAYS test upgrades using production data
  • 24. #Cassandra13 Upgrading 1.0 to 1.1 •  After the previous upgrades, we did all the tests with production data and everything worked fine... •  Until we redid it in production, and we had reports of missing rows •  Scrub ⇾ restart made them reappear •  This was in December, have not been able to reproduce •  PEBKAC? •  Takeaway: ?
  • 25. #Cassandra13 How not to deal with large clusters
  • 26. #Cassandra13 Coordinator •  Coordinator performs partitioning, passes on request to the right nodes •  Merges all responses
  • 27. #Cassandra13 What happens if one node is slow?
  • 28. #Cassandra13 What happens if one node is slow? Many reasons for temporary slowness: •  Bad raid battery •  Sudden bursts of compaction/repair •  Bursty load •  Net hiccup •  Major GC •  Reality
  • 29. #Cassandra13 What happens if one node is slow? •  Coordinator has a request queue •  If a node goes down completely, gossip will notice quickly and drop the node •  But what happens if a node is just super slow?
  • 30. #Cassandra13 What happens if one node is slow? •  Gossip doesn't react quickly to slow nodes •  The request queue for the coordinator on every node in the cluster fills up •  And the entire cluster stops accepting requests
  • 31. #Cassandra13 What happens if one node is slow? •  Gossip doesn't react quickly to slow nodes •  The request queue for the coordinator on every node in the cluster fills up •  And the entire cluster stops accepting requests •  No single point of failure?
  • 32. #Cassandra13 What happens if one node is slow? •  Solution: Partitioner awareness in client •  Max 3 nodes go down •  Available in Astyanax
  • 33. #Cassandra13 How not to delete data
  • 34. #Cassandra13 Deleting data How is data deleted? •  SSTables are immutable, we can't remove the data •  Cassandra creates tombstones for deleted data •  Tombstones are versioned the same way as any other write
  • 35. #Cassandra13 How not to delete data Do tombstones ever go away? •  During compactions, tombstones can get merged into SStables that hold the original data, making the tombstones redundant •  Once a tombstone is the only value for a specific column, the tombstone can go away •  Still need grace time to handle node downtime
  • 36. #Cassandra13 How not to delete data •  Tombstones can only be deleted once all non- tombstone values have been deleted •  If you're using SizeTiered compaction, 'old' rows will rarely get deleted
  • 37. #Cassandra13 How not to delete data •  Tombstones are a problem even when using levelled compaction •  In theory, 90 % of all rows should live in a single SSTable •  In production, we've found that 20 - 50 % of all reads hit more than one SSTable •  Frequently updated columns will exist in many levels, causing tombstones to stick around
  • 38. #Cassandra13 How not to delete data •  Deletions are messy •  Unless you perform major compactions, tombstones will rarely get deleted from «popular» rows •  Avoid schemas that delete data!
  • 39. #Cassandra13 TTL:ed data •  Cassandra supports TTL:ed data •  Once TTL:ed data expires, it should just be compacted away, right? •  We know we don't need the data anymore, no need for a tombstone, so it should be fast, right?
  • 40. #Cassandra13 TTL:ed data •  Cassandra supports TTL:ed data •  Once TTL:ed data expires, it should just be compacted away, right? •  We know we don't need the data anymore, no need for a tombstone, so it should be fast, right? •  Noooooo... •  (Overwritten data could theoretically bounce back)
  • 41. #Cassandra13 TTL:ed data •  CASSANDRA-5228 •  Drop entire sstables when all columns are expired
  • 42. #Cassandra13 The Playlist service Our most complex service •  ~1 billion playlists •  40 000 reads per second •  22 TB of compressed data
  • 43. #Cassandra13 The Playlist service Our old playlist system had many problems: •  Stored data across hundreds of millions of files, making backup process really slow. •  Home brewed replication model that didn't work very well •  Frequent downtimes, huge scalability problems
  • 44. #Cassandra13 The Playlist service Our old playlist system had many problems: •  Stored data across hundreds of millions of files, making backup process really slow. •  Home brewed replication model that didn't work very well •  Frequent downtimes, huge scalability problems •  Perfect test case for Cassandra!
  • 45. #Cassandra13 Playlist data model •  Every playlist is a revisioned object •  Think of it like a distributed versioning system •  Allows concurrent modification on multiple offlined clients •  We even have an automatic merge conflict resolver that works really well! •  That's actually a really useful feature
  • 46. #Cassandra13 Playlist data model •  Every playlist is a revisioned object •  Think of it like a distributed versioning system •  Allows concurrent modification on multiple offlined clients •  We even have an automatic merge conflict resolver that works really well! •  That's actually a really useful feature said no one ever
  • 47. #Cassandra13 Playlist data model •  Sequence of changes •  The changes are the authoritative data •  Everything else is optimization •  Cassandra pretty neat for storing this kind of stuff •  Can use consistency level ONE safely
  • 49. #Cassandra13 Tombstone hell Noticed that HEAD requests took several seconds for some lists Easy to reproduce in cassandra-cli •  get playlist_head[utf8('spotify:user...')]; •  1-15 seconds latency - should be < 0.1 s Copy head SSTables to development machine for investigation Cassandra tool sstabletojson showed that the row contained 600 000 tombstones!
  • 50. #Cassandra13 Tombstone hell We expected tombstones would be deleted after 30 days •  Nope, all tombstones since 1.5 years ago were there Revelation: Rows existing in 4+ SSTables never have tombstones deleted during minor compactions •  Frequently updated lists exists in nearly all SSTables Solution: Major compaction (CF size cut in half)
  • 51. #Cassandra13 Zombie tombstones •  Ran major compaction manually on all nodes during a few days. •  All seemed well... •  But a week later, the same lists took several seconds again‽‽‽
  • 52. #Cassandra13 Repair vs major compactions A repair between the major compactions "resurrected" the tombstones :( New solution: •  Repairs during Monday-Friday •  Major compaction Saturday-Sunday A (by now) well-known Cassandra anti-pattern: Don't use Cassandra to store queues
  • 53. #Cassandra13 Cassandra counters •  There are lots of places in the Spotify UI where we count things •  # of followers of a playlist •  # of followers of an artist •  # of times a song has been played •  Cassandra has a feature called distributed counters that sounds suitable •  Is this awesome?
  • 54. #Cassandra13 Cassandra counters •  They've actually worked reasonably well for us.
  • 55. #Cassandra13 Lessons •  There are still various esoteric problems with large scale Cassandra installations •  Debugging them is interesting •  If you agree with the above statements, you should totally come work with us
  • 56. #Cassandra13 Lessons •  Cassandra read performance is heavily dependent on the temporal patterns of your writes •  Cassandra is initially snappy, but various write patterns make read performance slowly decrease •  Super hard to perform realistic benchmarks
  • 57. #Cassandra13 Lessons •  Avoid repeatedly writing data to the same row over very long spans of time •  If you're working at scale, you'll need to know how Cassandra works under the hood •  nodetool cfhistograms is your friend