This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.
11. Social Network and Apps
• Spotify
• Facebook
• Twitter
• YouTube
• Ustream
• Nico nico
• …
12. The San Francisco Office
• The majority of our services use Cassandra
• We started with 0 customers, currently 60 million
active for PS3 and PS4
• Growth rate over Christmas was 20% on PS4!
• All of our clusters are in double digits number of
nodes
13. Some Stats
• Dozens of Gb per second in data transfer
• 100s of TB of raw data
• Millions of reads and writes per second
• Complex functionality on APIs
15. • Started with 3
developers
• 60+ Column Families
• Thrift / Astyanax
• What’s New
• Players Met
• Recent Activities
Profile
Live Details
• Title news feed
• In game posts
Activity Feed
16. Challenges
• Data Distribution
• Volatile Data
• Performance
• Real time privacy
• Data Retention
• Unnecessary reads
• Optimize for data size transfer
• Avoid tombstone hell, adjust
gc_period and compation
threshold
• Test Compaction Strategy,
consistency level, etc
• Optimize for reads
• Design with ttl in mind
• Avoid denormalization
Activity Feed
17.
18. Why aggregate?
• Single read for any user and get all its stories
• Condensed stories
• Paging + Real time privacy = Blocks
19. Vnodes
• Very unstable when we launched
• Flapping when adding new nodes
• Easy to manage
• Our current strategy
• Over time stabilized
21. • 4 developers
• 20+ Tables
• CQL / DataStax
• Community Wall
• Now Playing
• Community Members
Communities
22. Challenges
• IN clause & Astyanax
• Small dataset could kill
the cluster
• Volatile data
• Multi-level reads
• Counters
• Use DataStax
• Use something else to
store small datasets
• Adjust gc_period
• Don’t do that!
• Use them when they can
be innacurate
Communities
23. Cassandra + Cache = $$$
• Communities was the first strong user of Redis
• Most features offload significant work from
Cassandra to ehCache and Redis
• Activity caches stories on Redis
We stared this project with only 3 developers, actually 2 and half developers because one of them was shared to other projects. This is a project that currently has