Ooyala has been using Apache Cassandra since version 0.4.Their data ingest volume has exploded since 0.4 and Cassandra has scaled along with it. In this webinar, Al will share lessons that he has learned across an array of topics from an operational perspective including how to manage, tune, and scale Cassandra in a production environment.
Speaker: Al Tobey, Tech Lead, Compute and Data Services at Ooyala
Al Tobey is Tech Lead of the Compute and Data services team at Ooyala. His team develops and operates Ooyala's internal big data platform, consisting of Apache Cassandra, Hadoop, and internally developed tools. When not in front of a computer, Al is a father, husband, and trombonist.
Unraveling Multimodality with Large Language Models.pdf
Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optimization
1. PRACTICE MAKES PERFECT:
EXTREME CASSANDRA OPTIMIZATION
@AlTobey
Tech Lead, Compute and Data Services
#CASSANDRA
Thursday, August 8, 13
2. 2
⁍ About me / Ooyala
⁍ How not to manage your Cassandra clusters
⁍ Make it suck less
⁍ How to be a heuristician
⁍ Tools of the trade
⁍ More Settings
⁍ Show & Tell
#CASSANDRA
Outline
Thursday, August 8, 13
3. 3
⁍ Tech Lead, Compute and Data Services at Ooyala, Inc.
⁍ C&D team is #devops: 3 ops, 3 eng, me
⁍ C&D team is #bdaas: Big Data as a Service
⁍ ~100 Cassandra nodes, expanding quickly
⁍ Obligatory: we’re hiring
#CASSANDRA
@AlTobey
Thursday, August 8, 13
4. 4
⁍ Founded in 2007
⁍ 230+ employees globally
⁍ 200M unique users,110+ countries
⁍ Over 1 billion videos played per month
⁍ Over 2 billion analytic events per day
#CASSANDRA
Ooyala
Thursday, August 8, 13
5. 5
Ooyala has been using Cassandra since v0.4
Use cases:
⁍ Analytics data (real-time and batch)
⁍ Highly available K/V store
⁍ Time series data
⁍ Play head tracking (cross-device resume)
⁍ Machine Learning Data
#CASSANDRA
Ooyala & Cassandra
Thursday, August 8, 13
7. memTable
Avoiding read-modify-write
7#CASSANDRA
Albert 6 Wednesday 0
Evan Tuesday 0 Wednesday 0
Frank Tuesday 3 Wednesday 3
Kelvin Tuesday 0 Wednesday 0
cassandra13_drinks column family
Krzysztof Tuesday 0 Wednesday 0
Phillip Tuesday 12 Wednesday 0
Tuesday
Thursday, August 8, 13
8. memTable
Avoiding read-modify-write
8#CASSANDRA
Al Tuesday 2 Wednesday 0
Phillip Tuesday 0 Wednesday 1
cassandra13_drinks column family
ssTable
Albert 6 Wednesday 0
Evan Tuesday 0 Wednesday 0
Frank Tuesday 3 Wednesday 3
Kelvin Tuesday 0 Wednesday 0
Krzysztof Tuesday 0 Wednesday 0
Phillip Tuesday 12 Wednesday 0
Tuesday
Thursday, August 8, 13
9. memTable
Avoiding read-modify-write
9#CASSANDRA
Albert Tuesday 22 Wednesday 0
cassandra13_drinks column family
ssTable
Albert Tuesday 2 Wednesday 0
Phillip Tuesday 0 Wednesday 1
ssTable
Albert 6 Wednesday 0
Evan Tuesday 0 Wednesday 0
Frank Tuesday 3 Wednesday 3
Kelvin Tuesday 0 Wednesday 0
Krzysztof Tuesday 0 Wednesday 0
Phillip Tuesday 12 Wednesday 0
Tuesday
Thursday, August 8, 13
10. Avoiding read-modify-write
10#CASSANDRA
cassandra13_drinks column family
ssTable
Albert Tuesday 22 Wednesday 0
Evan Tuesday 0 Wednesday 0
Frank Tuesday 3 Wednesday 3
Kelvin Tuesday 0 Wednesday 0
Krzysztof Tuesday 0 Wednesday 0
Phillip Tuesday 0 Wednesday 1
Thursday, August 8, 13
11. 2011: 0.6 ➜ 0.8
11
⁍ Migration is still a largely unsolved problem
⁍ Wrote a tool in Scala to scrub data and write via Thrift
⁍ Rebuilt indexes - faster than copying
hadoop
cassandra
GlusterFS P2P
cassandra
Thrift
#CASSANDRA
Scala Map/Reduce
Thursday, August 8, 13
12. Changes: 0.6 ➜ 0.8
12
⁍ Cassandra 0.8
⁍ 24GiB heap
⁍ Sun Java 1.6 update
⁍ Linux 2.6.36
⁍ XFS on MD RAID5
⁍ Disabled swap or at least vm.swappiness=1
#CASSANDRA
Thursday, August 8, 13
14. System Changes: Apache 1.0 ➜ DSE 3.0
14
⁍ DSE 3.0 installed via apt packages
⁍ Unchanged: heap, distro
⁍ Ran much faster this time!
⁍ Mistake: Moved to MD RAID 0
Fix: RAID10 or RAID5, MD, ZFS, or btrfs
⁍ Mistake: Running on Ubuntu Lucid
Fix: Ubuntu Precise
#CASSANDRA
Thursday, August 8, 13
16. 16
⁍ 36 nodes ➜ lots more nodes
⁍ As usual, no downtime!
#CASSANDRA
DSE 3.1DSE 3.1
replication
2013: Datacenter Move
Thursday, August 8, 13
17. 17
Upcoming use cases:
⁍ Store every event from our players at full resolution
⁍ Cache code for our Spark job server
⁍ AMPLab Tachyon backend?
#CASSANDRA
Coming Soon for Cassandra at Ooyala
Thursday, August 8, 13
19. 19
⁍ Security
⁍ Cost of Goods Sold
⁍ Operations / support
⁍ Developer happiness
⁍ Physical capacity (cpu/memory/network/disk)
⁍ Reliability / Resilience
⁍ Compromise
#CASSANDRA
There’s more to tuning than performance:
Thursday, August 8, 13
20. 20
⁍ I’d love to be more scientific, but production comes first
⁍ Sometimes you have to make educated guesses
⁍ It’s not as difficult as it’s made out to be
⁍ Your brain is great at heuristics. Trust it.
⁍ Concentrate on bottlenecks
⁍ Make incremental changes
⁍ Read Malcom Gladwell’s “Blink”
#CASSANDRA
I am not a scientist ... heuristician?
Thursday, August 8, 13
21. 21
Observe, Orient, Decide, Act:
⁍ Observe the system in production under load
⁍ Make small, safe changes
⁍ Observe
⁍ Commit or Revert
#CASSANDRA
The OODA Loop
Thursday, August 8, 13
22. Testing Shiny Things
22
⁍ Like kernels
⁍ And Linux distributions
⁍ And ZFS
⁍ And btrfs
⁍ And JVM’s & parameters
⁍ Test them in production!
#CASSANDRA
Thursday, August 8, 13
31. 31#CASSANDRA
nodetool ring
10.10.10.10 Analytics rack1 Up Normal 47.73 MB 1.72% 1012046694721756637024691720378965
10.10.10.10 Analytics rack1 Up Normal 63.94 MB 0.86% 1026714038123521225967078556906197
10.10.10.10 Analytics rack1 Up Normal 85.73 MB 0.86% 1041381381525285814909465393433428
10.10.10.10 Analytics rack1 Up Normal 47.87 MB 0.86% 1056048724927050403851852229960659
10.10.10.10 Analytics rack1 Up Normal 39.73 MB 0.86% 1070716068328814992794239066487891
10.10.10.10 Analytics rack1 Up Normal 40.74 MB 1.75% 1100423945662575060114582859200003
10.10.10.10 Analytics rack1 Up Normal 40.08 MB 2.20% 1137814208669076757916163680305794
10.10.10.10 Analytics rack1 Up Normal 56.19 MB 3.45% 1196501513956187970179620530735245
10.10.10.10 Analytics rack1 Up Normal 214.88 MB 11.62% 1394248867770897155613247921498720
10.10.10.10 Analytics rack1 Up Normal 214.29 MB 2.45% 1435882108713996181107000284314407
10.10.10.10 Analytics rack1 Up Normal 158.49 MB 1.76% 1465773686249280216901752503449044
10.10.10.10 Analytics rack1 Up Normal 40.3 MB 0.92% 1481401683578223483181070489250370
Thursday, August 8, 13
32. 32#CASSANDRA
nodetool cfstats
Keyspace: gostress
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Column Family: stressful
SSTable count: 1
Space used (live): 32981239
Space used (total): 32981239
Number of Keys (estimate): 128
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Bloom Filter False Positives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 336
Compacted row minimum size: 7007507
Compacted row maximum size: 8409007
Compacted row mean size: 8409007
Could be using a lot of heap
Controllable by sstable_size_in_mb
Thursday, August 8, 13
38. 38#CASSANDRA
-Xmx8G leave it alone
-Xms8G leave it alone
-Xmn1200M 100MiB * nCPU
-Xss180k should be fine
-XX:+UseNUMA
numactl --interleave
JVM Args
Thursday, August 8, 13
39. cgroups
39#CASSANDRA
Provides fine-grained control over Linux resources
⁍ Makes the Linux scheduler better
⁍ Lets you manage systems under extreme load
⁍ Useful on all Linux machines
⁍ Can choose between determinism and flexibility
Thursday, August 8, 13
42. Successful Experiment: ZFS on Linux
42#CASSANDRA
zpool create data raidz /dev/sd[c-h]
zfs create data/cassandra
zfs set compression=lzjb data/cassandra
zfs set atime=off data/cassandra
zfs set logbias=throughput data/cassandra
Thursday, August 8, 13
43. Conclusions
43#CASSANDRA
⁍ Tuning is multi-dimensional
⁍ Production load is your most important benchmark
⁍ Lean on Cassandra, experiment!
⁍ No one metric tells the whole story
Thursday, August 8, 13