7. X-FACTOR: THE RESULTS
Over 1 Million app downloads
Over 260 Million boos/claps
Massive peak loads on CTA
4
8. BUT SURELY COUNTING IS EASY?
Need real time results
How many boos?
How many claps?
Rate of boos
Rate of claps
Design for scale
Goal of handling 10K per second coming into our servers
5
9. DISTRIBUTED COUNTING
“Hey, my CPU can do 22305 MIPS!”
“Stick it in Memcache!”
“How about Redis?”
“OK, how about sharding?”
“Well, I hear Cassandra 0.8 has counters”
6
21. MEMCACHE CAN’T COUNT PART 3
EC2 limits
Single Memcache server runs out of network I/O
What then?
8
22. MEMCACHE CAN’T COUNT PART 3
EC2 limits
Single Memcache server runs out of network I/O
What then?
Redis?
Benchmarked on EC2
m1.large -> m1.large, 28K INCR/s
Network I/O limited
Can’t horizontally scale
8
23. SHARDED COUNTERS
Implemented 2 level cache on web tier (https://gist.github.com/953524)
But a counter is more complicated
Sharded counter
Store (count, delta, timestamp) locally
Store count in L2 cache
Increment changes local delta
Push deltas to central every N seconds & refresh count
Eventually consistent
Maybe....unless something crashes
9
24. CASSANDRA HAS COUNTERS
New feature in Cassandra 0.8
Special column type - CounterColumnType as the validator
Distributed 64 bit counter, with eventual consistency
CL.ONE writes recommended to avoid implicit reads impacting performance
Reads tot up values from replicas to give value
Simple functionality
incr()/decr(), get()
10
27. CAN CASSANDRA COUNT?
Yes, But....
Performance can suck
Switch off replicate_on_write, tune RF & cluster size
11
28. CAN CASSANDRA COUNT?
Yes, But....
Performance can suck
Switch off replicate_on_write, tune RF & cluster size
Not scalable
Scales as function of RF up to 4 nodes
Above that ... you’re out of luck
Best we achieved is ~10K/s increments to single counter value
11
29. CAN CASSANDRA COUNT?
Yes, But....
Performance can suck
Switch off replicate_on_write, tune RF & cluster size
Not scalable
Scales as function of RF up to 4 nodes
Above that ... you’re out of luck
Best we achieved is ~10K/s increments to single counter value
What do you do if an operation fails?
11
30. CASSANDRA - MAKE IT COUNT *FASTER*
Recommendation (from Cassandra committers...):
12
31. CASSANDRA - MAKE IT COUNT *FASTER*
Recommendation (from Cassandra committers...):
SHARD YOUR COUNTERS
12
35. CONCLUSION
Counting is easy.....
Unless you want to do it really, really fast
If you’re inside the I/O limits for a single box, all is peachy
Above that, there’s no good off the shelf answers
14
36. ANY QUESTIONS?
We’re hiring - if you’re interested in helping us count, get in touch!
malcolm@tellybug.com
@malcolmbox
15
Hinweis der Redaktion
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
redis - what if you need 30K/s?\n\n
redis - what if you need 30K/s?\n\n
\n
\n
\n
\n
\n
\n
reveal - “shard your counters”\n
1+1+1 = 2 - eventual consistency. Cache consistency\n\nWrite only DB - Cassandra bug where get_range() wasn’t returning all the data in the DB.\n
1+1+1 = 2 - eventual consistency. Cache consistency\n\nWrite only DB - Cassandra bug where get_range() wasn’t returning all the data in the DB.\n
single box - failures?\nCass counters don’t scale :(\n