Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Â
Cassandra at eBay - Cassandra Summit 2012
1. August 8, 2012
Cassandra at eBay
Time left: 29m 59s
Jay Patel
Architect, Platform Systems
@pateljay3001
2. eBay Marketplaces
ï§ 97 million active buyers and sellers
ï§ 200+ million items
ï§ 2 billion page views each day
ï§ 80 billion database calls each day
ï§ 5+ petabytes of site storage capacity
ï§ 80+ petabytes of analytics storage capacity
2
3. How do we scale databases?
ï§ Shard
â Patterns: Modulus, lookup-based, range, etc.
â Application sees only logical shard/database
ï§ Replicate
â Disaster recovery, read availability/scalability
ï§ Big NOs
â No transactions
â No joins
â No referential integrity constraints
3
4. We like Cassandra
ï§ Multi-datacenter (active-active) ï§ Write performance
ï§ Availability - No SPOF ï§ Distributed counters
ï§ Scalability ï§ Hadoop support
We also utilize MongoDB & HBase
4
5. Are we replacing RDBMS with NoSQL?
Not at all! But, complementing.
ï§ Some use cases donât fit well - sparse data, big data, schema
optional, real-time analytics, âŠ
ï§ Many use cases donât need top-tier set-ups - logging, tracking, âŠ
5
6. A glimpse on our Cassandra deployment
ï§ Dozens of nodes across multiple clusters
ï§ 200 TB+ storage provisioned
ï§ 400M+ writes & 100M+ reads per day, and growing
ï§ QA, LnP, and multiple Production clusters
6
7. Use Cases on Cassandra
Social Signals on eBay product & item pages
Hunch taste graph for eBay users & items
Time series use cases (many):
ï§ Mobile notification logging and tracking
ï§ Tracking for fraud detection
ï§ SOA request/response payload logging
ï§ RedLaser server logs and analytics
7
9. Manage signals via âYour Favoritesâ
Whole page is
served by
Cassandra
9
10. Why Cassandra for Social Signals?
ï§ Need scalable counters
ï§ Need real (or near) time analytics on collected social data
ï§ Need good write performance
ï§ Reads are not latency sensitive
10
11. Deployment
User request has no datacenter affinity
Non-sticky load balancing
Topology - NTS Data is backed up periodically
RF - 2:2 to protect against human or
Read CL - ONE software error
Write CL â ONE
11
15. Yes, eventual consistency!
One scenario that produces duplicate signals in UserLike CF:
1. Signal
2. De-signal (1st operation is not propagated to all replica)
3. Signal, again (1st operation is not propagated yet!)
So, whatâs the solution? LaterâŠ
15
16. Social Signals, next phase: Real-time Analytics
ï§ Most signaled or popular items per affinity groups (category, etc.)
ï§ Aggregated item count per affinity group
Example affinity group
16
17. Initial Data Model for real-time analytics
Items in an affinitygroup
is physically stored
sorted by their signal
count
Update counters for both individual item
and all the affinity groups that item
belongs to
20. Graph in Cassandra
Event consumers listen for site events (sell/bid/buy/watch) & populate graph in Cassandra
ï§ 30 million+ writes daily ï§ Batch-oriented reads
ï§ 14 billion+ edges already (for taste vector updates)
20
21. ï§ Mobile notification logging and tracking
ï§ Tracking for fraud detection
ï§ SOA request/response payload logging
ï§ RedLaser server logs and analytics
21
24. Thatâs all about the use cases..
Remember the duplicate problem in Use Case #1?
Letâs see some options we considered to solve thisâŠ
24
25. Option 1 â Make âLikeâ idempotent for UserLike
ï§ Remove time (timeuuid) from the composite column name:
ï§ Multiple signal operations are now Idempotent
ï§ No need to read before de-signaling (deleting)
X Need timeuuid for ordering!
Already have a user with more than 1300 signals 25
26. Option 2 â Use strong consistency
ï§ Local Quorum
â Wonât help us. User requests are not geo-load balanced
(no DC affinity).
ï§ Quorum
â Wonât survive during partition between DCs (or, one of the
DC is down). Also, adds additional latency.
X Need to survive!
26
27. Option 3 â Adapt to eventual consistency
If desire survival!
27
http://www.strangecosmos.com/content/item/101254.html
28. Adjustments to eventual consistency
De-signal steps:
â Donât check whether item is already signaled by a user, or not
â Read all (duplicate) signals from UserLike_unordered (new CF to avoid reading
whole row from UserLike)
â Delete those signals from UserLike_unordered and UserLike
Still, can get duplicate signals or false positives as there is a âread before deleteâ.
To shield further, do ârepair on readâ. Not a full story!
28
29. Lessons & Best Practices
âą Choose proper Replication Factor and Consistency Level.
â They alter latency, availability, durability, consistency and cost.
â Cassandra supports tunable consistency, but remember strong consistency is not free.
âą Consider all overheads in capacity planning.
â Replicas, compaction, secondary indexes, etc.
âą De-normalize and duplicate for read performance.
â But donât de-normalize if you donât need to.
âą Many ways to model data in Cassandra.
â The best way depends on your use case and query patterns.
More on http://ebaytechblog.com?p=1308