5. ElasticInbox 1000 ft view
MTA
…
elasticinbox nodes
load-balancing, share-nothing
Message Original
Metadata Message
Blob Store
(OpenStack, AWS S3, others)
Metadata Store (Cassandra …
Cluster)
…
5
6. Why Cassandra?
Horizontal Scalability
High Availability, no SPOF and Automatic
Replication
Flexible schema
Counters
Email storage does more writes than reads
spam, sent mails, notifications, mailing lists, unread
emails, ...
6
7. Why not Cassandra for
BLOBs?
Thrift does not support streaming
Value has to fit into memory
Default max Thrift frame size is 5MB
Possible solution: split large files into 1MB
chunks
Less than 2% of emails >1MB (in our case)
7
8. Why not Cassandra for
BLOBs?
Wasted RAM / JVM Heap
200 x 5MB messages R/W = 1GB RAM
8
9. Why not Cassandra for
BLOBs?
Wasted RAM / JVM Heap
200 x 5MB messages R/W = 1GB RAM
Wasted disk space
When RF=3, disk space = 6 × data
1TB data 6TB storage required!
8
10. Why not Cassandra for
BLOBs?
Wasted RAM / JVM Heap
200 x 5MB messages R/W = 1GB RAM
Wasted disk space
When RF=3, disk space = 6 × data
1TB data 6TB storage required!
Wasted CPU
More CPU used during compactions
8
11. Why not Cassandra for
BLOBs?
Wasted RAM / JVM Heap
200 x 5MB messages R/W = 1GB RAM
Wasted disk space
When RF=3, disk space = 6 × data
1TB data 6TB storage required!
Wasted CPU
More CPU used during compactions
Leveled Compaction Strategy?
New (1.0+), less wasted storage but more I/O.
8
12. BLOB Stores for BLOBs
BLOB Stores are designed for storing BLOBs
Can store unlimited number of objects in a single
container.
AWS S3, OpenStack Object Store, and other 15
supported (thanks @jclouds!).
40%-50% more space efficient than BLOBs in
Cassandra (w/RF=3; 1TB 3.5TB, rather than
6TB).
Cons: much slower than Cassandra (no memtable).
9
13. Polyglot Persistence
Martin Fowler: “any decent sized enterprise will
have a variety of different data storage
technologies for different kinds of data”
Martin Fowler, 16 Nov 2011
Don't take the example in the diagram too seriously.
10
17. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
11
18. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
11
19. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
But sometimes, access pattens are driven by NoSQL data model:
11
20. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
But sometimes, access pattens are driven by NoSQL data model:
Synergy between programming model and data model
11
21. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
But sometimes, access pattens are driven by NoSQL data model:
Synergy between programming model and data model
Some Gmail features driven BigTable limitations?
11
22. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
But sometimes, access pattens are driven by NoSQL data model:
Synergy between programming model and data model
Some Gmail features driven BigTable limitations?
Labels instead of folders
11
23. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
But sometimes, access pattens are driven by NoSQL data model:
Synergy between programming model and data model
Some Gmail features driven BigTable limitations?
Labels instead of folders
No custom sorting, only by time
11
24. Data Model
NoSQL data model is driven by data access pattens:
Email is immutable
Mostly, very recent messages are accessed and updated
But sometimes, access pattens are driven by NoSQL data model:
Synergy between programming model and data model
Some Gmail features driven BigTable limitations?
Labels instead of folders
No custom sorting, only by time
Other examples: “More” pagination
11
26. Data Model ‒ Accounts
Column Family
Reserved Labels: 0 = All Mails, 1 = Inbox, 2 =
Drafts, ...
"Accounts" {
"user@elasticinbox.com" {
"label:0" : "all",
"label:1" : "inbox",
"label:2" : "drafts",
"label:230": "Custom Label",
...
}
}
13
27. Data Model ‒ IndexLabels
Column Family
Composite Key : Account + Label ID
Messages ordered by time
"IndexLabels" {
"user@elasticinbox.com:0" { # All Mails
"550e8400-e29b-41d4-a716-446655440000" : null,
"892e8300-e29b-41d4-a716-446655440000" : null,
"a0232400-e29b-41d4-a716-446655440000" : null,
...
}
"user@elasticinbox.com:1" { # Inbox
"550e8400-e29b-41d4-a716-446655440000" : null,
"892e8300-e29b-41d4-a716-446655440000" : null,
"a0232400-e29b-41d4-a716-446655440000" : null,
...
}
}
14
28. Data Model ‒
MessageMetadata
SuperColumn Family
Stores message metadata and pre-parsed
contents
Message headers, body and attachment info
TimeUUID as unique Message ID, ordered by
time
15
31. Data Model ‒ Counters
SuperColumn Family
Account’s all counters are on the same node
"Counters" {
"user@elasticinbox.com" {
"l:0" {
"total_bytes" : 18239090,
"total_msg" : 394,
"new_msg" : 12
}
"l:1" {
"total_msg" : 144,
"new_msg" : 10
}
...
}
18
32. Data Model ‒ Counters
SuperColumn Family
Account’s all counters are on the same node
"Counters" {
Non-atomic
"user@elasticinbox.com" { Counters
"l:0" {
"total_bytes" : 18239090, It’s easy to
miscount
"total_msg" : 394,
"new_msg" : 12
}
"l:1" {
"total_msg" : 144,
"new_msg" : 10
}
...
}
18
33. ElasticInbox in Production
In production since Nov 2011
~200K accounts, 30M+ messages
4 node cluster, RF=3, Cassandra 0.8.x
Each 1TB of raw mails = 70GB in Cassandra
Metadata + LZF compressed email text/html
body
19
34. ElasticInbox in Production
Cassandra load : 40 requests per second per
node
Cassandra latency: 10ms read average, 0.02ms
write
Write to Read ratio:
CF Name W:R Ratio
MessageMetadata 3:1
IndexLabels 2:1
Accounts 1:50
Counters 2:3
20
35. Future work
Performance improvements (may involve minor
schema changes)
Full-text search (preferably on top of Cassandra)
POP3 and IMAP
Built-in filtering rules
Message threads / conversations
21