27. ElephantDB serving
data flow
Distributed filesystem
Shard ElephantDB
Server
Shard
Shard ElephantDB
Server
Shard
28. ElephantDB serving
• Each server in “ring” serves a subset of the
data
• Download shards, open, serve
29. Terminology
• “Domain”: related set of key/value pairs
• “Shard”: Subset of a domain
• “Ring”: Cluster of servers that work
together to serve same set of domains
• “Local persistence”: Regular key/value
database that implements a shard (like
Berkeley DB)
35. MapReduce flow
(key, value) (shard, list<key, value>)
Stream into LP
hash mod Group by shard
Local
Persistence on Shard on DFS
Upload
(shard, key, value) LFS
36. Incremental ElephantDB
(key, value) (shard, list<key, value>)
Stream into LP Old shard on
hash mod Group by shard Download DFS
Local
New shard on
Persistence on
Upload DFS
(shard, key, value) LFS
37. Incremental ElephantDB
• Avoid reindexing domain from scratch
• Massive performance benefits in creating/
updating versions
• ElephantDB ring still has to download
domain from scratch