The document summarizes how Accumulo can scale to support large clusters storing petabytes of data. It discusses how Accumulo maintains low administrative effort and scan latency as the data size scales up. Key techniques for scaling Accumulo include distributing writes across all servers, designing schemas to minimize the number of scans needed, and using temporal or binned keys to parallelize writes. The document also provides estimates for planning Accumulo clusters capable of ingesting millions of entries per second and storing data in the petabyte range.
48. Group for Locality
Key Value
userA:name Bob
userA:age 43
userB:name Annie
userB:age 32
userC:name Fred
userC:age 29
userD:name Joe
userD:age 59
Key Value
userA:name Bob
userA:age 43
userA:account $30
userB:name Annie
userB:age 32
userB:account $25
userC:name Joe
userC:age 59
RowID Col Value
af362de4 name Annie
af362de4 age 32
af362de4 account $25
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
Still fairly uniform writes
49. Group for Locality
RowID Col Value
af362de4 name Annie
af362de4 age 32
af362de4 account $25
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
1 x 3-entry scan on 1 server
get(userA)
57. Binned Temporal Keys
Key Value
userA:name Bob
userA:age 43
userB:name Annie
userB:age 32
userC:name Fred
userC:age 29
userD:name Joe
userD:age 59
Key Value
20140101 44
20140102 22
20140103 23
RowID Col Value
0_20140101 44
1_20140102 22
2_20140103 23
Uniform Writes
58. Binned Temporal Keys
Key Value
userA:name Bob
userA:age 43
userB:name Annie
userB:age 32
userC:name Fred
userC:age 29
userD:name Joe
userD:age 59
Key Value
20140101 44
20140102 22
20140103 23
20140104 25
20140105 31
20140106 27
RowID Col Value
0_20140101 44
0_20140104 25
1_20140102 22
1_20140105 31
2_20140103 23
2_20140106 27
Uniform Writes
59. Binned Temporal Keys
Key Value
userA:name Bob
userA:age 43
userB:name Annie
userB:age 32
userC:name Fred
userC:age 29
userD:name Joe
userD:age 59
Key Value
20140101 44
20140102 22
20140103 23
20140104 25
20140105 31
20140106 27
20140107 25
20140108 17
RowID Col Value
0_20140101 44
0_20140104 25
0_20140107 25
1_20140102 22
1_20140105 31
1_20140108 17
2_20140103 23
2_20140106 27
Uniform Writes
60. Binned Temporal Keys
RowID Col Value
0_20140101 44
0_20140104 25
0_20140107 25
1_20140102 22
1_20140105 31
1_20140108 17
2_20140103 23
2_20140106 27
One scan per bin
get(20140101 to 201404)
62. Keys to Scaling
• Key design is critical
• Group data under common row IDs to reduce
scans
• Prepend bins to row IDs to increase write
parallelism
63. Splits
• Pre-split or organic splits
• Going from dev to production, can ingest a
representative sample, obtain split points and use
them to pre-split a larger system
• Hundreds or thousands of tablets per server is ok
• Want at least one tablet per server
64. Effect of Compression
• Similar sorted keys compress well
• May need more data than you think to auto-split
68. Update - Overwrite
• Performance same as insert
• Ignore (don’t read) existing value
• Accumulo’s Versioning Iterator does the overwrite
69. Update - Overwrite
RowID Col Value
af362de4 name Annie
af362de4 age 32
af362de4 account $25
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
userB:age -> 34
70. Update - Overwrite
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $25
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
userB:age -> 34
71. Update - Combine
• Things like X = X + 1
• Normally one would have to read the old value to
do this, but Accumulo Iterators allow multiple
inserts to be combined at scan time, or compaction
• Performance is same as inserts
72. Update - Combine
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $25
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
userB:account -> +10
73. Update - Combine
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $25
af362de4 account $10
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
userB:account -> +10
74. Update - Combine
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $25
af362de4 account $10
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
getAccount(userB)
$35
75. Update - Combine
After compaction
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $35
c48e2ade name Joe
c48e2ade age 59
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
76. Update - Complex
• Some updates require looking at more data than
Iterators have access to - such as multiple rows
• These require reading the data out in order to write
the new value
• Performance will be much slower
77. Update - Complex
userC:account =
getBalance(userA) +
getBalance(userB)
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $35
c48e2ade name Joe
c48e2ade age 59
c48e2ade account $40
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
35+30 = 65
78. Update - Complex
userC:account =
getBalance(userA) +
getBalance(userB)
RowID Col Value
af362de4 name Annie
af362de4 age 34
af362de4 account $35
c48e2ade name Joe
c48e2ade age 59
c48e2ade account $65
e2e4dac4 name Bob
e2e4dac4 age 43
e2e4dac4 account $30
35+30 = 65
81. Model for Ingest Rates
A = 0.85log2N * N * S
N - Number of machines
S - Single Server throughput (entries / second)
A - Aggregate Cluster throughput (entries / second)
Expect 85% increase in write rate
when doubling the size of the cluster
82. Estimating Machines Required
N = 2 (log2(A/S) / 0.7655347)
N - Number of machines
S - Single Server throughput (entries / second)
A - Target Aggregate throughput (entries / second)
Expect 85% increase in write rate
when doubling the size of the cluster