2. About me
Software Engineer at Sematext International
http://blog.sematext.com/author/abaranau
@abaranau
http://github.com/sematext (abaranau)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
3. Plan
Problem background: what? why?
Going real-time with append-only updates
approach: how?
Open-source implementation: how exactly?
Q&A
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
4. Background: our services
Systems Monitoring Service (Solr, HBase, ...)
Search Analytics Service
data collector Reports
Data
100
75
data collector Analytics & 50
Storage 25
data collector 0
2007 2008 2009 2010
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
5. Background: Report Example
Search engine (Solr) request latency
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
6. Background: Report Example
HBase flush operations
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
7. Background: requirements
High volume of input data
Multiple filters/dimensions
Interactive (fast) reports
Show wide range of data intervals
Real-time data changes visibility
No sampling, accurate data needed
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
8. Background: serve raw data?
simply storing all data points doesn’t work
to show 1-year worth of data points collected every second
31,536,000 points have to be fetched
pre-aggregation (at least partial) needed
Data Analytics & Storage Reports
aggregated data
input data
data processing
(pre-aggregating)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
9. Background: pre-aggregation
OLAP-like Solution
aggregation rules
* filters/dimensions
* time range granularities aggregated
* ... value
aggregated
input data processing value
item
logic
aggregated
value
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
10. Background: pre-aggregation
Simplified Example aggregated record groups
by minute
minute: 22214701
aggregation rules value: 30.0
* by sensor ...
* by minute/day by day
day: 2012-04-26
input data item value: 10.5
time: 1332882078 processing ...
sensor: sensor55
value: 80.0 logic by minute & sensor
minute: 22214701
sensor: sensor55
cpu: 70.3
...
...
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
11. Background: RMW updates are slow
more dimensions/filters -> greater output data vs input data
ratio
individual ready-modify-write (Get+Put) operations are slow
and not efficient (10-20+ times slower than only Puts)
sensor1 sensor2
... <...>
sensor2
value:15.0 ... value:41.0
input
Get Put Get Put Get Put
... sensor1 sensor2
... reports
<...> avg : 28.7 Get/Scan sensor2
min: 15.0
storage max: 41.0
(HBase)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
12. Background: improve updates
Using in-place increment operations? Not fast
enough and not flexible...
Buffering input records on the way in and
writing in small batches? Doesn’t scale and
possible loss of data...
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
13. Background: batch updates
More efficient data processing: multiple
updates processed at once, not individually
Decreases aggregation output (per input
record)
Reliable, no data loss
Using “dictionary records” helps to reduce
number of Get operations
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
14. Background: batch updates
“Dictionary Records”
Using data de-normalization to reduce random Get
operations while doing “Get+Put” updates:
Keep compound records which hold data of
multiple “normal” records that are usually
updated together
N Get+Put operations replaced with M (Get+Put)
and N Put operations, where M << N
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
15. Background: batch updates
Not real-time
If done frequently (closer to real-time), still
a lot of costly Get+Put update operations
Bad (any?) rollback support
Handling of failures of tasks which partially
wrote data to HBase is complex
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
16. Going Real-time
with
Append-based Updates
Sunday, May 20, 12
17. Append-only: main goals
Increase record update throughput
Process updates more efficiently: reduce
operations number and resources usage
Ideally, apply high volume of incoming data
changes in real-time
Add ability to roll back changes
Handle well high update peaks
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
18. Append-only: how?
1. Replace read-modify-write (Get+Put) operations
at write time with simple append-only writes (Put)
2. Defer processing of updates to periodic jobs
3. Perform processing of updates on the fly only
if user asks for data earlier than updates are
processed.
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
19. Append-only: writing updates
1 Replace update (Get+Put) operations at write time
with simple append-only writes (Put)
sensor1 sensor2 sensor2
... ... value:15.0 ... value:41.0 input
Put Put Put
... sensor1 sensor2
...
<...> avg : 22.7
max: 31.0
sensor1
<...>
...
... sensor2
value: 15.0
storage sensor2
value: 41.0
(HBase) ... Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
20. Append-only: writing updates
2 Defer processing of updates to periodic jobs
processing updates with MR job
... sensor1 sensor2
...
<...> avg : 22.7 ... sensor1 sensor2
...
max: 31.0 <...> avg : 23.4
sensor1
<...>
... max: 41.0
... sensor2
value: 15.0
sensor2
value: 41.0
...
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
21. Append-only: writing updates
3 Perform aggregations on the fly if user asks
for data earlier than updates are processed
... sensor1 sensor2
... reports
<...> avg : 22.7 sensor1 ...
max: 31.0
sensor1
<...>
...
... sensor2
value: 15.0
sensor2
storage value: 41.0
...
sensor2
avg : 23.4
max: 41.0
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
22. Append-only: benefits
High update throughput
Real-time updates visibility
Efficient updates processing
Handling high peaks of update operations
Ability to roll back any range of changes
Automatically handling failures of tasks which
only partially updated data (e.g. in MR jobs)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
23. 1/6
Append-only: high update throughput
Avoid Get+Put operations upon writing
Use only Put operations (i.e. insert new
records only) which is very fast in HBase
Process updates when flushing client-side
buffer to reduce the number of actual
writes
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
24. 2/6
Append-only: real-time updates
Increased update throughput allows to apply
updates in real-time
User always sees the latest data changes
Updates processed on the fly during Get or
Scan can be stored back right away
Periodic updates processing helps avoid doing
a lot of work during reads, making reading
very fast
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
25. 3/6
Append-only: efficient updates
To apply N changes:
N Get+Put operations replaced with
N Puts and 1 Scan (shared) + 1 Put operation
Applying N changes at once is much more
efficient than performing N individual changes
Especially when updated value is complex (like bitmaps),
takes time to load in memory
Skip compacting if too few records to process
Avoid a lot of redundant Get operations when
large portion of operations - inserting new data
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
26. 4/6
Append-only: high peaks handling
Actual updates do not happen at write time
Merge is deferred to periodic jobs, which can
be scheduled to run at off-peak time (nights/
week-ends)
Merge speed is not critical, doesn’t affect the
visibility of changes
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
27. 5/6
Append-only: rollback
Rollbacks are easy when updates were not
processed yet (not merged)
To preserve rollback ability after they are
processed (and result is written back), updates
can be compacted into groups
written at: processing updates
9:00 ... sensor2
...
... ... sensor2 ...
...
sensor2
...
...
10:00 sensor2 sensor2
... ...
sensor2
...
...
11:00 sensor2 sensor2
... ...
... ...
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
28. 5/6
Append-only: rollback
Example:
* keep all-time avg value for sensor
* data collected every 10 second for 30 days
Solution:
* perform periodic compactions every 4 hours
* compact groups based on 1-hour interval
Result:
At any point of time there are no more than
24 * 30 + 4 * 60 * 6 = 2160 non-compacted
records that needs to be processed on the fly
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
29. 6/6
Append-only: idempotency
Using append-only approach helps recover from
failed tasks which write data to HBase
without rolling back partial updates
avoids applying duplicate updates
fixes task failure with simple restart of task
Note: new task should write records with same row
keys as failed one
easy, esp. given that input data is likely to be same
Very convenient when writing from MapReduce
Updates processing periodic jobs are also idempotent
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
30. Append-only: cons
Processing on the fly makes reading slower
Looking for data to compact (during periodic
compactions) may be inefficient
Increased amount of stored data depending
on use-case (in 0.92+)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
31. Append-only + Batch?
Works very well together, batch approach
benefits from:
increased update throughput
automatic task failures handling
rollback ability
Use when HBase cluster cannot cope with
processing updates in real-time or update
operations are bottleneck in your batch
We use it ;)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
33. HBaseHUT: Overview
Simple
Easy to integrate into existing projects
Packed as a singe jar to be added to HBase client
classpath (also add it to RegionServer classpath to
benefit from server-side optimizations)
Supports native HBase API: HBaseHUT classes
implement native HBase interfaces
Apache License, v2.0
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
34. HBaseHUT: Overview
Processing of updates on-the-fly (behind
ResultScanner interface)
Allows storing back processed Result
Can use CPs to process updates on server-side
Periodic processing of updates with Scan or
MapReduce job
Including processing updates in groups based on write ts
Rolling back changes with MapReduce job
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
35. HBaseHUT vs(?) OpenTSDB
“vs” is wrong, they are simply different things
OpenTSDB is a time-series database
HBaseHUT is a library which implements append-only
updates approach to be used in your project
OpenTSDB uses “serve raw data” approach (with
storage improvements), limited to handling
numeric values
HBaseHUT is meant for (but not limited to)
“serve aggregated data” approach, works with
any data
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
36. HBaseHUT: API overview
Writing data:
Put put = new Put(HutPut.adjustRow(rowKey));
// ...
hTable.put(put);
Reading data:
Scan scan = new Scan(startKey, stopKey);
ResultScanner resultScanner =
new HutResultScanner(hTable.getScanner(scan),
updateProcessor);
for (Result current : resultScanner) {...}
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
37. HBaseHUT: API overview
Example UpdateProcessor:
public class MaxFunction extends UpdateProcessor {
// ... constructor & utility methods
@Override
public void process(Iterable<Result> records,
UpdateProcessingResult result) {
Double maxVal = null;
for (Result record : records) {
double val = getValue(record);
if (maxVal == null || maxVal < val) {
maxVal = val;
}
}
result.add(colfam, qual, Bytes.toBytes(maxVal));
}
}
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
38. HBaseHUT: how we use it
Data Analytics & Storage Reports
aggregated data
HBaseHUT
HBaseHUT
input initial data
data processing
HBase
HBaseHUT
HBaseHUT
periodic
MapReduce
updates
jobs
processing
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
39. HBaseHUT: Next Steps
Wider CPs (HBase 0.92+) utilization
Process updates during memstore flush
Make use of Append operation (HBase 0.94+)
Integrate with asynchbase lib
Reduce storage overhead from adjusting
row keys
etc.
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12
40. Qs?
http://github.com/sematext/HBaseHUT
http://blog.sematext.com
@abaranau
http://github.com/sematext (abaranau)
http://sematext.com, we are hiring! ;)
Alex Baranau, Sematext International, 2012
Sunday, May 20, 12