For Map/Reduce programmers used to HDFS, the mutability of HBase tables poses new challenges: Data can change over the duration of a job, multiple jobs can write concurrently, writes are effective immediately, and it is not trivial to clean up partial writes. Revision Manager introduces atomic commits and point-in-time consistent snapshots over a table, guaranteeing repeatable reads and protection from partial writes. Revision Manager is optimized for a relatively small number of concurrent write jobs, which is typical within Hadoop clusters. This session will discuss the implementation of Revision Manager using ZooKeeper and coprocessors, and paying extra care to ensure security in multi-tenant clusters. Revision Manager is available as part of the HBase storage handler in HCatalog, but can easily be used stand-alone with little coding effort.
4. Mutable Data
Partial writes in the midst of failures
Write_Job
Table1
Map1
C1=1
Map2
C2=2
C3=?
Map3
5. Mutable Data
Partial writes in the midst of failures
Write_Job
Table1
Map1
C1=1
Map2 Read_Job
C2=2
C3=?
Map3
6. Revision Manager
Optimized for batch processing
› Large number of writes (ie Data Ingestion, Batch
updates)
Cross row write transactions within a table
Coprocessor Endpoint
› Leverage HBase Security
Zookeeper for persistence
› table revision information
Experimental feature in Hcatalog 0.4
8. API
For reads
› RevisionManager.createSnapshot(tableName)
› SnapshotFilter.filter(result)
For writes
› RevisionManager.beginWriteTransaction(table, families)
› RevisionManager.commitWriteTransaction(transaction)
› RevisionManager.abortWriteTransaction(transaction)
9. Concepts
Revision
› Monotonically increasing number
› All “Puts” of a job are written with the same revision number as the
cell version
TableSnapshot
› Point-in-time consistent view of a table
› Used for reading
› Latest committed revision
› List of aborted revisions
› Upper bound on visible revision per CF
Transaction
› Write transaction
› Revision Number
› List of column Families being written to
12. Change After Commit
Revisions are only viewable after commit
› A job cannot see it‟s own writes
Aborted revisions are added to a table‟s aborted
list
Timed out revisions are aborted
14. Precedence Preservation
Snapshot Isolation
› Transaction is aborted when a write conflict is detected
Conflicts
› Concurrent transactions to the same Column Family
› Inefficient to abort
Resolved during read time
• For every CF
– find: min_rev = min(active_revision)
– Only return closest revision to min_rev
• min_rev is what‟s stored in a snapshot
15. Precedence Preservation
CellA=1 Write
CellB=1 Read
Begin t1 CellA=2 CellB=2 Commit
Write_Job1
Changes are not visible due to t1
Begin t2 CellA=3 Commit
Write_Job2
Snapshot1 CellA=1 CellB=1
Read_Job1
* CellA and CellB are members of the same column family
16. Snapshot Filter
Consumes TableSnapshot
Read time filtering
› Aborted revisions
› Revisions written after snapshot was taken
› Conflicting/Blocked revisions
17. Flow - Read
User/Client
› RevisionManager.createSnapshot()
• TableSnapshot instance is serialized into JobConf
RecordReader
› Using SnapshotFilter.filter(result)
18. Flow - Read
SnapshotRecordReader SnapshotFilter ScannerIterator
next(key,value)
Loop
result != null
and
filtered == null
next()
next result
filter(result)
filtered result
next record
19. Flow - Write
User/Client
› HBaseOutputFormat.checkOutputSpecs(FileSystem, JobConf)
• Write transaction is started by calling beginWriteTransaction(Transaction)
• Transaction instance is serialized into JobConf
RecordWriter
› Puts make use of the revision number as the version
OutputCommitter
› OutputCommitter.commitJob(JobContext)
• RevisionManager.commitWriteTransaction(Transaction)
› OutputCommitter.abortJob(JobContext)
• RevisionManager.abortWriteTransaction(Transaction)
20. Usage
Using HCatalog Revision Manager usage is done
under the covers.
Work is being done to decouple HCatalog from
HBaseInputFormat/HBaseOutputFormat
Other frameworks can make use of the
RevisionManager API
21. Usage: HCatalog
Create Table
hcat –e “create table my_table(key string, gpa string) STORED BY
'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
TBLPROPERTIES ('hbase.columns.mapping'=':key,info:gpa');”
Using Pig
A = LOAD „table1‟ USING org.apache.hcatalog.pig.HCatLoader();
STORE A INTO „table1‟ USING org.apache.hcatalog.pig.HCatStorer();
Using MapReduce
HCatInputFormat.setInput(job,…)
HCatOutputFormat.setOutput(job,…)
22. Future Work
Compaction of aborted transactions
Server-side filtering using HBase Filters
Compatibility with Hive