HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!

Relaxed Transactions for HBase
Francis Liu, Software Engineer
5/22/12

Mutable Data

 Writes are effective immediately

Read_Job
Table1
Map1
C1=1

C2=2
Map2
C3=3

Map3

Mutable Data

 Writes are effective immediately

Job1
Read_Job
Table1
Map1
Map1
C1=1
Map2 C2=2
Write_Job Map2
C3=4
Map3
Map3

Mutable Data
 Partial writes in the midst of failures

Write_Job

Table1
Map1

C1=1
Map2
C2=2

C3=?
Map3

Mutable Data
 Partial writes in the midst of failures

Write_Job

Table1
Map1

C1=1
Map2 Read_Job
C2=2

C3=?
Map3

Revision Manager

 Optimized for batch processing
› Large number of writes (ie Data Ingestion, Batch
updates)
 Cross row write transactions within a table
 Coprocessor Endpoint
› Leverage HBase Security
 Zookeeper for persistence
› table revision information
 Experimental feature in Hcatalog 0.4

Architecture

Revision Mgr
Revision Mgr
Service
Client
(Coprocessor)
InputFormat/
RegionServer Zookeeper
OutputFormat

API

 For reads
› RevisionManager.createSnapshot(tableName)
› SnapshotFilter.filter(result)

 For writes
› RevisionManager.beginWriteTransaction(table, families)
› RevisionManager.commitWriteTransaction(transaction)
› RevisionManager.abortWriteTransaction(transaction)

Concepts
 Revision
› Monotonically increasing number
› All “Puts” of a job are written with the same revision number as the
cell version

 TableSnapshot
› Point-in-time consistent view of a table
› Used for reading
› Latest committed revision
› List of aborted revisions
› Upper bound on visible revision per CF

 Transaction
› Write transaction
› Revision Number
› List of column Families being written to

Relaxed Transaction Properties

 Immutable Input
 Change After Commit
 Precedence Preservation

Immutable Input

Consistent Read
Write
CellA=1
Read

Snapshot1 CellA=1 CellA=1 CellA=1
Read_Job1

Begin t1 CellA=2 Commit

Write_Job1

Change After Commit

 Revisions are only viewable after commit
› A job cannot see it‟s own writes
 Aborted revisions are added to a table‟s aborted
list
 Timed out revisions are aborted

Change After Commit

Write
CellA=1
Read
Snapshot1 CellA=1
Read_Job1


Write_Job1
t1 change read
Snapshot2 CellA=2
Read_Job2

Precedence Preservation

 Snapshot Isolation
› Transaction is aborted when a write conflict is detected
 Conflicts
› Concurrent transactions to the same Column Family
› Inefficient to abort
 Resolved during read time
• For every CF
– find: min_rev = min(active_revision)
– Only return closest revision to min_rev
• min_rev is what‟s stored in a snapshot

Precedence Preservation

CellA=1 Write
CellB=1 Read

Begin t1 CellA=2 CellB=2 Commit

Write_Job1
Changes are not visible due to t1

Write_Job2

Snapshot1 CellA=1 CellB=1

Read_Job1

* CellA and CellB are members of the same column family

Snapshot Filter

 Consumes TableSnapshot
 Read time filtering
› Aborted revisions
› Revisions written after snapshot was taken
› Conflicting/Blocked revisions

Flow - Read
 User/Client
› RevisionManager.createSnapshot()
• TableSnapshot instance is serialized into JobConf

 RecordReader
› Using SnapshotFilter.filter(result)

Flow - Read
SnapshotRecordReader SnapshotFilter ScannerIterator

next(key,value)

Loop
result != null
and
filtered == null

next()

next result

filter(result)

filtered result

next record

Flow - Write
 User/Client
› HBaseOutputFormat.checkOutputSpecs(FileSystem, JobConf)
• Write transaction is started by calling beginWriteTransaction(Transaction)
• Transaction instance is serialized into JobConf

 RecordWriter
› Puts make use of the revision number as the version

 OutputCommitter
› OutputCommitter.commitJob(JobContext)
• RevisionManager.commitWriteTransaction(Transaction)
› OutputCommitter.abortJob(JobContext)
• RevisionManager.abortWriteTransaction(Transaction)

Usage

 Using HCatalog Revision Manager usage is done
under the covers.
 Work is being done to decouple HCatalog from
HBaseInputFormat/HBaseOutputFormat
 Other frameworks can make use of the
RevisionManager API

Usage: HCatalog
Create Table
hcat –e “create table my_table(key string, gpa string) STORED BY
'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
TBLPROPERTIES ('hbase.columns.mapping'=':key,info:gpa');”

Using Pig
A = LOAD „table1‟ USING org.apache.hcatalog.pig.HCatLoader();
STORE A INTO „table1‟ USING org.apache.hcatalog.pig.HCatStorer();

Using MapReduce
HCatInputFormat.setInput(job,…)
HCatOutputFormat.setOutput(job,…)

Future Work

 Compaction of aborted transactions
 Server-side filtering using HBase Filters
 Compatibility with Hive

Further Info

 hcatalog-user@incubator.apache.org
 http://incubator.apache.org/hcatalog/
 toffer@apache.org

HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (20)

Mehr von Cloudera, Inc.

Mehr von Cloudera, Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!