HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. Yahoo! has been using HBase for a long time as isolated one off deployments. Having a multi-tenant platform makes it possible for all our grid customers to take advantage of HBase capabilities now. We will provide a brief overview of HBase and how it works (several of you asked for back to basics type talks), and then spend the majority of our time talking about multi-tenancy with HBase.
Presenter(s):
Francis Christopher Liu, Software Engineer, Yahoo! and PPMC Member, Apache HCatalog
Vandana Ayyalasomayajula, Software Engineer, Yahoo! and PPMC Member, Apache HCatalog
+92343-7800299 No.1 Amil baba in Pakistan amil baba in Lahore amil baba in Ka...
April 2013 HUG: HBase as a Service at Yahoo!
1. HBase as a Service at Yahoo!
Bay Area HUG Presentation
Francis Liu
Vandana Ayyalasomayajula
April 17, 2013
2. HBase Overview
2Yahoo! Presentation, Confidential
Apache HBase is an open source Bigtable-like, distributed, scalable, consistent,
random access, key-value store built on Apache Hadoop
Column Family - Info
Rowkey Email Age Password
Alice alice@wonderland.com 23
Bob bob@myworld.com 25 Iambob
Eve hithere@getintouch.com 30 nice1pass
Table is
lexicographically
sorted on rowkeys
1
2
3
trickedyou
newpassword
Cells
4
ts1 = 1
ts2 = 2
Each cell has multiple
versions represented by
timestamp where
ts2>ts1
Identify your data (cell value) in the HBase table by
[1] rowkey, [2] column family, [3] column qualifier, [4] timestamp/ version]
HBase Data Model
3. HBase Distributed Mode
3Yahoo! Presentation, Confidential
Andy Arch
Brad Arch
Dheeraj Ops
Eleanor PgM
Francis Dev
Govind Dev
Rajiv Ops
Sumeet PM
Vandana Dev
Table T1 is split into three
regions R1, R2, R3
Each region is served by a
RegionServer collocated with
the DataNode
Client
ZooKeeper
-Root-
Client contacts
ZooKeeper, a
separate cluster of
ZK nodes
Retrieve RS hosting
–ROOT- region
(Row/ Meta region)
Find Sumeet’s role
with HBase
M1
M2
RS1
T1R1
RS2
T1R2, T1R3
RS1
(Row/ table region)
RS2
Query the .Meta.
server that has the
row key “Sumeet”
T1R1
T1R2
T1R3
RS1
RS2
RS2
RS3
8. Dimension Store Use Case
HBaseHDFS
MapReduce
Hive
Pig
Clickstream Ad Campaign
8
9. Incremental Processing Use Cases
9
HBase
MapReduce
Storm
HDFS
Collector
Slow
Fast
On-stageOff-stage processingCollection
Serving
Store
Search
Events
Files
10. Hadoop at Yahoo!
§ Hosted Multi-tenant Service
§ Security
§ Job Queues
§ HDFS Quota
10
11. HBase at Yahoo!
§ Hosted Multi-tenant Service
§ Security
§ Isolated Deployment
§ Region Server Group
§ Namespace
11
12. Security
§ Authentication
§ Kerberos (users, processes)
§ Delegation Token (MapReduce, YARN, etc)
§ Authorization
§ HBase ACLs (Read, Write, Create, Admin)
§ Grant permissions to User or Unix Group
§ ACL for Table, Column Family or Column
§ Only Global Admin can create/drop tables
12
14. Region Server Groups
§ Member Region Servers
§ Member Tables
§ Resource Isolation
§ Flexibility with configuration
14
Group Bar
Region Server 5…8
Table3
Table4
Group Foo
Region Server 1…4
Table1
Table2
RS1
Table1
Table2
RS2
Table1
Table2
RS3
Table1
Table2
RS4 RS5
Table3
Table4
RS6
Table3
Table4
RS7
Table3
Table4
RS8
15. Region Server Groups
15
§ group_add
§ group_remove
§ group_move_servers
§ group_move_tables
§ create … { … CONFIGURATION=>{‘hbase.rsgroup.name’=>’my_group’}}
17. Namespace
§ Analogous to Database
§ Table Name: <table namespace>.<table qualifier>
§ i.e. my_ns.my_table
§ Reserved namespaces
§ Default – tables with no explicit namespace
§ System – tables are guaranteed to be assigned prior to user tables
§ Table Path: /<hbaseRoot>/data/<namespace>/<tableName>
§ /hbase/data/my_ns/my_ns.my_table
17
18. Namespace + Security + Group + Quota
§ Tables
§ Namespace ACL
§ Default Region Server Group
§ Quota
§ Max Tables
§ Max Regions
18
Namespace
Group Tables Quota ACL
20. Conclusion
§ HBase enables new processing paradigms (vs HDFS)
§ Namespace provide tenants with a project space
§ Region Server Groups guarantee Isolation
§ Namespace Quota limits use of shared resources
§ Namespace ACLs help project level administration
Yahoo! Presentation, Confidential 20