2. What is Hbase
ï Non-relational, distributed database
ï ColumnâOriented
ï MultiâDimensional
ï High Availability
ï High Performance
ï open source
3. Implementation
ï HBase modeled with an HBase master node
orchestrating a cluster of one or more regionserver
slaves.
ï The HBase master is responsible for bootstrapping a
virgin install, for assigning regions to registered region
servers, and for recovering region server failures. The
master node is lightly loaded.
4.
5. Installation
ï Download a stable release from an Apache Download
Mirror and unpack it on your local filesystem.
ï you first need to tell HBase where Java is located on
your system. i.e JAVA_HOME environment variable
ï you can set the Java installation that HBase uses by
editing HBaseâs conf/hbaseenv.sh, and specifying the
JAVA_HOME
6. Continue
ï Add the HBase binary directory to your command-line
path. For example:
% exportHBASE_HOME=/home/hbase/hbase-x.y.z
% export PATH=$PATH:$HBASE_HOME/bin
ï To get the list of HBase options, type: % hbase
ï To start a temporary instance of HBase
% start-hbase.sh
ï launch the HBase shell by typing: % hbase shell
type:
7. Difference Between Hadoop/HDFS
and Hbase
ï HDFS is a distributed file system that is well suited for
the storage of large files.
ï HBase, on the other hand, is built on top of HDFS and
provides fast record lookups (and updates) for large
tables.
ï HDFS has based on GFS file system.
ï Hbase is Distributed â uses HDFS for storage ,Column
â Oriented , Multi Dimensional( Versions) , Storage
System
8. Hbase is NOT
ï A sql Database â No Joins,
datatypes , no (damn) sql
ï No Schema
ï No DBA needed
no query engine, no
10. Storage Model
ï Column â oriented database (column families)
ï Table consists of Rows, each which has a
primary
key(row key)
ï Each Row may have any number of columns
ï Table schema only defines Column familes(column
family can have any number of columns)
ï Each cell value has a timestamp
11. Hbase shell
ï hbase(main):003:0> create 'test', 'cf'
ï 0 row(s) in 1.2200 seconds
ï hbase(main):004:0> put 'test', 'row1', 'cf:a', 'value1'
ï 0 row(s) in 0.0560 seconds
ï 0 hbase(main):005:0> put 'test', 'row2', 'cf:b', 'value2'
ï 0 row(s) in 0.0370 seconds
ï hbase(main):006:0> put 'test', 'row3', 'cf:c', 'value3'
ï 0 row(s) in 0.0450 seconds
12. Hbase vs RDBMS
RDBMS Hbase
ï Data layout: Row-oriented Column family oriented
ï Query language SQL Get/put/scan/etc *
ï Security
ï Max data size
ï Read/write
Authentication/Authorization Work in Progress
TBs
1000s queries/second
Hundrends of PBs
Millions of queries /second
13. Use Cases
ï Facebook Analytics
ï Real-time counters of URLs shared, preferred links
ï Twitter
ï 25 TB of message every month
ï Mozilla
ï Store crashes report, 2.5 million per day.