Apache Phoenix and HBase: Past, Present and Future of SQL over HBase

© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Apache Phoenix and HBase: Past, Present
and Future of SQL over HBase
Enis Soztutar (enis@hortonworks.com)
Ankit Singhal (asinghal@hortonworks.com)

About Me
Enis Soztutar
Committer and PMC member in Apache HBase, Phoenix, and Hadoop
HBase/Phoenix team @Hortonworks
Twitter @enissoz
Disclaimer: Not a SQL expert!

Outline
PART I – The Past (a.k.a. All the existing stuff)
 Phoenix the basics
 Architecture
 Overview of existing Phoenix features
PART II – The Present (a.k.a. All the recent stuff)
 Look at recent releases
 Transactions
 Phoenix Query Server
 Other features
PART III – The Future (a.k.a. All the upcoming stuff)
 Calcite integration
 Phoenix – Hive

Part I – The Past
All the existing stuff !

Obligatory Slide - Who uses Phoenix

Phoenix – The Basics
• Hope everybody is familiar with HBase
• Otherwise you are in the wrong talk!
• What is wrong with pure-HBase?
• HBase is a powerful, flexible and extensible “engine”
• Too low level
• Have to write java code to do anything!
• Phoenix is relational layer over HBase
• Also described as a SQL-Skin
• Looking more and more like a generic SQL engine
• Why not Hive / Spark SQL / other SQL-over-Hadoop
• OTLP versus OLAP
• As fast as HBase, 1 ms query, 10K-1M qps

Why SQL?

From CDK Global
slides
https://phoenix.apache.
org/presentations/Strata
HadoopWorld.pdf

HBase Architecture
DataNode
RegionServer 2
T:foo, region:a
T:bar, region:54
T:foo, region:t
Application
HBase client
DataNode
RegionServer 1
T:foo, region:c
T:bar, region:14
T:foo, region:d
DataNode
RegionServer 3
T:bar, region:32
T:foo, region:k
ZooKeeper
Quorum

Phoenix Architecture
DataNode
RegionServer 2
T:foo, region:c
T:bar, region:54
T:foo, region:t
Phoenix RPC
endpoint
px
px
Application
Phoenix client / JDBC
HBase client
DataNode
RegionServer 1
T:foo, region:c
T:bar, region:14
T:foo, region:d
Phoenix RPC
endpoint
px
px
DataNode
RegionServer 3
T:SYSTEM.CATALOG
T:bar, region:32
T:foo, region:k
Phoenix RPC
endpoint
px
px
ZooKeeper
Quorum

Phoenix Goodies
SQL DataTypes
Schemas / DDL / HBase table properties
Composite Types (Composite Primary Key)
Map existing HBase tables
Write from HBase, read from Phoenix
Salting
Parallel Scan
Skip scan
Filter push down
Statistics Collection / Guideposts

DDL Example
CREATE TABLE IF NOT EXISTS METRIC_RECORD (
METRIC_NAME VARCHAR,
HOSTNAME VARCHAR,
SERVER_TIME UNSIGNED_LONG NOT NULL
METRIC_VALUE DOUBLE,
…
CONSTRAINT pk PRIMARY KEY (METRIC_NAME, HOSTNAME,
SERVER_TIME))
DATA_BLOCK_ENCODING=’FAST_DIFF', TTL=604800,
COMPRESSION=‘SNAPPY’
SPLIT ON ('a', 'k', 'm');

METRIC_NAME HOSTNAME SERVER_TIME METRIC_VALUE
Regionserver.readRequestCount cn011.hortonworks.com 1396743589 92045759
Regionserver.readRequestCount cn011.hortonworks.com 1396767589 93051916
Regionserver.readRequestCount cn011.hortonworks.com …. …
Regionserver.readRequestCount cn012. hortonworks.com 1396743589
….. … … …
Regionserver.wal.bytesWritten cn011.hortonworks.com
Regionserver.wal.bytesWritten …. …. …
SORT ORDERSORTORDER
HBASE ROW KEY OTHER COLUMNS

Parallel Scan
SELECT * FROM METRIC_RECORD;
CLIENT 4-CHUNK PARALLEL 1-WAY
FULL SCAN OVER METRIC_RECORD
Region1
Region2
Region3
Region4
Client
RS3RS2
RS1
scanscanscanscan

Filter push down
SELECT * FROM METRIC_RECORD
WHERE SERVER_TIME > NOW() - 7;
CLIENT 4-CHUNK PARALLEL 1-WAY
FULL SCAN OVER METRIC_RECORD
SERVER FILTER BY
SERVER_TIME > DATE
'2016-04-06 09:09:05.978’
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
scanscanscanscan
Server-side Filter

Skip Scan
WHERE METRIC_NAME LIKE 'abc%'
AND HOSTNAME in ('host1’,
'host2');
CLIENT 1-CHUNK PARALLEL 1-WAY SKIP
SCAN ON 2 RANGES OVER
METRIC_RECORD ['abc','host1'] -
['abd','host2']
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
Skip scan

TopN
WHERE SERVER_TIME > NOW() - 7
ORDER BY HOSTNAME LIMIT 5;
CLIENT 4-CHUNK PARALLEL 4-WAY FULL
SCAN OVER METRIC_RECORD
SERVER FILTER BY SERVER_TIME > …
SERVER TOP 5 ROWS SORTED BY
[HOSTNAME]
CLIENT MERGE SORT
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
scanscanscanscan
Sort by HOSTNAME
Return only 5
ROWS

Aggregation
SELECT METRIC_NAME, HOSTNAME,
AVG(METRIC_VALUE)
FROM METRIC_RECORD
WHERE SERVER_TIME > NOW() - 7
GROUP BY METRIC_NAME, HOSTNAME
ORDER BY METRIC_NAME, HOSTNAME;
CLIENT 4-CHUNK PARALLEL 1-WAY FULL
SCAN OVER METRIC_RECORD
SERVER FILTER BY SERVER_TIME > …
SERVER AGGREGATE INTO ORDERED
DISTINCT ROWS BY
[METRIC_NAME, HOSTNAME]
CLIENT MERGE SORT
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
scanscanscanscan
Return only
aggregated data by
METRIC_NAME,
HOSTNAME

Joins and subqueries in Phoenix
Grammar
• Inner, Left, Right, Full outer join, Cross join
• Semi-join / Anti-join
Algorithms
• Hash-join, sort-merge join
• Hash-join table is computed and pushed to each regionserver from client
Optimizations
• Predicate push-down
• PK-to-FK join optimization
• Global index with missing columns
• Correlated query rewrite

Joins and subqueries in Phoenix
Phoenix can execute most of TPC-H queries!
No nested loop join
With Calcite support, more improvements soon
No statistical Guided join selection yet
Not very good at executing very big joins
• No generic YARN / Tez execution layer
• But Hive / Spark support for generic DAG execution

Secondary Indexes
HBase table is a sorted map
• Everything in HBase is sorted in primary key order
• Full or partial scans in sort order is very efficient in HBase
• Sort data differently with secondary index dimensions
Two types
• Global index
• Local index
Query
• Indexes are “covered”
• Indexes are automatically selected from queries
• Only covered columns are returned from index without going back to data table

Global and Local Index
Global Index
• A single instance for all table data in a
different sort order
• A different HBase table per index
• Optimized for read-heavy use cases
• Can be one edit “behind” actual primary
data
• Transactional tables indices have ACID
guarantees
• Different consistency / durability for
mutable / immutable tables
Local Index
• Multiple mini-instances per region
• Uses same HBase table, different cf
• Optimized for write-heavy use cases
• Atomic commit and visibility (coming soon)
• Queries have to ask all regions for relevant
data from index

Part II – The Present
All the recent stuff !

Release Note Highlights
4.4
• Functional Indexes
• UDFs
• Query Server
• UNION ALL
• MR Index Build
• Spark Integration
• Date built-in functions
4.5
• Client-side per-statement metrics
• SELECT without FROM
• ALTER TABLE with VIEWS
• Math and Array built-in functions

Release Note Highlights
4.6
• ROW_TIMESTAMP for HBase native timestamps
• Support for correlate variable
• Support for un-nesting arrays
• Web-app for visualizing trace info (alpha)
4.7
• Transaction support
• Enhanced secondary index consistency guarantees
• Statistics improvements
• Perf improvements

Row Timestamps
A pseudo-column for HBase native timestamps (versions)
Enables setting and querying cell timestamps
Perfect for time-series use cases
• Combine with FIFO / Date Tiered Compaction policies
• And HBase scan file pruning based on min-max ts for very efficient scans
CREATE TABLE METRICS_TABLE (
CREATED_DATE NOT NULL DATE,
METRIC_ID NOT NULL CHAR(15), METRIC_VALUE LONG
CONSTRAINT PK PRIMARY KEY(CREATED_DATE ROW_TIMESTAMP,
METRIC_ID)) SALT_BUCKETS = 8;

Transactions
Uses Tephra
Snapshot isolation semantics
Completely optional.
• Can be enabled per-table (TRANSACTIONAL=true)
• Transactional and non-transactional tables can live side by side
Transactions see their own uncommitted data
Released in 4.7, will GA in 5.0
Optimistic Concurrency Control
• No locking for rows
• Transactions have to roll back and undo their writes in case of conflict
• Cost of conflict is higher

Tephra Architecture
RegionServer 2
Tephra / HBase Client
RegionServer 1 RegionServer 3
HBase client
ZooKeeper
Quorum
Tephra Trx Manager
(active)
Tephra Trx Manager
(standby)

Transaction Lifecycle
From Tephra
presentation
http://www.slideshare.n
et/alexbaranau/transacti
ons-over-hbase

Phoenix Query Server
Similar to HBase REST Server / Hive Server 2
Built on top of Calcite’s Avatica Server with Phoenix bindings
Embeds a Phoenix thick client inside
No client side sorting / join!
Protobuf-3.0 over HTTP protocol
Has a (thin) JDBC driver
Allows ODBC driver for Phoenix

Phoenix architecture revisited (thick client)
RegionServer 2
T:foo, region:d
Phoenix RPC
endpoint
px
Application
RegionServer 1
T:foo, region:d
Phoenix RPC
endpoint
px
RegionServer 3
T:foo, region:d
Phoenix RPC
endpoint
px
HBase client

Phoenix Query Server (thin client)
RegionServer 2
T:foo, region:d
Phoenix RPC
endpoint
px
Application
Phoenix thin client / JDBC
RegionServer 1
T:foo, region:d
Phoenix RPC
endpoint
px
RegionServer 3
T:foo, region:d
Phoenix RPC
endpoint
px
HBase client
HBase client
HBase client

Other new features (4.8+)
Shaded client by default. No more library dependency problems!
Phoenix schema mapping to HBase namespace
• Allows using isolation and security features of HBase namespaces
• Standard SQL syntax:
CREATE SCHEMA FOO;
USE FOO;
LIMIT / OFFSET
• We already had LIMIT. Now we have OFFSET
• Together with Row-Value-Constructs, covers most of cursor use cases

Part III – The Future
All the upcoming stuff !

Local Index
• Local Index re-implemented
• Instead of a different table, now local index data is kept within the same data
table
• Local index data goes into a different column family
• Index and data is committed together atomically without external transactions
• Bunch of stability improvements with region splits and merges

Calcite Integration
Calcite is a framework for:
• Query parser
• Compiler
• Planner
• Cost based optimizer
SQL-92 compliant
Based on relational algebra
Cost based optimizer with default rules + pluggable rules per-backend
Used by Hive / Drill / Kylin / Samza, etc.

Calcite Integration

Phoenix - Hive integration
Hive is a very rich and generic execution engine
Uses Tez + YARN to execute arbitrary DAG
Hive integration enables big joins and other Hive features
Phoenix DDL with HiveQL
Data insert / update delete (DML) with HiveQL
Predicate pushdown, salting, partitioning, partition pruning, etc
Can use secondary indexes as well since it uses Phoenix compiler
https://issues.apache.org/jira/browse/PHOENIX-2743

Future<Phoenix>
JSON support
TPC-H / Microstrategy / Tableau queries
Sqoop integration
Support Omid based transactions
Dogfooding within the Hadoop-ecosystem
• Ambari Metrics Service (AMS) uses Phoenix
• YARN will soon use HBase / Phoenix (ATS)
STRUCT type
Improvements to cost based optimization
Security and other HBase features used from Phoenix
See https://phoenix.apache.org/roadmap.html

Further Reference
Even more info on https://phoenix.apache.org
 New Features: https://phoenix.apache.org/recent.html
 Roadmap: https://phoenix.apache.org/roadmap.html
Get involved in mailing lists
 user@phoenix.apache.org
 dev@phoenix.apache.org

Thanks
Q & A

Apache Phoenix and HBase: Past, Present and Future of SQL over HBase

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Apache Phoenix and HBase: Past, Present and Future of SQL over HBase

Ähnlich wie Apache Phoenix and HBase: Past, Present and Future of SQL over HBase (20)

Mehr von DataWorks Summit/Hadoop Summit

Mehr von DataWorks Summit/Hadoop Summit (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Apache Phoenix and HBase: Past, Present and Future of SQL over HBase

Hinweis der Redaktion