SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Page1 © Hortonworks Inc. 2014
Hive ACID
Hive Streaming & SQL Insert/Update/Delete
Raj Bains, Fall 2015
Page2 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Hive – Single tool for all SQL use cases
OLTP, ERP, CRM Systems
Unstructured documents, emails
Clickstream
Server logs
Sentiment, Web Data
Sensor. Machine Data
Geolocation
Interactive
Analytics
Batch Reports /
Deep Analytics
Hive - SQL
ETL / ELT
Page3 © Hortonworks Inc. 2014
Hive: Batch to Sub-Second
Hive 0.10
Batch
Processing
Hive 0.13
Human
Interactive
(10 seconds)
Vectorized SQL Engine,
Tez Execution Engine,
ORC Columnar format
52x Average Query
Speedup
7.8 days to 9.3 hours
Hive 0.13
Human
Interactive
(10 seconds)
Stinger Initiative
Cost Based Optimizer
Faster Map Joins
Hive 0.14
Human
Interactive
(5 seconds)
3x Average Query
Speedup
Hive 2.0
Sub-SecondLLAP In-memory Cache,
LLAP Resident Process,
New Metastore for Compile,
Vectorization Improvements
Stinger.Next Initiative
Significant Query
Speedup
Using TPC-DS
Benchmark
Hive 1.2
Human
Interactive
(5 seconds)
Page4 © Hortonworks Inc. 2014
Transaction Overview
.
Page5 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Transaction Use Cases
• Reporting with Analytics (YES)
• Reporting on data with occasional updates
• Corrections to the fact tables, evolving dimension tables
• Low concurrency updates, low TPS
• SQL INSERT / UPDATE / DELETE Support
• Operational Reporting (Next)
• High throughput ingest from operational (OLTP) database
• Periodic inserts every 5-30 minutes
• Bulk updates and deletes are not supported
• SQL Merge
• Requires tool support and changes in our Transactions
• Operational (OLTP) Database (NO)
• Small Transactions, each doing single line inserts
• High Concurrency - Hundreds to thousands of connections
Hive
OLTP Hive
Replication
Analytics Modifications
Hive
High Concurrency
OLTP
Page6 © Hortonworks Inc. 2014
Transaction Use Cases
• Streaming Ingest
• Use Hive Streaming API
• Designed for tools
Hive
Analytics
Append
Page7 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Transaction Compactions
Read-
Optimized
ORCFile
Delta File
Merged
Read-
Optimized
ORCFile
1. Original File
Task reads the latest
ORCFile
Task
Read-
Optimized
ORCFile
Task Task
2. Edits Made
Task reads the ORCFile and merges
the delta file with the edits
3. Edits Merged
Task reads the
updated ORCFile
Hive ACID Compactor
periodically merges the delta
files in the background.
HDFS is a read only file system
Page8 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
More About Compaction
Read-
Optimized
ORCFile
Delta File
Merged
Read-
Optimized
ORCFile
Read-
Optimized
ORCFile
Delta File
Delta File
Delta File
Minor Compaction
10% local
Major Compaction
10% global
Minor and Major Compactions
Page9 © Hortonworks Inc. 2014
Setting up Transactions in Ambari 2.1
Page10 © Hortonworks Inc. 2014
Transaction Internals
.
Page11 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Compaction: Scheduling
(driver.compile, driver.execute)
Execution Layer – Tez
HiveCLI Hive
Metastore
Schema
Definition
db
Table1
Schema1
Table2
Schema2
Hive
Metastore DB
Client – beeline, MicroStrategy
Hive Server 2
JDBC/ ODBC
Hive
Metastore
Schema
Definition
db
Table1
Schema1
Table2
Schema2
Hive
Metastore DB
Schedules compaction
jobs as the owner of
HDFS data in their
default queue
Note: You have to setup impersonation so Metastore can run as the owner of the data
Compaction jobs – Workload Management and Usability limitations
• Can I schedule it in the queue I want
• Can I easily figure out if compaction is taking up N% of my cluster resources
Page12 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Locking and Concurrency
• We follow Snapshot isolation where
• Writers do traditional 2 phase locking (and write newer versions of data)
• Readers read the latest version number when the query arrived
• Readers and writers do not block each other
• Writes can write newer versions
• Readers can read a consistent view based on version number
• We do Table level and Partition level locking
• If we cannot figure out the partition, we’ll do table level locking
• Two transactions trying to update the same Table/Partition will block behind one another
Page13 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Transactions Implementation
• INSERTS
• inserts write delta files instead of appending rows to new files
• Requires full list of columns:
• INSERT INTO TABLE T VALUES(1, 2, 3) – OK
• UPDATES
• UPDATE T SET name = ‘fred’ WHERE name = ‘freddy’;
• Writes new values to the delta file, complete with transaction information
• New UPDATE privilege added to authorization
• Updated columns passed to authorizer, SELECT privileges also required
• DELETES
• DELETE FROM T WHERE name = ‘freddy’;
• Writes deleted values to the delta file, complete with transaction information
• DELETE privilege already existed in authorization, SELECT privileges also required
Page14 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Transaction Gotchas – Update Example
• Update Implementation
• User statement
• UPDATE T SET name = ‘fred’ WHERE name = ‘freddy’;
• Rewritten as
• INSERT INTO T SELECT ROW__ID, a, ‘fred’, c FROM T WHERE name = ‘freddy’
(assuming T has columns a, b, c)
• ROW__ID is row identifier information from AcidInputFormat
• Some Consequences
• It is only for updates on the RHS side of 'set' that you can't do subqueries
• BAD: update T set name = (select name from popular_names order by name limit 1);
• OK: update T set popular = true where name in (select name from popular_names)
Page15 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Restrictions
OVERALL – ACID is V1 with restrictions and usability improvements to come
• Need to declare the table as having Transaction Property
• Table Needs to have buckets
INSERTS
• Still need to say “TABLE” and “PARTITION, (Fix TODO)
• INSERT INTO TABLE T
• INSERT INTO TABLE T PARTITION(ds = ‘today’) VALUES ...
• insert/overwrite disallowed
UPDATES
• Cannot update the partition column or bucket column (as this would require a delete and insert, and we don’t have an
easy way to do that at the moment)
• Expressions on the right side of SET must be supported as projections by Hive, thus no subqueries.
Compactions
• Workload Management needs to be built
Page16 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Transaction Performance
• Insert – No scans, Single partition/bucket write
• Update – Entire Table scan (unless partition key in where clause)
• Delete – Entire table scan (unless partition key in where clause)
• Multiple rows of values (update/ delete) in a single statement
• Produces single table scan
Page17 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Other commands and Settings
• SHOW TRANSACTIONS
• SHOW TRANSACTIONS is for use by administrators when Hive transactions are being used. It returns a list of all currently
open and aborted transactions in the system, including this information:
• transaction ID
• transaction state
• user who started the transaction
• machine where the transaction was started
• SHOW COMPACTIONS
• SHOW COMPACTIONS returns a list of all tables and partitions currently being compacted or scheduled for compaction when
Hive transactions are being used, including this information:
• database name
• table name
• partition name (if the table is partitioned)
• whether it is a major or minor compaction
• the state the compaction is in, which can be: "initiated" – waiting in the queue to be compacted "working" – being compacted "ready for
cleaning" – the compaction has been done and the old files are scheduled to be cleaned thread ID of the worker thread doing the compaction
(only if in working state) the time at which the compaction started (only if in working or ready for cleaning state)
• SHOW LOCKS
• SHOW LOCKS <table_name>;
• SHOW LOCKS <table_name> EXTENDED;
• SHOW LOCKS <table_name> PARTITION (<partition_desc>);
• SHOW LOCKS <table_name> PARTITION (<partition_desc>) EXTENDED;
Page18 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Configuration
Configuration key Values Notes
hive.txn.manager Default: org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
Need: transactions:org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
DummyTxnManager replicates pre Hive-0.13 behavior and
provides no transactions.
hive.txn.timeout Default: 300 Time after which transactions are declared aborted if the client has not sent a heartbeat,
in seconds.
hive.txn.max.open.batch Default: 1000 Maximum number of transactions that can be fetched in one call to open_txns().
Controls how many transactions streaming agents such as Flume or Storm open
simultaneously.
hive.compactor.initiator.on Default: false
Value to turn on transactions: true (for exactly one instance of the Thrift
metastore service)
Whether to run the initiator and cleaner threads on this
metastore instance.
hive.compactor.worker.threads Default: 0 .Value to turn on transactions: > 0 on at least one instance
of the Thrift metastore service
How many compactor worker threads to run on this
metastore instance.**
hive.compactor.worker.timeout Default: 86400 Time in seconds after which a compaction job will be declared failed and the compaction
re-queued.
hive.compactor.check.interval Default: 300 Time in seconds between checks to see if any tables or partitions need to be
compacted.***
hive.compactor.delta.num.threshold Default: 10 Number of delta directories in a table or partition that will trigger a minor compaction.
hive.compactor.delta.pct.threshold Default: 0.1 Percentage (fractional) size of the delta files relative to the base that will trigger a major
compaction. 1 = 100%, so the default 0.1 = 10%.
hive.compactor.abortedtxn.threshold Default: 1000 Number of aborted transactions involving a given table or partition that will trigger a
major compaction.
Hive.enforce.bucketing true
Hive.exec.dynamic.partition.mode nonstrict
hive.support.concurrency true
Page19 © Hortonworks Inc. 2014
Streaming Solution Goals
What business goals are required?
Page20 © Hortonworks Inc. 2014
Background - Hive Partitions and Buckets
CREATE TABLE user_info(
user_id BIGINT,
firstname STRING,
lastname STRING)
PARTITIONED BY(ds STRING)
CLUSTERED BY(user_id) INTO 255 BUCKETS;
Buckets
• Within every partitions, rows are distributed into buckets using a hash function
• hash_fn(bucketing_column) mod num_buckets
Table
Partition
date-0
Partition
2015-08-25
Bucket 1 Bucket 2 Bucket … Bucket 256
Partition …
Partition
date-N
Partitions
• All rows in a partition have the same value of partition key
• Partition is a directory
Page21 © Hortonworks Inc. 2014
Hive Transactions and Buckets – Physical Structure
Table
Partition
2015-08-25
Delta1
Bucket 1 Bucket … Bucket 256
Delta2
Bucket 1 Bucket … Bucket 256
Partition …
Partition
date-N
CREATE TABLE user_info(
user_id BIGINT,
firstname STRING,
lastname STRING)
PARTITIONED BY(ds STRING)
CLUSTERED BY(user_id) INTO 255 BUCKETS;
STORED AS ORC
TBLPROPERTIES ("transactional"="true");
insert into table user_info values (,,), (,,), (,,), (,,), (,,);
Page22 © Hortonworks Inc. 2014
Example Streaming Solution
Kafka
Storm
Hive Bolt
Hive Bolt
Hive Bolt
Hive
Storm
Topology
Hive Streaming API
Streaming Analytics SQL Analytics
Page23 © Hortonworks Inc. 2014
Streaming into Hive without Hive Streaming
Hive Bolt
Hive Bolt
Hive Bolt
Partitions
Page24 © Hortonworks Inc. 2014
What are your Streaming Sink goals
Write Throughput
• Determines the hardware cost of solution
• (events * size) per node
Ingest to Query Latency
• How long after ingest is the data available for query
Query Speed
• What query speed is needed when reading the data
• How do I want to layout the data to achieve that
Data Quality Guarantees
• What are the semantics of the solution
• At least once, Exactly once
Page25 © Hortonworks Inc. 2014
Hive Streaming
Streaming API using ACID + ORC
Page26 © Hortonworks Inc. 2014
Hive Streaming API Basics
Storm
Hive Bolt
Hive
HIVE STREAMING API
HiveEndPoint EP(MetastoreURI
DB, Table,
Partitions<>)
TransactionBatch TB (10)
Foreach (t : TB) {
t.begin ( );
t.write (BatchSize rows);
t.commit ( );
}
TB.close ( );
Hive Metastore
HDFS
Performance Note: Fewer Metastore calls is better
Page27 © Hortonworks Inc. 2014
ORC File Basics
Row Data indexes Row Groups of 10k rows each
A stripe consists of 500K rows at 500bytes
Read Considerations
When read a subset of data in a common case, you
want to push the predicate and skip over most of the
data by looking at metadata.
- Indexes index most primitive types
- There are min and max that are most effective for
sorted data
- There are Bloom Filters (Hive 1.2+) that are
effective even for non-sorted data
Page28 © Hortonworks Inc. 2014
Hive Streaming + ORC: Writes
• Bolts write delta files
• Multiple bolts/threads can write to a bucket, but each writes a
separate delta file
• One thread per bucket yields larger files
• Rows are uniquely identified with <Bucket, Txn, Id>
Bucket 1
Delta files
P
a
r
t
i
t
i
o
n
Bucket 2
Bucket 3
Bucket 4
Hive Bolt
Hive Bolt
Hive Bolt
Base File
Hive Bolt
Hive Bolt
Page29 © Hortonworks Inc. 2014
Hive Streaming + ORC: Reads
Bucket 1
Delta files
P
a
r
t
i
t
i
o
n
Bucket 2
Bucket 3
Bucket 4
• The Mappers will read buckets and merge delta files at read
• Sorting within files on keys helps fast merge
• If there is more than one Mapper per bucket
• they can split the base file
• they will have to merge all delta files
Map
Map
Base File
Map
Map
Page30 © Hortonworks Inc. 2014
Hive Streaming + ORC: Compactions
Bucket 1
Delta files
P
a
r
t
i
t
i
o
n
Bucket 2
Bucket 3
Bucket 4
Note: There are no base files till the first major compaction
Base file
Bucket 1
Delta file
P
a
r
t
i
t
i
o
n
Bucket 2
Bucket 3
Bucket 4
Base file
Bucket 1P
a
r
t
i
t
i
o
n
Bucket 2
Bucket 3
Bucket 4
Base file
Minor Compaction
A minor compaction will join multiple
delta files into one
Major Compaction
A major compaction will merge delta
files with the base file
Page31 © Hortonworks Inc. 2014
Storm Overview
.
Page32 © Hortonworks Inc. 2014
Storm
Storm Overview
Kafka
Spout
Kafka
Spout
Kafka
Spout
Kafka
Topic 1
Partition 0
Topic 1
Partition 1
Topic 1
Partition 2
Bolt
Bolt
Analytics
Hive
Bolt
Hive
Bolt
Hive
(HDFS)
Page33 © Hortonworks Inc. 2014
Storm
Storm Guarantees and Constraints
Kafka
Spout
Kafka
Spout
Kafka
Spout
Kafka
Topic 1
Partition 0
Topic 1
Partition 0
Topic 1
Partition 0
Bolt
Bolt
Hive
Bolt
Hive
Bolt
Hive
(HDFS)
State in
Memory
ACK
ACK
State needs to be preserved till the data is acknowledged by Hive
Otherwise Kafka data is replayed
This means large inserts can cause state to fill up memory
Storm would like to guarantee data is delivered
to Hive – exactly once
Page34 © Hortonworks Inc. 2014
Balancing Constraints
.
Page35 © Hortonworks Inc. 2014
Balancing Storm and Hive Preferences
• Storm wants to hold on to minimum state
• Smaller writes make data visible to readers quicker
• Hive prefers larger writes
• Write throughput is good with larger writes
• However, compaction can help
State
Page36 © Hortonworks Inc. 2014
Questions
?

Weitere ähnliche Inhalte

Was ist angesagt?

ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerDataWorks Summit
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hiverxu
 
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetupt3rmin4t0r
 
Hive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationHive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationEyad Garelnabi
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive QueriesOwen O'Malley
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_featuresAlberto Romero
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelt3rmin4t0r
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveDataWorks Summit
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloudgluent.
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHortonworks
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014alanfgates
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Chris Nauroth
 
Llap: Locality is Dead
Llap: Locality is DeadLlap: Locality is Dead
Llap: Locality is Deadt3rmin4t0r
 

Was ist angesagt? (20)

Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
 
Apache Hive on ACID
Apache Hive on ACIDApache Hive on ACID
Apache Hive on ACID
 
Tune up Yarn and Hive
Tune up Yarn and HiveTune up Yarn and Hive
Tune up Yarn and Hive
 
Data organization: hive meetup
Data organization: hive meetupData organization: hive meetup
Data organization: hive meetup
 
Hive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationHive Data Modeling and Query Optimization
Hive Data Modeling and Query Optimization
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
Apache Hive ACID Project
Apache Hive ACID ProjectApache Hive ACID Project
Apache Hive ACID Project
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
 
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthelTez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud Storage
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
 
Llap: Locality is Dead
Llap: Locality is DeadLlap: Locality is Dead
Llap: Locality is Dead
 
Achieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on TezAchieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on Tez
 

Ähnlich wie HiveACIDPublic

What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?DataWorks Summit
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019alanfgates
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoDataWorks Summit
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizonThejas Nair
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizonArtem Ervits
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?DataWorks Summit
 
Transactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and futureTransactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and futureDataWorks Summit
 
ACID Transactions in Hive
ACID Transactions in HiveACID Transactions in Hive
ACID Transactions in HiveEugene Koifman
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfMiguel Angel Fajardo
 
Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019alanfgates
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?DataWorks Summit
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clustersenissoz
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsDataWorks Summit/Hadoop Summit
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Yahoo! Hack Europe Workshop
Yahoo! Hack Europe WorkshopYahoo! Hack Europe Workshop
Yahoo! Hack Europe WorkshopHortonworks
 

Ähnlich wie HiveACIDPublic (20)

What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
 
Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?Fast SQL on Hadoop, really?
Fast SQL on Hadoop, really?
 
Transactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and futureTransactional operations in Apache Hive: present and future
Transactional operations in Apache Hive: present and future
 
What's new in Ambari
What's new in AmbariWhat's new in Ambari
What's new in Ambari
 
ACID Transactions in Hive
ACID Transactions in HiveACID Transactions in Hive
ACID Transactions in Hive
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
 
Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019Hive Performance Dataworks Summit Melbourne February 2019
Hive Performance Dataworks Summit Melbourne February 2019
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clusters
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Yahoo! Hack Europe Workshop
Yahoo! Hack Europe WorkshopYahoo! Hack Europe Workshop
Yahoo! Hack Europe Workshop
 

HiveACIDPublic

  • 1. Page1 © Hortonworks Inc. 2014 Hive ACID Hive Streaming & SQL Insert/Update/Delete Raj Bains, Fall 2015
  • 2. Page2 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Hive – Single tool for all SQL use cases OLTP, ERP, CRM Systems Unstructured documents, emails Clickstream Server logs Sentiment, Web Data Sensor. Machine Data Geolocation Interactive Analytics Batch Reports / Deep Analytics Hive - SQL ETL / ELT
  • 3. Page3 © Hortonworks Inc. 2014 Hive: Batch to Sub-Second Hive 0.10 Batch Processing Hive 0.13 Human Interactive (10 seconds) Vectorized SQL Engine, Tez Execution Engine, ORC Columnar format 52x Average Query Speedup 7.8 days to 9.3 hours Hive 0.13 Human Interactive (10 seconds) Stinger Initiative Cost Based Optimizer Faster Map Joins Hive 0.14 Human Interactive (5 seconds) 3x Average Query Speedup Hive 2.0 Sub-SecondLLAP In-memory Cache, LLAP Resident Process, New Metastore for Compile, Vectorization Improvements Stinger.Next Initiative Significant Query Speedup Using TPC-DS Benchmark Hive 1.2 Human Interactive (5 seconds)
  • 4. Page4 © Hortonworks Inc. 2014 Transaction Overview .
  • 5. Page5 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Transaction Use Cases • Reporting with Analytics (YES) • Reporting on data with occasional updates • Corrections to the fact tables, evolving dimension tables • Low concurrency updates, low TPS • SQL INSERT / UPDATE / DELETE Support • Operational Reporting (Next) • High throughput ingest from operational (OLTP) database • Periodic inserts every 5-30 minutes • Bulk updates and deletes are not supported • SQL Merge • Requires tool support and changes in our Transactions • Operational (OLTP) Database (NO) • Small Transactions, each doing single line inserts • High Concurrency - Hundreds to thousands of connections Hive OLTP Hive Replication Analytics Modifications Hive High Concurrency OLTP
  • 6. Page6 © Hortonworks Inc. 2014 Transaction Use Cases • Streaming Ingest • Use Hive Streaming API • Designed for tools Hive Analytics Append
  • 7. Page7 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Transaction Compactions Read- Optimized ORCFile Delta File Merged Read- Optimized ORCFile 1. Original File Task reads the latest ORCFile Task Read- Optimized ORCFile Task Task 2. Edits Made Task reads the ORCFile and merges the delta file with the edits 3. Edits Merged Task reads the updated ORCFile Hive ACID Compactor periodically merges the delta files in the background. HDFS is a read only file system
  • 8. Page8 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION More About Compaction Read- Optimized ORCFile Delta File Merged Read- Optimized ORCFile Read- Optimized ORCFile Delta File Delta File Delta File Minor Compaction 10% local Major Compaction 10% global Minor and Major Compactions
  • 9. Page9 © Hortonworks Inc. 2014 Setting up Transactions in Ambari 2.1
  • 10. Page10 © Hortonworks Inc. 2014 Transaction Internals .
  • 11. Page11 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Compaction: Scheduling (driver.compile, driver.execute) Execution Layer – Tez HiveCLI Hive Metastore Schema Definition db Table1 Schema1 Table2 Schema2 Hive Metastore DB Client – beeline, MicroStrategy Hive Server 2 JDBC/ ODBC Hive Metastore Schema Definition db Table1 Schema1 Table2 Schema2 Hive Metastore DB Schedules compaction jobs as the owner of HDFS data in their default queue Note: You have to setup impersonation so Metastore can run as the owner of the data Compaction jobs – Workload Management and Usability limitations • Can I schedule it in the queue I want • Can I easily figure out if compaction is taking up N% of my cluster resources
  • 12. Page12 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Locking and Concurrency • We follow Snapshot isolation where • Writers do traditional 2 phase locking (and write newer versions of data) • Readers read the latest version number when the query arrived • Readers and writers do not block each other • Writes can write newer versions • Readers can read a consistent view based on version number • We do Table level and Partition level locking • If we cannot figure out the partition, we’ll do table level locking • Two transactions trying to update the same Table/Partition will block behind one another
  • 13. Page13 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Transactions Implementation • INSERTS • inserts write delta files instead of appending rows to new files • Requires full list of columns: • INSERT INTO TABLE T VALUES(1, 2, 3) – OK • UPDATES • UPDATE T SET name = ‘fred’ WHERE name = ‘freddy’; • Writes new values to the delta file, complete with transaction information • New UPDATE privilege added to authorization • Updated columns passed to authorizer, SELECT privileges also required • DELETES • DELETE FROM T WHERE name = ‘freddy’; • Writes deleted values to the delta file, complete with transaction information • DELETE privilege already existed in authorization, SELECT privileges also required
  • 14. Page14 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Transaction Gotchas – Update Example • Update Implementation • User statement • UPDATE T SET name = ‘fred’ WHERE name = ‘freddy’; • Rewritten as • INSERT INTO T SELECT ROW__ID, a, ‘fred’, c FROM T WHERE name = ‘freddy’ (assuming T has columns a, b, c) • ROW__ID is row identifier information from AcidInputFormat • Some Consequences • It is only for updates on the RHS side of 'set' that you can't do subqueries • BAD: update T set name = (select name from popular_names order by name limit 1); • OK: update T set popular = true where name in (select name from popular_names)
  • 15. Page15 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Restrictions OVERALL – ACID is V1 with restrictions and usability improvements to come • Need to declare the table as having Transaction Property • Table Needs to have buckets INSERTS • Still need to say “TABLE” and “PARTITION, (Fix TODO) • INSERT INTO TABLE T • INSERT INTO TABLE T PARTITION(ds = ‘today’) VALUES ... • insert/overwrite disallowed UPDATES • Cannot update the partition column or bucket column (as this would require a delete and insert, and we don’t have an easy way to do that at the moment) • Expressions on the right side of SET must be supported as projections by Hive, thus no subqueries. Compactions • Workload Management needs to be built
  • 16. Page16 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Transaction Performance • Insert – No scans, Single partition/bucket write • Update – Entire Table scan (unless partition key in where clause) • Delete – Entire table scan (unless partition key in where clause) • Multiple rows of values (update/ delete) in a single statement • Produces single table scan
  • 17. Page17 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Other commands and Settings • SHOW TRANSACTIONS • SHOW TRANSACTIONS is for use by administrators when Hive transactions are being used. It returns a list of all currently open and aborted transactions in the system, including this information: • transaction ID • transaction state • user who started the transaction • machine where the transaction was started • SHOW COMPACTIONS • SHOW COMPACTIONS returns a list of all tables and partitions currently being compacted or scheduled for compaction when Hive transactions are being used, including this information: • database name • table name • partition name (if the table is partitioned) • whether it is a major or minor compaction • the state the compaction is in, which can be: "initiated" – waiting in the queue to be compacted "working" – being compacted "ready for cleaning" – the compaction has been done and the old files are scheduled to be cleaned thread ID of the worker thread doing the compaction (only if in working state) the time at which the compaction started (only if in working or ready for cleaning state) • SHOW LOCKS • SHOW LOCKS <table_name>; • SHOW LOCKS <table_name> EXTENDED; • SHOW LOCKS <table_name> PARTITION (<partition_desc>); • SHOW LOCKS <table_name> PARTITION (<partition_desc>) EXTENDED;
  • 18. Page18 © Hortonworks Inc. 2014 HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Configuration Configuration key Values Notes hive.txn.manager Default: org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager Need: transactions:org.apache.hadoop.hive.ql.lockmgr.DbTxnManager DummyTxnManager replicates pre Hive-0.13 behavior and provides no transactions. hive.txn.timeout Default: 300 Time after which transactions are declared aborted if the client has not sent a heartbeat, in seconds. hive.txn.max.open.batch Default: 1000 Maximum number of transactions that can be fetched in one call to open_txns(). Controls how many transactions streaming agents such as Flume or Storm open simultaneously. hive.compactor.initiator.on Default: false Value to turn on transactions: true (for exactly one instance of the Thrift metastore service) Whether to run the initiator and cleaner threads on this metastore instance. hive.compactor.worker.threads Default: 0 .Value to turn on transactions: > 0 on at least one instance of the Thrift metastore service How many compactor worker threads to run on this metastore instance.** hive.compactor.worker.timeout Default: 86400 Time in seconds after which a compaction job will be declared failed and the compaction re-queued. hive.compactor.check.interval Default: 300 Time in seconds between checks to see if any tables or partitions need to be compacted.*** hive.compactor.delta.num.threshold Default: 10 Number of delta directories in a table or partition that will trigger a minor compaction. hive.compactor.delta.pct.threshold Default: 0.1 Percentage (fractional) size of the delta files relative to the base that will trigger a major compaction. 1 = 100%, so the default 0.1 = 10%. hive.compactor.abortedtxn.threshold Default: 1000 Number of aborted transactions involving a given table or partition that will trigger a major compaction. Hive.enforce.bucketing true Hive.exec.dynamic.partition.mode nonstrict hive.support.concurrency true
  • 19. Page19 © Hortonworks Inc. 2014 Streaming Solution Goals What business goals are required?
  • 20. Page20 © Hortonworks Inc. 2014 Background - Hive Partitions and Buckets CREATE TABLE user_info( user_id BIGINT, firstname STRING, lastname STRING) PARTITIONED BY(ds STRING) CLUSTERED BY(user_id) INTO 255 BUCKETS; Buckets • Within every partitions, rows are distributed into buckets using a hash function • hash_fn(bucketing_column) mod num_buckets Table Partition date-0 Partition 2015-08-25 Bucket 1 Bucket 2 Bucket … Bucket 256 Partition … Partition date-N Partitions • All rows in a partition have the same value of partition key • Partition is a directory
  • 21. Page21 © Hortonworks Inc. 2014 Hive Transactions and Buckets – Physical Structure Table Partition 2015-08-25 Delta1 Bucket 1 Bucket … Bucket 256 Delta2 Bucket 1 Bucket … Bucket 256 Partition … Partition date-N CREATE TABLE user_info( user_id BIGINT, firstname STRING, lastname STRING) PARTITIONED BY(ds STRING) CLUSTERED BY(user_id) INTO 255 BUCKETS; STORED AS ORC TBLPROPERTIES ("transactional"="true"); insert into table user_info values (,,), (,,), (,,), (,,), (,,);
  • 22. Page22 © Hortonworks Inc. 2014 Example Streaming Solution Kafka Storm Hive Bolt Hive Bolt Hive Bolt Hive Storm Topology Hive Streaming API Streaming Analytics SQL Analytics
  • 23. Page23 © Hortonworks Inc. 2014 Streaming into Hive without Hive Streaming Hive Bolt Hive Bolt Hive Bolt Partitions
  • 24. Page24 © Hortonworks Inc. 2014 What are your Streaming Sink goals Write Throughput • Determines the hardware cost of solution • (events * size) per node Ingest to Query Latency • How long after ingest is the data available for query Query Speed • What query speed is needed when reading the data • How do I want to layout the data to achieve that Data Quality Guarantees • What are the semantics of the solution • At least once, Exactly once
  • 25. Page25 © Hortonworks Inc. 2014 Hive Streaming Streaming API using ACID + ORC
  • 26. Page26 © Hortonworks Inc. 2014 Hive Streaming API Basics Storm Hive Bolt Hive HIVE STREAMING API HiveEndPoint EP(MetastoreURI DB, Table, Partitions<>) TransactionBatch TB (10) Foreach (t : TB) { t.begin ( ); t.write (BatchSize rows); t.commit ( ); } TB.close ( ); Hive Metastore HDFS Performance Note: Fewer Metastore calls is better
  • 27. Page27 © Hortonworks Inc. 2014 ORC File Basics Row Data indexes Row Groups of 10k rows each A stripe consists of 500K rows at 500bytes Read Considerations When read a subset of data in a common case, you want to push the predicate and skip over most of the data by looking at metadata. - Indexes index most primitive types - There are min and max that are most effective for sorted data - There are Bloom Filters (Hive 1.2+) that are effective even for non-sorted data
  • 28. Page28 © Hortonworks Inc. 2014 Hive Streaming + ORC: Writes • Bolts write delta files • Multiple bolts/threads can write to a bucket, but each writes a separate delta file • One thread per bucket yields larger files • Rows are uniquely identified with <Bucket, Txn, Id> Bucket 1 Delta files P a r t i t i o n Bucket 2 Bucket 3 Bucket 4 Hive Bolt Hive Bolt Hive Bolt Base File Hive Bolt Hive Bolt
  • 29. Page29 © Hortonworks Inc. 2014 Hive Streaming + ORC: Reads Bucket 1 Delta files P a r t i t i o n Bucket 2 Bucket 3 Bucket 4 • The Mappers will read buckets and merge delta files at read • Sorting within files on keys helps fast merge • If there is more than one Mapper per bucket • they can split the base file • they will have to merge all delta files Map Map Base File Map Map
  • 30. Page30 © Hortonworks Inc. 2014 Hive Streaming + ORC: Compactions Bucket 1 Delta files P a r t i t i o n Bucket 2 Bucket 3 Bucket 4 Note: There are no base files till the first major compaction Base file Bucket 1 Delta file P a r t i t i o n Bucket 2 Bucket 3 Bucket 4 Base file Bucket 1P a r t i t i o n Bucket 2 Bucket 3 Bucket 4 Base file Minor Compaction A minor compaction will join multiple delta files into one Major Compaction A major compaction will merge delta files with the base file
  • 31. Page31 © Hortonworks Inc. 2014 Storm Overview .
  • 32. Page32 © Hortonworks Inc. 2014 Storm Storm Overview Kafka Spout Kafka Spout Kafka Spout Kafka Topic 1 Partition 0 Topic 1 Partition 1 Topic 1 Partition 2 Bolt Bolt Analytics Hive Bolt Hive Bolt Hive (HDFS)
  • 33. Page33 © Hortonworks Inc. 2014 Storm Storm Guarantees and Constraints Kafka Spout Kafka Spout Kafka Spout Kafka Topic 1 Partition 0 Topic 1 Partition 0 Topic 1 Partition 0 Bolt Bolt Hive Bolt Hive Bolt Hive (HDFS) State in Memory ACK ACK State needs to be preserved till the data is acknowledged by Hive Otherwise Kafka data is replayed This means large inserts can cause state to fill up memory Storm would like to guarantee data is delivered to Hive – exactly once
  • 34. Page34 © Hortonworks Inc. 2014 Balancing Constraints .
  • 35. Page35 © Hortonworks Inc. 2014 Balancing Storm and Hive Preferences • Storm wants to hold on to minimum state • Smaller writes make data visible to readers quicker • Hive prefers larger writes • Write throughput is good with larger writes • However, compaction can help State
  • 36. Page36 © Hortonworks Inc. 2014 Questions ?

Hinweis der Redaktion

  1. ===== INSERT IMPLEMENTATION ===== INSERT INTO TABLE T VALUES(1, 2, 3) – OK (assuming T has 3 int columns) Implemented as: Dump values to temp table Rewrite insert to INSERT INTO T SELECT * FROM temp_table; ===== UPDATE IMPLEMENTATION ===== Implemented by rewriting query to: INSERT INTO T SELECT ROW__ID, a, ‘fred’, c FROM T WHERE name = ‘freddy’ (assuming T has columns a, b, c) ROW__ID is row identifier information from AcidInputFormat FileSinkOperator is informed this is an update rather than a standard insert === DELETE IMPLEMENTATION ==== Implemented by rewriting query to: INSERT INTO T SELECT ROW__ID FROM T WHERE name = ‘freddy’ ROW__ID is row identifier information from AcidInputFormat FileSinkOperator is informed this is a delete rather than a standard insert