Currently telecom companies store their data in database or data warehouse, treating them through ETL process and working on statistics and analysis by using OLAP tools or data mining engines. However, due to the data explosion along with the spread of Smart Phones traditional data storages like DB and DW aren’t sufficient to cope with these “Big Data”. As an alternative the method of storing data in Hadoop and performing ETL process and Ad-hoc Query with Hive is being introduced, and China Mobile is being mentioned as the most representative example. But, they are adopted mainly by new projects, which have low barriers in applying the new Hive data model and HQL. On the other hand, it is extremely difficult to replace the existing database with the combination of Hadoop and Hive if there are already a number of tables and SQL queries. NexR is migrating the telecom company’s data from Oracle DB to Hadoop, and converting a lot of existing Oracle SQL queries to Hive HQL queries. Though HQL supports a similar syntax to ANSI-SQL, it lacks a large portion of basic functions and hardly supports Oracle analytic functions like rank() which are utilized mainly in statistical analysis. Furthermore, the difference of data types like null value is also blocking the application of it. In this presentation, we will share the experience converting Oracle SQL to Hive HQL and developing additional functions with MapReduce. Also, we will introduce several ideas and trials to improve Hive performance.
http://sdec.kr/
43. What is HIVE ?
• A system for managing and querying structured data
built on top of Hadoop
• Map-Reduce for execution
• HDFS for storage
• Metadata in an RDBMS
44. What is HIVE ?
• A system for managing and querying structured data
built on top of Hadoop
• Map-Reduce for execution
• HDFS for storage
• Metadata in an RDBMS
• Key Building Principles
• SQL is a familiar language
• Extensibility - Types, Functions, Formats, Scripts
• Performance
49. public class CallCountMapper extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(WritableComparable key, Writable value,
OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line.toLowerCase());
word.set(itr.nextToken());
output.collect(word, one);
}
}
50. public class CallCountMapper extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
er
private Text word = new Text();
app
public void map(WritableComparable key, Writable value,
M
OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line.toLowerCase());
word.set(itr.nextToken());
output.collect(word, one);
}
}
51. public class CallCountMapper extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
er
private Text word = new Text();
app
public void map(WritableComparable key, Writable value,
M
OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line.toLowerCase());
word.set(itr.nextToken());
output.collect(word, one);
}
}
public class CallCountReducer extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator values,
OutputCollector output, Reporter reporter) throws IOException {
int sum = 0;
while (values.hasNext()) {
IntWritable value = (IntWritable) values.next();
sum += value.get(); // process value
}
output.collect(key, new IntWritable(sum));
}
}
52. public class CallCountMapper extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
er
private Text word = new Text();
app
public void map(WritableComparable key, Writable value,
M
OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line.toLowerCase());
word.set(itr.nextToken());
output.collect(word, one);
}
}
public class CallCountReducer extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator values,
er
OutputCollector output, Reporter reporter) throws IOException {
uc
int sum = 0;
ed
while (values.hasNext()) {
R
IntWritable value = (IntWritable) values.next();
sum += value.get(); // process value
}
output.collect(key, new IntWritable(sum));
}
}
53. public class CallCountMapper extends MapReduceBase public class CallCount {
implements Mapper<LongWritable, Text, Text, IntWritable> {
public static void main(String[] args) {
private final IntWritable one = new IntWritable(1); JobClient client = new JobClient();
er
private Text word = new Text(); JobConf conf = new JobConf(WordCount.class);
app
public void map(WritableComparable key, Writable value, // specify output types
M
OutputCollector output, Reporter reporter) throws IOException { conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line.toLowerCase()); // specify input and output dirs
word.set(itr.nextToken()); FileInputPath.addInputPath(conf, new Path("input"));
output.collect(word, one); FileOutputPath.addOutputPath(conf, new Path("output"));
}
} // specify a mapper
conf.setMapperClass(KeyCountMapper.class);
// specify a reducer
conf.setReducerClass(CallCountReducer.class);
conf.setCombinerClass(CallCountReducer.class);
public class CallCountReducer extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> { client.setConf(conf);
try {
public void reduce(Text key, Iterator values, JobClient.runJob(conf);
er
OutputCollector output, Reporter reporter) throws IOException { } catch (Exception e) {
uc
e.printStackTrace();
int sum = 0; }
ed
while (values.hasNext()) { }
R
IntWritable value = (IntWritable) values.next(); }
sum += value.get(); // process value
}
output.collect(key, new IntWritable(sum));
}
}
54. public class CallCountMapper extends MapReduceBase public class CallCount {
implements Mapper<LongWritable, Text, Text, IntWritable> {
public static void main(String[] args) {
private final IntWritable one = new IntWritable(1); JobClient client = new JobClient();
er
private Text word = new Text(); JobConf conf = new JobConf(WordCount.class);
app
public void map(WritableComparable key, Writable value, // specify output types
M
OutputCollector output, Reporter reporter) throws IOException { conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
String line = value.toString();
er
StringTokenizer itr = new StringTokenizer(line.toLowerCase()); // specify input and output dirs
riv
word.set(itr.nextToken()); FileInputPath.addInputPath(conf, new Path("input"));
output.collect(word, one); FileOutputPath.addOutputPath(conf, new Path("output"));
D
}
} // specify a mapper
conf.setMapperClass(KeyCountMapper.class);
// specify a reducer
conf.setReducerClass(CallCountReducer.class);
conf.setCombinerClass(CallCountReducer.class);
public class CallCountReducer extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> { client.setConf(conf);
try {
public void reduce(Text key, Iterator values, JobClient.runJob(conf);
er
OutputCollector output, Reporter reporter) throws IOException { } catch (Exception e) {
uc
e.printStackTrace();
int sum = 0; }
ed
while (values.hasNext()) { }
R
IntWritable value = (IntWritable) values.next(); }
sum += value.get(); // process value
}
output.collect(key, new IntWritable(sum));
}
}
57. History of Hive
• Hive development cycle is fast and the developer
community is growing rapidly
• Product release cycle is accelerating
Project
started 0.3.0 0.4.0 0.5.0 0.6.0 0.7.0 0.7.1
03/08 4/09 12/09 02/10 10/10 03/11 06/11
58. History of Hive
• Hive development cycle is fast and the developer
community is growing rapidly
• Product release cycle is accelerating
Project
started 0.3.0 0.4.0 0.5.0 0.6.0 0.7.0 0.7.1
03/08 4/09 12/09 02/10 10/10 03/11 06/11
59. History of Hive
• Hive development cycle is fast and the developer
community is growing rapidly
• Product release cycle is accelerating
Project
started 0.3.0 0.4.0 0.5.0 0.6.0 0.7.0 0.7.1
03/08 4/09 12/09 02/10 10/10 03/11 06/11
60. Who use Hive?
http://wiki.apache.org/hadoop/Hive/PoweredBy
64. UseCase in Hive?
• Report and ad hoc query
• Log Analysis
• Social Graph Analysis
65. UseCase in Hive?
• Report and ad hoc query
• Log Analysis
• Social Graph Analysis
• Data mining and analysis
66. UseCase in Hive?
• Report and ad hoc query
• Log Analysis
• Social Graph Analysis
• Data mining and analysis
• Machine Learning
67. UseCase in Hive?
• Report and ad hoc query
• Log Analysis
• Social Graph Analysis
• Data mining and analysis
• Machine Learning
• Dataset cleaning
68. UseCase in Hive?
• Report and ad hoc query
• Log Analysis
• Social Graph Analysis
• Data mining and analysis
• Machine Learning
• Dataset cleaning
• Data Warehouse
69. Hive Architecture
UI Driver
DDL HQL
Execution
Works
Engine
MetaStore Compiler
ORM Hadoop
Result
70. Hive Architecture
UI Driver select col1 from tab1 where ...
DDL HQL
Execution
Works
Engine
MetaStore Compiler
ORM Hadoop
Result
71. Hive Architecture
UI Driver
DDL HQL
Execution
Works
Engine
MetaStore Compiler
ORM Hadoop
Result
72. Hive Architecture
UI Driver
DDL HQL
Execution
Works
Engine
MetaStore Compiler
ORM Hadoop
Result
73. Hive Architecture
UI Driver
DDL HQL
Execution
Works
Engine
MetaStore Compiler
ORM Hadoop
Result
74. Hive Architecture
a 123344
b 121211
c 342434
UI Driver
DDL HQL
Execution
Works
Engine
MetaStore Compiler
ORM Hadoop
Result
75. Hive Internal
Map Reduce
Web UI Hive CLI JDBC
TSOperator User Script
Browse, Query, DDL
UDF/UDAF
SELOperator
substr
sum
MetaStore Hive QL FSOperator average
Thrift API Parser ExecMapper/ExecReducer
Plan SerDe
Optimizer Input/OutputFormat
Task
HDFS StorageHandler
RCFile
DB ... HBase
76. Hive Internal
Map Reduce
Web UI Hive CLI JDBC
TSOperator User Script
Browse, Query, DDL
UDF/UDAF
SELOperator
substr
sum
MetaStore Hive QL FSOperator average
Thrift API Parser ExecMapper/ExecReducer
Plan SerDe
Optimizer Input/OutputFormat
Task
HDFS StorageHandler
RCFile
DB ... HBase
84. Hive Internal
Map Reduce
Web UI Hive CLI JDBC
TSOperator User Script
Browse, Query, DDL
UDF/UDAF
SELOperator
substr
sum
MetaStore Hive QL FSOperator average
Thrift API Parser ExecMapper/ExecReducer
Plan SerDe
Optimizer Input/OutputFormat
Task
HDFS StorageHandler
RCFile
DB ... HBase
85. Hive Internal
Map Reduce
Web UI Hive CLI JDBC
TSOperator User Script
Browse, Query, DDL
UDF/UDAF
SELOperator
substr
sum
MetaStore Hive QL FSOperator average
Thrift API Parser ExecMapper/ExecReducer
Plan SerDe
Optimizer Input/OutputFormat
Task
HDFS StorageHandler
RCFile
DB ... HBase
86. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
87. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM
TOK_WHERE
TOK_SELECT
TOK_DESTINATION
88. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE
TOK_SELECT
TOK_DESTINATION
89. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE
TOK_SELECT
TOK_DESTINATION
90. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE FilterOperator
TOK_SELECT
TOK_DESTINATION
91. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE FilterOperator
TOK_SELECT
TOK_DESTINATION
92. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE FilterOperator
TOK_SELECT SelectOperator
TOK_DESTINATION
93. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE FilterOperator
TOK_SELECT SelectOperator
TOK_DESTINATION
94. Plan
Plan
Select col1,col2 From tab1 Where col3 > 5
QB
TOK_FROM TableScanOperator
TOK_WHERE FilterOperator
TOK_SELECT SelectOperator
TOK_DESTINATION FileSinkOperator
95. Hive Internal
Map Reduce
Web UI Hive CLI JDBC
TSOperator User Script
Browse, Query, DDL
UDF/UDAF
SELOperator
substr
sum
MetaStore Hive QL FSOperator average
Thrift API Parser ExecMapper/ExecReducer
Plan SerDe
Optimizer Input/OutputFormat
Task
HDFS StorageHandler
RCFile
DB ... HBase
96. Hive Internal
Map Reduce
Web UI Hive CLI JDBC
TSOperator User Script
Browse, Query, DDL
UDF/UDAF
SELOperator
substr
sum
MetaStore Hive QL FSOperator average
Thrift API Parser ExecMapper/ExecReducer
Plan SerDe
Optimizer Input/OutputFormat
Task
HDFS StorageHandler
RCFile
DB ... HBase
97. Optimizer
Optimizer Select col1,col2 From tab1 Where col3 > 5
TableScanOperator
FilterOperator
SelectOperator
FileSinkOperator
98. Optimizer
Optimizer Select col1,col2 From tab1 Where col3 > 5
tab1 {col1, col2, col3, col4,col5,col6,col7}
TableScanOperator
FilterOperator
SelectOperator
FileSinkOperator
99. Optimizer
Optimizer Select col1,col2 From tab1 Where col3 > 5
tab1 {col1, col2, col3, col4,col5,col6,col7}
TableScanOperator
FilterOperator
SelectOperator
FileSinkOperator
132. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
133. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
Partition
134. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
Partition time=hour /hive/Log/time=1h
135. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
Partition time=hour /hive/Log/time=1h
Bucket
136. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
Partition time=hour /hive/Log/time=1h
/wh/Log/time=1h/
Bucket phone-num
part-$hash(phone-num)
137. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
Partition time=hour /hive/Log/time=1h
/wh/Log/time=1h/
Bucket phone-num
part-$hash(phone-num)
External
Table
138. Data Model
Hive Entity Sample HDFS LOC
Table Log /hive/Log
Partition time=hour /hive/Log/time=1h
/wh/Log/time=1h/
Bucket phone-num
part-$hash(phone-num)
External /app/meta/dir
customer (arbitrary location)
Table
139. Data Model
MetaStore HDFS
Table
Data Location Partition
Bucketing Info
Partitioning Info
part-001
Bucket
Partition
MetaStore DB
/hive/Log
/hive/Log/time=1h
/hive/Log/time=1h/part-0001
156. Range Operator
BETWEEN ~ AND ~
SELECT * from Employee WHERE
salary BETWEEN 100 AND 500;
157. Range Operator
BETWEEN ~ AND ~
SELECT * from Employee WHERE
salary BETWEEN 100 AND 500;
SELECT * from Employee WHERE
salary >= 100 AND salary <=500;
158. Range Operator
BETWEEN ~ AND ~
SELECT * from Employee WHERE
salary BETWEEN 100 AND 500;
SELECT * from Employee WHERE
salary >= 100 AND salary <=500;
SELECT * from Employee WHERE
BETWEEN(salary,100,500);
161. IN / EXISTS Clause
IN / EXISTS SubQuery
SELECT * from Employee e WHERE e.DeptNo
IN(SELECT d.DeptNo FROM Dept d)
162. IN / EXISTS Clause
IN / EXISTS SubQuery
SELECT * from Employee e WHERE e.DeptNo
IN(SELECT d.DeptNo FROM Dept d)
SELECT * from Employee e WHERE
EXISTS(SELECT )
1 FROM Dept d WHERE e.DeptNo=d.DeptNo
163. IN / EXISTS Clause
IN / EXISTS SubQuery
SELECT * from Employee e WHERE e.DeptNo
IN(SELECT d.DeptNo FROM Dept d)
SELECT * from Employee e WHERE
EXISTS(SELECT 1 FROM Dept d WHERE e.DeptNo=d.DeptNo )
SELECT * from Employee e
LEFT SEMI JOIN Dept d ON (e.DeptNo=d.DeptNo)
166. NOT IN Clause
NOT IN SubQuery
SELECT * from Employee e WHERE e.DeptNo
NOT IN(SELECT d.DeptNo FROM Dept d)
167. NOT IN Clause
NOT IN SubQuery
SELECT * from Employee e WHERE e.DeptNo
NOT IN(SELECT d.DeptNo FROM Dept d)
SELECT e.* from Employee e
LEFT OUTER JOIN Dept d ON (e.DeptNo=d.DeptNo)
WHERE d.DeptNo IS NULL
170. NOT EXIST Clause
NOT EXIST SubQuery
SELECT * from Employee e WHERE
NOT EXISTS(SELECT 1 FROM Dept d WHERE e.DeptNo=d.DeptNo )
171. NOT EXIST Clause
NOT EXIST SubQuery
SELECT * from Employee e WHERE
NOT EXISTS(SELECT 1 FROM Dept d WHERE e.DeptNo=d.DeptNo )
SELECT e.* from Employee e
LEFT OUTER JOIN Dept d ON (e.DeptNo=d.DeptNo)
WHERE d.DeptNo IS NULL
174. LIKE Clause
LIKE / NOT LIKE
SELECT * from Employee e WHERE name LIKE ’%steve’
175. LIKE Clause
LIKE / NOT LIKE
SELECT * from Employee e WHERE name LIKE ’%steve’
SELECT e.* from Employee e WHERE name LIKE ‘%steve’
176. LIKE Clause
LIKE / NOT LIKE
SELECT * from Employee e WHERE name LIKE ’%steve’
SELECT * from Employee e WHERE name NOT LIKE ’%steve’
SELECT e.* from Employee e WHERE name LIKE ‘%steve’
177. LIKE Clause
LIKE / NOT LIKE
SELECT * from Employee e WHERE name LIKE ’%steve’
SELECT * from Employee e WHERE name NOT LIKE ’%steve’
SELECT e.* from Employee e WHERE name LIKE ‘%steve’
SELECT e.* from Employee e WHERE NOT name LIKE ‘%steve’
178. LIKE Clause
LIKE / NOT LIKE
SELECT * from Employee e WHERE name LIKE ’%steve’
SELECT * from Employee e WHERE name NOT LIKE ’%steve’
SELECT e.* from Employee e WHERE name LIKE ‘%steve’
SELECT e.* from Employee e WHERE NOT name LIKE ‘%steve’
189. JOIN Operator (3/4)
LEFT OUTER JOIN
FROM Emp, Dept
SELECT *
WHERE Emp.deptNo = Dept.deptNo(+)
190. JOIN Operator (3/4)
LEFT OUTER JOIN
FROM Emp, Dept
SELECT *
WHERE Emp.deptNo = Dept.deptNo(+)
FROM Emp
SELECT *
LEFT OUTER JOIN Dept ON Emp.deptNO = Dept.deptNo
193. JOIN Operator (4/4)
RIGHT OUTER JOIN
FROM Emp, Dept
SELECT *
WHERE Emp.deptNo(+) = Dept.deptNo
194. JOIN Operator (4/4)
RIGHT OUTER JOIN
FROM Emp, Dept
SELECT *
WHERE Emp.deptNo(+) = Dept.deptNo
FROM Emp
SELECT *
RIGHT OUTER JOIN Dept ON Emp.deptNO = Dept.deptNo
198. Condition Function
CASE
CASE expr WHEN THEN r1
cond1
[WHEN cond2 THEN r2]* [ELSE r] END
199. Condition Function
CASE
CASE expr WHEN THEN r1
cond1
[WHEN cond2 THEN r2]* [ELSE r] END
CASE expr WHEN THEN r1
cond1
[WHEN cond2 THEN r2]* [ELSE r] END
235. Analytic Function
Joins, WHERE, GROUP BY clauses are performed
the analytic functions are performed
with the result set
236. Analytic Function
Joins, WHERE, GROUP BY clauses are performed
the analytic functions are performed
with the result set
ORDER BY clause is processed
237. Analytic Function
Rank salary in dept
name
dept
salary
---------------------
a
Research
100
b
Research
100
c
Sales
200
d
Sales
300
e
Research
50
f
Accounting
200
g
Accounting
300
h
Accounting
400
i
Research
10
238. Analytic Function
name
dept
salary
---------------------
a
Research
100
b
Research
100
c
Sales
200
d
Sales
300
e
Research
50
f
Accounting
200
g
Accounting
300
h
Accounting
400
i
Research
10
239. Analytic Function
Map
name
dept
salary
---------------------
a
Research
100
b
Research
100
c
Sales
200
d
Sales
300
e
Research
50
Map
f
Accounting
200
g
Accounting
300
h
Accounting
400
i
Research
10
Map
240. Analytic Function
a
Research
100
b
Research
100
c
Sales
Map
200
d
Sales
300
e
Research
Map
50
f
Accounting
200
g
Accounting
300
h
Accounting
Map
400
i
Research
10
241. Analytic Function
DISTRIBUTED BY dept
a
Research
100
b
Research
100
c
Sales
Map
200
d
Sales
300
e
Research
Map
50
f
Accounting
200
g
Accounting
300
h
Accounting
Map
400
i
Research
10
242. Analytic Function
DISTRIBUTED BY dept
a
Research
100
b
Research
100
c
Sales
Map
200
Reduce
d
Sales
300
e
Research
Map
50
f
Accounting
200
Reduce
g
Accounting
300
h
Accounting
Map
400
i
Research
10
243. Analytic Function
DISTRIBUTED BY dept
c Sales 200
Map g Accounting
300
h
d
Accounting
Sales
400
300
Reduce
f Accounting
200
Map g
Research
300
h
Research
400
e Research 300 Reduce
i Research 10
Map
244. Analytic Function
SORT BY dept, salary
c Sales 200
Map d Sales 300
f Accounting
200
g Accounting
300 Reduce
h Accounting
400
Map i Research 10
g
Research
300
e Research 300 Reduce
h
Research
400
Map
245. Analytic Function
c Sales 200
Map d Sales 300
f Accounting
200
g Accounting
300 Reduce
h Accounting
400
Map i Research 10
g
Research
300
e Research 300 Reduce
h
Research
400
Map
246. Analytic Function
RANK(dept,salary)
c Sales 200 1
Map d Sales 300 2
f Accounting
200 1
Reduce g Accounting
300 2
h Accounting
400 3
Map i Research 10 1
g
Research
300 2
Reduce e Research 300 3
h
Research
400 4
Map
250. Analytic Function
RANK
SELECT name,dept,salary,RANK() OVER (PARTITION BY dept
ORDER BY salary DESC) FROM emp
SELECT e.name,e.dept,e.salary,RANK( e.dept,e.salary)
FROM (SELECT name, dept, salary FROM empDISTRIBUTED
BY dept SORT BY dept, salary DESC) e
251. Analytic Function
RANK
SELECT name,dept,salary,RANK() OVER (PARTITION BY dept
ORDER BY salary DESC) FROM emp
RANK(arg1,arg2) - Custom UDF
SELECT e.name,e.dept,e.salary,RANK( e.dept,e.salary)
FROM (SELECT name, dept, salary FROM empDISTRIBUTED
BY dept SORT BY dept, salary DESC) e
263. Future Work
• HiveQL SQL Compliance
• HIVE-282 - IN statement for WHERE clauses
• HIVE-192 - Add TIMESTAMP column type
• HIVE-1269 - Support Date/Datetime/Time/Timestamp Primitive Types
264. Future Work
• HiveQL SQL Compliance
• HIVE-282 - IN statement for WHERE clauses
• HIVE-192 - Add TIMESTAMP column type
• HIVE-1269 - Support Date/Datetime/Time/Timestamp Primitive Types
• Analytic Function
• HIVE-896 - Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive
• HIVE-952 - Support analytic NTILE function
265. Future Work
• HiveQL SQL Compliance
• HIVE-282 - IN statement for WHERE clauses
• HIVE-192 - Add TIMESTAMP column type
• HIVE-1269 - Support Date/Datetime/Time/Timestamp Primitive Types
• Analytic Function
• HIVE-896 - Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive
• HIVE-952 - Support analytic NTILE function
• Optimization
• HIVE-1694 - Accelerate GROUP BY execution using indexes
• HIVE-482 - Optimize Group By + Order By with the same keys
269. Hive
A system for managing and querying
structured data built on top of Hadoop
Oracle 2 Hive
270. Hive
A system for managing and querying
structured data built on top of Hadoop
Oracle 2 Hive
data model
ANSI-SQL
built-in function / custom UDF
analytic function