SlideShare a Scribd company logo
1 of 190
Download to read offline
Hive sql的编译过程

chenchun@meituan.com

Monday, 30 December,
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

如何理解Hive执⾏行计划
Join
select u.name, o.orderid from order o join user u on o.uid = u.uid;
user
uid name
1

apple

2 orange

order
uid orderid
1

1001

1

1002

2

1003

Monday, 30 December,
Join
select u.name, o.orderid from order o join user u on o.uid = u.uid;
user

key
1

1

apple

2 orange

<1,apple>

2

uid name

value
<1,orange>

key

value

Map
order
uid orderid
1

1001

1

<2,1001>

1

1002

1

<2,1002>

2

1003

2

<2,1003>

Monday, 30 December,
Join
select u.name, o.orderid from order o join user u on o.uid = u.uid;
user

key

1

<1,apple>

1

<1,apple>

<1,orange>

1

<2,1001>

1

2 orange

value

2

apple

key

1

uid name

value

<2,1002>

key

value

2

<1,orange>

2

<2,1003>

Map
order
uid orderid

key

value

1

1001

1

<2,1001>

1

1002

1

<2,1002>

2

1003

2

<2,1003>

Monday, 30 December,

Shuffle
Sort
Join
select u.name, o.orderid from order o join user u on o.uid = u.uid;
user

key

1

name

orderid

<1,apple>

1

<1,apple>

apple

1001

<1,orange>

1

<2,1001>

apple

1002

1

2 orange

value

2

apple

key

1

uid name

value

<2,1002>

Map
order
uid orderid

key

value

1

1001

1

<2,1001>

1

1002

1

<2,1002>

2

1003

2

<2,1003>

Monday, 30 December,

Shuffle
Sort

Reduce
key

value

name

orderid

2

<1,orange>

orange

1003

2

<2,1003>
Group By
select rank, isonline, count(*) from city group by rank, isonline;
city
rank isonline
A

1

A

1

city
rank isonline
A

1

B

0

Monday, 30 December,
Group By
select rank, isonline, count(*) from city group by rank, isonline;
city

key
<A, 1>

rank isonline
A

1

A

value
2

key

value

<A, 1>

1

<B, 0>

1

1

Map
city
rank isonline
A

1

B

0

Monday, 30 December,
Group By
select rank, isonline, count(*) from city group by rank, isonline;
city

key

A
A

value

2

<A, 1>

2

<A, 1>

1

key

<A, 1>

rank isonline

value

1

key

value

<B, 0>

1

1

Map
city
rank isonline
A

1

B

0

Monday, 30 December,

key

value

<A, 1>

1

<B, 0>

1

Shuffle
Sort
Group By
select rank, isonline, count(*) from city group by rank, isonline;
city

key

A
A

value

2

<A, 1>

2

<A, 1>

1

key

<A, 1>

rank isonline

value

1

1

Map
city
rank isonline
A

1

B

0

Monday, 30 December,

key

value

<A, 1>

1

<B, 0>

1

rank isonline value
A

1

3

Reduce

Shuffle
Sort
key

value

<B, 0>

1

rank isonline value
B

0

1
Distinct
select dealid, count(distinct uid) num from order group by dealid;
uid dealid
1

1001

2

1002

2

1001

uid dealid
1

1002

1

1002

2

1001

Monday, 30 December,
Distinct
select dealid, count(distinct uid) num from order group by dealid;
uid dealid
1

1001

2

1002

2

1001

key

partition
value
Key

<1001, 1>

1

1001

<1002, 2>

1

1002

<1001, 2>

1

1001

Map
uid dealid

partition
value
Key

1

1002

key

1

1002

<1002, 1>

1

1002

2

1001

<1001, 2>

1

1001

Monday, 30 December,
Distinct
select dealid, count(distinct uid) num from order group by dealid;
uid dealid
1

1001

2

1002

2

1001

key

partition
value
Key

<1001, 1>

1

1001

<1002, 2>

1

1002

<1001, 2>

1

1001

Map

value

<1001, 1>

1

<1001, 2>

1

<1001, 2>

1

Shuffle
Sort

uid dealid

partition
value
Key

1

1002

key

1

1002

<1002, 1>

1

1002

2

1001

<1001, 2>

1

1001

Monday, 30 December,

key

key

value

<1002, 1>

2

<1002, 2>

1
Distinct
select dealid, count(distinct uid) num from order group by dealid;
uid dealid
1

1001

2

1002

2

1001

key

partition
value
Key

<1001, 1>

1

1001

<1002, 2>

1

1002

<1001, 2>

1

1001

Map

value

<1001, 1>

1

dealid num

<1001, 2>

1

1001

<1001, 2>

1

partition
value
Key

1

1002

key

1

1002

<1002, 1>

1

1002

2

1001

<1001, 2>

1

1001

2

Reduce

Shuffle
Sort

uid dealid

Monday, 30 December,

key

key

value

<1002, 1>

2

<1002, 2>

1

dealid num
1002

2
Distinct
select dealid, count(distinct uid), count(distinct date) from order group by dealid;
uid dealid date
1

1001 1101

2

1001 1101

2

1001 1102

Monday, 30 December,
Distinct
select dealid, count(distinct uid), count(distinct date) from order group by dealid;

key

uid dealid date
1

1001 1101

2

1001 1101

2

1001 1102

Monday, 30 December,

Map

value

partition
Key

<1001,1,1101>

1

1001

<1001,2,1101>

1

1001

<1001,2,1102>

1

1001
Distinct
select dealid, count(distinct uid), count(distinct date) from order group by dealid;

key

uid dealid date
1

1001 1101

2

1001 1101

2

1001 1102

Map

value

partition
Key

<1001,1,1101>

1

1001

<1001,2,1101>

1

1001

<1001,2,1102>

1

1001

需要在Reduce阶段在内存中分对uid和date去重

Monday, 30 December,
Distinct
select dealid, count(distinct uid), count(distinct date) from order group by dealid;
uid dealid date
1

1001 1101

2

1001 1101

2

1001 1102

Monday, 30 December,
Distinct
select dealid, count(distinct uid), count(distinct date) from order group by dealid;
key

uid dealid date

partition
value
Key

1001 1101

2

1001 1102

Monday, 30 December,

1001

<1001,1,1101>

1

1001

<1001,0,2>

1

1001

1

1001

1

1001

<1001,1,1102>

2

Map

1

<1001,0,2>

1001 1101

<1001,0,1>

<1001,1,1101>

1

1

1001
Distinct
select dealid, count(distinct uid), count(distinct date) from order group by dealid;
key

uid dealid date

partition
value
Key

1001 1101

2

1001 1102

1001

<1001,1,1101>

1

1001

<1001,0,2>

1

1001

1

1001

1

1001

<1001,1,1102>

2

Map

1

<1001,0,2>

1001 1101

<1001,0,1>

<1001,1,1101>

1

1

1001

只需要在Reduce阶段记录lastDealid, lastTag, lastuid, lastDate

Monday, 30 December,
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
Compile Workflow
Parser
Semantic
Analyzer
Logical
Plan Gen
Logical
Optimizer
Physical
Plan Gen
Physical
Optimizer

Monday, 30 December,
Compile Workflow
Hive
QL
Parser

AST
Tree

Semantic
Analyzer
QB
Logical
Plan Gen
Operator
Tree
Logical
Optimizer
Operator
Tree Physical
Plan Gen

Task
TreePhysical
Optimizer

Monday, 30 December,

Task
Tree
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
Antlr
•
•
•

Monday, 30 December,

Antlr是⼀一种语⾔言识别的⼯工具
可以⽤用来构造领域语⾔言
只需要编写⼀一个语法⽂文件,定义词法和语法替换规则,Antlr完成了词
法分析、语法分析、语义分析、中间代码⽣生成等过程
AST Tree
如果需要对表达式做进⼀一步的处理,对表达式的运算结果求值,使⽤用
Antlr 可以有两种选择,第⼀一,直接在语法⽂文件中嵌⼊入动作,加⼊入代码⽚片
段;第⼆二,使⽤用 Antlr 的抽象语法树语法,在语法分析的同时将⽤用户输⼊入
转换成中间表⽰示⽅方式:抽象语法树,后续在遍历语法树的同时完成计算。

Monday, 30 December,
Example SQL

Monday, 30 December,
Sub Query

Parser

Semantic
Analyzer

Logical
Plan Gen.
15

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Sub Query
1

1

Parser

Semantic
Analyzer

Logical
Plan Gen.
15

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Sub Query
2
1

2
1

Parser

Semantic
Analyzer

Logical
Plan Gen.
15

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
From => AST

1.1

Parser

Semantic
Analyzer

Logical
Plan Gen.
16

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
From => AST

1.1

Parser

Semantic
Analyzer

Logical
Plan Gen.
17

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Select => AST

1.2

Parser

Semantic
Analyzer

Logical
Plan Gen.
18

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Select => AST

1.2

Parser

Semantic
Analyzer

Logical
Plan Gen.
19

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Where

1.3

Parser

Semantic
Analyzer

Logical
Plan Gen.
20

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Where => AST

1.3

Parser

Semantic
Analyzer

Logical
Plan Gen.
21

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
QueryBlock
•

QueryBlock : ⼀一条SQL的基本组成单元,包括三个部分:输⼊入源,计算
过程,输出。

•

从AST Tree⽣生成QueryBlock的过程,就是从抽象语法树中找出所有的基
本单元以及每个单元之间的关系的过程。每个基本单元创建⼀一个QB对
象,将每个基本单元的不同操作转化为QB对象的不同属性。

Parser

Semantic
Analyzer

Logical
Plan Gen.
23

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QueryBlock
•

QueryBlock : ⼀一条SQL的基本组成单元,包括三个部分:输⼊入源,计算
过程,输出。

•

从AST Tree⽣生成QueryBlock的过程,就是从抽象语法树中找出所有的基
本单元以及每个单元之间的关系的过程。每个基本单元创建⼀一个QB对
象,将每个基本单元的不同操作转化为QB对象的不同属性。

Parser

Semantic
Analyzer

Logical
Plan Gen.
23

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock

Parser

Semantic
Analyzer

Logical
Plan Gen.
24

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock
表名和别名
的映射关系

Parser

Semantic
Analyzer

Logical
Plan Gen.
24

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock
⼦子查询

⼦子查询

Parser

Semantic
Analyzer

Logical
Plan Gen.
24

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock

QBExpr本意是表达QB的
关系,但是⺫⽬目前只实现
了Union
Parser

Semantic
Analyzer

Logical
Plan Gen.
24

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock
Join ASTTree

Parser

Semantic
Analyzer

Logical
Plan Gen.
24

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock

key=‘inclause-i’ value=ASTTree
Parser

Semantic
Analyzer

Logical
Plan Gen.
24

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QuueryBlock
记录表的源数据

Parser

Semantic
Analyzer

Logical
Plan Gen.
25

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

Parser

1.

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

TOK_QUERY > 创建QB对象,循环递归⼦子节点

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

Parser

1.
2.

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

TOK_QUERY > 创建QB对象,循环递归⼦子节点
TOK_FROM > QB#aliasToTabs.put(alias, tabname); QB#aliases.put(alias, tabname);
QBParseInfo#aliasToSrc.put(alias.toLowerCase(), ast);

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

1.
2.
3.

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

TOK_QUERY > 创建QB对象,循环递归⼦子节点
TOK_FROM > QB#aliasToTabs.put(alias, tabname); QB#aliases.put(alias, tabname);
QBParseInfo#aliasToSrc.put(alias.toLowerCase(), ast);
TOK_INSERT > 循环递归⼦子节点

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

1.
2.
3.
4.

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

TOK_QUERY > 创建QB对象,循环递归⼦子节点
TOK_FROM > QB#aliasToTabs.put(alias, tabname); QB#aliases.put(alias, tabname);
QBParseInfo#aliasToSrc.put(alias.toLowerCase(), ast);
TOK_INSERT > 循环递归⼦子节点
TOK_DESTINATION > QBParseInfo#nameToDest.put(“insclause-i”, astnode)

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

1.
2.
3.
4.
5.

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

TOK_QUERY > 创建QB对象,循环递归⼦子节点
TOK_FROM > QB#aliasToTabs.put(alias, tabname); QB#aliases.put(alias, tabname);
QBParseInfo#aliasToSrc.put(alias.toLowerCase(), ast);
TOK_INSERT > 循环递归⼦子节点
TOK_DESTINATION > QBParseInfo#nameToDest.put(“insclause-i”, astnode)
TOK_SELECT > QBParseInfo#destToSelExpr.put(“insclause-i”, astnode);
destToAggregationExprs.put(“insclause-i”, astnode);
destToDistinctFuncExprs.put(“insclause-i”, astnode);

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

1.
2.
3.
4.
5.

6.

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

TOK_QUERY > 创建QB对象,循环递归⼦子节点
TOK_FROM > QB#aliasToTabs.put(alias, tabname); QB#aliases.put(alias, tabname);
QBParseInfo#aliasToSrc.put(alias.toLowerCase(), ast);
TOK_INSERT > 循环递归⼦子节点
TOK_DESTINATION > QBParseInfo#nameToDest.put(“insclause-i”, astnode)
TOK_SELECT > QBParseInfo#destToSelExpr.put(“insclause-i”, astnode);
destToAggregationExprs.put(“insclause-i”, astnode);
destToDistinctFuncExprs.put(“insclause-i”, astnode);
TOK_WHERE > QBParseInfo# destToWhereExpr.put(“insclause-i”, ast);

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
AST Tree => QB
先序遍历AST Tree SemanticAnalyze#doPhase1

1
2

1.
2.
3.
4.
5.

6.

TOK_QUERY > 创建QB对象,循环递归⼦子节点
TOK_FROM > QB#aliasToTabs.put(alias, tabname); QB#aliases.put(alias, tabname);
QBParseInfo#aliasToSrc.put(alias.toLowerCase(), ast);
TOK_INSERT > 循环递归⼦子节点
TOK_DESTINATION > QBParseInfo#nameToDest.put(“insclause-i”, astnode)
TOK_SELECT > QBParseInfo#destToSelExpr.put(“insclause-i”, astnode);
destToAggregationExprs.put(“insclause-i”, astnode);
destToDistinctFuncExprs.put(“insclause-i”, astnode);
TOK_WHERE > QBParseInfo# destToWhereExpr.put(“insclause-i”, ast);
QB1

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
26

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
Operator
•
•

逻辑操作符,在Map阶段或者Reduce阶段完成单⼀一特定的功能。

•
•

Map/Reduce阶段都由⼀一个OperatorTree组成。

•

某些Operator是⼀一个终结操作符TerminalOperator,标⽰示Map/Reduce阶段的结
束。如FileSinkOperator将数据写⼊入⽂文件,标志当前阶段的结束。

•

ReduceSinkOperator只可能出现在Map阶段,将Map端的字段组合序列化为
Reduce Key/value, Partition Key。

常⻅见的Operator如:TableScanOperator SelectOperator FilterOperator
JoinOperator GroupByOperator ReduceSinkOperator

流式的计算过程。每⼀一个Operator计算完成⼀一⾏行数据之后将数据传递给
childOperator计算

Parser

Semantic
Analyzer

Logical
Plan Gen.
28

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Operator
•
•
•

Logical
Plan Gen.
29

Monday, 30 December,

Logical
Optimizer

Operator所有运⾏行时需要的参数均保存
在OperatorDesc中,OperatorDesc在提
交任务前序列化到hdfs上,在MR Task执
⾏行前从hdfs读取并反序列化

•
Semantic
Analyzer

Hive每⼀一⾏行数据经过⼀一个Operator处理
之后,会对字段重新编号,colExprMap
被LogicalOptimizer⽤用来回溯字段名

•

Parser

RowSchema表⽰示Operator的输出字段

Map阶段OperatorTree在hdfs上的位置在
Job.getConf(“hive.exec.plan”) + “/map.xml”

Physical
Plan Gen.

InputObjInspector outputObjInspector解
析输⼊入和输出字段

Physical
Optimizer
QB => Operator Tree
中序遍历QB SemanticAnalyzer#genPlan(QB qb)
1.
2.
3.
4.
5.
6.
7.

SemanticAnalyzer#genPlan
QB#aliasToSubq => 递归调⽤用genPlan()
QB#aliasToTabs => TableScanOperator
QBParseInfo#joinExpr => QBJoinTree => ReduceSinkOperator + JoinOperator
QBParseInfo#destToWhereExpr => FilterOperator
QBParseInfo#destToGroupby => ReduceSinkOperator + GroupByOperator
QBParseInfo#destToOrderby => ReduceSinkOperator + ExtractOperator
...
SemanticAnalyzer#genBodyPlan

Parser

Semantic
Analyzer

Logical
Plan Gen.
30

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : aliasToTabs => TableScanOperator

QB#aliasToTabs {du=dim.user, c=detail.usersequence_client, p=fact.orderpayment}
TableScanOperator(“dim.user”) TS[0]
TableScanOperator(“detail.usersequence_client”) TS[1]
TableScanOperator(“fact.orderpayment”) TS[2]

Parser

Semantic
Analyzer

Logical
Plan Gen.
31

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QBJoinTree

Parser

Semantic
Analyzer

Logical
Plan Gen.
32

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBParseInfo#joinExpr => QBJoinTree
先序遍历joinExpr⽣生成QBJoinTree

Parser

Semantic
Analyzer

Logical
Plan Gen.
33

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBParseInfo#joinExpr => QBJoinTree
1

先序遍历joinExpr⽣生成QBJoinTree

p
/
c


p

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
33

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBParseInfo#joinExpr => QBJoinTree
1

2

先序遍历joinExpr⽣生成QBJoinTree

base
/ 
p du
/ 
c
p

p
/
c


p

QB1

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
33

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator

Parser

Semantic
Analyzer

Logical
Plan Gen.
34

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator
base
/ 
p du
/ 
c
p

TS[c] TS[p]

Parser

Semantic
Analyzer

Logical
Plan Gen.
34

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator
base
/ 
p du
/ 
c
p

TS[c] TS[p]
|
|
RS[3] RS[4]

TS[c] TS[p]

Parser

Semantic
Analyzer

Logical
Plan Gen.
34

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator
base
/ 
p du
/ 
c
p

TS[c] TS[p]
|
|
RS[3] RS[4]

TS[c] TS[p]

Parser

Semantic
Analyzer

Logical
Plan Gen.
34

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5]

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator

Parser

Semantic
Analyzer

Logical
Plan Gen.
35

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator
base
/ 
p du
/ 
c
p

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]

Parser

Semantic
Analyzer

Logical
Plan Gen.
35

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator
base
/ 
p du
/ 
c
p

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]

Parser

Semantic
Analyzer

Logical
Plan Gen.
35

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : QBJoinTree => RS + JOIN
前序遍历QBJoinTree
TS=TableScanOperator RS=ReduceSinkOperator JOIN=JoinOperator
base
/ 
p du
/ 
c
p

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]

Parser

Semantic
Analyzer

Logical
Plan Gen.
35

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
QB2 : genBodyPlan
QBParseInfo#destToWhereExpr > FilterOperator
FIL= FilterOperator SEL= SelectOperator

Parser

Semantic
Analyzer

Logical
Plan Gen.
36

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : genBodyPlan
QBParseInfo#destToWhereExpr > FilterOperator
FIL= FilterOperator SEL= SelectOperator
TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]

Parser

Semantic
Analyzer

Logical
Plan Gen.
36

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : genBodyPlan
QBParseInfo#destToWhereExpr > FilterOperator
FIL= FilterOperator SEL= SelectOperator
TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]

Parser

Semantic
Analyzer

Logical
Plan Gen.
36

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB2 : genBodyPlan
QBParseInfo#destToWhereExpr > FilterOperator
FIL= FilterOperator SEL= SelectOperator
TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]

Parser

Semantic
Analyzer

Logical
Plan Gen.
36

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]
QB1 : genBodyPlan
QBParseInfo#destToGroupby > ReduceSinkOperator + GroupByOperator
GBY= GroupByOperator

Parser

Semantic
Analyzer

Logical
Plan Gen.
37

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB1 : genBodyPlan
QBParseInfo#destToGroupby > ReduceSinkOperator + GroupByOperator
GBY= GroupByOperator
TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]

Parser

Semantic
Analyzer

Logical
Plan Gen.
37

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB1 : genBodyPlan
QBParseInfo#destToGroupby > ReduceSinkOperator + GroupByOperator
GBY= GroupByOperator
TS[c] TS[p]
|
|
TS[c] TS[p]
RS[3] RS[4]
|
|

/
RS[3] RS[4]
JOIN[5] TS[du]

/
|
|
JOIN[5] TS[du]
RS[6]
RS[7]
|
|

/
RS[6]
RS[7]
JOIN[8]

/
|
JOIN[8]
FIL[9]
|
|
FIL[9]
SEL[10]
|
|
SEL[10]
SEL[11]
|
HashMode AGGR < GBY[12]

Parser

Semantic
Analyzer

Logical
Plan Gen.
37

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
QB1 : genBodyPlan
QBParseInfo#destToGroupby > ReduceSinkOperator + GroupByOperator
GBY= GroupByOperator
TS[c] TS[p]
|
|
TS[c] TS[p]
RS[3] RS[4]
|
|

/
RS[3] RS[4]
JOIN[5] TS[du]

/
|
|
JOIN[5] TS[du]
RS[6]
RS[7]
|
|

/
RS[6]
RS[7]
JOIN[8]

/
|
JOIN[8]
FIL[9]
|
|
FIL[9]
SEL[10]
|
|
SEL[10]
SEL[11]
|
HashMode AGGR < GBY[12]

Parser

Semantic
Analyzer

Logical
Plan Gen.
37

Monday, 30 December,

Logical
Optimizer

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]
|
SEL[11]
|
GBY[12]
|
RS[13]
Physical
Plan Gen.

Physical
Optimizer
QB1 : genBodyPlan
QBParseInfo#destToGroupby > ReduceSinkOperator + GroupByOperator
TS[c] TS[p]
GBY= GroupByOperator
TS[c] TS[p]
|
|
TS[c] TS[p]
RS[3] RS[4]
|
|

/
RS[3] RS[4]
JOIN[5] TS[du]

/
|
|
JOIN[5] TS[du]
RS[6]
RS[7]
|
|

/
RS[6]
RS[7]
JOIN[8]

/
|
JOIN[8]
FIL[9]
|
|
FIL[9]
SEL[10]
|
|
SEL[10]
SEL[11]
|
HashMode AGGR < GBY[12]

Parser

Semantic
Analyzer

Logical
Plan Gen.
37

Monday, 30 December,

Logical
Optimizer

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]
|
SEL[11]
|
GBY[12]
|
RS[13]
Physical
Plan Gen.

Physical
Optimizer

|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]
|
SEL[11]
|
GBY[12]
|
RS[13]
|
GBY[14]
QB1 : genPostGroupByBodyPlan
FS=FileSinkOperator
SEL[11]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
SEL[16]
|
FS[17]

TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]

QB2

Parser

Semantic
Analyzer

QB1
Logical
Plan Gen.
38

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
Logical Optimizer
变换OperatorTree
名称

作⽤用

2) PredicatePushDown

谓词前置

ColumnPruner

字段剪枝

2) GroupByOptimizer

Map端聚合

1) ReduceSinkDeDuplication

合并线性的OperatorTree中partition/sort key
相同的reduce

1) CorrelationOptimizer

利⽤用查询中的相关性,合并有相关性的
Job,HIVE-2206

2) SimpleFetchOptimizer

优化没有GroupBy表达式的聚合查询

2) MapJoinProcessor

MapJoin,提供hint

2) BucketMapJoinOptimizer

BucketMapJoin

Parser

Semantic
Analyzer

Logical
Plan Gen.
40

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Logical Optimizer
变换OperatorTree
名称

作⽤用

2) PredicatePushDown

谓词前置

ColumnPruner

字段剪枝

2) GroupByOptimizer

Map端聚合

1) ReduceSinkDeDuplication

合并线性的OperatorTree中partition/sort key
相同的reduce

1) CorrelationOptimizer

利⽤用查询中的相关性,合并有相关性的
Job,HIVE-2206

2) SimpleFetchOptimizer

优化没有GroupBy表达式的聚合查询

2) MapJoinProcessor

MapJoin,提供hint

2) BucketMapJoinOptimizer

BucketMapJoin

1) ⼀一个Job干尽可能多的事情/合并Job

Parser

Semantic
Analyzer

Logical
Plan Gen.
40

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Logical Optimizer
变换OperatorTree
名称

作⽤用

2) PredicatePushDown

谓词前置

ColumnPruner

字段剪枝

2) GroupByOptimizer

Map端聚合

1) ReduceSinkDeDuplication

合并线性的OperatorTree中partition/sort key
相同的reduce

1) CorrelationOptimizer

利⽤用查询中的相关性,合并有相关性的
Job,HIVE-2206

2) SimpleFetchOptimizer

优化没有GroupBy表达式的聚合查询

2) MapJoinProcessor

MapJoin,提供hint

2) BucketMapJoinOptimizer

BucketMapJoin

1) ⼀一个Job干尽可能多的事情/合并Job
2) 减少shuffle数据量,甚⾄至不做Reduce
Parser

Semantic
Analyzer

Logical
Plan Gen.
40

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
PredicatePushDown
断⾔言判断提前
TS[c] TS[p]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
FIL[9]
|
SEL[10]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
41

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
NonBlockingOpDeDupProc
合并SEL-SEL 或者 FIL-FIL 为⼀一个Operator
SEL[11]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
SEL[16]
|
FS[17]

GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]

QB1
Parser

Semantic
Analyzer

Logical
Plan Gen.
42

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;
TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

TS
|
RS
|
GBY
|
SEL
|
FS

Stage-1 Stage-2

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;
TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

TS
|
RS
|
GBY
|
SEL
|
FS

key

partition
Key

pRS key,value key,value
cRS

key

key

Stage-1 Stage-2

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;
TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

TS
|
RS
|
GBY
|
SEL
|
FS

key

partition
Key

pRS key,value key,value
cRS

key

key

pRS key完全包含cRS key,且排序顺序⼀一致
pRS partitionkey完全包含cRS partitionkey

Stage-1 Stage-2

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;
TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

TS
|
RS
|
GBY
|
SEL
|
FS

key

partition
Key

pRS key,value key,value
cRS

key

key

pRS key完全包含cRS key,且排序顺序⼀一致
pRS partitionkey完全包含cRS partitionkey

Stage-1 Stage-2

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;
TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

TS
|
RS
|
GBY
|
SEL
|
FS

key

partition
Key

pRS key,value key,value
cRS

key

key

pRS key完全包含cRS key,且排序顺序⼀一致
pRS partitionkey完全包含cRS partitionkey

Stage-1 Stage-2

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

key : key, value
partitionkey : key
ReduceSinkDeDuplication
合并线性的相连的两个RS
from (select key, value from src group by key, value) s select s.key group by s.key;
TS
|
SEL
|
GBY
|
RS
|
GBY
|
SEL
|
GBY
|
FS

TS
|
RS
|
GBY
|
SEL
|
FS

key

partition
Key

pRS key,value key,value
cRS

key

key

pRS key完全包含cRS key,且排序顺序⼀一致
pRS partitionkey完全包含cRS partitionkey

Stage-1 Stage-2

Parser

Semantic
Analyzer

Logical
Plan Gen.
43

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS
|
SEL
|
GBY
|
key : key, value
RS
partitionkey : key
|
GBY
|
SEL
两个Job的numReduce
|
数⺫⽬目是否⼀一致
GBY
|
FS
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
MapReduceCompiler
•
•
•
•
•
•

Parser

对输出表⽣生成MoveTask
从OperatorTree的其中⼀一个根节点向下深度优先遍历
ReduceSinkOperator标⽰示Map/Reduce的界限,多个Job间的界限
遍历其他根节点,遇过碰到JoinOperator合并MapReduceTask
⽣生成StatTask更新元数据
剪断Map与Reduce间的Operator

Semantic
Analyzer

Logical
Plan Gen.
45

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R0 gen MoveTask & Fetch Task

GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]

MapredLockWork[Stage-0]
Stage-0
Move Operator

QB1

Parser

Semantic
Analyzer

Logical
Plan Gen.
46

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Begin Walk
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

toWalk[] {TS[c], TS[du], TS[p]}

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
47

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Begin Walk
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

toWalk[] {TS[c], TS[du], TS[p]}
opStack {}

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
48

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Begin Walk
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

toWalk[] {TS[c], TS[du]}
opStack {TS[p]}

QB2

Parser

Semantic
Analyzer

Logical
Plan Gen.
49

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[du], TS[c]} opStack {TS[p]}

Parser

Semantic
Analyzer

Logical
Plan Gen.
50

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[du], TS[c]} opStack {TS[p]}
"".join([t + "%" for t in opStack]) == “ TS%”

Parser

Semantic
Analyzer

Logical
Plan Gen.
50

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[du], TS[c]} opStack {TS[p]}
"".join([t + "%" for t in opStack]) == “ TS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

QB2
Parser

Semantic
Analyzer

Logical
Plan Gen.
50

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[du], TS[c]} opStack {TS[p]}
"".join([t + "%" for t in opStack]) == “ TS%”
TS[p] Stage-1 MapRedTask
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

QB2
Parser

Semantic
Analyzer

Logical
Plan Gen.
50

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4]}

Parser

Semantic
Analyzer

Logical
Plan Gen.
51

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4]}
"".join([t + "%" for t in opStack]) == “TS%.*RS%”

Parser

Semantic
Analyzer

Logical
Plan Gen.
51

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4]}
"".join([t + "%" for t in opStack]) == “TS%.*RS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]
Stage-1 MapTask
Parser

Semantic
Analyzer

Logical
Plan Gen.
51

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4]}
"".join([t + "%" for t in opStack]) == “TS%.*RS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

TS[p]
Stage-1 MapTask
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
Stage-1 ReduceTask
SEL[10]

Stage-1 MapTask
Parser

Semantic
Analyzer

Logical
Plan Gen.
51

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6]}

Parser

Semantic
Analyzer

Logical
Plan Gen.
52

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”

Parser

Semantic
Analyzer

Logical
Plan Gen.
52

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1 MapTask

Stage-1 ReduceTask
Parser

Semantic
Analyzer

Logical
Plan Gen.
52

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1 MapTask

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1

Stage-1 ReduceTask
Parser

Semantic
Analyzer

Stage-2
Logical
Plan Gen.
52

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1 MapTask

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1

Stage-1 ReduceTask
Parser

Semantic
Analyzer

splitPlan

Stage-2
Logical
Plan Gen.
52

Monday, 30 December,

MR[Stage-1] MR[Stage-2]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS[p]
|
FIL[18]
|
RS[4]
/
JOIN[5]
|
FS[19]

TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1 MapTask

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]

Stage-1

Stage-1 ReduceTask
Parser

Semantic
Analyzer

splitPlan

Logical
Plan Gen.

Logical
Optimizer

Physical
Plan Gen.

TS[p]
|
FIL[18]
|
RS[4]
/
JOIN[5]
|
FS[19]

TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]

中间数据落地,
存储在hdfs临时⽂文
件中

Stage-2

52

Monday, 30 December,

MR[Stage-1] MR[Stage-2]

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13]}

Stage-3
Parser

Semantic
Analyzer

Logical
Plan Gen.
53

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”

Stage-3
Parser

Semantic
Analyzer

Logical
Plan Gen.
53

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
Stage-2
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

Stage-3
Semantic
Analyzer

Logical
Plan Gen.
53

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
Stage-2
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

TS[20]
Stage-2
|
RS[6]

JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
Stage-3
FS[17]
Semantic
Analyzer

Logical
Plan Gen.
53

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R3 GenMRRedSink2
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13]}
"".join([t + "%" for t in opStack]) == “RS%.*RS%”
TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
Stage-2
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

TS[20]
Stage-2
|
RS[6]

JOIN[8]
|
splitPlan
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
Stage-3
FS[17]
Semantic
Analyzer

Logical
Plan Gen.
53

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

MR[Stage-2]

MR[Stage-3]

TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[22]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]

Physical
Optimizer
R4 GenMRFileSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13], GBY[14], SEL[15], FS[17]}
"".join([t + "%" for t in opStack]) == “FS%”

Parser

Semantic
Analyzer

Logical
Plan Gen.
54

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R4 GenMRFileSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13], GBY[14], SEL[15], FS[17]}
"".join([t + "%" for t in opStack]) == “FS%”

MR[Stage-1]
|
MR[Stage-2]
|
MR[Stage-3]

Parser

MoveWork[Stage-0]

Semantic
Analyzer

Logical
Plan Gen.
54

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R4 GenMRFileSink1
toWalk[] {TS[du], TS[c]} opStack {TS[p], FIL[18], RS[4], JOIN[5], RS[6],
JOIN[8], SEL[10], GBY[12], RS[13], GBY[14], SEL[15], FS[17]}
"".join([t + "%" for t in opStack]) == “FS%”

MR[Stage-1]
|
MR[Stage-2]
|
MR[Stage-3]

Parser

MoveWork[Stage-0]

Semantic
Analyzer

Logical
Plan Gen.
54

Monday, 30 December,

MR[Stage-1]
|
MR[Stage-2]
|
MR[Stage-3]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Begin Walk
TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

opStack.clear()

Logical
Plan Gen.
55

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Begin Walk
TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

toWalk[] {TS[c], TS[du]}
opStack {}

Logical
Plan Gen.
56

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[c]} opStack {TS[du]}
"".join([t + "%" for t in opStack]) == “ TS%”

Parser

Semantic
Analyzer

Logical
Plan Gen.
57

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[c]} opStack {TS[du]}
"".join([t + "%" for t in opStack]) == “ TS%”
TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
57

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {TS[c]} opStack {TS[du]}
"".join([t + "%" for t in opStack]) == “ TS%”
TS[du] Stage-5 MapTask
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
57

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[c]} opStack {TS[du], RS[7]}
"".join([t + "%" for t in opStack]) == “ TS%.*RS%”

Parser

Semantic
Analyzer

Logical
Plan Gen.
58

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[c]} opStack {TS[du], RS[7]}
"".join([t + "%" for t in opStack]) == “ TS%.*RS%”
TS[du] Stage-5 MapTask
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
58

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[c]} opStack {TS[du], RS[7]}
"".join([t + "%" for t in opStack]) == “ TS%.*RS%”
Stage-5 MapTask
TS[du] Stage-5 MapTask
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Stage-5 ReduceTask

Parser

Semantic
Analyzer

Logical
Plan Gen.
58

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[c]} opStack {TS[du], RS[7]}
"".join([t + "%" for t in opStack]) == “ TS%.*RS%”
Stage-5 MapTask
TS[du] Stage-5 MapTask
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

MR[Stage-2]

+

TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]

Stage-5 ReduceTask

Parser

Semantic
Analyzer

Logical
Plan Gen.
58

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {TS[c]} opStack {TS[du], RS[7]}
"".join([t + "%" for t in opStack]) == “ TS%.*RS%”
Stage-5 MapTask
TS[du] Stage-5 MapTask
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[du]
|
RS[7]
/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

MR[Stage-2]

MR[Stage-2]

+

TS[20]
|
RS[6]

JOIN[8]
|
SEL[10]

merge map
work

Stage-5 ReduceTask

Parser

Semantic
Analyzer

Logical
Plan Gen.
58

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS[20] TS[du]
|
|
RS[6] RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]
Begin Walk

TS[c]
|
RS[3]

JOIN[5]
|
FS[19]

Parser

opStack.clear()

Semantic
Analyzer

Logical
Plan Gen.
59

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Begin Walk

TS[c]
|
RS[3]

JOIN[5]
|
FS[19]

Parser

toWalk[] {TS[c]}
opStack {}

Semantic
Analyzer

Logical
Plan Gen.
60

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R1 GenMRTableScan1
toWalk[] {} opStack {TS[c]}
"".join([t + "%" for t in opStack]) == “ TS%”
Stage-6 MapRedTask

TS[c]
|
RS[3]

JOIN[5]
|
FS[19]

Parser

Semantic
Analyzer

TS[c]
|
RS[3]

JOIN[5]
|
FS[19]

Logical
Plan Gen.
61

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
R2 GenMRRedSink1
toWalk[] {} opStack {TS[c], RS[3]}
"".join([t + "%" for t in opStack]) == “ TS%.*RS%”
Stage-6 MapRedTask

MR[Stage-1]

Stage-6 MapWork

TS[c]
|
RS[3]

JOIN[5]
|
FS[19]

TS[c]
|
RS[3]

JOIN[5]
|
FS[19]

+

TS[p]
|
FIL[18]
|
RS[4]
/
JOIN[5]
|
FS[19]

MR[Stage-1]

merge map
work

Stage-6 RedWork

Parser

Semantic
Analyzer

Logical
Plan Gen.
62

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5]
|
FS[19]
breakTaskTree
MR[Stage-1]

MR[Stage-2]

MR[Stage-3]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5]
|
FS[19]

TS[20] TS[du]
|
|
RS[6] RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[22]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]

Parser

Semantic
Analyzer

Logical
Plan Gen.
63

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
breakTaskTree
MR[Stage-1]

MR[Stage-2]

MR[Stage-3]

MR[Stage-1]

MR[Stage-2]

MR[Stage-3]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5]
|
FS[19]

TS[20] TS[du]
|
|
RS[6] RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[22]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

TS[20] TS[du]
|
|
RS[6] RS[7]

TS[22]
|
RS[13]

Parser

Semantic
Analyzer

Logical
Plan Gen.
63

Monday, 30 December,

JOIN[5]
|
FS[19]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

GBY[14]
|
SEL[15]
|
FS[17]
breakTaskTree
MR[Stage-1]

MR[Stage-2]

MR[Stage-3]

MR[Stage-1]

MR[Stage-2]

MR[Stage-3]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5]
|
FS[19]

TS[20] TS[du]
|
|
RS[6] RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

TS[22]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

TS[20] TS[du]
|
|
RS[6] RS[7]

TS[22]
|
RS[13]

Parser

Semantic
Analyzer

Logical
Plan Gen.
63

Monday, 30 December,

JOIN[5]
|
FS[19]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

GBY[14]
|
SEL[15]
|
FS[17]

map

reduce
Logical Plan => Physical Plan
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

Semantic
Analyzer

Logical
Plan Gen.
64

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Logical Plan => Physical Plan
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

Semantic
Analyzer

MR[Stage-1]
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]
JOIN[5]
|
FS[19]

Logical
Plan Gen.
64

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
Logical Plan => Physical Plan
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

Semantic
Analyzer

MR[Stage-1]
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

TS[20] TS[du]
|
|
RS[6] RS[7]

JOIN[5]
|
FS[19]

Logical
Plan Gen.
64

Monday, 30 December,

MR[Stage-2]

Logical
Optimizer

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Physical
Plan Gen.

Physical
Optimizer
Logical Plan => Physical Plan
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

Semantic
Analyzer

MR[Stage-1]

MR[Stage-3]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

TS[20] TS[du]
|
|
RS[6] RS[7]

TS[22]
|
RS[13]

JOIN[5]
|
FS[19]

Logical
Plan Gen.
64

Monday, 30 December,

MR[Stage-2]

Logical
Optimizer

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Physical
Plan Gen.

GBY[14]
|
SEL[15]
|
FS[17]

Physical
Optimizer
Logical Plan => Physical Plan
TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

/
JOIN[5] TS[du]
|
|
RS[6]
RS[7]

/
JOIN[8]
|
SEL[10]
|
GBY[12]
|
RS[13]
|
GBY[14]
|
SEL[15]
|
FS[17]
Parser

Semantic
Analyzer

MR[Stage-1]

MR[Stage-3]

TS[p]
|
TS[c] FIL[18]
|
|
RS[3] RS[4]

TS[20] TS[du]
|
|
RS[6] RS[7]

TS[22]
|
RS[13]

JOIN[5]
|
FS[19]

Logical
Plan Gen.
64

Monday, 30 December,

MR[Stage-2]

Logical
Optimizer

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Physical
Plan Gen.

GBY[14]
|
SEL[15]
|
FS[17]

Physical
Optimizer

MR[Stage-1]
JOIN[5]
|
MR[Stage-2]
JOIN[8] GBY[12]
|
MR[Stage-3]
GBY[14]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器

3.

Monday, 30 December,

Hive执⾏行计划
Physical Optimizer
名称

作⽤用

CommonJoinResolver +
MapJoinResolver

MapJoin

SortMergeJoinResolver

与bucket配合,类似于归并排序

SamplingOptimizer

并⾏行 order by

Vectorizer

HIVE-4160

Parser

Semantic
Analyzer

Logical
Plan Gen.
66

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoin
MapReduce Local Task

Parser

Semantic
Analyzer

Logical
Plan Gen.
67

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoin

Small
Small
Small
Table
Table
Table
Data
Data
Data

MapReduce Local Task

Parser

Semantic
Analyzer

Logical
Plan Gen.
67

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoin

Small
Small
Small
Table
Table
Table
Data
Data
Data

MapReduce Local Task

HashTable
HashTable
HashTable
Files
Files
Files

Upload files to
DC

Distributed Cache

Parser

Semantic
Analyzer

Logical
Plan Gen.
67

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoin

Small
Small
Small
Table
Table
Table
Data
Data
Data

MapReduce Local Task

HashTable
HashTable
HashTable
Files
Files
Files

Upload files to
DC

Distributed Cache
MapJoin
Task
Mapper
Mapper

…

Mapper

Parser

…

…

Semantic
Analyzer

Logical
Plan Gen.
67

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoin

Small
Small
Small
Table
Table
Table
Data
Data
Data

MapReduce Local Task

HashTable
HashTable
HashTable
Files
Files
Files

Upload files to
DC

Distributed Cache
MapJoin
Task
Mapper
Mapper

…

Mapper

Parser

…

…

Semantic
Analyzer

Logical
Plan Gen.
67

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoin

Small
Small
Small
Table
Table
Table
Data
Data
Data

MapReduce Local Task

HashTable
HashTable
HashTable
Files
Files
Files

Upload files to
DC

Distributed Cache
MapJoin
Task
Mapper

…

Mapper

…

Record

Mapper

…

Record

Record

Record
…
…

Parser

Semantic
Analyzer

Logical
Plan Gen.
67

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

Big
Table
Data
CommonJoinResolver
Task A

Task C

Parser

Semantic
Analyzer

Logical
Plan Gen.
68

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
Task A

Conditional Task

Task C

Parser

Semantic
Analyzer

Logical
Plan Gen.
68

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
Task A

Conditional Task

MapJoin
LocalTask

MapJoinTas
k
Task C

Parser

Semantic
Analyzer

Logical
Plan Gen.
68

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
Task A

Conditional Task
Memory Bound

MapJoin
LocalTask

MapJoinTas
k
Task C

Parser

Semantic
Analyzer

Logical
Plan Gen.
68

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
Task A

Conditional Task
Memory Bound

MapJoin
LocalTask

MapJoinTas
k
Task C

Parser

Semantic
Analyzer

Logical
Plan Gen.
68

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
Task A

Conditional Task
Memory Bound
Run as a Backup
Task

MapJoin
LocalTask

CommonJoinTas
k
MapJoinTas
k
Task C

Parser

Semantic
Analyzer

Logical
Plan Gen.
68

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
•
•
•
•

MR[Stage-1]
JOIN[5]
|
MR[Stage-2]
JOIN[8] GBY[12]
|
MR[Stage-3]
GBY[14]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

Parser

Semantic
Analyzer

Logical
Plan Gen.
69

Monday, 30 December,

深度优先遍历Task Tree
找到JoinOperator,判断左右表数据量⼤大⼩小
⼩小表 + ⼤大表 => MapJoinTask
⼩小/⼤大表 + 中间表 => ConditionalTask

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
TS[20] TS[du]
|
|
RS[6] RS[7]
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
70

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
big table
TS[20] TS[du]
|
|
RS[6] RS[7]
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
70

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
big table
TS[20] TS[du]
|
|
RS[6] RS[7] deepCopy
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

TS[23] TS[25]
|
|
RS[24] RS[26]
JOIN[34]
|
SEL[35]
|
GBY[36]
|
FS[37]

Semantic
Analyzer

Logical
Plan Gen.
70

Monday, 30 December,

MR[Stage-7]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
big table
TS[20] TS[du]
|
|
RS[6] RS[7] deepCopy
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

TS[23] TS[25]
|
|
RS[24] RS[26]

TS[23] TS[25] Map Only MR

/
MAPJOIN[44]
|
SEL[35]
|
GBY[36]
|
FS[37]

JOIN[34]
|
SEL[35]
|
GBY[36]
|
FS[37]

Semantic
Analyzer

Logical
Plan Gen.
70

Monday, 30 December,

MRTask[Stage-7]
FetchWork[$INTNAME] LocalWork

MR[Stage-7]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
TS[20] TS[du]
|
|
RS[6] RS[7]
JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
71

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
TS[20] TS[du]
|
|
RS[6] RS[7]

big table

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
71

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-2]
big table
TS[20] TS[du]
|
|
RS[6] RS[7] deepCopy

...

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
71

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MRTask[Stage-8]
FetchWork[du] LocalWork

MR[Stage-2]
big table
TS[20] TS[du]
|
|
RS[6] RS[7] deepCopy

...

JOIN[8]
|
SEL[10]
|
GBY[12]
|
FS[21]

Parser

Semantic
Analyzer

Logical
Plan Gen.
71

Monday, 30 December,

TS[45] TS[47]

/
MAPJOIN[66]
|
SEL[57]
|
GBY[36]
|
FS[37]

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer

Map Only MR
CommonJoinResolver
MR[Stage-1]
JOIN[5]
|
MR[Stage-2]
JOIN[8] GBY[12]
|
MR[Stage-3]
GBY[14]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

Parser

Semantic
Analyzer

Logical
Plan Gen.
72

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-10]
MAPJOIN
|
ConditionalTask[Stage-9]
/
|

MR[Stage-7] MR[Stage-8] MR[Stage-2]
MAPJOIN
MAPJOIN JOIN

|
/

|
/
MR[Stage-3]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

MR[Stage-1]
JOIN[5]
|
MR[Stage-2]
JOIN[8] GBY[12]
|
MR[Stage-3]
GBY[14]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

Parser

Semantic
Analyzer

Logical
Plan Gen.
72

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
CommonJoinResolver
MR[Stage-10]
MAPJOIN
|
ConditionalTask[Stage-9]
/
|

MR[Stage-7] MR[Stage-8] MR[Stage-2] 运⾏行时判断,
MAPJOIN
MAPJOIN JOIN
采⽤用哪种⽅方式执⾏行

|
/

|
/
MR[Stage-3]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

MR[Stage-1]
JOIN[5]
|
MR[Stage-2]
JOIN[8] GBY[12]
|
MR[Stage-3]
GBY[14]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

Parser

Semantic
Analyzer

Logical
Plan Gen.
72

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoinResolver
•

遍历Task Tree,将所有有local work的MapReduceTask拆
成两个Task
MRTask[Stage-13]
FetchWork[c]
HashTableSinkOperator
|
MRTask[Stage-10]
MRWork

MRTask[Stage-10]
FetchWork[c]
MRWork

Parser

Semantic
Analyzer

Logical
Plan Gen.
73

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
MapJoinResolver
Lock[Stage-13]
|
MR[Stage-10]
MAPJOIN
|
ConditionalTask[Stage-9]
/
|

Lock[Stage-11] Lock[Stage-12] 
|
|
|
MR[Stage-7] MR[Stage-8] MR[Stage-2]
MAPJOIN
MAPJOIN
JOIN

|
/

|
/

|
/
MR[Stage-3]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

MR[Stage-10]
MAPJOIN
|
ConditionalTask[Stage-9]
/
|

MR[Stage-7] MR[Stage-8] MR[Stage-2]
MAPJOIN
MAPJOIN JOIN

|
/

|
/
MR[Stage-3]
|
MoveWork[Stage-0]
|
StatsWork[Stage-4]

Parser

Semantic
Analyzer

Logical
Plan Gen.
74

Monday, 30 December,

Logical
Optimizer

Physical
Plan Gen.

Physical
Optimizer
回顾
sql翻译的过程

Monday, 30 December,
回顾
sql翻译的过程
1.

Monday, 30 December,

Antlr定义sql的语法规则,完成sql词法,语法解析,将sql转化为抽象语
法树AST Tree
回顾
sql翻译的过程
1.

Antlr定义sql的语法规则,完成sql词法,语法解析,将sql转化为抽象语
法树AST Tree

2.

遍历AST Tree,抽象出查询的基本组成单元QueryBlock

Monday, 30 December,
回顾
sql翻译的过程
1.

Antlr定义sql的语法规则,完成sql词法,语法解析,将sql转化为抽象语
法树AST Tree

2.

遍历AST Tree,抽象出查询的基本组成单元QueryBlock

3.

遍历QueryBlock,翻译为执⾏行逻辑OperatorTree

Monday, 30 December,
回顾
sql翻译的过程
1.

Antlr定义sql的语法规则,完成sql词法,语法解析,将sql转化为抽象语
法树AST Tree

2.

遍历AST Tree,抽象出查询的基本组成单元QueryBlock

3.

遍历QueryBlock,翻译为执⾏行逻辑OperatorTree

4.

逻辑优化器进⾏行OperatorTree变换,合并ReduceSink,减少shuffle数据量

Monday, 30 December,
回顾
sql翻译的过程
1.

Antlr定义sql的语法规则,完成sql词法,语法解析,将sql转化为抽象语
法树AST Tree

2.

遍历AST Tree,抽象出查询的基本组成单元QueryBlock

3.

遍历QueryBlock,翻译为执⾏行逻辑OperatorTree

4.

逻辑优化器进⾏行OperatorTree变换,合并ReduceSink,减少shuffle数据量

5.

遍历OperatorTree,翻译为MapReduce任务

Monday, 30 December,
回顾
sql翻译的过程
1.

Antlr定义sql的语法规则,完成sql词法,语法解析,将sql转化为抽象语
法树AST Tree

2.

遍历AST Tree,抽象出查询的基本组成单元QueryBlock

3.

遍历QueryBlock,翻译为执⾏行逻辑OperatorTree

4.

逻辑优化器进⾏行OperatorTree变换,合并ReduceSink,减少shuffle数据量

5.

遍历OperatorTree,翻译为MapReduce任务

6.

物理层优化器进⾏行MapReduce任务的变换,⽣生成Conditional Task,动态
检测是否能转化MapJoin

Monday, 30 December,
⺫⽬目录
1.

MapReduce实现Join Group By Distinct操作的基本原理

2.

SQL转化为MapReduce的过程
(1) Antlr && ASTTree
(2) sql基本组成单元QueryBlock
(3) 逻辑操作符Operator
(4) 逻辑层优化器
(5) OperatorTree转化为MapReduce Job的过程
(6) 物理层优化器 MapJoin原理

3.

Monday, 30 December,

Hive执⾏行计划
执⾏行计划

• AST抽象语法树
• Stage Dependency
• MapReduce Plan

Monday, 30 December,
Stage Dependency
Stage-11 depends on stages: Stage-14 , consists of Stage-15, Stage-16, Stage-4
Stage-11是⼀一个ConditionalTask,可能执⾏行Stage-15/Stage-16/Stage-4中的
⼀一个。⺫⽬目前出现ConditionalTask只可能是在执⾏行期间判断是否能转化为
MapJoin的情况。Stage-4 common join,Stage-15和Stage-16就是可能的两
种MapJoin的情况。

Monday, 30 December,
Stage Dependency
Stage-11 depends on stages: Stage-14 , consists of Stage-15, Stage-16, Stage-4
Stage-11是⼀一个ConditionalTask,可能执⾏行Stage-15/Stage-16/Stage-4中的
⼀一个。⺫⽬目前出现ConditionalTask只可能是在执⾏行期间判断是否能转化为
MapJoin的情况。Stage-4 common join,Stage-15和Stage-16就是可能的两
种MapJoin的情况。

Monday, 30 December,
MapReduce Plan

•
•
•
•
•

Monday, 30 December,

ReduceSinkOperator只可能出现在Map阶段,且标志着Map阶段
组合字段为reduce key, value
sort order 按id正排,按name正排
partition key 按partitionkey求hash值分配reduce
tag,标⽰示表,在Join中区分是哪个原始表
MapReduce Plan

•

每个Operator计算完成之后均会对字段重新命名,命名⽅方式_col + i,Map
输出字段以KEY/VALUE._col + i形式表⽰示

•
•

KEY._col1:0._col0 “0.”表⽰示给distinct字段打上标签

Monday, 30 December,

mode,聚合计算⽅方式,COMPLETE, PARTIAL1, PARTIAL2, PARTIALS,
FINAL, HASH, MERGEPARTIAL
MapReduce Plan

•
•

Monday, 30 December,

condition expression表⽰示join中两表分别包含的字段
Position of Big Table 表⽰示tag=1的表是数据量⼤大的表
Monday, 30 December,
Thanks && QA

Monday, 30 December,

More Related Content

What's hot

Java Input Output (java.io.*)
Java Input Output (java.io.*)Java Input Output (java.io.*)
Java Input Output (java.io.*)Om Ganesh
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
 
Python variables and data types.pptx
Python variables and data types.pptxPython variables and data types.pptx
Python variables and data types.pptxAkshayAggarwal79
 
Introduction to Python programming Language
Introduction to Python programming LanguageIntroduction to Python programming Language
Introduction to Python programming LanguageMansiSuthar3
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowJulien Le Dem
 
Function in Python
Function in PythonFunction in Python
Function in PythonYashdev Hada
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...Edureka!
 
PLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptxPLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptxjohnwick814916
 
Algoritmos e Lógica de Programação
Algoritmos e Lógica de ProgramaçãoAlgoritmos e Lógica de Programação
Algoritmos e Lógica de ProgramaçãoJose Augusto Cintra
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySparkSpark Summit
 
Programando para web com python - Introdução a Python
Programando para web com python - Introdução a PythonProgramando para web com python - Introdução a Python
Programando para web com python - Introdução a PythonAlvaro Oliveira
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsdatamantra
 
Python Basics | Python Tutorial | Edureka
Python Basics | Python Tutorial | EdurekaPython Basics | Python Tutorial | Edureka
Python Basics | Python Tutorial | EdurekaEdureka!
 

What's hot (20)

Java Input Output (java.io.*)
Java Input Output (java.io.*)Java Input Output (java.io.*)
Java Input Output (java.io.*)
 
Java lab-manual
Java lab-manualJava lab-manual
Java lab-manual
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Writing Parsers and Compilers with PLY
Writing Parsers and Compilers with PLYWriting Parsers and Compilers with PLY
Writing Parsers and Compilers with PLY
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
 
Python variables and data types.pptx
Python variables and data types.pptxPython variables and data types.pptx
Python variables and data types.pptx
 
Haskell study 6
Haskell study 6Haskell study 6
Haskell study 6
 
Introduction to Python programming Language
Introduction to Python programming LanguageIntroduction to Python programming Language
Introduction to Python programming Language
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Function in Python
Function in PythonFunction in Python
Function in Python
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
 
PLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptxPLPgSqL- Datatypes, Language structure.pptx
PLPgSqL- Datatypes, Language structure.pptx
 
HTAP Queries
HTAP QueriesHTAP Queries
HTAP Queries
 
Introdução ao Prolog
Introdução ao PrologIntrodução ao Prolog
Introdução ao Prolog
 
Algoritmos e Lógica de Programação
Algoritmos e Lógica de ProgramaçãoAlgoritmos e Lógica de Programação
Algoritmos e Lógica de Programação
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
 
Avro introduction
Avro introductionAvro introduction
Avro introduction
 
Programando para web com python - Introdução a Python
Programando para web com python - Introdução a PythonProgramando para web com python - Introdução a Python
Programando para web com python - Introdução a Python
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Python Basics | Python Tutorial | Edureka
Python Basics | Python Tutorial | EdurekaPython Basics | Python Tutorial | Edureka
Python Basics | Python Tutorial | Edureka
 

Recently uploaded

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 

Recently uploaded (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 

Hive sql的编译过程