SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Meet Your Match
Advanced row pattern matching (12c)
Stew Ashton
UKOUG Tech 17
Can you read the following line? If not, please move closer.
It's much better when you can read the code ;)
Advanced usage, not all the syntax
• Reminder of the basics
• Exercises
• Bin fitting
• Positive and negative sequencing
• Hierarchical summaries
– Thanks, Kim Berg Hansen
• Alternatives to inequality joining
– Thanks, Jonathan Lewis
2
Reminder: the Basics
• To illustrate: table with PAGE column
– Group consecutive pages together
3
PAGE
1
2
3
5
FIRSTPAGE LASTPAGE CNT
1 3 3
5 5 1
Pattern and Matching Rows
• PATTERN
– Uninterrupted series of input rows
– Described as list of conditions (≅ “regular expressions”)
PATTERN (A B*)
"A" : 1 row, "B*" : 0 or more rows, as many as possible
• DEFINE (at least one) row condition
[A undefined = TRUE]
B AS page = PREV(page)+1
• Each series that matches the pattern is a “match”
– "A" and "B" identify the rows that meet their conditions
– There can be unmatched rows between series
4
Input, Processing, Output
1. Define input
2. Order input
3. Process pattern
4. using defined conditions
5. Output: rows per match
6. Output: columns per row
7. Go where after match?
5
SELECT *
FROM t
MATCH_RECOGNIZE (
ORDER BY page
MEASURES
A.page as firstpage,
LAST(page) as lastpage,
COUNT(*) cnt
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (A B*)
DEFINE B AS page = PREV(page)+1
);
Which row do we mean?
pg id
DEFINE ALL ROWS PER MATCH ONE ROW PER MATCH
first Current last first Current last
Final
last
first Current last
Final
last
1 A 1 1 1 1 1 1 3
2 B 1 2 2 1 2 2 3
3 B 1 3 3 1 3 3 3 1 3 3 3
5 B? 1 5 5
6
Column name by itself = « current » row
• DEFINE: row being evaluated ; ALL ROWS: each row ; ONE ROW: last row
Exercise: what output from this input?
7
CUST_ID TX_DATE DESCR
C001 2016-01-01 Inquiry
C001 2016-01-01 Inquiry
C001 2016-01-10 Sales
C001 2016-01-21 Repeat Inquiry
C001 2016-02-10 Repeat Inquiry
C001 2016-05-01 Sales
C001 2016-05-06 Sales
C001 2016-06-10 Inquiry 1
C001 2016-09-01 Inquiry 2
C002 2016-02-01 Inquiry 1
C002 2016-02-25 Inquiry 2
C003 2016-02-01 Inquiry 2
C003 2016-02-10 Sales
C003 2016-02-10 Sales
C003 2016-03-10 Inquiry 2
C004 2016-04-15 Sales
select * from t match_recognize(
all rows per match
pattern (a*)
define a as 1=1
);
Add sequence number, starting over after 40 days
8
CUST_ID TX_DATE DESCR
C001 2016-01-01 Inquiry
C001 2016-01-01 Inquiry
C001 2016-01-10 Sales
C001 2016-01-21 Repeat Inquiry
C001 2016-02-10 Repeat Inquiry
C001 2016-05-01 Sales
C001 2016-05-06 Sales
C001 2016-06-10 Inquiry 1
C001 2016-09-01 Inquiry 2
C002 2016-02-01 Inquiry 1
C002 2016-02-25 Inquiry 2
C003 2016-02-01 Inquiry 2
C003 2016-02-10 Sales
C003 2016-02-10 Sales
C003 2016-03-10 Inquiry 2
C004 2016-04-15 Sales
select * from t match_recognize(
all rows per match
pattern (a*)
define a as 1=1
);
Add sequence number, starting over after 40 days
9
CUST_ID TX_DATE DESCR
C001 2016-01-01 Inquiry
C001 2016-01-01 Inquiry
C001 2016-01-10 Sales
C001 2016-01-21 Repeat Inquiry
C001 2016-02-10 Repeat Inquiry
C001 2016-05-01 Sales
C001 2016-05-06 Sales
C001 2016-06-10 Inquiry 1
C001 2016-09-01 Inquiry 2
C002 2016-02-01 Inquiry 1
C002 2016-02-25 Inquiry 2
C003 2016-02-01 Inquiry 2
C003 2016-02-10 Sales
C003 2016-02-10 Sales
C003 2016-03-10 Inquiry 2
C004 2016-04-15 Sales
select * from t match_recognize(
all rows per match
pattern (a*)
define
a as 1=1
);
Add sequence number, starting over after 40 days
10
CUST_ID TX_DATE DESCR
C001 2016-01-01 Inquiry
C001 2016-01-01 Inquiry
C001 2016-01-10 Sales
C001 2016-01-21 Repeat Inquiry
C001 2016-02-10 Repeat Inquiry
C001 2016-05-01 Sales
C001 2016-05-06 Sales
C001 2016-06-10 Inquiry 1
C001 2016-09-01 Inquiry 2
C002 2016-02-01 Inquiry 1
C002 2016-02-25 Inquiry 2
C003 2016-02-01 Inquiry 2
C003 2016-02-10 Sales
C003 2016-02-10 Sales
C003 2016-03-10 Inquiry 2
C004 2016-04-15 Sales
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
all rows per match
pattern (a*)
define
a as
);
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
all rows per match
pattern (a*)
define
a as tx_date <= first(tx_date) + 40
);
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) as seq
all rows per match
pattern (a*)
define
a as tx_date <= first(tx_date) + 40
);
Add sequence number, starting over after 40 days
11
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 1
C001 2016-01-01 Inquiry 2
C001 2016-01-10 Sales 3
C001 2016-01-21 Repeat Inquiry 4
C001 2016-02-10 Repeat Inquiry 5
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 1
C002 2016-02-01 Inquiry 1 1
C002 2016-02-25 Inquiry 2 2
C003 2016-02-01 Inquiry 2 1
C003 2016-02-10 Sales 2
C003 2016-02-10 Sales 3
C003 2016-03-10 Inquiry 2 4
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) as seq
all rows per match
pattern (a*)
define
a as tx_date <= first(tx_date) + 40
);
Sequence starts from First Sale, Inquiry outside 40 days = 0
12
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 1
C001 2016-01-01 Inquiry 2
C001 2016-01-10 Sales 3
C001 2016-01-21 Repeat Inquiry 4
C001 2016-02-10 Repeat Inquiry 5
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 1
C002 2016-02-01 Inquiry 1 1
C002 2016-02-25 Inquiry 2 2
C003 2016-02-01 Inquiry 2 1
C003 2016-02-10 Sales 2
C003 2016-02-10 Sales 3
C003 2016-03-10 Inquiry 2 4
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) as seq
all rows per match
pattern (a*)
define
a as tx_date <= first(tx_date) + 40
);
Sequence starts from Sale, Inquiry outside 40 days = 0
13
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 1
C001 2016-01-01 Inquiry 2
C001 2016-01-10 Sales 3
C001 2016-01-21 Repeat Inquiry 4
C001 2016-02-10 Repeat Inquiry 5
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 1
C002 2016-02-01 Inquiry 1 1
C002 2016-02-25 Inquiry 2 2
C003 2016-02-01 Inquiry 2 1
C003 2016-02-10 Sales 2
C003 2016-02-10 Sales 3
C003 2016-03-10 Inquiry 2 4
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) as seq
all rows per match
pattern (a *)
define
a as tx_date <= first(tx_date) + 40
);
Sequence starts from Sale, Inquiry outside 40 days = 0
14
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 1
C001 2016-01-01 Inquiry 2
C001 2016-01-10 Sales 3
C001 2016-01-21 Repeat Inquiry 4
C001 2016-02-10 Repeat Inquiry 5
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 1
C002 2016-02-01 Inquiry 1 1
C002 2016-02-25 Inquiry 2 2
C003 2016-02-01 Inquiry 2 1
C003 2016-02-10 Sales 2
C003 2016-02-10 Sales 3
C003 2016-03-10 Inquiry 2 4
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) as seq
all rows per match
pattern (inq* sale1{0,1} more_tx*)
define
more_tx as tx_date <= + 40
);
- count(inq.*)
define inq as descr != 'Sales',
sale1 as descr = 'Sales',
more_tx as tx_date <= sale1.tx_date + 40
);
Sequence starts from Sale, Inquiry outside 40 days = 0
15
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 0
C001 2016-01-01 Inquiry 0
C001 2016-01-10 Sales 1
C001 2016-01-21 Repeat Inquiry 2
C001 2016-02-10 Repeat Inquiry 3
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 0
C002 2016-02-01 Inquiry 1 0
C002 2016-02-25 Inquiry 2 0
C003 2016-02-01 Inquiry 2 0
C003 2016-02-10 Sales 1
C003 2016-02-10 Sales 2
C003 2016-03-10 Inquiry 2 3
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) - count(inq.*) as seq
all rows per match
pattern (inq* sale1{0,1} more_tx*)
define inq as descr != 'Sales',
sale1 as descr = 'Sales',
more_tx as tx_date <= sale1.tx_date + 40
);
Negative sequence for Inquiries within 10 days prior to Sale
16
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 0
C001 2016-01-01 Inquiry 0
C001 2016-01-10 Sales 1
C001 2016-01-21 Repeat Inquiry 2
C001 2016-02-10 Repeat Inquiry 3
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 0
C002 2016-02-01 Inquiry 1 0
C002 2016-02-25 Inquiry 2 0
C003 2016-02-01 Inquiry 2 0
C003 2016-02-10 Sales 1
C003 2016-02-10 Sales 2
C003 2016-03-10 Inquiry 2 3
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures count(*) - count(inq.*) as seq
all rows per match
pattern (inq* sale1{0,1} more_tx*)
define inq as descr != 'Sales',
sale1 as descr = 'Sales',
more_tx as tx_date <= sale1.tx_date + 40
);
Negative sequence for Inquiries within 10 days prior to Sale
17
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 0
C001 2016-01-01 Inquiry 0
C001 2016-01-10 Sales 1
C001 2016-01-21 Repeat Inquiry 2
C001 2016-02-10 Repeat Inquiry 3
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 0
C002 2016-02-01 Inquiry 1 0
C002 2016-02-25 Inquiry 2 0
C003 2016-02-01 Inquiry 2 0
C003 2016-02-10 Sales 1
C003 2016-02-10 Sales 2
C003 2016-03-10 Inquiry 2 3
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures
count(*) - count(inq.*)
as seq
all rows per match
pattern (inq* sale1{0,1} more_tx*)
define inq as descr != 'Sales',
sale1 as descr = 'Sales',
more_tx as tx_date <= sale1.tx_date + 40
);
Negative sequence for Inquiries within 10 days prior to Sale
18
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry 0
C001 2016-01-01 Inquiry 0
C001 2016-01-10 Sales 1
C001 2016-01-21 Repeat Inquiry 2
C001 2016-02-10 Repeat Inquiry 3
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 0
C002 2016-02-01 Inquiry 1 0
C002 2016-02-25 Inquiry 2 0
C003 2016-02-01 Inquiry 2 0
C003 2016-02-10 Sales 1
C003 2016-02-10 Sales 2
C003 2016-03-10 Inquiry 2 3
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures case when classifier() = 'INQ'
and tx_date >=
final first(sale1.tx_date) - 10
then
count(inq.*) - final count(inq.*) - 1
else
count(*) - count(inq.*)
end as seq
all rows per match
pattern (inq* sale1{0,1} more_tx*)
define inq as descr != 'Sales',
sale1 as descr = 'Sales',
more_tx as tx_date <= sale1.tx_date + 40
);
Negative sequence for Inquiries within 10 days prior to Sale
19
CUST_ID TX_DATE DESCR SEQ
C001 2016-01-01 Inquiry -2
C001 2016-01-01 Inquiry -1
C001 2016-01-10 Sales 1
C001 2016-01-21 Repeat Inquiry 2
C001 2016-02-10 Repeat Inquiry 3
C001 2016-05-01 Sales 1
C001 2016-05-06 Sales 2
C001 2016-06-10 Inquiry 1 3
C001 2016-09-01 Inquiry 2 0
C002 2016-02-01 Inquiry 1 0
C002 2016-02-25 Inquiry 2 0
C003 2016-02-01 Inquiry 2 -1
C003 2016-02-10 Sales 1
C003 2016-02-10 Sales 2
C003 2016-03-10 Inquiry 2 3
C004 2016-04-15 Sales 1
select * from t match_recognize(
partition by cust_id
order by tx_date, descr
measures case when classifier() = 'INQ'
and tx_date >=
final first(sale1.tx_date) - 10
then
count(inq.*) - final count(inq.*) - 1
else
count(*) - count(inq.*)
end as seq
all rows per match
pattern (inq* sale1{0,1} more_tx*)
define inq as descr != 'Sales',
sale1 as descr = 'Sales',
more_tx as tx_date <= sale1.tx_date + 40
);
Hierarchical Summary: get salaries of mgr + subordinates
20
select level lvl, ename, sal
from scott.emp
start with mgr is null
connect by mgr = prior empno;
LVL ENAME SAL
1 KING 5000
2 JONES 2975
3 SCOTT 3000
4 ADAMS 1100
3 FORD 3000
4 SMITH 800
2 BLAKE 2850
3 ALLEN 1600
3 WARD 1250
3 MARTIN 1250
3 TURNER 1500
3 JAMES 950
2 CLARK 2450
3 MILLER 1300
>2
Hierarchical Summary: get salaries of mgr + subordinates
21
select * from (
select level lvl, ename, sal
from scott.emp
start with mgr is null
connect by mgr = prior empno
)
match_recognize(
measures a.lvl lvl, a.ename ename,
a.sal sal, sum(sal) as sum_sal
pattern(a b*)
define b as lvl > a.lvl
);
LVL ENAME SAL
1 KING 5000
2 JONES 2975
3 SCOTT 3000
4 ADAMS 1100
3 FORD 3000
4 SMITH 800
2 BLAKE 2850
3 ALLEN 1600
3 WARD 1250
3 MARTIN 1250
3 TURNER 1500
3 JAMES 950
2 CLARK 2450
3 MILLER 1300
Hierarchical Summary: get salaries of mgr + subordinates
22
LVL ENAME SAL SUM_SAL
1 KING 5000 29025
select * from (
select level lvl, ename, sal
from scott.emp
start with mgr is null
connect by mgr = prior empno
)
match_recognize(
measures a.lvl lvl, a.ename ename,
a.sal sal, sum(sal) as sum_sal
pattern(a b*)
define b as lvl > a.lvl
);
Hierarchical Summary: get salaries of mgr + subordinates
23
LVL ENAME SAL SUM_SAL
1 KING 5000 29025
select * from (
select level lvl, ename, sal
from scott.emp
start with mgr is null
connect by mgr = prior empno
)
match_recognize(
measures a.lvl lvl, a.ename ename,
a.sal sal, sum(sal) as sum_sal
after match skip past last row
pattern(a b*)
define b as lvl > a.lvl
);
Hierarchical Summary: get salaries of mgr + subordinates
24
LVL ENAME SAL SUM_SAL
1 KING 5000 29025
select * from (
select level lvl, ename, sal
from scott.emp
start with mgr is null
connect by mgr = prior empno
)
match_recognize(
measures a.lvl lvl, a.ename ename,
a.sal sal, sum(sal) as sum_sal
after match skip to next row
pattern(a b*)
define b as lvl > a.lvl
);
Hierarchical Summary: get salaries of mgr + subordinates
25
LVL ENAME SAL SUM_SAL
1 KING 5000 29025
2 JONES 2975 10875
3 SCOTT 3000 4100
4 ADAMS 1100 1100
3 FORD 3000 3800
4 SMITH 800 800
2 BLAKE 2850 9400
3 ALLEN 1600 1600
3 WARD 1250 1250
3 MARTIN 1250 1250
3 TURNER 1500 1500
3 JAMES 950 950
2 CLARK 2450 3750
3 MILLER 1300 1300
select * from (
select level lvl, ename, sal
from scott.emp
start with mgr is null
connect by mgr = prior empno
)
match_recognize(
measures a.lvl lvl, a.ename ename,
a.sal sal, sum(sal) as sum_sal
after match skip to next row
pattern(a b*)
define b as lvl > a.lvl
);
http://www.kibeha.dk/2015/07/row-pattern-matching-nested-within.html
Inequality joins
26
>create table t1(id, jd, v) cache
as
select level, level + .1, level
from dual
connect by level <= 20000;
• Equality
• Band Join: compare T1.ID to T2.ID + a constant
• Range Join: T1.ID within a range T2 (ID to JD)
• Overlap Join: T1 range (ID to JD) overlaps T2 range
>create table t2 cache
as
select * from t1;
Equality
27
select t1.id id1, t2.id id2, t1.v v1, t2.v v2
from t1, t2
where t1.id = t2.id
Elapsed: .O1 seconds
Band Join
28
select t1.id id1, t2.id id2, t1.v v1, t2.v v2
from t1, t2
where t1.id between t2.id and t2.id + .1
Elapsed: .O4 seconds
• New implementation in 12.2
• Before 12.2, about the same time as range join =>
Range Join
29
select t1.id id1, t2.id id2, t1.v v1, t2.v v2
from t1, t2
where t1.id between t2.id and t2.jd
Elapsed: 30 seconds
(Equality: .01
Band: .04)
Range Join Execution Plan
30
-------------------------------------------------------
| Id | Operation | Name | Starts | A-Rows |
-------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20000 |
| 1 | MERGE JOIN | | 1 | 20000 |
| 2 | SORT JOIN | | 1 | 20000 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 20000 |
|* 4 | FILTER | | 20000 | 20000 |
|* 5 | SORT JOIN | | 20000 | 200M|
| 6 | TABLE ACCESS FULL| T2 | 1 | 20000 |
-------------------------------------------------------
31
T1 T2
ID ID JD
1 1 1.1
1 2 2.1
1 3 3.1
2 1 1.1
2 2 2.1
2 3 3.1
3 1 1.1
3 2 2.1
3 3 3.1
All possible combinations
32
T1 T2
ID ID JD
1 1 1.1
1 2 2.1
1 3 3.1
2 1 1.1
2 2 2.1
2 3 3.1
3 1 1.1
3 2 2.1
3 3 3.1
5 - access("T1"."ID">="T2"."ID")
33
T1 T2
ID ID JD
1 1 1.1
1 2 2.1
1 3 3.1
2 1 1.1
2 2 2.1
2 3 3.1
3 1 1.1
3 2 2.1
3 3 3.1
4 - filter("T1"."ID"<="T2"."JD")
5 - access("T1"."ID">="T2"."ID")
Sort all and Match?
34
T2 T1 T2 T1 T2 T1 T2
1 1 1
2 2.1 2 2
3 3.1 3 3
4 4.1 4
5 5.1
Sort by ID, T2 first. (order shown from left to right)
T2 range diff. is now 1.1, so will match 2 T1 rows
Sort all and Match?
35
T2
1 1
2 2.1
3
4
5
Start
Look for following T1 rows with ID <= 2.1
Due to sort, their IDs must be >= T2.ID
Sort all and Match?
36
T2 T1
1 1 1
2 2.1
3
4
5
Start Join
T1.ID < 2.1
Due to sort, T1.ID is automatically >= T2.ID
Sort all and Match?
37
T2 T1 T2
1 1 1
2 2.1 2
3 3.1
4
5
Start Join match
T2.ID < 2.1
So match, but do not output
Sort all and Match?
38
T2 T1 T2 T1
1 1 1
2 2.1 2 2
3 3.1
4
5
Start Join match Join
T1.ID < 2.1
So match and output
Sort all and Match?
39
T2 T1 T2 T1 T2
1 1 1
2 2.1 2 2
3 3.1 3
4 4.1
5
Start Join match Join X
Match ended
Skip to next
Sort all and Match?
40
T2 T1 T2 T1 T2 T1 T2
1 1 1
2 2.1 2 2
3 3.1 3 3
4 4.1 4
5 5.1
Start Join match Join X
(Almost) All Rows per Match
• PATTERN ( A {- B A -} B)
– The parts of the pattern enclosed
between {- and -} are excluded from the output.
– Here only two rows per match will be returned
– More granular than using a WHERE clause
• Alternation: | means OR
– "Alternatives are preferred in the order they are specified."
PATTERN ( A | B ) =
If A condition is true then A, else if B condition is true then B
41
Range Match
42
select ID ID1, ID2, JD2
from (
select t2.*, 1 is_t2 from t2
union all
select t1.*, null from t1
)
match_recognize(
order by id, is_t2
measures t2.id id2, t2.jd jd2
all rows per match
after match skip to next row
pattern({-T2-} ( T1 | {-T2-} )* T1)
define T2 as is_t2 = 1 and id < first(t2.jd),
T1 as is_t2 is null and id < first(t2.jd)
);
Elapsed: .12 secs
Equality: .01
Band: .04
Range join: 30.00
Overlap Join
43
select t1.id id1, t2.id id2, t1.v v1, t2.v v2
from t1, t2
where (t2.id <= t1.id and t1.id < t2.jd)
or (t1.id <= t2.id and t2.id < t1.jd)
Elapsed: 50 seconds
Overlap Join Execution Plan
44
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | VIEW | VW...| 1 | 175M| 20000 |
| 3 | UNION-ALL | | 1 | | 20000 |
| 4 | MERGE JOIN | | 1 | 100M| 20000 |
| 5 | SORT JOIN | | 1 | 20000 | 20000 |
| 6 | TABLE ACCESS FULL | T2 | 1 | 20000 | 20000 |
|* 7 | FILTER | | 20000 | | 20000 |
|* 8 | SORT JOIN | | 20000 | 20000 | 200M|
| 9 | TABLE ACCESS FULL| T1 | 1 | 20000 | 20000 |
| 10 | MERGE JOIN | | 1 | 75M| 0 |
| 11 | SORT JOIN | | 1 | 20000 | 20000 |
| 12 | TABLE ACCESS FULL | T2 | 1 | 20000 | 20000 |
|* 13 | FILTER | | 20000 | | 0 |
|* 14 | SORT JOIN | | 20000 | 20000 | 200M|
| 15 | TABLE ACCESS FULL| T1 | 1 | 20000 | 20000 |
Overlap Match
45
select * from (
select t1.*, 1 table_num from t1
union all
select t2.*, 2 from t2
)
match_recognize(
order by id, jd
all rows per match
after match skip to next row
pattern({-ta-} ( tb | {-x-} )* tb)
define tb as table_num != ta.table_num
and id < first(jd),
x as table_num = ta.table_num
and id < first(jd)
);
Elapsed: .12 secs
Equality: .01
Band: .04
Range match: .12
No need to wait for 18c
46
Child'
s play
Solving Problems with pattern matching
• Clear knowledge of input & requirement
– Beware of assumptions
• Identify typical problems and solutions
– Consecutive sequences
– Ad hoc grouping
– Bin fitting
– Ranges
• Visualize the data processing flow
– Output from other rows is not available, input is.
47
Meet Your Match
Advanced row pattern matching (12c)
Stew Ashton
UKOUG Tech 17
https://stewashton.wordpress.com/
Twitter: @stewashton
Anchors
• Anchors
– ^ matches the position before the first row in the
partition.
– $ matches the position after the last row in the
partition
PATTERN(^ A $) = partition must have 1 row
49
JOIN alternative: CDC compare
50
PKVAL
1Same value
2Delete this
3Old value
PKVAL
1Same value
3New value
4Insert this
T1 T2 select pk, op, val, oldrid from (
select pk, val, rowid rid from t1
union all
select pk, val, null from t2
)
match_recognize(
partition by pk order by rid
measures classifier() op,
first(rid) oldrid
all rows per match
pattern(^ D $ | ^ I $ | (^ O U $) )
define D as rid is not null,
U as decode(O.val, val, 0, 1) = 1
);
PK OP VAL OLDRID
2D Delete this AAAkdlAAH…MAAB
3O Old value AAAkdlAAH…MAAC
3U New value AAAkdlAAH…MAAC
4I Insert this

Weitere ähnliche Inhalte

Was ist angesagt?

11. Linear Models
11. Linear Models11. Linear Models
11. Linear ModelsFAO
 
mat lab introduction and basics to learn
mat lab introduction and basics to learnmat lab introduction and basics to learn
mat lab introduction and basics to learnpavan373
 
Comparison GUM versus GUM+1
Comparison GUM  versus GUM+1Comparison GUM  versus GUM+1
Comparison GUM versus GUM+1Maurice Maeck
 
Proyecto diseño tablestaca muro milan
Proyecto diseño tablestaca  muro milanProyecto diseño tablestaca  muro milan
Proyecto diseño tablestaca muro milanTATIANAOLIVA1
 
Number series for aptitude preparation
Number series  for  aptitude preparationNumber series  for  aptitude preparation
Number series for aptitude preparationavdheshtripathi2
 
19 prim,kruskal alg. in data structure
19 prim,kruskal alg. in data structure19 prim,kruskal alg. in data structure
19 prim,kruskal alg. in data structureEMEY GUJJAR
 
2.4 mst prim &kruskal demo
2.4 mst  prim &kruskal demo2.4 mst  prim &kruskal demo
2.4 mst prim &kruskal demoKrish_ver2
 
10. Getting Spatial
10. Getting Spatial10. Getting Spatial
10. Getting SpatialFAO
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLSergey Petrunya
 
Mysqlfunctions
MysqlfunctionsMysqlfunctions
MysqlfunctionsN13M
 
Applied numerical methods lec6
Applied numerical methods lec6Applied numerical methods lec6
Applied numerical methods lec6Yasser Ahmed
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212Mahmoud Samir Fayed
 
Computer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and EllipseComputer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and Ellipse2013901097
 
The Ring programming language version 1.5.1 book - Part 23 of 180
The Ring programming language version 1.5.1 book - Part 23 of 180The Ring programming language version 1.5.1 book - Part 23 of 180
The Ring programming language version 1.5.1 book - Part 23 of 180Mahmoud Samir Fayed
 

Was ist angesagt? (20)

11. Linear Models
11. Linear Models11. Linear Models
11. Linear Models
 
mat lab introduction and basics to learn
mat lab introduction and basics to learnmat lab introduction and basics to learn
mat lab introduction and basics to learn
 
Comparison GUM versus GUM+1
Comparison GUM  versus GUM+1Comparison GUM  versus GUM+1
Comparison GUM versus GUM+1
 
Proyecto diseño tablestaca muro milan
Proyecto diseño tablestaca  muro milanProyecto diseño tablestaca  muro milan
Proyecto diseño tablestaca muro milan
 
Number series for aptitude preparation
Number series  for  aptitude preparationNumber series  for  aptitude preparation
Number series for aptitude preparation
 
19 prim,kruskal alg. in data structure
19 prim,kruskal alg. in data structure19 prim,kruskal alg. in data structure
19 prim,kruskal alg. in data structure
 
2.4 mst prim &kruskal demo
2.4 mst  prim &kruskal demo2.4 mst  prim &kruskal demo
2.4 mst prim &kruskal demo
 
10. Getting Spatial
10. Getting Spatial10. Getting Spatial
10. Getting Spatial
 
1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...
1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...
1st and 2nd Semester M Tech: Computer Science and Engineering (Dec-2015; Jan-...
 
2020 preTEST3A
2020 preTEST3A2020 preTEST3A
2020 preTEST3A
 
19 primkruskal
19 primkruskal19 primkruskal
19 primkruskal
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
Matlab plotting
Matlab plottingMatlab plotting
Matlab plotting
 
Mysqlfunctions
MysqlfunctionsMysqlfunctions
Mysqlfunctions
 
Applied numerical methods lec6
Applied numerical methods lec6Applied numerical methods lec6
Applied numerical methods lec6
 
Regression (II)
Regression (II)Regression (II)
Regression (II)
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212
 
Test (S) on R
Test (S) on RTest (S) on R
Test (S) on R
 
Computer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and EllipseComputer Graphic - Lines, Circles and Ellipse
Computer Graphic - Lines, Circles and Ellipse
 
The Ring programming language version 1.5.1 book - Part 23 of 180
The Ring programming language version 1.5.1 book - Part 23 of 180The Ring programming language version 1.5.1 book - Part 23 of 180
The Ring programming language version 1.5.1 book - Part 23 of 180
 

Ähnlich wie Advanced row pattern matching

counters and registers
counters and registerscounters and registers
counters and registersMeenaAnusha1
 
counters_and_registers_5 lecture fifth.ppt
counters_and_registers_5 lecture fifth.pptcounters_and_registers_5 lecture fifth.ppt
counters_and_registers_5 lecture fifth.pptImranAhmadAhmad
 
EET107_Chapter 3_SLD(part2.1)-edit1.ppt
EET107_Chapter 3_SLD(part2.1)-edit1.pptEET107_Chapter 3_SLD(part2.1)-edit1.ppt
EET107_Chapter 3_SLD(part2.1)-edit1.pptBeautyKumar1
 
C PROGRAMS - SARASWATHI RAMALINGAM
C PROGRAMS - SARASWATHI RAMALINGAMC PROGRAMS - SARASWATHI RAMALINGAM
C PROGRAMS - SARASWATHI RAMALINGAMSaraswathiRamalingam
 
SQLチューニング総合診療Oracle CloudWorld出張所
SQLチューニング総合診療Oracle CloudWorld出張所SQLチューニング総合診療Oracle CloudWorld出張所
SQLチューニング総合診療Oracle CloudWorld出張所Hiroshi Sekiguchi
 

Ähnlich wie Advanced row pattern matching (8)

counters and registers
counters and registerscounters and registers
counters and registers
 
counters_and_registers_5 lecture fifth.ppt
counters_and_registers_5 lecture fifth.pptcounters_and_registers_5 lecture fifth.ppt
counters_and_registers_5 lecture fifth.ppt
 
dld 01-introduction
dld 01-introductiondld 01-introduction
dld 01-introduction
 
EET107_Chapter 3_SLD(part2.1)-edit1.ppt
EET107_Chapter 3_SLD(part2.1)-edit1.pptEET107_Chapter 3_SLD(part2.1)-edit1.ppt
EET107_Chapter 3_SLD(part2.1)-edit1.ppt
 
Extra assign
Extra assignExtra assign
Extra assign
 
Sql queries
Sql queriesSql queries
Sql queries
 
C PROGRAMS - SARASWATHI RAMALINGAM
C PROGRAMS - SARASWATHI RAMALINGAMC PROGRAMS - SARASWATHI RAMALINGAM
C PROGRAMS - SARASWATHI RAMALINGAM
 
SQLチューニング総合診療Oracle CloudWorld出張所
SQLチューニング総合診療Oracle CloudWorld出張所SQLチューニング総合診療Oracle CloudWorld出張所
SQLチューニング総合診療Oracle CloudWorld出張所
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Advanced row pattern matching

  • 1. Meet Your Match Advanced row pattern matching (12c) Stew Ashton UKOUG Tech 17 Can you read the following line? If not, please move closer. It's much better when you can read the code ;)
  • 2. Advanced usage, not all the syntax • Reminder of the basics • Exercises • Bin fitting • Positive and negative sequencing • Hierarchical summaries – Thanks, Kim Berg Hansen • Alternatives to inequality joining – Thanks, Jonathan Lewis 2
  • 3. Reminder: the Basics • To illustrate: table with PAGE column – Group consecutive pages together 3 PAGE 1 2 3 5 FIRSTPAGE LASTPAGE CNT 1 3 3 5 5 1
  • 4. Pattern and Matching Rows • PATTERN – Uninterrupted series of input rows – Described as list of conditions (≅ “regular expressions”) PATTERN (A B*) "A" : 1 row, "B*" : 0 or more rows, as many as possible • DEFINE (at least one) row condition [A undefined = TRUE] B AS page = PREV(page)+1 • Each series that matches the pattern is a “match” – "A" and "B" identify the rows that meet their conditions – There can be unmatched rows between series 4
  • 5. Input, Processing, Output 1. Define input 2. Order input 3. Process pattern 4. using defined conditions 5. Output: rows per match 6. Output: columns per row 7. Go where after match? 5 SELECT * FROM t MATCH_RECOGNIZE ( ORDER BY page MEASURES A.page as firstpage, LAST(page) as lastpage, COUNT(*) cnt ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (A B*) DEFINE B AS page = PREV(page)+1 );
  • 6. Which row do we mean? pg id DEFINE ALL ROWS PER MATCH ONE ROW PER MATCH first Current last first Current last Final last first Current last Final last 1 A 1 1 1 1 1 1 3 2 B 1 2 2 1 2 2 3 3 B 1 3 3 1 3 3 3 1 3 3 3 5 B? 1 5 5 6 Column name by itself = « current » row • DEFINE: row being evaluated ; ALL ROWS: each row ; ONE ROW: last row
  • 7. Exercise: what output from this input? 7 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( all rows per match pattern (a*) define a as 1=1 );
  • 8. Add sequence number, starting over after 40 days 8 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( all rows per match pattern (a*) define a as 1=1 );
  • 9. Add sequence number, starting over after 40 days 9 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( all rows per match pattern (a*) define a as 1=1 );
  • 10. Add sequence number, starting over after 40 days 10 CUST_ID TX_DATE DESCR C001 2016-01-01 Inquiry C001 2016-01-01 Inquiry C001 2016-01-10 Sales C001 2016-01-21 Repeat Inquiry C001 2016-02-10 Repeat Inquiry C001 2016-05-01 Sales C001 2016-05-06 Sales C001 2016-06-10 Inquiry 1 C001 2016-09-01 Inquiry 2 C002 2016-02-01 Inquiry 1 C002 2016-02-25 Inquiry 2 C003 2016-02-01 Inquiry 2 C003 2016-02-10 Sales C003 2016-02-10 Sales C003 2016-03-10 Inquiry 2 C004 2016-04-15 Sales select * from t match_recognize( partition by cust_id order by tx_date, descr all rows per match pattern (a*) define a as ); select * from t match_recognize( partition by cust_id order by tx_date, descr all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 ); select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 );
  • 11. Add sequence number, starting over after 40 days 11 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 );
  • 12. Sequence starts from First Sale, Inquiry outside 40 days = 0 12 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a*) define a as tx_date <= first(tx_date) + 40 );
  • 13. Sequence starts from Sale, Inquiry outside 40 days = 0 13 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (a *) define a as tx_date <= first(tx_date) + 40 );
  • 14. Sequence starts from Sale, Inquiry outside 40 days = 0 14 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 1 C001 2016-01-01 Inquiry 2 C001 2016-01-10 Sales 3 C001 2016-01-21 Repeat Inquiry 4 C001 2016-02-10 Repeat Inquiry 5 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 1 C002 2016-02-01 Inquiry 1 1 C002 2016-02-25 Inquiry 2 2 C003 2016-02-01 Inquiry 2 1 C003 2016-02-10 Sales 2 C003 2016-02-10 Sales 3 C003 2016-03-10 Inquiry 2 4 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define more_tx as tx_date <= + 40 ); - count(inq.*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  • 15. Sequence starts from Sale, Inquiry outside 40 days = 0 15 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) - count(inq.*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  • 16. Negative sequence for Inquiries within 10 days prior to Sale 16 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) - count(inq.*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  • 17. Negative sequence for Inquiries within 10 days prior to Sale 17 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures count(*) - count(inq.*) as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  • 18. Negative sequence for Inquiries within 10 days prior to Sale 18 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry 0 C001 2016-01-01 Inquiry 0 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 0 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures case when classifier() = 'INQ' and tx_date >= final first(sale1.tx_date) - 10 then count(inq.*) - final count(inq.*) - 1 else count(*) - count(inq.*) end as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  • 19. Negative sequence for Inquiries within 10 days prior to Sale 19 CUST_ID TX_DATE DESCR SEQ C001 2016-01-01 Inquiry -2 C001 2016-01-01 Inquiry -1 C001 2016-01-10 Sales 1 C001 2016-01-21 Repeat Inquiry 2 C001 2016-02-10 Repeat Inquiry 3 C001 2016-05-01 Sales 1 C001 2016-05-06 Sales 2 C001 2016-06-10 Inquiry 1 3 C001 2016-09-01 Inquiry 2 0 C002 2016-02-01 Inquiry 1 0 C002 2016-02-25 Inquiry 2 0 C003 2016-02-01 Inquiry 2 -1 C003 2016-02-10 Sales 1 C003 2016-02-10 Sales 2 C003 2016-03-10 Inquiry 2 3 C004 2016-04-15 Sales 1 select * from t match_recognize( partition by cust_id order by tx_date, descr measures case when classifier() = 'INQ' and tx_date >= final first(sale1.tx_date) - 10 then count(inq.*) - final count(inq.*) - 1 else count(*) - count(inq.*) end as seq all rows per match pattern (inq* sale1{0,1} more_tx*) define inq as descr != 'Sales', sale1 as descr = 'Sales', more_tx as tx_date <= sale1.tx_date + 40 );
  • 20. Hierarchical Summary: get salaries of mgr + subordinates 20 select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno; LVL ENAME SAL 1 KING 5000 2 JONES 2975 3 SCOTT 3000 4 ADAMS 1100 3 FORD 3000 4 SMITH 800 2 BLAKE 2850 3 ALLEN 1600 3 WARD 1250 3 MARTIN 1250 3 TURNER 1500 3 JAMES 950 2 CLARK 2450 3 MILLER 1300 >2
  • 21. Hierarchical Summary: get salaries of mgr + subordinates 21 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal pattern(a b*) define b as lvl > a.lvl ); LVL ENAME SAL 1 KING 5000 2 JONES 2975 3 SCOTT 3000 4 ADAMS 1100 3 FORD 3000 4 SMITH 800 2 BLAKE 2850 3 ALLEN 1600 3 WARD 1250 3 MARTIN 1250 3 TURNER 1500 3 JAMES 950 2 CLARK 2450 3 MILLER 1300
  • 22. Hierarchical Summary: get salaries of mgr + subordinates 22 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal pattern(a b*) define b as lvl > a.lvl );
  • 23. Hierarchical Summary: get salaries of mgr + subordinates 23 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal after match skip past last row pattern(a b*) define b as lvl > a.lvl );
  • 24. Hierarchical Summary: get salaries of mgr + subordinates 24 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal after match skip to next row pattern(a b*) define b as lvl > a.lvl );
  • 25. Hierarchical Summary: get salaries of mgr + subordinates 25 LVL ENAME SAL SUM_SAL 1 KING 5000 29025 2 JONES 2975 10875 3 SCOTT 3000 4100 4 ADAMS 1100 1100 3 FORD 3000 3800 4 SMITH 800 800 2 BLAKE 2850 9400 3 ALLEN 1600 1600 3 WARD 1250 1250 3 MARTIN 1250 1250 3 TURNER 1500 1500 3 JAMES 950 950 2 CLARK 2450 3750 3 MILLER 1300 1300 select * from ( select level lvl, ename, sal from scott.emp start with mgr is null connect by mgr = prior empno ) match_recognize( measures a.lvl lvl, a.ename ename, a.sal sal, sum(sal) as sum_sal after match skip to next row pattern(a b*) define b as lvl > a.lvl ); http://www.kibeha.dk/2015/07/row-pattern-matching-nested-within.html
  • 26. Inequality joins 26 >create table t1(id, jd, v) cache as select level, level + .1, level from dual connect by level <= 20000; • Equality • Band Join: compare T1.ID to T2.ID + a constant • Range Join: T1.ID within a range T2 (ID to JD) • Overlap Join: T1 range (ID to JD) overlaps T2 range >create table t2 cache as select * from t1;
  • 27. Equality 27 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where t1.id = t2.id Elapsed: .O1 seconds
  • 28. Band Join 28 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where t1.id between t2.id and t2.id + .1 Elapsed: .O4 seconds • New implementation in 12.2 • Before 12.2, about the same time as range join =>
  • 29. Range Join 29 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where t1.id between t2.id and t2.jd Elapsed: 30 seconds (Equality: .01 Band: .04)
  • 30. Range Join Execution Plan 30 ------------------------------------------------------- | Id | Operation | Name | Starts | A-Rows | ------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 20000 | | 1 | MERGE JOIN | | 1 | 20000 | | 2 | SORT JOIN | | 1 | 20000 | | 3 | TABLE ACCESS FULL | T1 | 1 | 20000 | |* 4 | FILTER | | 20000 | 20000 | |* 5 | SORT JOIN | | 20000 | 200M| | 6 | TABLE ACCESS FULL| T2 | 1 | 20000 | -------------------------------------------------------
  • 31. 31 T1 T2 ID ID JD 1 1 1.1 1 2 2.1 1 3 3.1 2 1 1.1 2 2 2.1 2 3 3.1 3 1 1.1 3 2 2.1 3 3 3.1 All possible combinations
  • 32. 32 T1 T2 ID ID JD 1 1 1.1 1 2 2.1 1 3 3.1 2 1 1.1 2 2 2.1 2 3 3.1 3 1 1.1 3 2 2.1 3 3 3.1 5 - access("T1"."ID">="T2"."ID")
  • 33. 33 T1 T2 ID ID JD 1 1 1.1 1 2 2.1 1 3 3.1 2 1 1.1 2 2 2.1 2 3 3.1 3 1 1.1 3 2 2.1 3 3 3.1 4 - filter("T1"."ID"<="T2"."JD") 5 - access("T1"."ID">="T2"."ID")
  • 34. Sort all and Match? 34 T2 T1 T2 T1 T2 T1 T2 1 1 1 2 2.1 2 2 3 3.1 3 3 4 4.1 4 5 5.1 Sort by ID, T2 first. (order shown from left to right) T2 range diff. is now 1.1, so will match 2 T1 rows
  • 35. Sort all and Match? 35 T2 1 1 2 2.1 3 4 5 Start Look for following T1 rows with ID <= 2.1 Due to sort, their IDs must be >= T2.ID
  • 36. Sort all and Match? 36 T2 T1 1 1 1 2 2.1 3 4 5 Start Join T1.ID < 2.1 Due to sort, T1.ID is automatically >= T2.ID
  • 37. Sort all and Match? 37 T2 T1 T2 1 1 1 2 2.1 2 3 3.1 4 5 Start Join match T2.ID < 2.1 So match, but do not output
  • 38. Sort all and Match? 38 T2 T1 T2 T1 1 1 1 2 2.1 2 2 3 3.1 4 5 Start Join match Join T1.ID < 2.1 So match and output
  • 39. Sort all and Match? 39 T2 T1 T2 T1 T2 1 1 1 2 2.1 2 2 3 3.1 3 4 4.1 5 Start Join match Join X Match ended Skip to next
  • 40. Sort all and Match? 40 T2 T1 T2 T1 T2 T1 T2 1 1 1 2 2.1 2 2 3 3.1 3 3 4 4.1 4 5 5.1 Start Join match Join X
  • 41. (Almost) All Rows per Match • PATTERN ( A {- B A -} B) – The parts of the pattern enclosed between {- and -} are excluded from the output. – Here only two rows per match will be returned – More granular than using a WHERE clause • Alternation: | means OR – "Alternatives are preferred in the order they are specified." PATTERN ( A | B ) = If A condition is true then A, else if B condition is true then B 41
  • 42. Range Match 42 select ID ID1, ID2, JD2 from ( select t2.*, 1 is_t2 from t2 union all select t1.*, null from t1 ) match_recognize( order by id, is_t2 measures t2.id id2, t2.jd jd2 all rows per match after match skip to next row pattern({-T2-} ( T1 | {-T2-} )* T1) define T2 as is_t2 = 1 and id < first(t2.jd), T1 as is_t2 is null and id < first(t2.jd) ); Elapsed: .12 secs Equality: .01 Band: .04 Range join: 30.00
  • 43. Overlap Join 43 select t1.id id1, t2.id id2, t1.v v1, t2.v v2 from t1, t2 where (t2.id <= t1.id and t1.id < t2.jd) or (t1.id <= t2.id and t2.id < t1.jd) Elapsed: 50 seconds
  • 44. Overlap Join Execution Plan 44 | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 | | 2 | VIEW | VW...| 1 | 175M| 20000 | | 3 | UNION-ALL | | 1 | | 20000 | | 4 | MERGE JOIN | | 1 | 100M| 20000 | | 5 | SORT JOIN | | 1 | 20000 | 20000 | | 6 | TABLE ACCESS FULL | T2 | 1 | 20000 | 20000 | |* 7 | FILTER | | 20000 | | 20000 | |* 8 | SORT JOIN | | 20000 | 20000 | 200M| | 9 | TABLE ACCESS FULL| T1 | 1 | 20000 | 20000 | | 10 | MERGE JOIN | | 1 | 75M| 0 | | 11 | SORT JOIN | | 1 | 20000 | 20000 | | 12 | TABLE ACCESS FULL | T2 | 1 | 20000 | 20000 | |* 13 | FILTER | | 20000 | | 0 | |* 14 | SORT JOIN | | 20000 | 20000 | 200M| | 15 | TABLE ACCESS FULL| T1 | 1 | 20000 | 20000 |
  • 45. Overlap Match 45 select * from ( select t1.*, 1 table_num from t1 union all select t2.*, 2 from t2 ) match_recognize( order by id, jd all rows per match after match skip to next row pattern({-ta-} ( tb | {-x-} )* tb) define tb as table_num != ta.table_num and id < first(jd), x as table_num = ta.table_num and id < first(jd) ); Elapsed: .12 secs Equality: .01 Band: .04 Range match: .12 No need to wait for 18c
  • 47. Solving Problems with pattern matching • Clear knowledge of input & requirement – Beware of assumptions • Identify typical problems and solutions – Consecutive sequences – Ad hoc grouping – Bin fitting – Ranges • Visualize the data processing flow – Output from other rows is not available, input is. 47
  • 48. Meet Your Match Advanced row pattern matching (12c) Stew Ashton UKOUG Tech 17 https://stewashton.wordpress.com/ Twitter: @stewashton
  • 49. Anchors • Anchors – ^ matches the position before the first row in the partition. – $ matches the position after the last row in the partition PATTERN(^ A $) = partition must have 1 row 49
  • 50. JOIN alternative: CDC compare 50 PKVAL 1Same value 2Delete this 3Old value PKVAL 1Same value 3New value 4Insert this T1 T2 select pk, op, val, oldrid from ( select pk, val, rowid rid from t1 union all select pk, val, null from t2 ) match_recognize( partition by pk order by rid measures classifier() op, first(rid) oldrid all rows per match pattern(^ D $ | ^ I $ | (^ O U $) ) define D as rid is not null, U as decode(O.val, val, 0, 1) = 1 ); PK OP VAL OLDRID 2D Delete this AAAkdlAAH…MAAB 3O Old value AAAkdlAAH…MAAC 3U New value AAAkdlAAH…MAAC 4I Insert this