Cary Millsap Performance Forum Presentation on Oracle Database Tracing Techniques

Cary Millsap
@CaryMillsap
Dallas Oracle Users Group
Oracle Database Forum
5:00p–7:00p Thursday 25 January 2018
© 2015, 2018 Cary Millsap
Performance

@carymillsap
Cary Millsap
2
Method R
TM

@carymillsap 3
Something I need to say
about performance data

@carymillsap
Surrogate measures suck.
utilizations latencies samples hit ratios
They’re too hard. Too error-prone. And too slow.
4

@carymillsap
The code path for
measuring software
belongs in your tools,
not your brain.
5

@carymillsap
Text
hubbelsite.org 8

@carymillsap
Relevance 
Measured durations must match experienced durations.
Determinism 
The data interpretation process must be deterministic, unambiguous, devoid of magic.
Integrity 
Numbers must reconcile within a report, across drill-downs, across reports, etc.
Quality 
There must be no unit-mixing, no chartjunk, no longjumps, etc.
9
Some requirements for measurement tools

@carymillsap
Database Total Size Total Storage
----------------- --------------- ---------------
SAD99PS 635.53 GB 1.24 TB
ANGLL 9.15 TB 18.3 TB
FRI_W1 2.14 TB 4.29 TB
DEMO 6.62 TB 13.24 TB
H111D16 7.81 TB 15.63 TB
HAANT 1.1 TB 2.2 TB
FSU 7.41 TB 14.81 TB
BYNANK 2.69 TB 5.38 TB
HDMI7 237.68 GB 476.12 GB
SXXZPP 598.49 GB 1.17 TB
TPAA 1.71 TB 3.43 TB
MAISTERS 823.96 GB 1.61 TB
p17gv_data01.dbf 800.0 GB 1.56 TB
10
Database Size (TB) Storage (TB)
---------------- --------- ------------
ANGLL 9.15 18.30
H111D16 7.81 15.63
FSU 7.41 14.81
DEMO 6.62 13.24
BYNANK 2.69 5.38
FRI_W1 2.14 4.29
TPAA 1.71 3.43
HAANT 1.10 2.20
MAISTERS .82 1.61
p17gv_data01.dbf .80 1.56
SAD99PS .64 1.24
SXXZPP .60 1.17
HDMI7 .24 .48
Example

@carymillsap
$ mrskew v11203_ora_26827.trc --rc=exp.rc --z=.05 --where='$exp_id != 0' --top=5
EXP-ID DURATION % CALLS MEAN MIN MAX
----------- --------- ------ ------ -------- -------- --------
19537 2.212251 3.5% 807 0.002741 0.000000 0.049582
27239 2.112561 3.3% 791 0.002671 0.000000 0.048360
24213 1.927336 3.0% 267 0.007218 0.000000 0.048210
16121 1.450686 2.3% 683 0.002124 0.000000 0.049147
22279 0.997744 1.6% 643 0.001552 0.000000 0.045547
323 others 55.300800 86.4% 13,841 0.003995 0.000000 0.049996
----------- --------- ------ ------ -------- -------- --------
TOTAL (328) 64.001378 100.0% 17,032 0.003758 0.000000 0.049996
$ mrskew v11203_ora_26827.trc --z=.05 —where='$exp_id == 19537'
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- -------- ------ ----- -------- -------- --------
SQL*Net message from client 2.211694 100.0% 251 0.008812 0.001512 0.049582
SQL*Net message to client 0.000557 0.0% 252 0.000002 0.000001 0.000035
FETCH 0.000000 0.0% 252 0.000000 0.000000 0.000000
EXEC 0.000000 0.0% 52 0.000000 0.000000 0.000000
--------------------------- -------- ------ ----- -------- -------- --------
TOTAL (4) 2.212251 100.0% 807 0.002741 0.000000 0.049582
11
Example

@carymillsap 12
New ways to control tracing

@carymillsap
alter session set events '10046 trace name context forever, level 12'
-- Trace this session until you disable.
alter session set events '10046 trace name context off'
13
The old way
alter session set events 'sql_trace wait=true, bind=true, plan_stat=all_executions'
-- Trace this session until you disable.
alter session set events 'sql_trace off'
The new11g way
Better alter session syntax

@carymillsap 14
alter system set events 'sql_trace[sql: sql_id=ds9j6z3j9n49k|95hh6uvjgspm2] wait=true, bind=true, plan_stat=adaptive';
-- Trace statements with matching SQL IDs until you disable.
alter system set events 'sql_trace[sql: sql_id=ds9j6z3j9n49k|95hh6uvjgspm2] off';
alter system set events 'sql_trace{process: 9176} wait=true, bind=true, plan_stat=adaptive';
-- Trace the identified process until you disable.
alter system set events 'sql_trace{process: 9176} off';
alter system set events 'sql_trace{process: orapid=75} wait=true, bind=true, plan_stat=adaptive';
-- Trace the identified process until you disable.
alter system set events 'sql_trace{process: orapid=75} off';
alter system set events 'sql_trace{process: pname=smon|p0000|p0003} wait=true, bind=true, plan_stat=adaptive';
-- Trace the named processes until you disable.
alter system set events 'sql_trace{process: pname=smon|p0000|p0003} off';
alter system set events 'sql_trace[sql: sql_id=4cxhkp5ckjs6s]{process:9176} wait=true, bind=true, plan_stat=adaptive';
-- Trace the specified SQL only in the specified process until you disable.
alter system set events 'sql_trace[sql: sql_id=4cxhkp5ckjs6s]{process:9176} off';
More alter system options

@carymillsap
oradebug setmypid
oradebug event dump system
15
Showing pending traces

@carymillsap
dbms_session.session_trace_enable(…)
dbms_monitor.session_trace enable(sid, serial, …)
dbms_monitor.serv_mod_act_trace_enable(serv, mod, act, …)
16
And, of course…

@carymillsap 17
Measurement intrusion

@carymillsap
What is the performance
penalty of tracing?
18

@carymillsap
Tracing is
unnoticeable  
if you do it
right.
19

@carymillsap
What is “if you do it right”?
20

@carymillsap
You never want anyone
to perceive—fairly or
not—that tracing is
hurting your business.
21

@carymillsap
Respect the IOPS capacity of your storage
Thumb rule: ≤ 5 simultaneous traces
Make a plan for an immediate emergency trace disable
Test your tracing strategy
Trace at the right level
22
What is “if you do it right”?

@carymillsap
What is “the right level”?
23

@carymillsap
If you don’t have time to think about it, then use…
sql_trace wait=true, bind=false, plan_stat=adaptive
Using the wrong level can get you into big trouble.
24

@carymillsap
Example: A batch program is too slow. With tracing disabled, the program processes
10,000 rows in 1.222 s. However, with wait=true,bind=true tracing enabled, the
same program takes 29.342 s to process the same 10,000 rows. This is a 24×
performance penalty, which is horrifying. With wait=true,bind=false tracing
enabled, the same program processes its 10,000 rows in 1.211 s, which is within ±1%
of the trace-disabled duration.
Investigation of the trace reveals that the program’s duration is dominated nearly 100%
by a complicated update statement. The statement is parsed only once, which is good,
but each EXEC of this statement updates only one row (this, of course, is the root of the
problem). The statement’s where clause references 100 placeholders
named :b001, :b002, ..., :b100. With bind tracing disabled, we can tell that each
EXEC call consumes 0.000 122 s, but with bind tracing enabled, each EXEC call
consumes 0.002 921 s. For each call, the tracing overhead is 95.8%!
25

@carymillsap
bind=false 122 μs/row
bind=true 2,921 μs/row
(2921-122)/2921 = 95.8% overhead, 2921/122 = 24× penalty
26
The problem

@carymillsap
bind=true prints 5n + 1 lines per EXEC call.
Lots of placeholders (n) makes 2(5n + 1) large.
Application makes way too many EXEC calls.
27
Problem is a combination of three factors…

@carymillsap
Trace with wait=true,bind=false.
Now you know the application makes way too many EXEC calls.
Fix the application (process sets, not rows).
Now the application is faster.
And now you can use wait=true,bind=true.
28
Solution

@carymillsap
Don’t use bind=true
until you know it’s safe.
29

@carymillsap
Same analysis applies for
plan_stat=all_executions.
30

@carymillsap
The Magic of Interposing
31

@carymillsap
FETCH #46996428170184:c=281957,e=293172,p=1,cr=6244,cu=3, 
mis=0,r=1,dep=0,og=1,plh=4061086508,tim=1400860472606066
33
sql_trace

@carymillsap
1
user
running
2
kernel
running
3
ready to
run
4
asleep
sys call
or interrupt
schedule
process
wakeup
context
switch
permissible
sleep
interrupt
return
return
interrupt
[Bach 1986 (31)]
34

@carymillsap
1
user
running
2
kernel
running
3
ready to
run
4
asleep
sys call
or interrupt
schedule
process
wakeup
context
switch
permissible
sleep
interrupt
return
return
interrupt
[Bach 1986 (31)]
35
What is c? 
User running? Kernel running? Both?

@carymillsap
1. Use strace to find out what function Oracle
calls to calculate c.
2. Use function interposition to replace that
function; make it return what we want.
3. Trace something. See what c is.
36
What is c?
User running? Kernel running? Both?

@carymillsap 37
Oracle OS
A syscall

@carymillsap 38
Oracle OSMagic
A syscall with function interposition

@carymillsap 39
dbcall(...) {
e0 = gettime;
c0 = getrusage;
/* execute the dbcall here */
e1 = gettime;
c1 = getrusage;
e = e1 - e0;
c = c1 - c0;
write(TRC, "%s #%d:c=%d,e=%d,...",nam,cid,c,e,...);
}
Millsap: The Method R Guide to Mastering Oracle Trace Data, 2nd ed., p96
The function...

@carymillsap
static int seq = 1;
int getrusage(who, struct rusage *u) {
int r = real_getrusage(who, u);
u->ru_utime = 1 * seq;
u->ru_stime = 2 * seq;
++seq;
return r;
}
40
c0 = getrusageseq=1: utime = 1, stime = 2, utime+stime = 3
c1 - c0: utime = 1, stime = 2, utime+stime = 3
calls the real getrusage,Our getrusage
The interposition…
but lies about
what it returned.

@carymillsap
c1 - c0: utime = 1, stime = 2, utime+stime = 3
41
FETCH #31286428452191:c=3,e=42,p=1,cr=0,cu=0,mis=0,r=1, 
dep=0,og=1,plh=8211425821,tim=1400871283103972
Therefore, c = utime+stime. QED
sql_trace
getrusage

@carymillsap
More fun with interposing
42

@carymillsap 43
Oracle sql_trace executes two write calls per line.
Let’s fix it.
Mission...

@carymillsap
write(10, "PARSE #139778455421104:c=1000,e=1304,p=0,cr=0, 
cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=280953708083", 94) = 94
write(10, "n", 1)
44
strace of a sql_trace’d Oracle session

@carymillsap 45
1. Interpose all fd = open(path,…) calls: 
if path is *.trc, then set trcfile = fd.
2. Interpose all write(fd,s) calls: 
if (fd == trcfile) then... 
if s doesn’t end with “n”, then save s; 
otherwise, write deferred content and s.
A solution…

@carymillsap 46
1. Buffer trace writes (Oracle SR 3-9616334591).
2. Write your own WAIT lines into the trace file that
summarize the cost of all the sql_trace write calls.
3. Intercept unnecessary commit calls.
4. Write In-Memory DB or Exadata Storage stats from
Oracle shared memory to the sql_trace file.
Other things you could do…

@carymillsap
Discussion
47
@CaryMillsap
carymillsap.blogspot.com
www.cintra.com
method-r.com

Cary Millsap Performance Forum Presentation on Oracle Database Tracing Techniques

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (12)

Ähnlich wie Cary Millsap Performance Forum Presentation on Oracle Database Tracing Techniques

Ähnlich wie Cary Millsap Performance Forum Presentation on Oracle Database Tracing Techniques (20)

Mehr von Cary Millsap

Mehr von Cary Millsap (9)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Cary Millsap Performance Forum Presentation on Oracle Database Tracing Techniques