20161213_FinTechć䝣ăŤćąăăăăDBéçşă¨ăťăăĽăŞă㣠by ć Şĺźäźç¤žă¤ăłăľă¤ăăăŻăăă¸ăź éżé¨ĺĽä¸Insight Technology, Inc.
Attunity礞ăŽă˝ăŞăĽăźăˇă§ăłăŽćĽćŹĺ˝ĺ ĺ¤éŠç¨äşäžĺăłăăźăăăăç´šäť[ATTUNITY & ă¤ăłăľă¤ăăăŻăăă¸ăź IoT / Big Data ă...Insight Technology, Inc.
ăŹăŹăˇăźăŤĺăăăăăźăżăăŞă˘ăŤăżă¤ă ă§ăŻăŠăŚă㸠[ATTUNITY & ă¤ăłăľă¤ăăăŻăăă¸ăź IoT / Big Data ăăŠăźăŠă 2018]Insight Technology, Inc.
Attunity礞ăŽă˝ăŞăĽăźăˇă§ăłăŽćĽćŹĺ˝ĺ ĺ¤éŠç¨äşäžĺăłăăźăăăăç´šäť[ATTUNITY & ă¤ăłăľă¤ăăăŻăăă¸ăź IoT / Big Data ă...
2. Agenda
1. Intro ⌠Me ⌠Delphix
2. What is DTrace
3. Why DTrace
â Make the Impossible be possible
â Low overhead
4. Where DTrace can be used
5. How DTrace is used
â Probes
â Overhead
â Variables
â Resources
3. Kyle Hailey
⢠OEM 10g Performance Monitoring
⢠Visual SQL Tuning (VST) in DB Optimizer
⢠Delphix
5. What is DTrace
⢠Way of tracing O/S and Programs
â Making the impossible possible
⢠Your code unchanged
â Optional add static DTrace probes
⢠No overhead when off
â Turning on dynamically changes code path
⢠Low overhead when on
â 1000s of events per second cause less 1% overhead
⢠Event Driven
â Like event 10046, 10053
7. Where can we trace
⢠Solaris
⢠OpenSolaris
⢠FreeBSD âŚ
⢠MacOS
⢠Linux â announced from Oracle
⢠AIX â working âprobevueâ
8. What can we trace?
Almost anything
â All system calls âreadâ
â All kernel calls âbiodoneâ
â All function calls in a program
â All DTrace stable providers
⢠Example : io:::start
⢠Predefined stable probes
⢠Non-stable Probe names and arguments can change
over time
â Custom probes
⢠Write custom probes in programs to trace
10. Event Driven
⢠DTrace Code run when probes fire in OS
/usr/sbin/dtrace -n '
Probe (multi-threaded, process)
#pragma D option quiet when this happens then:
io:::start
{
printf(" timestamp %d ÂĽn",timestamp);
}' Take action
⢠Program runs until canceled Print variable
$ sudo ./mydtrace.d
timestamp 8135515300287183
timestamp 8135515300328512
timestamp 8135515300346769
^C
11. What are these
What are these probes and variables:?
io:::start
Probe
{
printf(" timestamp %d ÂĽn",timestamp);
Variable
}'
â Probes
⢠kernel and system calls
⢠program function calls
⢠predefined by DTrace
â Variables
⢠Variables are either predefined in DTrace like timestamp
⢠defined by user
12. How to list Probes?
Two ways to list probes
1. All System and kernel calls
dtrace âl
2. All Process functions
dtrace âl pid[pid]
Output will have 4 part name, colon separated
ď§ Provider:module:function:name
13. Kernel vs User Space
Kernel Functions
dtrace âl
$ dtrace âl
dtrace âl System Calls
899
731 21
User Land
$ dtrace âl pid21
User Processes
14. dtrace -l
Provider Module Function Name
$ sudo dtrace âl
ID PROVIDER MODULE FUNCTION NAME
1 dtrace BEGIN
2 dtrace END
3 dtrace ERROR
16 profile tick-1sec
17 fbt klmops lm_find_sysid entry
18 fbt klmops lm_find_sysid return
19 fbt klmops gister_share_locally entry
âŚ
Thousands of lines .
16. Providers:defined interfaces
Instead of tracing a kernel function, which could change between O/S
versions, trace a maintained, stable probe
https://wikis.oracle.com/display/DTrace/Providers
â I/O io Provider
â CPU sched Provider
â system calls syscall Provider
â memory vminfo Provider
â user processes pid Provider
â network tcp Provider
Provider definition files in /usr/lib/dtrace, such as io.d, nfs.d, sched.d, tcp.d
17. Example Network: TCP
What if we wanted to look for TCP transmissions for receive ?
ď§ Probes have 4 part name
Provider:module:function:name
$ dtrace âl | grep tcp | grep receive
tcp:ip:tcp_input_data:receive
Or look at wiki
https://wikis.oracle.com/display/DTrace/tcp+Provider
18. Probe arguments: dtrace âlnv
What are the arguments for the probe function
âtcp:ip:tcp_input_data:receiveâ
$ dtrace -lvn tcp:ip:tcp_input_data:receive
ID PROVIDER MODULE FUNCTION NAME
7301 tcp ip tcp_input_data receive
Argument Types
args[0]: pktinfo_t *
args[1]: csinfo_t *
args[2]: ipinfo_t *
args[3]: tcpsinfo_t *
args[4]: tcpinfo_t *
What is âtcpsinfo_t â for example ?
19. Probe Argument definitions
Find out what âtcpsinfo_t â is
Two ways:
1. Stable Provider
â https://wikis.oracle.com/display/DTrace/Providers
â In our case there is a TCP stable provider
https://wikis.oracle.com/display/DTrace/tcp+Provider
2. Look at source code
â For OpenSolaris see: http://scr.illumos.org
â Otherwise get a copy of the source
⢠Load into Eclipse or similar for easy search
Letâs look up âtcpsinfo_t â
21. src.illumos.org
tcpsinfo_t - points to many things
example
string tcps_raddr = Remote machines IP address
22. Creating a Program
⢠Find out all the machines we are receiving TCP packets from
$ cat tcpreceive.d
#!/usr/sbin/dtrace -s
#pragma D option quiet
probe tcp:ip:tcp_input_data:receive
action { printf(" address %s ÂĽn", args[3]->tcps_raddr ); }
args[3]: tcpsinfo_t *
$ sudo ./tcpreceive.d
address 127.0.0.1
address 172.16.103.58 When TCP receive
address 127.0.0.1 Print remote address
address 172.16.100.187
address 172.16.103.58
address 127.0.0.1
^C
23. Using for TCP Window sizes
ip usend ssz send recd
172.16.103.58 564 16028 564 ÂĽ
172.16.103.58 696 16208 132 ÂĽ
172.16.103.58 1180 16208 484 ÂĽ
172.16.103.58 1664 16208 484 ÂĽ
172.16.103.58 2148 16208 484 ÂĽ
172.16.103.58 2148 16208 / 0
172.16.103.58 1452 16208 / 0
Remote Unacknowledged Send Receive
Machine Bytes Sent Bytes Bytes
Send Window
Bytes
If unacknowleged bytes sent goes above send window
then transmissions will be delayed
24. Review so far
⢠DTrace â trace O/S and user programs
⢠Solaris and partially on Linux among others
⢠Code is event driven, structure
â probe
â Include optional filter
â Action
⢠Get all eventâs with âdtrace âlâ
⢠Get event arguments with âdtrace âlnv probeâ
⢠Get argument definitions in source or wiki
25. Variables
1. Globals
⢠Not thread save
X=1;
A[1]=1;
2. Aggregates
⢠Thread safe scalars and arrays
⢠Special operations, Count, average, quantize
@ct = count() ;
@sm = sum(value);
@sm[type]=sum(value);
@agg = quantize(value);
3. Self-> var
⢠Thread variable, self->x = value;
4. This->var
⢠Light weight variable for only this probe firing
⢠this->x = value;
27. What is an aggregate?
⢠Multi CPU safe variable
⢠Light weight
⢠Array or scalar
⢠Denoted by @
â @var= function(value);
â @var[array_indice]=function(value);
⢠Functions pre-defined only, such as
â sum()
â count()
â max()
â quantize()***
⢠Print out with âprintaâ
28. Using Aggregates: count()
What program writes the most often?
syscall::write:entry {
@counts[execname] = count();
}
expr 72
sh 291
tee 814
make.bin 2010
execname = session Count of occurrences doing writes
https://wikis.oracle.com/display/DTrace/Aggregations
29. Aggregate: quantize()
Get distribution of all I/O sizes
If the following returns too many rows
$ sudo dtrace -l | grep io
Alternately Limit output to specific probes with â-lnâ flag:
$ sudo dtrace -ln io:::
ID PROVIDER MODULE FUNCTION NAME
6281 io genunix biodone done
6282 io genunix biowait wait-done
6283 io genunix biowait wait-start
7868 io nfs nfs_bio done
7871 io nfs nfs_bio start
30. Aggregate : quantize()
What if we wanted a distribution of all I/O sizes?
bio = block I/O
$ sudo dtrace -ln io:::
ID PROVIDER MODULE FUNCTION NAME
6281 io genunix biodone done
6282 io genunix biowait wait-done
6283 io genunix biowait wait-start
7868 io nfs nfs_bio done NFS
7871 io nfs nfs_bio start module
$ sudo dtrace -lvn io:genunix:biodone:done
ID PROVIDER MODULE FUNCTION NAME What is
6281 io genunix biodone done bufinfo_t?
Argument Types
args[0]: bufinfo_t * Sounds like
args[1]: devinfo_t * Buffer
args[2]: fileinfo_t information
34. Aggregate : iosizes.d with execname
Kernel land I/O
#!/usr/sbin/dtrace -s
#pragma D option quiet
io:::done
{ @sizes[execname] = quantize(args[0]->b_bcount); }
Size of the
I/O
$ sudo iosizes.d
sched
value --- Distribution -- count
256 | 0
512 |@@@@ 6
Only returns
1024 |@@@@ 6
I/O for sched
2048 |@@@@@@@@@@@@@@@@@@ 31
4096 |@@@ 5
Why?
8192 |@@@@@ 9
16384 |@@@@ 6
32768 | 0
^C
35. Kernel vs User Space
⢠I/O is done by the kernel so only see âschedâ
⢠User I/O is done via a system call to kernel
I/O is in
Kernel Functions kernel
dtrace âl
done by
sched
dtrace âl System Calls
User
programs
899 make a
731 21
User Land system
call âreadâ
36. io:::start : kernel, look for user syscall
⢠Look for the read system call
$ sudo dtrace -l | grep syscall | grep read
5425 syscall read entry
5426 syscall read return
$ sudo dtrace -lvn syscall::read:entry
ID PROVIDER MODULE FUNCTION NAME
5425 syscall read entry
Argument Types
None
37. User program system call âreadâ
Arg0 = fd
Arg1 = *buf
Arg2 = size
Instead of
args[2]->size
Use
arg2
$ sudo dtrace -lvn syscall::read:entry
Argument Types
None
38. Aggregate Example: readsizes.d
User land I/O
#!/usr/sbin/dtrace -s
#pragma D option quiet
syscall::read:entry
{ @read_sizes[execname] = quantize(arg2); }
Size of the
I/O
java
value ------------- Distribution ------------- count
4096 | 0
8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2
16384 | 0
cat
value ------------- Distribution ------------- count
16384 | 0
32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
65536 | 0
sshd
value ------------- Distribution ------------- count
8192 | 0
16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 931
32768 | 0
39. Built in variables
⢠pid â process id
⢠tid â thread id
⢠execname
⢠timestamp â nano-seconds
⢠cwd â current working directory
⢠Probes:
â probeprov
â probemod
â probefunc
â probename
40. Built in variable examples
No function name =
Wild card, all matches Program name
# cat exec.d Function executing
#!/usr/sbin/dtrace -s Records function
That fires
syscall:::entry
{ @num[execname, probefunc] = count(); }
dtrace:::END
{ printa(" %-32s %-32s %@8dÂĽn", @num);}
# ./syscall.d
dtrace: script './exec.d' matched 236 probes
sleep stat64 32
vmtoolsd pollsys 37
java pollsys 72
java lwp_cond_wait 180
Execname function count
41. Latency
Latency crucial to performance analysis.
Latency = delta = end_time â start_time
Dtrace probes have
⢠Entry, exit
⢠Start , done
Take time at beginning and time at end and take
42. Latency: how long does I/O take?
Latency = delta = end_time â start_time
â start_time io:::start
â end_time io:::done
Array to hold each I/O start time:
⢠Array needs a unique key for each I/O
⢠Key could be based on
â device = args[0]->b_edev Look these up in source
â block = args[0]->b_blkno
Array: tm_start[device,block]=timestamp
44. Other ways of keying start/end
1. We used a global array
â tm_start[device,block]=timestamp
â Probably best general way
2. Some people use arg0
â tm_start[arg0]=timestamp
â Not as clear that this is valid
3. Others use
â self->start = timestamp;
â This only works if the same thread that does the begin
probe is the same the does the end probe
⢠Doesnât work for io:::start , io:::done
⢠Does work for nfs:::start , nfs:::done
45. Tracing vs Profiling
Tracing
⢠Programs run until ^C
⢠Can print every probe
⢠At ^C all unprinted variables are printed
Profiling
⢠Take action every X seconds
⢠Special probe name
profile:::tick-1sec
Can profile at hz or ns, us, ms, sec
profile:::tick-1 Hz
profile:::tick-1ms ms
46. Latency: output every second
#!/usr/sbin/dtrace -s
#pragma D option quiet
io:::start
start /* device block number */
{ tm_start[ args[0]->b_edev, args[0]->b_blkno] = timestamp; }
io:::done
/ tm_start[ args[0]->b_edev, args[0]->b_blkno] /
{
end this->delta =
(timestamp - tm_start[args[0]->b_edev,args[0]->b_blkno] );
@io = quantize(this->delta);
tm_start[ args[0]->b_edev, args[0]->b_blkno] = 0;
}
Every profile:::tick-1sec
{ printa(@io);
second trunc(@io);
}
clear print quantize clear
47. User Process Tracing
Kernel Functions
dtrace âl
dtrace âl System Calls
899
731 21
User Land
User Processes
$ dtrace âl pid21
48. Tracing User Processes
⢠What can you trace in Oracle
â $ ps âef | grep oracle
â Get a process id
â $ dtrace âl pid[process_id]
â Lists program functions
⢠What do these functions do?
â Source code for Mysql
â Guess if you are on Oracle
â Some good blogs out there
49. Overhead
User process tracing (from Brendan Gregg )
⢠Don't worry too much about pid provider probe cost at < 1000 events/sec.
⢠At > 10,000 events/sec, pid provider probe cost will be noticeable.
⢠At > 100,000 events/sec, pid provider probe cost may be painful.
User process probes 2-15us typical, could be slower
Kernel and system calls are cheaper to trace
⢠> 1,000,000 20% impact
For non CPU work loads impact may be greater
⢠TCP tests showed 50% throughput drop at 160K events/sec
â 40K interupts/sec
50. Formatting data
Problem : Formating data difficult in Dtrace
DTrace has printf and printa (for arrays) but âŚ
⢠No floating point
⢠No âif-then-elseâ , no âfor-loopâ
â type = probename == "op-write-done" ? "W" : "R";
⢠No way to access index of an aggregate array (ex sum of
time by sum of counts)
Solution: do formatting and calculations in perl
dtrace -n â ⌠â | perl âe â ⌠â
51. Summary
⢠Stucture
#!/usr/sbin/dtrace -s
Name_of_something_to_trace
/ filters /
{ actions }
⢠List of Probes
dtrace -l
⢠Arguments to probes
dtrace âlnv prov:mod:func:name
⢠Look up args in source code http://scr.illumos.org
⢠Use Aggregates @ â they make DTrace easy
⢠Google Dtrace
â Find example programs