[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)

Database I/O Performance:
Measuring and Planning
Alex Gorbachev

Insight-Out Database Symposium
Tokyo, 2011

Alex Gorbachev

• CTO, The Pythian Group
• Blogger

• OakTable Network member
• Oracle ACE Director

• BattleAgainstAnyGuess.com

• President, Oracle RAC SIG

2 © 2009/2010 Pythian

Why Companies Trust Pythian
• Recognized Leader:
• Global industry-leader in remote database administration services and consulting for Oracle,
Oracle Applications, MySQL and SQL Server
• Work with over 150 multinational companies such as Western Union, Fox Interactive Media, and
MDS Inc. to help manage their complex IT deployments

• Expertise:
• One of the world’s largest concentrations of dedicated, full-time DBA expertise.

• Global Reach & Scalability:
• 24/7/365 global remote support for DBA and consulting, systems administration, special
projects or emergency response

3
8 © 2011 Pythian

Why Measure I/O Performance?

Diagnostics & troubleshooting
Proof of impact

Capacity planning and monitoring

Platform validation / acceptance testing

4 © 2009/2010 Pythian

Instrumentation:
Storage Stack vs Oracle Database

➡ Oracle DB call ➡ Storage I/O call
1.read block • UNKNOWN
2.read block
3.latch free
4.read block
5.enqueue
6.send result

Can proﬁle a DB call Cannot proﬁle I/O call

5 © 2009/2010 Pythian

Is Profiling an I/O Call
Feasible?

6 © 2009/2010 Pythian

Direct Attached Storage Stack

Illustration from Guttina Srinivas's Blog - http://guttinasrinivas.wordpress.com/

7 © 2009/2010 Pythian

Simplified Enterprise Storage Stack

Sample IBM Storage Stack - http://www.ibm.com/developerworks/tivoli/library/t-snaptsm1/index.html

8 © 2009/2010 Pythian

complex
Storage stack is too

and heterogeneous to
build end-to-end IO profile

10 © 2009/2010 Pythian

Sources of I/O Performance Measurements

Database as an application consuming I/O services

MUST HAVE
Drill down into the rest of the I/O stack
ASM
Operating System
Storage arrays
Complimentary ...

11 © 2009/2010 Pythian

How is I/O Measured in the Database?

• I/O code paths (syscalls) are instrumented - I/O Waits
• timed_statistics=true
• Additional statistics are collected
• IO size, amount, time spent
• Granularity on different levels
• Global, session, datafile, service, module/action
• Stored in SGA as cumulative counters - X$ tables
• Externalized via V$ views
• Snapshots taken by various tools like Statspack, AWR, Snapper, etc.

12 © 2009/2010 Pythian

WHAT Do We Measure?

Response Time
Throughput / Bandwidth
Skew & Patterns

I/O measurements are almost always
aggregate!

13 © 2009/2010 Pythian

Reproducible issue?

10046 trace
response time
skew & patterns

14 © 2009/2010 Pythian

Mr Tools - The Time-Saver

15 © 2009/2010 Pythian

Example Profile: 4+ hours batch job

Wait Event / Syscall DURATION CALLS MEAN MIN MAX
----------------------------- ------------------------ ---------- ----------- ----------- -----------
db file sequential read 11861.295517 81.4% 201940 0.058737 0.000000 5.473023
log file switch (checkpoint.. 1941.262523 13.3% 49 39.617603 0.001443 211.405054
PL/SQL lock timer 764.452061 5.2% 765 0.999284 0.000008 1.003142
log buffer space 0.149762 0.0% 8 0.018720 0.006973 0.030125
undo segment extension 0.126689 0.0% 19 0.006668 0.001265 0.033682
6 others 0.201454 0.0% 14 0.014390 0.000004 0.059468
----------------------------- ------------------------ ---------- ----------- ----------- -----------
TOTAL (11) 14567.488006 100.0% 202795 0.071834 0.000000 211.405054

16 © 2009/2010 Pythian

I/O Response Time Histogram

Matched event names:
db file sequential read

Options:
group = ''
name = 'db file sequential read'
where = '1'

RANGE {min <= e < max} DURATION CALLS MEAN
----------------------- ------------------------ ---------- -----------
0.000000 0.000001 0.000000 0.0% 14 0.000000
0.000001 0.000010 0.000021 0.0% 8 0.000003
0.000010 0.000100 0.008654 0.0% 180 0.000048
0.000100 0.001000 41.040579 0.3% 86617 0.000474
0.001000 0.010000 201.892556 1.7% 36305 0.005561
0.010000 0.100000 1435.417470 12.1% 66754 0.021503
0.100000 1.000000 3730.265905 31.4% 9059 0.411775
1.000000 10.000000 6452.670332 54.4% 3003 2.148741
10.000000 100.000000 0.000000 0.0% 0
100.000000 1000.000000 0.000000 0.0% 0
1000.000000 Infinity 0.000000 0.0% 0
----------------------- ------------------------ ---------- -----------
TOTAL (8) 11861.295517 100.0% 201940 0.058737

17 © 2009/2010 Pythian

Datafile Skew?

Options:
group = '$p1'
where = '1'

File ID DURATION CALLS MEAN MIN MAX
6 2383.052786 20.1% 40086 0.059449 0.000000 4.825304
10 2131.333101 18.0% 21568 0.098819 0.000029 5.366355
12 2065.204816 17.4% 35353 0.058417 0.000000 5.104831
7 1870.332973 15.8% 32955 0.056754 0.000000 4.954959
11 1711.504204 14.4% 39065 0.043812 0.000000 4.819981
9 1659.888036 14.0% 23735 0.069934 0.000000 5.473023
14 36.206148 0.3% 3141 0.011527 0.000063 4.442775
8 3.532841 0.0% 5877 0.000601 0.000073 0.061977
13 0.193044 0.0% 126 0.001532 0.000343 0.104574
1 0.046855 0.0% 32 0.001464 0.000000 0.022407
3 0.000713 0.0% 2 0.000357 0.000311 0.000402
TOTAL (11) 11861.295517 100.0% 201940 0.058737 0.000000 5.473023

18 © 2009/2010 Pythian

Analyzing Datafile Chunks


Options:
group = '$p1*1000000000+int($p2*8192/1024/1024)'
where = '$ela>0.1'

File Chunk DURATION CALLS MEAN MIN MAX
------------ ------------------------ ---------- ----------- ----------- -----------
10000008570 175.587622 1.7% 120 1.463230 0.134717 4.373926
6000000381 173.669439 1.7% 119 1.459407 0.107691 3.713161
10000008566 157.199899 1.5% 102 1.541175 0.167078 4.366412
10000008565 147.466754 1.4% 98 1.504763 0.128982 4.538604
6000008641 139.614461 1.4% 90 1.551272 0.127778 4.799470
10000008567 120.733972 1.2% 89 1.356561 0.100613 4.564558
9000008223 107.619815 1.1% 73 1.474244 0.118106 5.473023
10000008563 95.949235 0.9% 72 1.332628 0.115185 3.580435
9000008224 90.483791 0.9% 79 1.145364 0.129597 5.468010
6000006191 86.307121 0.8% 78 1.106502 0.102094 3.876378
4329 others 8888.304128 87.3% 11142 0.797730 0.100035 5.366355
------------ ------------------------ ---------- ----------- ----------- -----------
TOTAL (4339) 10182.936237 100.0% 12062 0.844216 0.100035 5.473023

19 © 2009/2010 Pythian

Playing with Chunks Size


Options:
group = '$p1*1000000000+int($p2*8192/1024/1024/16)'
where = '$ela>0.1'

File Chunk DURATION CALLS MEAN MIN MAX
----------- ------------------------ ---------- ----------- ----------- -----------
10000000535 846.934923 8.3% 633 1.337970 0.100168 4.564558
7000000029 315.398085 3.1% 353 0.893479 0.103097 3.670991
6000000023 280.162428 2.8% 330 0.848977 0.100183 3.713161
12000000171 261.555298 2.6% 268 0.975953 0.103535 4.014043
12000000170 193.130501 1.9% 166 1.163437 0.102184 3.937978
9000000513 175.100649 1.7% 124 1.412102 0.118106 5.473023
7000000157 173.111037 1.7% 160 1.081944 0.102949 4.237775
6000000540 140.663440 1.4% 91 1.545752 0.127778 4.799470
6000000386 130.590608 1.3% 172 0.759248 0.100873 3.876378
11000000156 122.062914 1.2% 135 0.904170 0.100622 3.748086
447 others 7544.226354 74.1% 9630 0.783409 0.100035 5.468010
----------- ------------------------ ---------- ----------- ----------- -----------
TOTAL (457) 10182.936237 100.0% 12062 0.844216 0.100035 5.473023

20 © 2009/2010 Pythian

Time Periods Analysis

One minute average IO response time, seconds
2.0

1.5

1.0

0.5

0
1
7
13
19
25
31
38
45
52
58
64
70
77
83
92
98
104
110
116
122
128
134
140
146
152
159
165
171
177
186
196
202
208
214
220
226
232
238
244
21 © 2009/2010 Pythian

10046 Trace Is Expensive... NOT!

• 10046 tracing overhead is insignificant
• This sample 4+ hours batch - trace <30MB with 300K+ lines
• 10x compressed - 3 MB
• 30 batches per night - <1GB of traces
• 10x compressed - 100 MB per night

One month of complete 10046 trace
batch history is only 3GB compressed

22 © 2009/2010 Pythian

Storing 3GB of data on Amazon S3
costs less than $1 per month

23 © 2009/2010 Pythian

What Does 10046 Not Buy You?

• Throughput
• Doable but needs quite a bit of traces to enable and process
• No accounting for non-database workload
• No visibility on how each IO call translates into “real” IOs
• Real IOs - requests done by DB server OS?
• Real IOs - requests done by a SAN controller?
• Real IOS - requests served by disk controller?
• Caching impact

24 © 2009/2010 Pythian

Measuring Throughput

Database

Host
• AWR & Statspack
• OS tools Storage Array
• Like sar, iostat, DTrace
• Storage vendor tools
• Like EMC Symmetrix Performance Analyzer (SPA)

25 © 2009/2010 Pythian

Average values
make sense only if events
are perfectly randomly
distributed as well as response times

26 © 2009/2010 Pythian

Don’t Be Trapped by Averages!

• Averaging response times
• Loosing skew info
• Loosing IO calls attributes
• Sizes, offsets, data blocks
• Loosingscope - what transaction is this IO request for?
• Reduced time granularity
• Traditional Statspack & AWR snaps are hourly
• sar data is captured every 5 (or 10?) minutes be default
• SAN stats usually aggregated as high as 1 hours (SPA - 5 minutes?)

27 © 2009/2010 Pythian

Choosing the Aggregation Interval

• 24 hours running window
• 95% of transaction should complete within 1 seconds

• 99% of transactions should complete within 10 seconds
• 10 seconds is timeout so 1% of transactions can fail and it’s OK
• 24 hours is 86,400 seconds => 1% is 864 seconds (14.4 min)

•1 hour intervals => few minutes hiccups won’t be
noticeable
• 5 minutes intervals => significant spikes of IO response
time will likely be noticeable
• But really want to go to intervals within the typical
transaction response times
28 © 2009/2010 Pythian

Random Arrivals concept
applies 100% to IO calls

Detecting Random Arrivals rule violation
requires averaging
interval close to response time

29 © 2009/2010 Pythian

Monitoring I/O Performance and SLAs

• How your transactions SLAs transform to IO SLAs?

• Percentile requirements
• Commit to response time according to percentile requirements at
the pre-defined throughput and concurrency levels
• *average* 2000 IOPS per second with up to 40 concurrent IOs
• 99% of IOs - <10 ms, 99.9% IOs - <100ms
• 1 minute sliding window
• Monitoring such SLAs - must average 1 minute and collect response
times histogram

30 © 2009/2010 Pythian

Importance of Response Time Histograms

• Includinghistograms in the snapshots adds more color to
the averaged measures
• Histogram is an indicator of skew

• They help selecting the right measurements interval
• Histograms can be build on any value - not just response
times
• Histogram of IO throughput per 5 minutes intervals to analyze
whether we have bursts of IO activity.

• Histogram in Statspack reports appeared in 10g
• Histogram in AWR reports appeared in 11g

31 © 2009/2010 Pythian

A Tool to Collect Short Interval Averages

• Requirements:
• 1 minute or less intervals
• Collect system level IO waits and stats
• Collect session level IO waits and stats
• Collect IO response time histograms (system and session)
• Nice to have - per service/module/action granularity
• Production collection example (6 years old)
• Oracle 9i RAC, HP-UX 64 cores
• thousands DB calls per second, thousands IO calls per second
• *All* stats and waits with 1-5 minute snaps and at logoff
• Tanel Poder’s Snapper and Sesspack

32 © 2009/2010 Pythian

ASH Data for I/O Measurements?

V$ACTIVE_SESSION_HISTORY
&
DBA_HIST_ACTIVE_SESS_HISTORY
• TIME_WAITED => 11.2 documentation is misleading
• DELTA_TIME

• DELTA_READ_IO_REQUESTS/BYTES

• DELTA_WRITE_IO_REQUESTS/BYTES

33 © 2009/2010 Pythian

ASH itself is
misleading for I/O performance
measurements

Sampling tends to hide short waits
invalidating it for any response time
analysis

34 © 2009/2010 Pythian

AWR Sources
• DBA_HIST_EVENT_HISTOGRAM
• DBA_HIST_FILEMETRIC_HISTORY *
• DBA_HIST_FILESTATXS
• DBA_HIST_IOSTAT_DETAIL/FILETYPE/FUNCTION

• DBA_HIST_SERVICE_STAT

• DBA_HIST_SESSMETRIC_HISTORY *
• DBA_HIST_SQLSTAT

• DBA_HIST_SYSTEM_EVENT

• DBA_HIST_SYSSTAT

• DBA_HIST_SYSMETRIC_HISTORY *
* These views have granularity of 1 minute

35 © 2009/2010 Pythian

AWR Example - DBA_HIST_SYSMETRIC_HISTORY

-- Physical Reads Per Sec
-- Physical Writes Per Sec
-- I/O Requests per Second
-- I/O Megabytes per Second
-- Redo Generated Per Sec
-- Average Synchronous Single-Block Read Latency

SELECT begin_time, ROUND(value,1) v
FROM dba_hist_sysmetric_history
WHERE metric_name=
'Average Synchronous Single-Block Read Latency'
ORDER BY 1;

36 © 2009/2010 Pythian

V$SESSION_WAIT_HISTORY?

• The last 10 wait events for each active session.

• Column WAIT_TIME_MICRO
• Amount of time waited (in microseconds)

37 © 2009/2010 Pythian

Measuring at the OS Layer

• OS is not really transparent for IO requests
• Has IO requests queues
• Utilizes various I/O schedulers that decide on requests priority
• ASYNC I/O
• Filesystems and buffered I/O
• Impact of CPU scheduling
• Timespent in OS layer becomes important as we move to
SSD and Flash storage

• Difficult to directly associate OS stats with DB stats

38 © 2009/2010 Pythian

Measuring at the SAN Layer

• Normally most of IO time is spent on physical disk but...
• Read cache impact
• Write cache impact
• Cache saturation situations
• Abnormal situations like controller/switch failure
• Quality of Service (QoS)
• Flash based storage shifts the balance of time again
• Non-disk component of IO response time becomes more prominent
• Difficultto associate SAN stats with OS & DB stats
• Virtualization kicks in

39 © 2009/2010 Pythian

Exadata Storage Cell Measurement

• Replacement of SAN layer
• More than jut stats per disk / controller and etc
• Storage Cell now performs more than just I/O functions
• Muchbetter accountability and association with
database
• Database segment visibility in flash cache
• IORM metrics - category, database, consumer groups
• Flash Cache metrics
• Cumulative and 1 minute aggregates
• Some stats are passed back to the database
• V$SYSSTAT, V$SQL, waits, XML cell stats in V$CELL_STATE

40 © 2009/2010 Pythian

Increased Importance of Low Latency Network

• With traditional HDD random access times of 5-10ms
➡ Communication overhead is minimal - less than 10%
• FC storage latencies are in few hundreds of microseconds
• NFS mounted storage adds less than 1ms latency
• IP stack is heavier on CPU => impact of OS CPU scheduler
• Flash read latency is order of magnitude shorter
➡ Suddenly InfiniBand SAN becomes necessity!
• microseconds latency

41 © 2009/2010 Pythian

Exadata: Flash + InfiniBand = Very Low Latency?

• Let’s check some Exadata 10046 traces...

cell single block physical read

Options:
group = ''
name = 'cell single block physical read'
where = '1'

RANGE {min <= e < max} DURATION CALLS MEAN
0.000000 0.000001 0.000000 0.0% 0
0.000001 0.000010 0.000000 0.0% 0
0.000010 0.000100 0.000000 0.0% 0
0.000100 0.001000 0.191839 95.5% 310 0.000619
0.001000 0.010000 0.008983 4.5% 3 0.002994
0.010000 0.100000 0.000000 0.0% 0

42 © 2009/2010 Pythian

Exadata: Flash + InfiniBand = Very Low Latency?

await svctm %util
0.51   0.31   5.85
0.79   0.38   6.40
0.57   0.41   6.50
0.62   0.40   7.00
0.41   0.30   4.95
0.43   0.32   5.60

Device:         rrqm/s   wrqm/s r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await svctm %util
sdn               0.50     0.00 188.50 0.00 1512.00     0.00    16.04     0.10    0.51   0.31   5.85
sdo               1.50     0.00 170.50 0.00 1376.00     0.00    16.14     0.14    0.79   0.38   6.40
sdp               2.50     0.00 157.00 0.00 1276.00     0.00    16.25     0.09    0.57   0.41   6.50
sdq               0.50     0.00 173.50 0.00 1392.00     0.00    16.05     0.11    0.62   0.40   7.00
sdr               0.50     0.00 166.50 0.00 1336.00     0.00    16.05     0.07    0.41   0.30   4.95
sds               1.00     0.00 175.50 0.00 1412.00     0.00    16.09     0.08    0.43   0.32   5.60

43 © 2009/2010 Pythian

Measuring for Planning:
Aggregate Interval

1. Choose a large-ish interval
2. Analyze histograms - skewed inside the interval?
3. If Yes, reduce the interval
4. Repeat steps 1-3 until ...
a) you either see no skew or ...
b) business stops carrying about skew inside that interval

44 © 2009/2010 Pythian

Distinguish Different Kinds of I/O

• Random vs sequential I/O
• If underlying disks are spinning media
• Small vs Large IOs
• Throughput is then measured either in IOPS or MBPS
• Reads vs Writes
• Sometimes can be generalized as what % are the writes

48 © 2009/2010 Pythian

Business Function Granularity

• Measure I/O at the right granularity
• Ideally per business transaction / function
• Practical - service, session, module/action, SQL
• “System” I/O - LGWR, ARCH, DBWR, etc.
• Indirect association to business transactions
• Helps building more realistic capacity planning models

49 © 2009/2010 Pythian

Oracle Database CALIBRATE_IO

DBMS_RESOURCE_MANAGER.CALIBRATE_IO
(<DISKS>, <MAX_LATENCY>, iops, mbps, lat);

• iops - max read per second (random single block)
• lat - actual average single block latency at iops rate

• mbps - max MB/s throughput (large reads)

simplistic
read-only needs a database
outputs max only requires ASYNC I/O
51 © 2009/2010 Pythian

ORION - ORacle I/O Numbers

• Free tool from Oracle simulating database-like IOs
• No database required
• Same I/O libs / code-path
• Still requires ASYNC I/O
• Very flexible
• Large vs Small IOs; flexible sizes; mixed
• Random vs Sequential I/O patterns; mixed
• Configurable write I/O %
• Can simulate ASM striping layout

52 © 2009/2010 Pythian

ORION Example 1: Scalability Anomaly

HP blades
HP Virtual Connect
Flex10
Big NetApp box
100 disks

53 © 2009/2010 Pythian

ORION Example 1: Impact of Large IOs

HP blades
HP Virtual Connect
Flex10
Big NetApp box
100 disks

54 © 2009/2010 Pythian

ORION Example 2: Initial Run - Failed Expectations
NetApp NAS, 1 Gbit Ethernet, 42 disks

5000 30.0

4000
Read only 22.5

Latency, ms
3000
IOPS

15.0
2000

7.5
1000

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100

IOPS Latency

5000 50

4000 40
Read write

Latency, ms
3000 30
IOPS

2000 20

1000 10

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100

56 © 2009/2010 Pythian

ORION Example 2: Tune-Up Results
Switched from Intel to Broadcom NICs

IOPS Latency
10000 12

10
8000

8

Latency, ms
6000
IOPS

6
4000
4

2000
2

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100
15000 8

12500
6
10000

Latency, ms
IOPS

7500 4

5000
2
2500

0 0
1 2 3 4 5 10 20 30 40 50 60 70 80 90 100

57 © 2009/2010 Pythian

Q&A

Email me - gorbachev@pythian.com
Read my blog - http://www.pythian.com
Follow me on Twitter - @AlexGorbachev
Join Pythian fan club on Facebook & LinkedIn

61 © 2009/2010 Pythian

[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie [INSIGHT OUT 2011] A23 database io performance measuring planning(alex)

Ähnlich wie [INSIGHT OUT 2011] A23 database io performance measuring planning(alex) (20)

Mehr von Insight Technology, Inc.

Mehr von Insight Technology, Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

[INSIGHT OUT 2011] A23 database io performance measuring planning(alex)