Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Full Table Scan: friend or foe

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 32 Anzeige

Full Table Scan: friend or foe

Herunterladen, um offline zu lesen

Session about Full Table Scan details in the Oracle database. Topic covers both costing and execution model plus trivia and example

Session about Full Table Scan details in the Oracle database. Topic covers both costing and execution model plus trivia and example

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Full Table Scan: friend or foe (20)

Anzeige

Aktuellste (20)

Full Table Scan: friend or foe

  1. 1. Full Table Scan: friend or foe? Mauro Pagano
  2. 2. Mauro Pagano • Consultant, working with both DBA and Devs • Oracle  Enkitec  Accenture Enkitec Group • Database Performance and SQL Tuning • Training, Workshops, OUG • Free Tools (SQLd360, TUNAs360, Pathfinder) • Newbie old fart (thanks Bryn! :-) 2
  3. 3. • ”FTS is bad” biggest myth in SQL Tuning • FTS distracts people from real root cause • FTS often not performed at its best 3 Why this session?
  4. 4. Friend or foe? Want to guess? Come on try to guess… IT DEPENDS  4
  5. 5. What is a Full Table Scan? • SQL Execution Access Method – Read data from table, while applying filters • Same mechanics apply to: – Full (sub)Partition Scan – Index Fast Full Scan • Always available regardless of SQL construct 5
  6. 6. How does it work? • Whole segment is read – From segment header up to (L)HWM – Blocks read regardless if empty or not • Several blocks read at once – This is key to understand full potential of FTS • Data goes into SGA => db file scattered read • Data goes into session PGA => direct path read 6
  7. 7. Why FTS rocks? • Can crunch A LOT of data efficiently (*) – Couple of O more data than index scans per disk • Full Scan (~200MB/s per disk) – Wait IO seek + latency per (large) chunk – Parallelizes well, increasing bandwidth (GB/s) • Index Scan (~1.5MB/s per disk) – Wait IO seek + latency per block 7https://vimeo.com/160371916
  8. 8. Why FTS doesn’t rock? • FTS read ~100x faster than index • Needs to read the whole segment – GB to read just few rows • Concurrent users share bandwidth – More users less resource for each • Index faster if filters less than ~1% of data – Assuming data comes all from disk 8
  9. 9. What’s the challenge then? • If < 1% index otherwise FTS? Easy right? • NO!!!! • Buffer Cache complicates things a lot – It saves large % of disk reads • Especially for index blocks, touched often – Buffer Cache is transitory in nature • No guarantee block X will be there when needed – Complex for CBO to consider caching • Algorithms assume each read is a physical one (kind of) 9
  10. 10. Demo table create table t_fts as select * from dba_objects; insert into t_fts select * from t_fts; / <<a few times>> exec dbms_stats.gather_stats(user,’T_FTS’); TABLE_NAME NUM_ROWS BLOCKS PAR ----------- ---------- ---------- --- T_FTS 827878 15327 NO 10
  11. 11. FTS CBO costing • Amount of work is xxx_TABLE.BLOCKS • How much can be read at once? – Adjusted by how much longer mread takes vs sread – Considering block size and IOTFRSPEED • db_file_multiblock_read_count – If set, then obeyed, if not set then use 8 – If MBRC from System Stats, then obeyed • Total Cost is dominated by IO cost 11
  12. 12. When is it a good idea to use FTS? • No hardcoded threshold on % of data selected • Main parameters to consider – Table size & scan “capacity” – Indexes available and their “quality” – Caching not considered (*) • CBO costs it, selects when cheaper • Costing formula simple and solid – Enhanced to support In-Memory (not Exadata) 12
  13. 13. FTS CBO costing – demo table No system stats (IOSEEKTIME 10ms and IOTRFSPEED 4096) No db_file_multiblock_read_count set Using 8 as MBRC Table Stats:: Table: T_FTS #Rows: 827878 #Blks: 15327 Access Path: TableScan Cost: 4198.59 Cost_io: 4153.00 Cost_cpu: 465137851 Costing Formula = 1 + (#Blocks/MBRC * mread/sread) Doing the math: 1(1) + (15327/8 * (10 + 8 * 8/4)/12) ~= 4152 13
  14. 14. FTS execution • Oracle tries to maximize IO Size • Usually translates in 1MB read size • db_file_multiblock_read_count – If set, then obeyed – Usually a bad idea • Direct path read vs db file scattered read – Decision NOT made by the CBO – Can increase pressure to IO 14
  15. 15. FTS execution – demo table No db_file_multiblock_read_count set PARSING IN CURSOR #140422231271976 ... sqlid='3g5guxdgz4drx’ select * from t_fts ... WAIT #1..:'direct path read' fnum=6 fdba=328203 bcnt=13 WAIT #1..:'direct path read' fnum=6 fdba=331408 bcnt=8 WAIT #1..:'direct path read' fnum=6 fdba=572034 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=596738 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=596994 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=597890 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=599938 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=600834 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=600962 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=601090 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=601218 bcnt=126 WAIT #1..:'direct path read' fnum=6 fdba=601856 bcnt=128 WAIT #1..:'direct path read' fnum=6 fdba=601984 bcnt=128 WAIT #1..:'direct path read' fnum=6 fdba=606848 bcnt=128 WAIT #1..:'direct path read' fnum=6 fdba=608512 bcnt=128 15
  16. 16. How did the myth start? • Many problems lead to FTS when not a good idea – FTS is the symptom / consequence of the issue, not cause – FTS gets blamed and steal attention from root cause • Some examples that affect CBO decision: – Lack of necessary index • 1-2% data, index usually faster thx to Buffer Cache – Poor quality stats, for example lack of histogram • CBO unable to determine specific values very selective 16
  17. 17. And if the CBO was right? • Legit FTS can still underperform • Often not due to Oracle database itself • Storage underperforming – Unable to provide data fast enough, small pipe – High latency due to high load • File system / ASM misunderstanding – Optimizations causing unstable (and puzzling) results – Caching making TEST vs PROD comparison incorrect – File System vs ASM comparison assuming they work the same 17
  18. 18. Use FTS the right way • DBAs have generally little control outside Oracle – Make sure FTS works at its best “from the DB” • Provide proper stats to CBO – In order to make FTS selected when needed – No need to actively discourage FTS – Usually no need to gather System Stats (*) • Make sure IO size is maximized • Let’s play Trivia!  18
  19. 19. Trivia 1 – What’s going on? Good execution WAIT #140245916217600: nam='db file scattered read' ela= 4834 file#=26 block#=16002 blocks=128 WAIT #140245916217600: nam='db file scattered read' ela= 4020 file#=26 block#=16130 blocks=128 WAIT #140245916217600: nam='db file scattered read' ela= 2452 file#=26 block#=16258 blocks=128 Bad bad execution WAIT #140245916165224: nam='db file sequential read' ela= 124 file#=26 block#=16002 blocks=1 WAIT #140245916165224: nam='db file scattered read' ela= 139 file#=26 block#=16004 blocks=2 WAIT #140245916165224: nam='db file sequential read' ela= 117 file#=26 block#=16007 blocks=1 ….<<another 38 waits here>> WAIT #140245916165224: nam='db file sequential read' ela= 132 file#=26 block#=16113 blocks=1 WAIT #140245916165224: nam='db file sequential read' ela= 123 file#=26 block#=16116 blocks=1 WAIT #140245916165224: nam='db file scattered read' ela= 142 file#=26 block#=16118 blocks=2 WAIT #140245916165224: nam='db file scattered read' ela= 141 file#=26 block#=16121 blocks=2 WAIT #140245916165224: nam='db file scattered read' ela= 135 file#=26 block#=16124 blocks=2 19
  20. 20. Trivia 2 – What’s going on? Good execution WAIT #139702846026760: nam='db file scattered read' ela= 13508 file#=25 block#=258 blocks=128 WAIT #139702846026760: nam='db file scattered read' ela= 9016 file#=25 block#=386 blocks=128 Bad bad execution WAIT #139702845969088: nam='db file scattered read' ela= 265 file#=25 block#=306 blocks=8 WAIT #139702845969088: nam='db file scattered read' ela= 257 file#=25 block#=314 blocks=8 WAIT #139702845969088: nam='db file scattered read' ela= 259 file#=25 block#=322 blocks=8 WAIT #139702845969088: nam='db file scattered read' ela= 254 file#=25 block#=330 blocks=8 20
  21. 21. Trivia 3 – What’s going on? Bad bad execution WAIT #140029131327704: nam='db file scattered read' file#=26 block#=15618 blocks=128 obj#=74828 WAIT #140029131327704: nam='db file sequential read'… bytes=8192 obj#=0 WAIT #140029131327704: nam='db file scattered read' file#=26 block#=15746 blocks=128 obj#=74828 WAIT #140029131327704: nam='db file sequential read' … bytes=8192 obj#=0 WAIT #140029131327704: nam='db file scattered read' file#=26 block#=15874 blocks=128 obj#=74828 WAIT #140029131327704: nam='db file sequential read' … bytes=8192 obj#=0 WAIT #140029131327704: nam='db file scattered read' file#=26 block#=16002 blocks=128 obj#=74828 … … … 21
  22. 22. Trivia Summary • Make sure FTS can go full speed on paper – Proper Extent Size – No db_file_multiblock_read_count – Helping buffered vs unbuffered decision, if needed • Be familiar with common cause for slowdown – Chained / Migrated rows – Heavy access to UNDO when doing CR 22
  23. 23. Mixed workloads • FTS works well with (relatively) large scans • Common in analytics, less in OLTP – Not necessarily a bad idea, just more uncommon • ”Classic OLTP” behaviors negatively affect FTS – Heavy concurrency, many reads from UNDO – Warm buffer cache, “fill the gaps” reads • FTS is more popular for batch-like SQLs • Direct path reads help a bit, at expenses of storage 23
  24. 24. Real-life case when FTS was desired 24
  25. 25. Real-life case when FTS was desired 25 Index scan is chosen by cost, looking into stats Formula = blevel + (ix_sel * leaf_blocks) + (ix_sel_with_filters * cluf) select table_name, num_rows, blocks, avg_row_len from dba_tables TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN -------------------------- ---------- ---------- ----------- ... 1547082 202847 149 select index_name, clustering_factor, leaf_blocks from dba_indexes INDEX_NAME CLUSTERING_FACTOR LEAF_BLOCKS -------------- ----------------- ----------- ..._IDX1 1508791 19508 ..._IDX2 774974 5579 ..._IDX3 13345 3241 <-- ..._PK 16462 3097 ..._UK1 1662405 7707 ..._IDX4 26671 2814
  26. 26. Real-life case when FTS was desired 26 • Shrinking table size dropped to 20k blocks • FTS cost dropped from 60k to 6k – Index scan cost stayed ~the same, 13k • Plan switched from IRS to FTS • Elapsed time dropped to 7s (from 60s) – SQL performed 20x less CR (mostly thx to shrink) • Lesson learned? FTS can be VERY efficient
  27. 27. FTS improvements • FTS made more efficient by engineering • Goal is avoid reading part of the segment – Reduce IO and improve response time – Can be row-based or column-based (columnar) • Skip region with no data of interest – Partition pruning, based on object definition – Zonemaps, based on data location – HCC, based on columns of interest 27
  28. 28. Where is Oracle going? • New products built on power of FTS • Big shift compared to old mentality • Exadata – Needs FTS with direct path read to offload processing – Skip regions thank to Storage Indexes (enh in 12.2) • In-memory – No index defined on IMCU, only FTS – Skip regions thank to IM Storage Index 28
  29. 29. Where is industry going? • Data is growing exponentially • Lots of interest in Hadoop / BigData – There is no index in Hadoop – Every scan is a FTS with pruning • Storage is faster now, more powerful scans – NVMe provides GB/s per card – Powerful scans moving processing to data 29
  30. 30. Summary • FTS is a very efficient way of scanning data • CBO determines when to use it – Suboptimal uses have other root causes • Few things negatively affect FTS – Need to know them to alleviate effect • Large scans are getting more popular – As software improves to handle more data – As hardware improves to scan more and faster 30
  31. 31. 31
  32. 32. Contact Information • http://mauro-pagano.com – Email • mauro.pagano@gmail.com – Free tools to download • SQLd360 • TUNAs360 • Pathfinder 32

Hinweis der Redaktion

  • <<can show text editor>>

    PARSING IN CURSOR #1…4 ... sqlid='27uhu2q2xuu7r’
    select * from t1

    EXEC #1…4:c=0,e=17,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=361

    WAIT #1…4:'db file sequential read’ file#=7 b#=110010 blocks=1
     0x23=PAGETABLE SEGMENT HEADER
    WAIT #1…4:'db file sequential read’ file#=7 b#=5249 blocks=1
     0x20=FIRST LEVEL BITMAP BLOCK
    WAIT #1…4:'db file sequential read’ file#=7 b#=110009 blocks=1
     0x21=SECOND LEVEL BITMAP BLOCK
    WAIT #1…4:'db file scattered read’ file#=7 b#=5376 blocks=2
     0x20=FIRST LEVEL BITMAP BLOCK
     0x20=FIRST LEVEL BITMAP BLOCK
    WAIT #1…4:'direct path read' file num=7 first dba=110011 bcnt=5
     typ: 1 – DATA
  • (*) assuming there is no bottleneck in the config, for example storage oversaturated or not enough ”pipe” in between

    More verbose explanation at https://vimeo.com/160371916 at minute 4:30
  • The internal parameters are:
    db_file_optimizer_read_count
    db_file_exec_read_count
  • (*) at least not well, there is a param to make the CBO “more aware” of caching but not very used

    Scan capacity is the brute force capacity of reading data
    Index quality is meant as how efficient (for example low CLUF and high selectivity) that index would be
  • (1) _table_scan_cost_plus_one

    https://support.oracle.com/epmos/faces/DocContentDisplay?id=1398860.1
  • https://fritshoogland.wordpress.com/2015/10/14/direct-path-and-buffered-reads-again/
  • Important to notice only few are consecutive, important to make extent size large enough to maximize potential of scan

    Also the read size increased, confirmation SYSTEM allocation was used
  • (*) larger topic, but many clients don’t use System Stats anyway
  • Warm buffer cache, buffered reads, aka filling up the gaps

    Direct path reads can help here, assuming the storage can handle that
  • Extent Size or MBRC, can’t tell from the wait events
  • Reads from UNDO
  • --------------------------------------------------------
    DBA Ranges :
    --------------------------------------------------------
    0x01c01500 Length: 64 Offset: 0
    0:Metadata 1:Metadata 2:unformatted 3:unformatted 4:unformatted 5:unformatted 6:unformatted 7:unformatted
    8:unformatted 9:unformatted 10:unformatted 11:unformatted 12:unformatted 13:unformatted 14:unformatted 15:unformatted
    16:unformatted 17:unformatted 18:unformatted 19:unformatted 20:unformatted 21:unformatted 22:unformatted 23:unformatted
    24:unformatted 25:unformatted 26:unformatted 27:unformatted 28:unformatted 29:unformatted 30:unformatted 31:unformatted
    32:unformatted 33:unformatted 34:unformatted 35:unformatted 36:unformatted 37:unformatted 38:unformatted 39:unformatted
    40:unformatted 41:unformatted 42:unformatted 43:unformatted 44:unformatted 45:unformatted 46:unformatted 47:unformatted
    48:unformatted 49:unformatted 50:unformatted 51:unformatted 52:unformatted 53:unformatted 54:unformatted 55:unformatted
    56:unformatted 57:unformatted 58:unformatted 59:unformatted 60:unformatted 61:unformatted 62:unformatted 63:unformatted
  • Real case, client complaining SQL is taking too long

    Symptoms are: SQL takes 60 secs, used to take 10 secs
    Focusing on where the time is going we find out 90% is spent on accessing a single table

    Looking into the stats we noticed the filter at step 51 was selecting ~99% of the data
  • From the stats the CLUF is really low, about 15x smaller than #blocks which is strange, shouldn’t be the same
    Also avg row len is 150 so those #blocks sounds too high

    Checking space most of the blocks are almost empty
    unformatted_blocks :2416 unformatted_bytes  :19791872 fs1_blocks   :3 fs1_bytes   :24576 fs2_blocks   :2 fs2_bytes   :16384 fs3_blocks   :3 fs3_bytes   :24576 fs4_blocks   :177881 fs4_bytes   :1457201152 full_blocks   :22530 full_bytes   :184565760 

×