Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
A Study on SSD Aware
Scan Operation
Optimization in
PostgreSQL Database
SSDs vs Traditional Spin Type HDDs
SSDs
Silicon memory chips
No moving parts
No rotational delay
Near zero seek time
Both random and sequential block access
...
But ...
The cost models in RDBMS are based on the
characteristics of spin type HDDs.
Assumes random_block_access_time >
se...
Background information
Scan operation
- SELECT * FROM table WHERE condition
Selectivity
Scan operation alternatives in Pos...
Our Hypothesis
Index scan based on a secondary index can
perform better than other scan operations in
databases which runs...
Our Hypothesis (Continued)
SELECT * FROM table WHERE column = val
- column is indexed (not primary)
- correlation between ...
Methodology
Kingston 8GB Data Traveler
Dedicated PC running Ubuntu 12.04 (i5 2.3 GHz processor
and 4GB system memory)
Post...
Methodology (Continued)
numeric field “idx_column” indexed using a
btree index
correlation between primary index and
secon...
Selectivity
(log) seq scan BHS + BIS index scan
-4 10594 0 0
-3 10269 1 0
-2 10255 9 4
-1 10260 94 44
0 10278 644 457
1 10...
In PostgreSQL
random_block_access_time
= 4 * seq_block_access_time
This is assuming spin type HDDs
What is the relation in...
Selectivity (log)
Running times before
optimization(ms)
Optimum running
times(ms)
Running times
after
optimization(ms)
Cos...
Are we done ??
We haven’t consider an important factor
- relative size of the table compared to the
system memory
Observations
Sequential scan remains consistent for all the
system memory values. why ?
Both BIS + BHS and index scan dras...
So the optimization will work only in special
conditions where at least majority of the
table content can reside in the ma...
Potential of this optimization
- Small table size databases
- Embedded devices
- Mobile phones etc.
Questions ??
SSD Aware Scan Operation Optimization in PostGreSQL Database
SSD Aware Scan Operation Optimization in PostGreSQL Database
SSD Aware Scan Operation Optimization in PostGreSQL Database
SSD Aware Scan Operation Optimization in PostGreSQL Database
SSD Aware Scan Operation Optimization in PostGreSQL Database
Nächste SlideShare
Wird geladen in …5
×

SSD Aware Scan Operation Optimization in PostGreSQL Database

309 Aufrufe

Veröffentlicht am

Optimizing PostgreSQL database scan operation for SSD devices

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

SSD Aware Scan Operation Optimization in PostGreSQL Database

  1. 1. A Study on SSD Aware Scan Operation Optimization in PostgreSQL Database
  2. 2. SSDs vs Traditional Spin Type HDDs
  3. 3. SSDs Silicon memory chips No moving parts No rotational delay Near zero seek time Both random and sequential block access time is almost the same !
  4. 4. But ... The cost models in RDBMS are based on the characteristics of spin type HDDs. Assumes random_block_access_time > sequential_block_access_time When used with SSDs this assumption is not valid - Is there opportunities for improvements ??
  5. 5. Background information Scan operation - SELECT * FROM table WHERE condition Selectivity Scan operation alternatives in PostgreSQL - Heap Scan - Bitmap index scan + Bitmap heap scan - Index scan
  6. 6. Our Hypothesis Index scan based on a secondary index can perform better than other scan operations in databases which runs on SSD type storage media. Based on the fact that in SSDs the random block access cost is almost similar to sequential block access cost
  7. 7. Our Hypothesis (Continued) SELECT * FROM table WHERE column = val - column is indexed (not primary) - correlation between primary index and secondary index is zero
  8. 8. Methodology Kingston 8GB Data Traveler Dedicated PC running Ubuntu 12.04 (i5 2.3 GHz processor and 4GB system memory) PostgreSQL 9.3 Table with 36 columns, 6,000,000 rows of data SELECT * FROM table_1 WHERE column_1 > val_1 AND column_1 < val_2 1.7 GB of data (with indexes)
  9. 9. Methodology (Continued) numeric field “idx_column” indexed using a btree index correlation between primary index and secondary index is = 0.000000… cardinality of the “idx_column” field is 933900
  10. 10. Selectivity (log) seq scan BHS + BIS index scan -4 10594 0 0 -3 10269 1 0 -2 10255 9 4 -1 10260 94 44 0 10278 644 457 1 10407 8794 4915 2 11600 16528 49395
  11. 11. In PostgreSQL random_block_access_time = 4 * seq_block_access_time This is assuming spin type HDDs What is the relation in SSDs ? random_block_access_time = seq_block_access_time ??
  12. 12. Selectivity (log) Running times before optimization(ms) Optimum running times(ms) Running times after optimization(ms) Cost reduction (ms) Cost reduction (%) -4 0 0 0 0 - -3 1 0 0 1 100 -2 9 4 4 5 56 -1 94 44 44 50 53 0 644 457 457 187 29 1 8794 4915 4915 3879 44 2 11600 11600 11600 0 0
  13. 13. Are we done ?? We haven’t consider an important factor - relative size of the table compared to the system memory
  14. 14. Observations Sequential scan remains consistent for all the system memory values. why ? Both BIS + BHS and index scan drastically underperforms when system memory is reduced. BIS + BHS performs slightly better than index scan
  15. 15. So the optimization will work only in special conditions where at least majority of the table content can reside in the main memory. - Does this means the optimization is of no use ??
  16. 16. Potential of this optimization - Small table size databases - Embedded devices - Mobile phones etc.
  17. 17. Questions ??

×