SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Column-Oriented
Query Processing for Row Stores
INTRODUCTION OBJECTIVE AND SCOPE
Column-oriented DBMSs have gained increasing interest dueto their superior performance
for analytical workloads. Priorefforts tried to determine the possibility of simulating the
Query processing techniques of column-oriented systems inrow-oriented databases, in a hope
to improve their performance,especially for OLAP and data warehousing applications.
We show that column-oriented queryprocessing can significantly improve the performance of
row-orientedDBMSs. We introduce new operators that take intoaccount the unique
characteristics of data obtained from indexes,and exploit new technologies such as flash
SSDs andmulti-core processors to boost the performance. We demonstrateour approach with
an experimental study using a prototypebuilt on a commercial row-oriented DBMS.
Recently, column-oriented database systems (also knownas column stores) have been
receiving a lot of attention. The main difference between column stores and the
traditionalrow-oriented database systems (also known as row stores)is in the way the data is
physically stored. As the namesimply, both kinds of database systems store data differently.
In row-stores, the DBMS stores all the attribute values ofa single row in contiguous space on
disk. Column-orientedDBMSs, on the other hand, store the data in a column-wisemanner (the
values of a column are stored contiguously).This difference in data storage has an implication
on dataaccess. In row stores, a whole row of data has to be read fromdisk, even if only few
columns of that row are needed to answera query. This means that the system may have to
readmuch more data than it actually needs. In column stores,only the required columns are
read from disk, although someextra processing is needed to construct the output tuplesfrom
these individual columns. However, when it comes todata updates, column stores tend to be
at a disadvantage.Since data updates (or inserts) usually happen at row granularity,multiple
accesses are needed to insert or update therelevant columns of each updated row. In row
stores, thewhole row is written in a single operation.This difference in access patterns and
performance hasmade column stores more suitable for workloads that areread-intensive with
few or no updates, e.g., analytical processingworkloads, such as those found in data
warehousesor decision support systems.
Related Area

We describe a column-oriented query processingmechanism for row stores based on indexonly plans,operating on single-column indexes. In our approach, onlythe indexes relevant to
the query are read, without readinganything from the underlying base tables. This leads
toavoiding the cost of reading the entire table rows when onlya few columns are referenced in
the query.
Seminar is related to DBMS and DATA WAREHOUSING.
Process Description

These column-oriented designs include:
_ vertically partitioning the tables in the database intoa set of two-column tables, each
consisting of (key,attribute) pairs, so that only the necessary columnsneed to be read to
answer a query. The key is used tojoin these tables to reconstruct the output tuples.
Using index-only plans; by creating a collection of indexes that cover all of the columns used
in a query; it is possible for the DBMS to answer a query without ever going to the
underlying (row-oriented) tables.

Such column-orientedphysical designs do not improve the performance of rowstores. In fact, in their
results, these techniques result inworse performance than the baseline row-oriented processing.
Above depicts the results of their experiments (normalizedrelative to the row store time). This
behaviour wasmainly because they tried to employ these techniques withoutany changes to the
underlying systems, which causedthe optimizer to make some bad decisions and use very
expensiveoperations. However, if there are operators that arespecially designed for column-oriented
processing, this canlead to a significant performance improvement.
We demonstrate the potential of column-orientedquery processing in row stores, focusing on
usingindex-only plans, but with query operators that take advantageof the characteristics of the data
obtained from theindexes. Our prototype is based on a commercial row store(which we shall refer to
as System A). We compare the performanceof our technique to the baseline performance ofSystem
A, as well as to that of a leading open-source columnstore (which we shall refer to as Col-DB). Our
resultsshow that column-oriented processing can significantly improvethe performance of row
stores for analytical workloads,and even approach the performance of a column store. Insummary,
we make the following contributions:
1. We show that column-oriented query processing techniques,when employed using proper
operators andmechanisms, can significantly improve the performanceof a row store, and even
approach the performance ofa column-store. Our main focus is index-only plans.
2. We introduce new ways of performing index intersectionthat leverage parallel processing, to be
used inquery execution using index-only plans; namely theindex-merge, index-merge-join, and indexhash-join operators.We describe the algorithms, and present a costanalysis for each operator.
3. We highlight the advantage of exploiting new hardwaretechnologies – namely flash SSDs and
multi-coreprocessors – to reduce query processing time.
4. We demonstrate the effectiveness of our approach withan extensive experimental study using a
prototype basedon System A, and we compare the performance of ourapproach to those of a
baseline row store (System A)and column store (Col-DB).
We describe a column-oriented query processingmechanism for row stores based on index-only
plans,operating on single-column indexes. In our approach, onlythe indexes relevant to the query
are read, without readinganything from the underlying base tables. This leads toavoiding the cost of
reading the entire table rows when onlya few columns are referenced in the query.
Weintroduce new specialized ways of combining the data obtainedfrom the indexes that take
advantage of the uniquecharacteristics of database indexes, instead of completely relyingon the
existing database operators.Our algorithmsare designed specifically to take advantage of
parallelprocessing whenever possible, which leads to better performance.
The new operators are:
Index Merge Merges the sorted RID-lists associated withindividual values from the index, producing
a (RID,value) list for the whole index, in RID order.
Index Merge Join Performs an N-way merge join operationbetween the (RID, value) lists coming
from severalindexes. The input lists have to be in RID order (e.g.,produced by the Index Merge
operator). The outputis a (RID, data) list, also in RID order, where the datais a collection of values).
Index Hash Join Performs an N-way hash join operationbetween N (RID, data) lists. The input lists
can bethe output of any of the previous two operators, anotherIndex Hash Join operator, or the
traditional Index Scan operator that exists in most row stores. Theinput lists do not need to be in any
order. One of theinput lists is used to build the hash table while theothers are used for probing. The
output is a (RID,data) list.
Testing Technologies

 INDEX MERGE (IXMG)
 INDEX MERGE JOIN (IXMGJN)
 Parallel Processing






INDEX HASH JOIN (IXHSJN)
In-place update (IPU)
Linked List (LL)
Linked List with Partitioning (LLP)
Resources and Limitations
All of our experiments were run on an IBM System x3400(Model 7974) server, with a 1.6
GHz single processor, dualcore Intel Xeon CPU with 4 GB of RAM running FedoraLinux. The
server has a 1 TB Seagate SATA disk, with105 MB/sec sustained data rate at 7.2K RPM. The
serveralso has an 80 GB Fusion-IO IO-Disk SSD with a read bandwidthof 700 MB/sec and a
write bandwidth of 550 MB/sec.
The numbers we report are the average of several runs, andare based on a cold buffer pool.
We built a prototype usingSystem A. We used two TPC-H [2] data sets for ourexperiments,
with scale factors 10 and 30. The lineitemtable(fact table with 16 columns) contains 60
million rowsin the 10GB database, and 180 million rows in the 30GBdatabase. A typical data
warehousing query involves thefact table (lineitem) and one or more dimension tables,
withpredicates on the fact and/or dimension tables.
Future scope and further enhancement
Several recent papers [3, 5, 8] have tackled the issue ofcomparing the performance of row
stores and column stores,and proposing optimizations for row stores in order for themto
compete with column stores for analytical workloads.
The work in [5] compares row stores and column stores.In particular, the authors try to
determine whether columnstore performance can be achieved in row stored usingcolumnoriented query processing. They simulate column-orientedprocessing by using both vertical
partitioning andindex-only plans in a commercial database system (referredto in the paper as
System X"), and compare the performanceof this system to the performance of C-Store [22].
In above Figure, System X performed very poorly withvertical partitioning, and even worse
with index-only plans.The reason for such a poor performance is because SystemX had to join
the (RID, value) lists coming from each indexusing a series of 2-way hash joins, each of
which builds anew hash table, which can be very expensive.
Bruno [8] argues that it is possible to achieve the performanceof column stores (or very close
to it) within row storeswithout any changes to the DBMS. The author proposes amethod to
store data in a compressed form, using a separatetable for each column in the original schema
(similar tothe vertical partitioning idea), but with these tables (called“c-tables”) storing
data in a format similar to run-length encoding.
The results show great performance improvement.However, this scheme requires creating a
c-table for everycolumn, and creating multiple indexes on every c-table, thusthe storage
requirements can grow rapidly. Additionally, thedependency between the tuples in the c-table
causes insertingnew rows to become very expensive, even for workloadswith infrequent
updates.Several optimizations have been implemented or proposedfor row-stores to approach
column-store performance, suchas super-tuples, using a column-based layout within eachdata
page, operating on compressed data, block processing,mirroring, invisible joins, late
materialization, and column-storeindexes [4, 12, 13, 15, 19, 20, 23, 24].
The idea of using multiple indexes to access data has beenstudied for some time. Mohan et al.
[16] introduced methodsfor using multiple indexes together to access base tables.
Raman et al. [21] presented two algorithms to perform theRID-list intersection operation,
which are comparable to thealgorithms we use for the low-cardinality indexes.
Flash memory has received increasing interest as a stablestorage media that can overcome the
access bottlenecks ofmagnetic disks. Researchers have considered modifying
existingalgorithms and data structures to make use of flashmemory. Flash-DB [17] is a selftuning flash-based index thatdynamically adapts to the mix of reads and writes in
theworkload. Flash-Logging [10] exploits using flash in appendmode for synchronous
logging.
Conclusion

Column-oriented processing can indeed improve query performancein row stores
significantly. Such performance improvementscan only be seen when changes are made to
thedatabase system. These changes should also make use ofnewly emerging technologies. For
example, multi-core processorsprovide a level of parallelism that can achieve
betterperformance, and new media such as flash memory canbe particularly used as
temporary storage, as it providesfast random access. Our work exploits these technologies
tobring the performance of a row store close to that of a columnstore while retaining roworiented capabilities (such asfast single-row lookups). This extends the scope for which
asingle database system can be used, where one does not needseparate software for the
warehouse and the OLTP system.
References

[1] MySQL, http://http://www.mysql.com/.
[2] TPC-H Benchmark, http://www.tpc.org/tpch.
[3] D. J. Abadi, P. A. Boncz, and S. Harizopoulos. Column oriented Database Systems.
PVLDB, 2(2):1664–1665, 2009.
[4] D. J. Abadi, S. Madden, and M. Ferreira. Integrating Compression and Execution in
Column-oriented Database Systems.In SIGMOD, 2006.
[5] D. J. Abadi, S. R. Madden, and N. Hachem. Column-Stores vs. Row-Stores: How
Different Are They Really? In SIGMOD, 2008.
[6] P. A. Boncz and M. L. Kersten.MIL Primitives for Querying a Fragmented World.VLDB
Journal,
8(2):101–119, 1999.
[7] P. A. Boncz, M. Zukowski, and N. Nes.MonetDB/X100: Hyper-Pipelining Query
Execution. In CIDR, 2005.
[8] N. Bruno. Teaching an Old Elephant New Tricks.In CIDR, 2009.
[9] M. Canim, G. A. Mihaila, B. Bhattacharjee, C. A. Lang, and K. A. Ross.Buffered Bloom
Filters on Solid State Storage.In ADMS, 2010.
[10] S. Chen. FlashLogging: Exploiting Flash Devices for Synchronous Logging
Performance. In SIGMOD, 2009.
[11] B. Debnath, S. Sengupta, and J. Li. FlashStore: High Throughput Persistent Key-Value
Store. In VLDB, 2010.
[12] G. Graefe. Efficient Columnar Storage in B-trees. SIGMOD Record, 36(1):3–6, 2007.
[13] A. Halverson, J. L. Beckmann, J. F. Naughton, and D. J. Dewitt.A Comparison of CStore and Row-Store in a Common Framework.Technical Report TR1570, University of
Wisconsin-Madison, 2006.
[14] Y.-R. Kim, K.-Y.Whang, and I.-Y. Song. Page-Differential Logging: An Efficient and
DBMS-Independent Approach for Storing Data into Flash Memory. In SIGMOD, 2010.
[15] P.-˚ A. Larson, C. Clinciu, E. N. Hanson, A. Oks, S. L. Price, S. Rangarajan, A. Surna,
and Q. Zhou. SQL Server Column Store Indexes.In SIGMOD, 2011.
[16] C. Mohan, D. J. Haderle, Y. Wang, and J. M. Cheng. Single Table Access Using
Multiple Indexes: Optimization, Execution, and Concurrency Control Techniques. In EDBT,
1990.
[17] S. Nath and A. Kansal.FlashDB: Dynamic Self-Tuning Database for NAND Flash. In
IPSN, 2007.
[18] S. Padmanabhan, B. Bhattacharjee, T. Malkemus, L. Cranston, and M. Huras. MultiDimensional Clustering: A New Data Layout Scheme in DB2. In SIGMOD, 2003.
[19] S. Padmanabhan, T. Malkemus, R. C. Agarwal, and A. Jhingran. Block Oriented
Processing of Relational Database Operations in Modern Computer Architectures. In ICDE,
2001.
[20] R. Ramamurthy, D. J. DeWitt, and Q. Su.A Case for Fractured Mirrors.In VLDB, 2002.
[21] V. Raman, L. Qiao, W. Han, I. Narang, Y.-L. Chen, K.-H.Yang, and F.-L. Ling. Lazy,
Adaptive RID-List Intersection, and Its Application to Index Anding.In SIGMOD, 2007.
[22] M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A.
Lin, S. Madden, E. O’Neil, P. O’Neil, A. Rasin, N. Tran, and S. Zdonik. C-Store: A ColumnOriented DBMS. In VLDB, 2005.
[23] D. Tsirogiannis, S. Harizopoulos, M. A. Shah, J. L. Wiener, and G. Graefe. Query
Processing Techniques for Solid State Drives.In SIGMOD, 2009.

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Datawarehousing
Introduction to  DatawarehousingIntroduction to  Datawarehousing
Introduction to Datawarehousingkarunakar81987
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasiryasir873
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit
 
Five Tuning Tips For Your Datawarehouse
Five Tuning Tips For Your DatawarehouseFive Tuning Tips For Your Datawarehouse
Five Tuning Tips For Your DatawarehouseJeff Moss
 
Intro to Data warehousing lecture 09
Intro to Data warehousing   lecture 09Intro to Data warehousing   lecture 09
Intro to Data warehousing lecture 09AnwarrChaudary
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineNicolas Morales
 
Db2 Important questions to read
Db2 Important questions to readDb2 Important questions to read
Db2 Important questions to readPrasanth Dusi
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouseSrinivasan R
 
The Database Environment Chapter 10
The Database Environment Chapter 10The Database Environment Chapter 10
The Database Environment Chapter 10Jeanie Arnoco
 
Dynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeDynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeClay Helberg
 
Sas Talk To R Users Group
Sas Talk To R Users GroupSas Talk To R Users Group
Sas Talk To R Users Groupgeorgette1200
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sasAjay Ohri
 
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...IBM India Smarter Computing
 

Was ist angesagt? (18)

Introduction to Datawarehousing
Introduction to  DatawarehousingIntroduction to  Datawarehousing
Introduction to Datawarehousing
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
 
Five Tuning Tips For Your Datawarehouse
Five Tuning Tips For Your DatawarehouseFive Tuning Tips For Your Datawarehouse
Five Tuning Tips For Your Datawarehouse
 
Intro to Data warehousing lecture 09
Intro to Data warehousing   lecture 09Intro to Data warehousing   lecture 09
Intro to Data warehousing lecture 09
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
 
Db2 Important questions to read
Db2 Important questions to readDb2 Important questions to read
Db2 Important questions to read
 
Optimization in essbase
Optimization in essbaseOptimization in essbase
Optimization in essbase
 
Teradata sql-tuning-top-10
Teradata sql-tuning-top-10Teradata sql-tuning-top-10
Teradata sql-tuning-top-10
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouse
 
The Database Environment Chapter 10
The Database Environment Chapter 10The Database Environment Chapter 10
The Database Environment Chapter 10
 
Dynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeDynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data Merge
 
Sas Talk To R Users Group
Sas Talk To R Users GroupSas Talk To R Users Group
Sas Talk To R Users Group
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sas
 
Designing data intensive applications
Designing data intensive applicationsDesigning data intensive applications
Designing data intensive applications
 
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
 

Ähnlich wie Column oriented Transactions

The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesTilak Patidar
 
Column store databases approaches and optimization techniques
Column store databases  approaches and optimization techniquesColumn store databases  approaches and optimization techniques
Column store databases approaches and optimization techniquesIJDKP
 
Benchmarking Scalability and Elasticity of DistributedDataba.docx
Benchmarking Scalability and Elasticity of DistributedDataba.docxBenchmarking Scalability and Elasticity of DistributedDataba.docx
Benchmarking Scalability and Elasticity of DistributedDataba.docxjasoninnes20
 
Harnessing the power of both worlds
Harnessing the power of both worldsHarnessing the power of both worlds
Harnessing the power of both worldsKaran Gulati
 
Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...Serkan Özal
 
Performance Improvement Technique in Column-Store
Performance Improvement Technique in Column-StorePerformance Improvement Technique in Column-Store
Performance Improvement Technique in Column-StoreIDES Editor
 
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
A Common Database Approach for OLTP and OLAP Using an In-Memory Column DatabaseA Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
A Common Database Approach for OLTP and OLAP Using an In-Memory Column DatabaseIshara Amarasekera
 
SAP HANA Interview questions
SAP HANA Interview questionsSAP HANA Interview questions
SAP HANA Interview questionsIT LearnMore
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
E132833
E132833E132833
E132833irjes
 
Optimization on Key-value Stores in Cloud Environment
Optimization on Key-value Stores in Cloud EnvironmentOptimization on Key-value Stores in Cloud Environment
Optimization on Key-value Stores in Cloud EnvironmentFei Dong
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxDURGADEVIL
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09guest9d79e073
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Mark Ginnebaugh
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 

Ähnlich wie Column oriented Transactions (20)

The design and implementation of modern column oriented databases
The design and implementation of modern column oriented databasesThe design and implementation of modern column oriented databases
The design and implementation of modern column oriented databases
 
Column store databases approaches and optimization techniques
Column store databases  approaches and optimization techniquesColumn store databases  approaches and optimization techniques
Column store databases approaches and optimization techniques
 
Benchmarking Scalability and Elasticity of DistributedDataba.docx
Benchmarking Scalability and Elasticity of DistributedDataba.docxBenchmarking Scalability and Elasticity of DistributedDataba.docx
Benchmarking Scalability and Elasticity of DistributedDataba.docx
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
 
Harnessing the power of both worlds
Harnessing the power of both worldsHarnessing the power of both worlds
Harnessing the power of both worlds
 
Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...
 
Performance Improvement Technique in Column-Store
Performance Improvement Technique in Column-StorePerformance Improvement Technique in Column-Store
Performance Improvement Technique in Column-Store
 
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
A Common Database Approach for OLTP and OLAP Using an In-Memory Column DatabaseA Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
 
Bo4301369372
Bo4301369372Bo4301369372
Bo4301369372
 
SAP HANA Interview questions
SAP HANA Interview questionsSAP HANA Interview questions
SAP HANA Interview questions
 
database.pdf
database.pdfdatabase.pdf
database.pdf
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
E132833
E132833E132833
E132833
 
Optimization on Key-value Stores in Cloud Environment
Optimization on Key-value Stores in Cloud EnvironmentOptimization on Key-value Stores in Cloud Environment
Optimization on Key-value Stores in Cloud Environment
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 

Kürzlich hochgeladen

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 

Kürzlich hochgeladen (20)

Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 

Column oriented Transactions

  • 2. INTRODUCTION OBJECTIVE AND SCOPE Column-oriented DBMSs have gained increasing interest dueto their superior performance for analytical workloads. Priorefforts tried to determine the possibility of simulating the Query processing techniques of column-oriented systems inrow-oriented databases, in a hope to improve their performance,especially for OLAP and data warehousing applications. We show that column-oriented queryprocessing can significantly improve the performance of row-orientedDBMSs. We introduce new operators that take intoaccount the unique characteristics of data obtained from indexes,and exploit new technologies such as flash SSDs andmulti-core processors to boost the performance. We demonstrateour approach with an experimental study using a prototypebuilt on a commercial row-oriented DBMS. Recently, column-oriented database systems (also knownas column stores) have been receiving a lot of attention. The main difference between column stores and the traditionalrow-oriented database systems (also known as row stores)is in the way the data is physically stored. As the namesimply, both kinds of database systems store data differently. In row-stores, the DBMS stores all the attribute values ofa single row in contiguous space on disk. Column-orientedDBMSs, on the other hand, store the data in a column-wisemanner (the values of a column are stored contiguously).This difference in data storage has an implication on dataaccess. In row stores, a whole row of data has to be read fromdisk, even if only few columns of that row are needed to answera query. This means that the system may have to readmuch more data than it actually needs. In column stores,only the required columns are read from disk, although someextra processing is needed to construct the output tuplesfrom these individual columns. However, when it comes todata updates, column stores tend to be at a disadvantage.Since data updates (or inserts) usually happen at row granularity,multiple accesses are needed to insert or update therelevant columns of each updated row. In row stores, thewhole row is written in a single operation.This difference in access patterns and performance hasmade column stores more suitable for workloads that areread-intensive with few or no updates, e.g., analytical processingworkloads, such as those found in data warehousesor decision support systems.
  • 3. Related Area We describe a column-oriented query processingmechanism for row stores based on indexonly plans,operating on single-column indexes. In our approach, onlythe indexes relevant to the query are read, without readinganything from the underlying base tables. This leads toavoiding the cost of reading the entire table rows when onlya few columns are referenced in the query. Seminar is related to DBMS and DATA WAREHOUSING.
  • 4. Process Description These column-oriented designs include: _ vertically partitioning the tables in the database intoa set of two-column tables, each consisting of (key,attribute) pairs, so that only the necessary columnsneed to be read to answer a query. The key is used tojoin these tables to reconstruct the output tuples. Using index-only plans; by creating a collection of indexes that cover all of the columns used in a query; it is possible for the DBMS to answer a query without ever going to the underlying (row-oriented) tables. Such column-orientedphysical designs do not improve the performance of rowstores. In fact, in their results, these techniques result inworse performance than the baseline row-oriented processing. Above depicts the results of their experiments (normalizedrelative to the row store time). This behaviour wasmainly because they tried to employ these techniques withoutany changes to the underlying systems, which causedthe optimizer to make some bad decisions and use very expensiveoperations. However, if there are operators that arespecially designed for column-oriented processing, this canlead to a significant performance improvement. We demonstrate the potential of column-orientedquery processing in row stores, focusing on usingindex-only plans, but with query operators that take advantageof the characteristics of the data obtained from theindexes. Our prototype is based on a commercial row store(which we shall refer to as System A). We compare the performanceof our technique to the baseline performance ofSystem A, as well as to that of a leading open-source columnstore (which we shall refer to as Col-DB). Our resultsshow that column-oriented processing can significantly improvethe performance of row stores for analytical workloads,and even approach the performance of a column store. Insummary, we make the following contributions:
  • 5. 1. We show that column-oriented query processing techniques,when employed using proper operators andmechanisms, can significantly improve the performanceof a row store, and even approach the performance ofa column-store. Our main focus is index-only plans. 2. We introduce new ways of performing index intersectionthat leverage parallel processing, to be used inquery execution using index-only plans; namely theindex-merge, index-merge-join, and indexhash-join operators.We describe the algorithms, and present a costanalysis for each operator. 3. We highlight the advantage of exploiting new hardwaretechnologies – namely flash SSDs and multi-coreprocessors – to reduce query processing time. 4. We demonstrate the effectiveness of our approach withan extensive experimental study using a prototype basedon System A, and we compare the performance of ourapproach to those of a baseline row store (System A)and column store (Col-DB). We describe a column-oriented query processingmechanism for row stores based on index-only plans,operating on single-column indexes. In our approach, onlythe indexes relevant to the query are read, without readinganything from the underlying base tables. This leads toavoiding the cost of reading the entire table rows when onlya few columns are referenced in the query. Weintroduce new specialized ways of combining the data obtainedfrom the indexes that take advantage of the uniquecharacteristics of database indexes, instead of completely relyingon the existing database operators.Our algorithmsare designed specifically to take advantage of parallelprocessing whenever possible, which leads to better performance. The new operators are: Index Merge Merges the sorted RID-lists associated withindividual values from the index, producing a (RID,value) list for the whole index, in RID order. Index Merge Join Performs an N-way merge join operationbetween the (RID, value) lists coming from severalindexes. The input lists have to be in RID order (e.g.,produced by the Index Merge operator). The outputis a (RID, data) list, also in RID order, where the datais a collection of values). Index Hash Join Performs an N-way hash join operationbetween N (RID, data) lists. The input lists can bethe output of any of the previous two operators, anotherIndex Hash Join operator, or the traditional Index Scan operator that exists in most row stores. Theinput lists do not need to be in any order. One of theinput lists is used to build the hash table while theothers are used for probing. The output is a (RID,data) list.
  • 6. Testing Technologies  INDEX MERGE (IXMG)  INDEX MERGE JOIN (IXMGJN)  Parallel Processing     INDEX HASH JOIN (IXHSJN) In-place update (IPU) Linked List (LL) Linked List with Partitioning (LLP)
  • 7. Resources and Limitations All of our experiments were run on an IBM System x3400(Model 7974) server, with a 1.6 GHz single processor, dualcore Intel Xeon CPU with 4 GB of RAM running FedoraLinux. The server has a 1 TB Seagate SATA disk, with105 MB/sec sustained data rate at 7.2K RPM. The serveralso has an 80 GB Fusion-IO IO-Disk SSD with a read bandwidthof 700 MB/sec and a write bandwidth of 550 MB/sec. The numbers we report are the average of several runs, andare based on a cold buffer pool. We built a prototype usingSystem A. We used two TPC-H [2] data sets for ourexperiments, with scale factors 10 and 30. The lineitemtable(fact table with 16 columns) contains 60 million rowsin the 10GB database, and 180 million rows in the 30GBdatabase. A typical data warehousing query involves thefact table (lineitem) and one or more dimension tables, withpredicates on the fact and/or dimension tables.
  • 8. Future scope and further enhancement Several recent papers [3, 5, 8] have tackled the issue ofcomparing the performance of row stores and column stores,and proposing optimizations for row stores in order for themto compete with column stores for analytical workloads. The work in [5] compares row stores and column stores.In particular, the authors try to determine whether columnstore performance can be achieved in row stored usingcolumnoriented query processing. They simulate column-orientedprocessing by using both vertical partitioning andindex-only plans in a commercial database system (referredto in the paper as System X"), and compare the performanceof this system to the performance of C-Store [22]. In above Figure, System X performed very poorly withvertical partitioning, and even worse with index-only plans.The reason for such a poor performance is because SystemX had to join the (RID, value) lists coming from each indexusing a series of 2-way hash joins, each of which builds anew hash table, which can be very expensive. Bruno [8] argues that it is possible to achieve the performanceof column stores (or very close to it) within row storeswithout any changes to the DBMS. The author proposes amethod to store data in a compressed form, using a separatetable for each column in the original schema (similar tothe vertical partitioning idea), but with these tables (called“c-tables”) storing data in a format similar to run-length encoding. The results show great performance improvement.However, this scheme requires creating a c-table for everycolumn, and creating multiple indexes on every c-table, thusthe storage requirements can grow rapidly. Additionally, thedependency between the tuples in the c-table causes insertingnew rows to become very expensive, even for workloadswith infrequent updates.Several optimizations have been implemented or proposedfor row-stores to approach column-store performance, suchas super-tuples, using a column-based layout within eachdata page, operating on compressed data, block processing,mirroring, invisible joins, late materialization, and column-storeindexes [4, 12, 13, 15, 19, 20, 23, 24]. The idea of using multiple indexes to access data has beenstudied for some time. Mohan et al. [16] introduced methodsfor using multiple indexes together to access base tables. Raman et al. [21] presented two algorithms to perform theRID-list intersection operation, which are comparable to thealgorithms we use for the low-cardinality indexes. Flash memory has received increasing interest as a stablestorage media that can overcome the access bottlenecks ofmagnetic disks. Researchers have considered modifying existingalgorithms and data structures to make use of flashmemory. Flash-DB [17] is a selftuning flash-based index thatdynamically adapts to the mix of reads and writes in theworkload. Flash-Logging [10] exploits using flash in appendmode for synchronous logging.
  • 9. Conclusion Column-oriented processing can indeed improve query performancein row stores significantly. Such performance improvementscan only be seen when changes are made to thedatabase system. These changes should also make use ofnewly emerging technologies. For example, multi-core processorsprovide a level of parallelism that can achieve betterperformance, and new media such as flash memory canbe particularly used as temporary storage, as it providesfast random access. Our work exploits these technologies tobring the performance of a row store close to that of a columnstore while retaining roworiented capabilities (such asfast single-row lookups). This extends the scope for which asingle database system can be used, where one does not needseparate software for the warehouse and the OLTP system.
  • 10. References [1] MySQL, http://http://www.mysql.com/. [2] TPC-H Benchmark, http://www.tpc.org/tpch. [3] D. J. Abadi, P. A. Boncz, and S. Harizopoulos. Column oriented Database Systems. PVLDB, 2(2):1664–1665, 2009. [4] D. J. Abadi, S. Madden, and M. Ferreira. Integrating Compression and Execution in Column-oriented Database Systems.In SIGMOD, 2006. [5] D. J. Abadi, S. R. Madden, and N. Hachem. Column-Stores vs. Row-Stores: How Different Are They Really? In SIGMOD, 2008. [6] P. A. Boncz and M. L. Kersten.MIL Primitives for Querying a Fragmented World.VLDB Journal, 8(2):101–119, 1999. [7] P. A. Boncz, M. Zukowski, and N. Nes.MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, 2005. [8] N. Bruno. Teaching an Old Elephant New Tricks.In CIDR, 2009. [9] M. Canim, G. A. Mihaila, B. Bhattacharjee, C. A. Lang, and K. A. Ross.Buffered Bloom Filters on Solid State Storage.In ADMS, 2010. [10] S. Chen. FlashLogging: Exploiting Flash Devices for Synchronous Logging Performance. In SIGMOD, 2009. [11] B. Debnath, S. Sengupta, and J. Li. FlashStore: High Throughput Persistent Key-Value Store. In VLDB, 2010. [12] G. Graefe. Efficient Columnar Storage in B-trees. SIGMOD Record, 36(1):3–6, 2007. [13] A. Halverson, J. L. Beckmann, J. F. Naughton, and D. J. Dewitt.A Comparison of CStore and Row-Store in a Common Framework.Technical Report TR1570, University of Wisconsin-Madison, 2006. [14] Y.-R. Kim, K.-Y.Whang, and I.-Y. Song. Page-Differential Logging: An Efficient and DBMS-Independent Approach for Storing Data into Flash Memory. In SIGMOD, 2010. [15] P.-˚ A. Larson, C. Clinciu, E. N. Hanson, A. Oks, S. L. Price, S. Rangarajan, A. Surna, and Q. Zhou. SQL Server Column Store Indexes.In SIGMOD, 2011. [16] C. Mohan, D. J. Haderle, Y. Wang, and J. M. Cheng. Single Table Access Using Multiple Indexes: Optimization, Execution, and Concurrency Control Techniques. In EDBT, 1990. [17] S. Nath and A. Kansal.FlashDB: Dynamic Self-Tuning Database for NAND Flash. In IPSN, 2007. [18] S. Padmanabhan, B. Bhattacharjee, T. Malkemus, L. Cranston, and M. Huras. MultiDimensional Clustering: A New Data Layout Scheme in DB2. In SIGMOD, 2003. [19] S. Padmanabhan, T. Malkemus, R. C. Agarwal, and A. Jhingran. Block Oriented Processing of Relational Database Operations in Modern Computer Architectures. In ICDE, 2001. [20] R. Ramamurthy, D. J. DeWitt, and Q. Su.A Case for Fractured Mirrors.In VLDB, 2002. [21] V. Raman, L. Qiao, W. Han, I. Narang, Y.-L. Chen, K.-H.Yang, and F.-L. Ling. Lazy, Adaptive RID-List Intersection, and Its Application to Index Anding.In SIGMOD, 2007. [22] M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O’Neil, P. O’Neil, A. Rasin, N. Tran, and S. Zdonik. C-Store: A ColumnOriented DBMS. In VLDB, 2005. [23] D. Tsirogiannis, S. Harizopoulos, M. A. Shah, J. L. Wiener, and G. Graefe. Query Processing Techniques for Solid State Drives.In SIGMOD, 2009.