5. Reduced IO
Fetches only needed columns
from disk
SELECT C2, SUM (C3) …
C2 C3 Columns are compressed
C1 C4 C5 C6
Less IO
Better buffer hit rates
6. New query execution technology
• Batch mode execution of some operations
– processes rows in batches
– groups of batch operations in query plan
• Better parallelism, better algorithms
7. Dictionary-based compression
Year of Code
Birth
1996 1 Internal Dictionary
1975 15
Year of
1948 50
Birth
1932 58 On-the-fly build dictionary
1996 … 60
with all distinct value.
1975 Substitute non-selective
values with ID.
1975 Index in our example – 6
bits per row.
1948 Year of
Birth
1932 Code
1 Compressed Fact
…
15
15
50
58
60
8. Segments
C1 C2 C3 C4 C5 C6 Column segment
Set of about contains values from
1M rows one column for a set
of about 1M rows
Column segments
are compressed
Each column
segment stored in
separate LOB
Column segment is
Column unit of transfer from
Segment disk
11. Best practices / worst practices
• Best practices:
– Put columnstore indexes on large tables only.
– Include every column of the table in the columnstore
index.
– Structure your queries as star joins with grouping and
aggregation as much as possible.
• Worst practices:
– Avoid JOIN and/or filter on string columns in the table
with columnstore index.
– Avoid OUTER JOIN, UNION ALL, IN/NOT IN.
– Avoid JOIN between 2 Fact tables.