When your query execution is slow, a couple of questions arise. Where to look for resources utilization? What tools do you have to analyze CPU, hard drive and RAM bottlenecks? Could you do something to reduce query execution time? MariaDB's Patrick LeBlanc and Roman Nozdrin touch on both Columnstore's query execution introspection tools as well as operating system capabilities that everyone should know about. They go on to discuss a number of real life use cases too. Some called for configuration changes whilst others forced them to make serious changes in the code.
2. What will we have today?
● Relevant but rarely used Columnstore.xml settings
● Computer resources overview
● CS monitoring and insights tools
● Query performance tips
...No magic wand certificates though
3. Query Performance Overview
● Does a query run fast enough?
● If not, why?
● Can we speed the query processing up?
4. Row-oriented vs. Column-oriented format
ID Fname Lname State Zip Phone Age Sex
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
ID
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F
SELECT Fname FROM Table 1 WHERE State = 'NY' ● Row oriented
○ Rows stored
sequentially in a file
○ Scans through every
record row by row
● Column oriented
○ Each column is
stored in a separate
file
○ Scans the only
relevant column
5. Data Loading and Extents
CSV File
Extent 1
Min 1
Max 100
Extent 2
Min 105
Max 200
8 million rows
8 million rows
Data loadData Range
1 ~ 200
Rows 16 million
New CSV File
Data Range
150 ~ 210
Rows 16 million
Extent 3
Min 150
Max 165
Extent 4
Min 162
Max 192
8 million rows
8 million rows
Data load
Second Data Load
7. Data Ingestion
● Load data ordered by the columns you filter most often for maximum IO elimination
● If you want to drop partitions based on a particular column, order by that column first
8. Data Modeling
● Conservative data typing reduces IO, compute, and memory requirements
❖ Short strings (up to char(8) and varchar(7)) are handled internally as integers
● Star-schema optimizations are generally a good idea
● Break down compound fields into individual fields
❖ Trivializes searching for sub-fields
❖ Can allow greater usage of short strings
9. Take Advantage of Push-Down Operations
● Filters
● Aggregates
● Functions & expressions
● Joins
10. What is not Pushed Down
● Having
● Window Functions
● ORDER BY
● LIMIT
11. Common Pitfalls
● It is OLAP, not OLTP
❖ single-row inserts
● Updating columns that upset the import sort order
● Top-level order by clause
12. Troubleshooting Queries that are Still Too Slow
● Given what you know about ColumnStore operation, can the query be improved?
● What does your resource usage look like? Are there bottlenecks?
13. Computer resources & bottlenecks
● CPU
● Storage: SSD, HDD
● Memory
● Network
...and there could be algorithmic bottlenecks
14. Computer resources utilization
● Utilization is a broad metric and gives no details
● < 100% utilization doesn’t mean you can improve the situation
● 100% utilization doesn’t mean you can’t improve the situation
15. Computer resources: CPU
● Use top, htop and friends for CPU utilization
❖ instructions Per Clock rate differs - use HyperThreading (perf stat)
● A CPU core could be 100% utilized:
❖ CPU may be busy waiting for data from cache or RAM (perf record)
❖ CPU frequency could scaled down by the OS (turbostat, dmesg)
16. Computer resources: CPU
● CPU is 50 % utilized
❖ code is optimized thus Hyper Threading won’t give a gain
❖ algorithmic limitations or waiting for Storage or Network
17.
18. Computer resources: Memory
● Default Linux memory allocator doesn’t reuse mmap-segments
● Tooling: free, vmstat, top:
❖ top shows both Virtual and Resident memory
● And the most important don’t ever use swap on production DBMS systems.
❖ free doesn’t show how much memory is actually available
❖ jemalloc works using madvice though
19.
20.
21. Computer resources: Storage
❖ The application’s read buffer isn’t big enough if O_DIRECT is used or
readahead isn’t set
❖ There could be very short and undetected 100% spikes
● Data at rest compression is important
● Application fully utilizes CPU but Storage is underutilized
● Tooling: iostat, iotop, dstat, sar
24. Queries and where to find them
● mcsadmin getActiveSQLStatements
mcsadmin> getActiveSQLStatements
getactivesqlstatements Wed Oct 7 08:38:32 2015
Get List of Active SQL Statements
=================================
Start Time Time (hh:mm:ss) Session ID SQL Statement
---------------- ---------------- --------------------
------------------------------------------------------------
Oct 7 08:38:30 00:00:03 73 select c_name,sum(lo_revenue) from customer, lineorder where
lo_custkey = c_custkey and c_custkey = 6 group by c_name
https://mariadb.com/kb/en/library/analyzing-queries-in-columnstore/#getactivesqlstatements
25. Queries and where to find them
● Query log structure
● debug.log produced by syslog
Feb 5 08:36:02 0bc58638bf11 ExeMgr[26783]: 02.772767 |10|0|0| D 16 CAL0041: Start SQL statement: select * from cs1; |test|
Feb 5 08:36:02 log timestamp
0bc58638bf11 hostname
ExeMgr process name
[26783] PID
02.772767 log timestamp in microseconds
10 session ID
0 id1
0 id2
D syslog facility
16 CS facility ID
CAL0041 log message type
Start SQL statement: select * from cs1; Message body
|test| database name
❖ MariaDB show log also could be used
26. What does the query do?
● Use calgettrace/calsettrace to get actual execution plan
● CS has its internal query representation
29. IO optimization: read
● Extent partitioning data could be marked valid or invalid
● CS doesn’t consider invalid extents for extent elimination
● Use I_S(columnstore_extents) or editem to look at extents
❖ Becomes valid the next time the extent is scanned
30.
31. Data insertion. Who is the fastest ?
But we are going to make them blazingly fast
● Try to avoid DELETE and UPDATE for the same reason
● cpimport (fast, native)
● INSERT..SELECT (uses disabled vtable mode for SELECT)
● mcsimport (works from Windows, uses bulk write API)
● INSERT (Don’t use INSERT. It is slow)
32. IO optimization: cpimport writes
● set RowsPerBatch to reduce per record cost
● Use ramdisk for TmpDir b/c cpimport saves extra data for rollback
❖ disk path must be used for TempFilePath
33. HASH GROUP BY operation:
Vic |1.0
Robert |25.2
Vic |999.9
Maria |41.1
Kevin |90.25
Robert |2.01
name | money
1 | 41.1 | Maria
1 | 25.2 | Robert
1 | 2.01 | Robert
hash(name)|sum(money)|name
2 | 999.9 | Vic
2 | 1.0 | Vic
hash(name)|sum(money)|name
3 | 90.25 | Kevin
hash(name)|sum(money)|name
HASH table
Bucket 1
Bucket 2
Bucket 3
Kevin | 90.25
Maria | 41.1
Robert | 27.21
Vic | 1000.9
name | sum(money)
34. GROUP BY optimization:
● XML settings
● per session infinidb_um_mem_limit
❖ RowAggrThreads
❖ RowAggrBuckets
36. QoS: long queries VS short queries
● XML settings:
❖ MaxOutstandingRequests (MOR value is 20 by default)
MariaDB AX(Columnstore)
mysqld
ExeMgr
User Module
WriteEngine/ProProc
Columnstore Storage
WriteEngine/PrimProc
Columnstore Storage
Performance Module 1
Performance Module 2
MOR
MOR