Weitere ähnliche Inhalte Ähnlich wie PostgreSQL and JDBC: striving for high performance (20) Kürzlich hochgeladen (20) PostgreSQL and JDBC: striving for high performance1. © 2016 NetCracker Technology Corporation Confidential
PostgreSQL and JDBC: striving for top performance
Vladimir Sitnikov
PgConf 2016
2. 2© 2016 NetCracker Technology Corporation Confidential
About me
• Vladimir Sitnikov, @VladimirSitnikv
• Performance architect at NetCracker
• 10 years of experience with Java/SQL
• PgJDBC committer
3. 3© 2016 NetCracker Technology Corporation Confidential
Explain (analyze, buffers) PostgreSQL and JDBC
•Data fetch
•Data upload
•Performance
•Pitfalls
4. 4© 2016 NetCracker Technology Corporation Confidential
Intro
Fetch of a single row via primary key lookup takes
20ms. Localhost. Database is fully cached
A. Just fine C. Kidding? Aim is 1ms!
B. It should be 1sec D. 100us
5. 5© 2016 NetCracker Technology Corporation Confidential
Lots of small queries is a problem
Suppose a single query takes 10ms, then 100 of
them would take a whole second *
* Your Captain
6. 6© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
•Simple query
• 'Q' + length + query_text
•Extended query
•Parse, Bind, Execute commands
7. 7© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Super extended query
https://github.com/pgjdbc/pgjdbc/pull/478
backend protocol wanted features
8. 8© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Simple query
•Works well for one-time queries
•Does not support binary transfer
9. 9© 2016 NetCracker Technology Corporation Confidential
PostgreSQL frontend-backend protocol
Extended query
•Eliminates planning time
•Supports binary transfer
10. 10© 2016 NetCracker Technology Corporation Confidential
PreparedStatement
Connection con = ...;
PreparedStatement ps =
con.prepareStatement("SELECT...");
...
ps.close();
11. 11© 2016 NetCracker Technology Corporation Confidential
PreparedStatement
Connection con = ...;
PreparedStatement ps =
con.prepareStatement("SELECT...");
...
ps.close();
12. 12© 2016 NetCracker Technology Corporation Confidential
Smoker’s approach to PostgreSQL
PARSE S_1 as ...; // con.prepareStmt
BIND/EXEC
DEALLOCATE // ps.close()
PARSE S_2 as ...;
BIND/EXEC
DEALLOCATE // ps.close()
13. 13© 2016 NetCracker Technology Corporation Confidential
Healthy approach to PostgreSQL
PARSE S_1 as ...;
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
...
DEALLOCATE
14. 14© 2016 NetCracker Technology Corporation Confidential
Healthy approach to PostgreSQL
PARSE S_1 as ...; 1 once in a life
BIND/EXEC REST call
BIND/EXEC
BIND/EXEC one more REST call
BIND/EXEC
BIND/EXEC
...
DEALLOCATE “never” is the best
15. 15© 2016 NetCracker Technology Corporation Confidential
Happiness closes no statements
Conclusion №1: in order to get top performance,
you should not close statements
ps = con.prepareStatement(...)
ps.execueQuery();
ps = con.prepareStatement(...)
ps.execueQuery();
...
16. 16© 2016 NetCracker Technology Corporation Confidential
Happiness closes no statements
Conclusion №1: in order to get top performance,
you should not close statements
ps = con.prepare...
ps.execueQuery();
ps = con.prepare...
ps.execueQuery();
...
17. 17© 2016 NetCracker Technology Corporation Confidential
Unclosed statements in practice
@Benchmark
public Statement leakStatement() {
return con.createStatement();
}
pgjdbc < 9.4.1202, -Xmx128m, OracleJDK 1.8u40
# Warmup Iteration 1: 1147,070 ns/op
# Warmup Iteration 2: 12101,537 ns/op
# Warmup Iteration 3: 90825,971 ns/op
# Warmup Iteration 4: <failure>
java.lang.OutOfMemoryError: GC overhead limit
exceeded
18. 18© 2016 NetCracker Technology Corporation Confidential
Unclosed statements in practice
@Benchmark
public Statement leakStatement() {
return con.createStatement();
}
pgjdbc >= 9.4.1202, -Xmx128m, OracleJDK 1.8u40
# Warmup Iteration 1: 30 ns/op
# Warmup Iteration 2: 27 ns/op
# Warmup Iteration 3: 30 ns/op
...
19. 19© 2016 NetCracker Technology Corporation Confidential
Statements in practice
• In practice, application is always closing the
statements
• PostgreSQL has no shared query cache
• Nobody wants spending excessive time on
planning
20. 20© 2016 NetCracker Technology Corporation Confidential
Server-prepared statements
What can we do about it?
• Wrap all the queries in PL/PgSQL
• It helps, however we had 100500 SQL of them
• Teach JDBC to cache queries
21. 21© 2016 NetCracker Technology Corporation Confidential
Query cache in PgJDBC
• Query cache was implemented in 9.4.1202 (2015-
08-27)
see https://github.com/pgjdbc/pgjdbc/pull/319
• Is transparent to the application
• We did not bother considering PL/PgSQL again
• Server-prepare is activated after 5 executions
(prepareThreshold)
22. 22© 2016 NetCracker Technology Corporation Confidential
Where are the numbers?
• Of course, planning time depends on the query
complexity
• We observed 20мс+ planning time for OLTP
queries: 10KiB query, 170 lines explain
• Result is ~0ms
24. 24© 2016 NetCracker Technology Corporation Confidential
Generated queries are bad
• If a query is generated
• It results in a brand new java.lang.String object
• Thus you have to recompute its hashCode
25. 25© 2016 NetCracker Technology Corporation Confidential
Parameter types
If the type of bind value changes, you have to
recreate server-prepared statement
ps.setInt(1, 42);
...
ps.setNull(1, Types.VARCHAR);
26. 26© 2016 NetCracker Technology Corporation Confidential
Parameter types
If the type of bind value changes, you have to
recreate server-prepared statement
ps.setInt(1, 42);
...
ps.setNull(1, Types.VARCHAR);
It leads to DEALLOCATE PREPARE
27. 27© 2016 NetCracker Technology Corporation Confidential
Keep data type the same
Conclusion №1
• Even NULL values should be properly typed
28. 28© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
If using prepared statements, the response time
gets 5'000 times slower. How’s that possible?
A. Bug C. Feature
B. Feature D. Bug
29. 29© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
select *
from plan_flipper -- <- table
where skewed = 0 -- 1M rows
and non_skewed = 42 -- 20 rows
30. 30© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
0.1ms 1st execution
0.05ms 2nd execution
0.05ms 3rd execution
0.05ms 4th execution
0.05ms 5th execution
250 ms 6th execution
31. 31© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
0.1ms 1st execution
0.05ms 2nd execution
0.05ms 3rd execution
0.05ms 4th execution
0.05ms 5th execution
250 ms 6th execution
32. 32© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
• Who is to blame?
• PostgreSQL switches to generic plan after 5
executions of a server-prepared statement
• What can we do about it?
• Add +0, OFFSET 0, and so on
• Pay attention on plan validation
• Discuss the phenomenon pgsql-hackers
33. 33© 2016 NetCracker Technology Corporation Confidential
Unexpected degradation
https://gist.github.com/vlsi -> 01_plan_flipper.sql
We just use +0 to forbid index on a bad column
select *
from plan_flipper
where skewed+0 = 0 ~ /*+no_index*/
and non_skewed = 42
34. 34© 2016 NetCracker Technology Corporation Confidential
Explain explain explain explain
The rule of 6 explains:
prepare x(number) as select ...;
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 1ms
explain analyze execute x(42); -- 10 sec
36. 36© 2016 NetCracker Technology Corporation Confidential
Decision problem
There’s a schema A with table X, and a schema B
with table X. What is the result of select * from
X?
A.X C. Error
B.X D. All of the above
37. 37© 2016 NetCracker Technology Corporation Confidential
Search_path
There’s a schema A with table X, and a schema B
with table X. What is the result of select * from
X?
• search_path determines the schema used
• server-prepared statements are not prepared for
search_path changes crazy things might
happen
38. 38© 2016 NetCracker Technology Corporation Confidential
Search_path can go wrong
• 9.1 will just use old OIDs and execute the “previous”
query
• 9.2-9.5 might fail with "cached plan must not change
result type” error
39. 39© 2016 NetCracker Technology Corporation Confidential
Search_path
What can we do about it?
• Keep search_path constant
• Discuss it in pgsql-hackers
• Set search_path + server-prepared statements =
cached plan must not change result type
• PL/pgSQL has exactly the same issue
40. 40© 2016 NetCracker Technology Corporation Confidential
To fetch or not to fetch
You are to fetch 1M rows 1KiB each, -Xmx128m
while (resultSet.next())
resultSet.getString(1);
A. No problem C. Must use
LIMIT/OFFSET
B. OutOfMemory D. autoCommit(false)
41. 41© 2016 NetCracker Technology Corporation Confidential
To fetch or not to fetch
• PgJDBC fetches all rows by default
• To fetch in batches, you need
Statement.setFetchSize and
connection.setAutoCommit(false)
• Default value is configurable via
defaultRowFetchSize (9.4.1202+)
42. 42© 2016 NetCracker Technology Corporation Confidential
fetchSize vs fetch time
6.48
2.28
1.76
1.04 0.97
0
2
4
6
8
10 50 100 1000 2000
Faster,ms
fetchSize
2000 rows
2000 rows
select int4, int4, int4, int4
43. 43© 2016 NetCracker Technology Corporation Confidential
FetchSize is good for stability
Conclusion №2:
• For stability & performance reasons set
defaultRowFetchSize >= 100
44. 44© 2016 NetCracker Technology Corporation Confidential
Data upload
For data uploads, use
• INSERT() VALUES()
• INSERT() SELECT ?, ?, ?
• INSERT() VALUES() executeBatch
• INSERT() VALUES(), (), () executeBatch
• COPY
45. 45© 2016 NetCracker Technology Corporation Confidential
Healty batch INSERT
PARSE S_1 as ...;
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
BIND/EXEC
...
DEALLOCATE
46. 46© 2016 NetCracker Technology Corporation Confidential
TCP strikes back
JDBC is busy with sending
queries, thus it has not started
fetching responses yet
DB cannot fetch more queries since it
is busy with sending responses
47. 47© 2016 NetCracker Technology Corporation Confidential
Batch INSERT in real life
PARSE S_1 as ...;
BIND/EXEC
BIND/EXEC
SYNC flush & wait for the response
BIND/EXEC
BIND/EXEC
SYNC flush & wait for the response
...
48. 48© 2016 NetCracker Technology Corporation Confidential
TCP deadlock avoidance
• PgJDBC adds SYNC to your nice batch operations
• The more the SYNCs the slower it performs
49. 49© 2016 NetCracker Technology Corporation Confidential
Horror stories
A single line patch makes insert batch 10 times faster:
https://github.com/pgjdbc/pgjdbc/pull/380
- static int QUERY_FORCE_DESCRIBE_PORTAL = 128;
+ static int QUERY_FORCE_DESCRIBE_PORTAL = 512;
...
// 128 has already been used
static int QUERY_DISALLOW_BATCHING = 128;
50. 50© 2016 NetCracker Technology Corporation Confidential
Trust but always measure
• Java 1.8u40+
• Core i7 2.6Ghz
• Java microbenchmark harness
• PostgreSQL 9.5
51. 51© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c)
values(?, ?, ?)
52. 52© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c)
values(?, ?, ?)
53. 53© 2016 NetCracker Technology Corporation Confidential
Queries under test: INSERT
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test(a, b, c)
values (?, ?, ?), (?, ?, ?), (?, ?,
?), (?, ?, ?), (?, ?, ?), (?, ?, ?),
(?, ?, ?), (?, ?, ?), (?, ?, ?), ...;
54. 54© 2016 NetCracker Technology Corporation Confidential
Тестируемые запросы: COPY
pgjdbc/ubenchmark/InsertBatch.java
COPY batch_perf_test FROM STDIN
1 s1 1
2 s2 2
3 s3 3
...
55. 55© 2016 NetCracker Technology Corporation Confidential
Queries under test: hand-made structs
pgjdbc/ubenchmark/InsertBatch.java
insert into batch_perf_test
select * from
unnest('{"(1,s1,1)","(2,s2,2)",
"(3,s3,3)"}'::batch_perf_test[])
56. 56© 2016 NetCracker Technology Corporation Confidential
You’d better use batch, your C.O.
2
16
128
0
50
100
150
16 128 1024
faster,ms
The number of inserted rows
Insert
Batch
Struct
Copy
int4, varchar, int4
57. 57© 2016 NetCracker Technology Corporation Confidential
COPY is good
0
0.5
1
1.5
2
2.5
16 128 1024
Faster,ms
The number of inserted rows
Batch
Struct
Copy
int4, varchar, int4
58. 58© 2016 NetCracker Technology Corporation Confidential
COPY is bad for small batches
0
5
10
15
20
25
1 4 8 16 128
Faster,ms
Batch size in rows
Batch
Struct
Copy
Insert of 1024 rows
59. 59© 2016 NetCracker Technology Corporation Confidential
Final thoughts
• PreparedStatement is our hero
• Remember to EXPLAIN ANALYZE at least six
times, a blue moon is a plus
• Don’t forget +0 and OFFSET 0
60. 60© 2016 NetCracker Technology Corporation Confidential
About me
• Vladimir Sitnikov, @VladimirSitnikv
• Performance architect in NetCracker
• 10 years of experience with Java/SQL
• PgJDBC committer
61. © 2016 NetCracker Technology Corporation Confidential
Questions?
Vladimir Sitnikov,
PgConf 2016