SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Unleashing
PostgreSQL Performance
Andres Freund
PostgreSQL Developer & Committer
Email: andres@anarazel.de
Email: andres.freund@enterprisedb.com
Twitter: @AndresFreundTec
anarazel.de/talks/2018-06-06-pgvision-performance/performance.pdf
PostgreSQL [Adoption] Bottlenecks
andres@alap4:~$ psql tpch_1
psql (11beta1)
Type "help" for help.
tpch_1[10841][1]=# quit
andres@alap4:~$
Auto-Failover in PostgreSQL
-
How?
Bloat: zheap
Parallelism
Partitioning
Efficiency
Analytics / Transactional Workloads
SELECT
l_returnflag,
l_linestatus,
sum(l_quantity) AS sum_qty,
sum(l_extendedprice) AS sum_base_price,
sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
avg(l_quantity) AS avg_qty,
avg(l_extendedprice) AS avg_price,
avg(l_discount) AS avg_disc,
count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - interval '74 days'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
New Hash-Table & Expression Evaluation
Engine Rewrite
Sort (cost=4313533.34..4313533.36 rows=6 width=68)
Sort Key: l_returnflag, l_linestatus
-> HashAggregate (cost=4313533.16..4313533.26 rows=6 width=68)
Group Key: l_returnflag, l_linestatus
Output: …, sum(l_quantity), sum(l_extendedprice), sum(…), ...
-> Seq Scan on lineitem (cost=0.00..1936427.80 rows=59427634 width=36)
Filter: (l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone)
New Hash-Table & Expression Evaluation
Engine Rewrite
ExecMakeFunctionResultNoSets
ExecEvalFuncArgs
ExecEvalConstExecEvalScalarVarFast
ExecEvalAnd
ExecMakeFunctionResultNoSets
ExecEvalFuncArgs
ExecEvalConstExecEvalScalarVarFast
WHERE a.col < 10 AND a.another = 3
< v10 Expression Evaluation Engine
v10+ Expression Evaluation Engine
●
WHERE a.col < 10 AND a.another = 3
– EEOP_SCAN_FETCHSOME (deform necessary cols)
– EEOP_SCAN_VAR (a.col)
– EEOP_CONST (10)
– EEOP_FUNCEXPR_STRICT (int4lt)
– EEOP_BOOL_AND_STEP_FIRST
– EEOP_SCAN_VAR (a.another)
– EEOP_CONST (3)
– EEOP_FUNCEXPR_STRICT (int4eq)
– EEOP_BOOL_AND_STEP_LAST (AND)
●
direct threaded
●
lots of indirect jumps
PG 9.5 PG 9.6 PG 10
0
5
10
15
20
25
30
35
40
45
TPCH Q-01
scale 5, fully cached
#Processes=1
#Processes=4
Version
TimeinSeconds
v10+ Expression Evaluation Engine
Just-in-Time Compilation
●
Interpreted execution → compiled execution
– removes interpretation overhead
●
Make variables constant
– removes branches, indirect jumps, reduces size
●
Inline called functions
●
PG 11: Optionally uses LLVM
SELECT
l_returnflag,
l_linestatus,
sum(l_quantity) AS sum_qty,
sum(l_extendedprice) AS sum_base_price,
sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
avg(l_quantity) AS avg_qty,
avg(l_extendedprice) AS avg_price,
avg(l_discount) AS avg_disc,
count(*) AS count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - interval '74 days'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
Just-in-Time Compilation in v11
Just-in-Time Compilation in v11
0 1 2 4 8 16
0
200000
400000
600000
800000
1000000
1200000
TPCH Q01 timing
scale 100, fully cached
PG 9.6
PG 10
master jit=0
master jit=1
parallelism
timeinms
JIT - Good Cases / Bad Cases
●
OLTP → bad, short query → bad
●
OLAP → good, long query → good
●
IO bound → not necessarily good
●
CPU bound → likely good
●
lots of aggregates → good
●
wide tables (i.e. lots of columns) → good
JIT – Planning
●
Naive!
●
Disabled / Enabled: jit = on / off (currently enabled by default)
●
SELECT pg_jit_available();
●
Perform JIT if query_cost > jit_above_cost
– default: 100000
●
Optimize if query_cost > jit_optimize_above_cost
– default: 500000
●
Inline if query_cost > jit_inline_above_cost
– default: 500000
●
-1 disables corresponding feature
●
Whole query decision
JIT – Planning
Sort (cost=4311583.16..4311583.18 rows=6 width=68) (actual time=[]18270.844 rows=4 loops=1)
Sort Key: l_returnflag, l_linestatus
Sort Method: quicksort Memory: 25kB
-> HashAggregate (...)
Group Key: l_returnflag, l_linestatus
-> Seq Scan on lineitem (...)
Filter: (l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 571965
Planning Time: 1.956 ms
JIT:
Functions: 9
Generation Time: 4.087 ms
Inlining: true
Inlining Time: 62.990 ms
Optimization: true
Optimization Time: 121.307 ms
Emission Time: 102.416 ms
Execution Time: 18291.397 ms
JIT – how to test
●
Debian:
– apt.postgresql.org has v11 beta 1 packages
– https://wiki.postgresql.org/wiki/Apt/
FAQ#I_want_to_try_the_beta_version_of_the_next_PostgreSQL_releas
e
– Install postgresql-11
●
RHEL:
– yum.postgresql.org has v11 beta 1 packages
– https://yum.postgresql.org/repopackages.php#pg11
– install postgresql11-server postgresql11-llvmjit
●
depends on EPEL
JIT Improvements: Code Generation
●
Expressions refer to per-query allocated memory
– generated code references memory locations
– lots of superflous memory reads/writes for arguments, optimizer can’t eliminate in most cases
●
massively reduces benefits of inlining
– optimizer can’t optimize away memory lots of memory references
– FIX: separate permanent and per eval memory
●
Expression step results refer to persistent memory
– move to temporary memory
●
Function Call Interface references persistent memory
– FIX: pass FunctionCallInfoData and FmgrInfo separately to functions
●
remove FunctionCallInfoData->flinfo
●
move context, resultinfo, fncollation to FmgrInfo
●
move isnull field to separate argument? Return struct?
– Non JITed expression evaluation will benefit too
JIT Improvements: Caching
●
Optimizer overhead significant
– TPCH Q01: unopt, noinline: time to optimize: 0.002s, emit: 0.036s
– TPCH Q01: time to inline: 0.080s, optimize: 0.163s, emit 0.082s
●
references to memory locations prevent caching (i.e. expression
codegen has to be optimized first)
●
Introduce per-backend LRU cache of functions keyed by hash of
emitted LRU (plus comparator)
– What to use as cache key?
●
IR? - requires generating it
●
Expression Trees?
●
Prepared Statement?
●
Shared / Non-Shared / Persistent?
●
relatively easy task, once pointers removed
JIT Improvements: Incremental JITing
JIT JITed Query Execution
Query Execution
JIT JITed Query Execution
JIT
Query Execution
JITed Query Execution
JITed Query Execution
JIT Improvements: Planning
●
Whole Query decision too coarse
– use estimates about total number of each function evaluation?
●
JIT more aggressively when using prepared statements?
– after repeated executions?
Future things to JIT
●
Executor control flow
– hard, but lots of other benefits (asynchronous execution, non-JITed will
be faster, less memory)
●
COPY parsing, input / output function invocation
– easy – medium
●
Aggregate & Hashjoin hash computation
– easy
●
in-memory tuplesort
– including tuple deforming (from MinimalTuple)
– easy
OLTP Workloads
PG 9.6 PG 10 PG 11
0
50000
100000
150000
200000
250000
pgbench readonly, scale 100 (on laptop)
1 client
16 client
version
tps
OLTP Workloads: Bottlenecks in v11
- 97.68% PostgresMain
- 32.09% exec_execute_message (inlined)
- 95.13% PortalRun
- 92.08% PortalRunSelect
- 98.38% standard_ExecutorRun
+ 98.23% ExecutePlan (inlined)
+ 0.86% printtup_startup
- 31.90% exec_bind_message (inlined)
- 53.38% PortalStart
- 86.96% standard_ExecutorStart
+ 96.43% InitPlan (inlined)
+ 3.29% CreateExecutorState
+ 10.56% GetCachedPlan
+ 8.14% start_xact_command (inlined)
+ 13.06% finish_xact_command
+ 10.31% ReadCommand (inlined)
OLTP Workloads: Bottlenecks in v11
●
Too much work done every query execution
– move to planner, helping prepared statements
– reduce duplicated work
●
Execution uses too much memory / too many small allocations
– rework executor data structures
– batch all allocations into one single memory allocation
Limit
Sort
Join
SeqScan Agg
Join
IdxScan IdxScan
Current Executor
Union All
FDWFDWFDWFDWFDWFDW Join
FDW FDW
Async Execution
Async IO
Join
HashJoin
... Hash
SeqScan
HashJoin
... Hash
SeqScan
JIT
static pg_attribute_always_inline
TupleTableSlot *
ExecHashJoinImpl(PlanState *pstate, bool
parallel)
{
...
for (;;)
{
/* lotsa code */
…
slot =
ExecProcNode(outerNode);
…
Linearized Programs
SELECT
l_returnflag,
l_linestatus,
sum(l_quantity) AS sum_qty,
sum(l_extendedprice) AS sum_base_price,
sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
avg(l_quantity) AS avg_qty,
avg(l_extendedprice) AS avg_price,
avg(l_discount) AS avg_disc,
count(*) AS count_order
FROM
lineitem
WHERE
l_shipdate <= date '1998-12-01' - interval '74 days'
GROUP BY
l_returnflag,
l_linestatus
ORDER BY
l_returnflag,
l_linestatus;
0: init_sort
1: seqscan_first
2: seqscan [j empty 5] > s0
3: qual [j fail 2] < scan s0
4: hashagg_tuple [j 2] < s0
5: drain_hashagg [j empty 7]
> s1
6: sort_tuple [j 5] < s1
7: sort
8: drain_sort [j empty 10] > s2
9: return < s2 [next 8]
10: done

Weitere ähnliche Inhalte

Was ist angesagt?

How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
Galvin-operating System(Ch10)
Galvin-operating System(Ch10)Galvin-operating System(Ch10)
Galvin-operating System(Ch10)
dsuyal1
 
In-core compression: how to shrink your database size in several times
In-core compression: how to shrink your database size in several timesIn-core compression: how to shrink your database size in several times
In-core compression: how to shrink your database size in several times
Aleksander Alekseev
 

Was ist angesagt? (20)

collectd & PostgreSQL
collectd & PostgreSQLcollectd & PostgreSQL
collectd & PostgreSQL
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
In-memory OLTP storage with persistence and transaction support
In-memory OLTP storage with persistence and transaction supportIn-memory OLTP storage with persistence and transaction support
In-memory OLTP storage with persistence and transaction support
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
Linux tuning for PostgreSQL at Secon 2015
Linux tuning for PostgreSQL at Secon 2015Linux tuning for PostgreSQL at Secon 2015
Linux tuning for PostgreSQL at Secon 2015
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
 
Galvin-operating System(Ch10)
Galvin-operating System(Ch10)Galvin-operating System(Ch10)
Galvin-operating System(Ch10)
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and TelegrafObtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
 
Nov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeNov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars george
 
In-core compression: how to shrink your database size in several times
In-core compression: how to shrink your database size in several timesIn-core compression: how to shrink your database size in several times
In-core compression: how to shrink your database size in several times
 
PostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQ
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Managing PostgreSQL with PgCenter
Managing PostgreSQL with PgCenterManaging PostgreSQL with PgCenter
Managing PostgreSQL with PgCenter
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons Learned
 

Ähnlich wie Postgres Vision 2018: Making Postgres Even Faster

Track h asic prototyping - logtel
Track h   asic prototyping - logtelTrack h   asic prototyping - logtel
Track h asic prototyping - logtel
chiportal
 
Track h asic prototyping - logtel
Track h   asic prototyping - logtelTrack h   asic prototyping - logtel
Track h asic prototyping - logtel
chiportal
 
Hadoop world g1_gc_forh_base_v4
Hadoop world g1_gc_forh_base_v4Hadoop world g1_gc_forh_base_v4
Hadoop world g1_gc_forh_base_v4
YanpingWang
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
Jeff Larkin
 

Ähnlich wie Postgres Vision 2018: Making Postgres Even Faster (20)

What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
When the OS gets in the way
When the OS gets in the wayWhen the OS gets in the way
When the OS gets in the way
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
 
Valerii Vasylkov Erlang. measurements and benefits.
Valerii Vasylkov Erlang. measurements and benefits.Valerii Vasylkov Erlang. measurements and benefits.
Valerii Vasylkov Erlang. measurements and benefits.
 
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
 
Finagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestFinagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at Pinterest
 
Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
GStreamer Instruments
GStreamer InstrumentsGStreamer Instruments
GStreamer Instruments
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
Track h asic prototyping - logtel
Track h   asic prototyping - logtelTrack h   asic prototyping - logtel
Track h asic prototyping - logtel
 
Track h asic prototyping - logtel
Track h   asic prototyping - logtelTrack h   asic prototyping - logtel
Track h asic prototyping - logtel
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
 
HPAT presentation at JuliaCon 2016
HPAT presentation at JuliaCon 2016HPAT presentation at JuliaCon 2016
HPAT presentation at JuliaCon 2016
 
Avoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance LossAvoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance Loss
 
Hadoop world g1_gc_forh_base_v4
Hadoop world g1_gc_forh_base_v4Hadoop world g1_gc_forh_base_v4
Hadoop world g1_gc_forh_base_v4
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
 

Mehr von EDB

EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
EDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
EDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
 

Mehr von EDB (20)

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQL
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Postgres Vision 2018: Making Postgres Even Faster

  • 1. Unleashing PostgreSQL Performance Andres Freund PostgreSQL Developer & Committer Email: andres@anarazel.de Email: andres.freund@enterprisedb.com Twitter: @AndresFreundTec anarazel.de/talks/2018-06-06-pgvision-performance/performance.pdf
  • 3. andres@alap4:~$ psql tpch_1 psql (11beta1) Type "help" for help. tpch_1[10841][1]=# quit andres@alap4:~$
  • 7.
  • 10. SELECT l_returnflag, l_linestatus, sum(l_quantity) AS sum_qty, sum(l_extendedprice) AS sum_base_price, sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, avg(l_quantity) AS avg_qty, avg(l_extendedprice) AS avg_price, avg(l_discount) AS avg_disc, count(*) AS count_order FROM lineitem WHERE l_shipdate <= date '1998-12-01' - interval '74 days' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus; New Hash-Table & Expression Evaluation Engine Rewrite
  • 11. Sort (cost=4313533.34..4313533.36 rows=6 width=68) Sort Key: l_returnflag, l_linestatus -> HashAggregate (cost=4313533.16..4313533.26 rows=6 width=68) Group Key: l_returnflag, l_linestatus Output: …, sum(l_quantity), sum(l_extendedprice), sum(…), ... -> Seq Scan on lineitem (cost=0.00..1936427.80 rows=59427634 width=36) Filter: (l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone) New Hash-Table & Expression Evaluation Engine Rewrite
  • 13. v10+ Expression Evaluation Engine ● WHERE a.col < 10 AND a.another = 3 – EEOP_SCAN_FETCHSOME (deform necessary cols) – EEOP_SCAN_VAR (a.col) – EEOP_CONST (10) – EEOP_FUNCEXPR_STRICT (int4lt) – EEOP_BOOL_AND_STEP_FIRST – EEOP_SCAN_VAR (a.another) – EEOP_CONST (3) – EEOP_FUNCEXPR_STRICT (int4eq) – EEOP_BOOL_AND_STEP_LAST (AND) ● direct threaded ● lots of indirect jumps
  • 14. PG 9.5 PG 9.6 PG 10 0 5 10 15 20 25 30 35 40 45 TPCH Q-01 scale 5, fully cached #Processes=1 #Processes=4 Version TimeinSeconds v10+ Expression Evaluation Engine
  • 15. Just-in-Time Compilation ● Interpreted execution → compiled execution – removes interpretation overhead ● Make variables constant – removes branches, indirect jumps, reduces size ● Inline called functions ● PG 11: Optionally uses LLVM
  • 16. SELECT l_returnflag, l_linestatus, sum(l_quantity) AS sum_qty, sum(l_extendedprice) AS sum_base_price, sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, avg(l_quantity) AS avg_qty, avg(l_extendedprice) AS avg_price, avg(l_discount) AS avg_disc, count(*) AS count_order FROM lineitem WHERE l_shipdate <= date '1998-12-01' - interval '74 days' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus; Just-in-Time Compilation in v11
  • 17. Just-in-Time Compilation in v11 0 1 2 4 8 16 0 200000 400000 600000 800000 1000000 1200000 TPCH Q01 timing scale 100, fully cached PG 9.6 PG 10 master jit=0 master jit=1 parallelism timeinms
  • 18. JIT - Good Cases / Bad Cases ● OLTP → bad, short query → bad ● OLAP → good, long query → good ● IO bound → not necessarily good ● CPU bound → likely good ● lots of aggregates → good ● wide tables (i.e. lots of columns) → good
  • 19. JIT – Planning ● Naive! ● Disabled / Enabled: jit = on / off (currently enabled by default) ● SELECT pg_jit_available(); ● Perform JIT if query_cost > jit_above_cost – default: 100000 ● Optimize if query_cost > jit_optimize_above_cost – default: 500000 ● Inline if query_cost > jit_inline_above_cost – default: 500000 ● -1 disables corresponding feature ● Whole query decision
  • 20. JIT – Planning Sort (cost=4311583.16..4311583.18 rows=6 width=68) (actual time=[]18270.844 rows=4 loops=1) Sort Key: l_returnflag, l_linestatus Sort Method: quicksort Memory: 25kB -> HashAggregate (...) Group Key: l_returnflag, l_linestatus -> Seq Scan on lineitem (...) Filter: (l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone) Rows Removed by Filter: 571965 Planning Time: 1.956 ms JIT: Functions: 9 Generation Time: 4.087 ms Inlining: true Inlining Time: 62.990 ms Optimization: true Optimization Time: 121.307 ms Emission Time: 102.416 ms Execution Time: 18291.397 ms
  • 21. JIT – how to test ● Debian: – apt.postgresql.org has v11 beta 1 packages – https://wiki.postgresql.org/wiki/Apt/ FAQ#I_want_to_try_the_beta_version_of_the_next_PostgreSQL_releas e – Install postgresql-11 ● RHEL: – yum.postgresql.org has v11 beta 1 packages – https://yum.postgresql.org/repopackages.php#pg11 – install postgresql11-server postgresql11-llvmjit ● depends on EPEL
  • 22. JIT Improvements: Code Generation ● Expressions refer to per-query allocated memory – generated code references memory locations – lots of superflous memory reads/writes for arguments, optimizer can’t eliminate in most cases ● massively reduces benefits of inlining – optimizer can’t optimize away memory lots of memory references – FIX: separate permanent and per eval memory ● Expression step results refer to persistent memory – move to temporary memory ● Function Call Interface references persistent memory – FIX: pass FunctionCallInfoData and FmgrInfo separately to functions ● remove FunctionCallInfoData->flinfo ● move context, resultinfo, fncollation to FmgrInfo ● move isnull field to separate argument? Return struct? – Non JITed expression evaluation will benefit too
  • 23. JIT Improvements: Caching ● Optimizer overhead significant – TPCH Q01: unopt, noinline: time to optimize: 0.002s, emit: 0.036s – TPCH Q01: time to inline: 0.080s, optimize: 0.163s, emit 0.082s ● references to memory locations prevent caching (i.e. expression codegen has to be optimized first) ● Introduce per-backend LRU cache of functions keyed by hash of emitted LRU (plus comparator) – What to use as cache key? ● IR? - requires generating it ● Expression Trees? ● Prepared Statement? ● Shared / Non-Shared / Persistent? ● relatively easy task, once pointers removed
  • 24. JIT Improvements: Incremental JITing JIT JITed Query Execution Query Execution JIT JITed Query Execution JIT Query Execution JITed Query Execution JITed Query Execution
  • 25. JIT Improvements: Planning ● Whole Query decision too coarse – use estimates about total number of each function evaluation? ● JIT more aggressively when using prepared statements? – after repeated executions?
  • 26. Future things to JIT ● Executor control flow – hard, but lots of other benefits (asynchronous execution, non-JITed will be faster, less memory) ● COPY parsing, input / output function invocation – easy – medium ● Aggregate & Hashjoin hash computation – easy ● in-memory tuplesort – including tuple deforming (from MinimalTuple) – easy
  • 27. OLTP Workloads PG 9.6 PG 10 PG 11 0 50000 100000 150000 200000 250000 pgbench readonly, scale 100 (on laptop) 1 client 16 client version tps
  • 28. OLTP Workloads: Bottlenecks in v11 - 97.68% PostgresMain - 32.09% exec_execute_message (inlined) - 95.13% PortalRun - 92.08% PortalRunSelect - 98.38% standard_ExecutorRun + 98.23% ExecutePlan (inlined) + 0.86% printtup_startup - 31.90% exec_bind_message (inlined) - 53.38% PortalStart - 86.96% standard_ExecutorStart + 96.43% InitPlan (inlined) + 3.29% CreateExecutorState + 10.56% GetCachedPlan + 8.14% start_xact_command (inlined) + 13.06% finish_xact_command + 10.31% ReadCommand (inlined)
  • 29. OLTP Workloads: Bottlenecks in v11 ● Too much work done every query execution – move to planner, helping prepared statements – reduce duplicated work ● Execution uses too much memory / too many small allocations – rework executor data structures – batch all allocations into one single memory allocation
  • 33. JIT static pg_attribute_always_inline TupleTableSlot * ExecHashJoinImpl(PlanState *pstate, bool parallel) { ... for (;;) { /* lotsa code */ … slot = ExecProcNode(outerNode); …
  • 34. Linearized Programs SELECT l_returnflag, l_linestatus, sum(l_quantity) AS sum_qty, sum(l_extendedprice) AS sum_base_price, sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, avg(l_quantity) AS avg_qty, avg(l_extendedprice) AS avg_price, avg(l_discount) AS avg_disc, count(*) AS count_order FROM lineitem WHERE l_shipdate <= date '1998-12-01' - interval '74 days' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus; 0: init_sort 1: seqscan_first 2: seqscan [j empty 5] > s0 3: qual [j fail 2] < scan s0 4: hashagg_tuple [j 2] < s0 5: drain_hashagg [j empty 7] > s1 6: sort_tuple [j 5] < s1 7: sort 8: drain_sort [j empty 10] > s2 9: return < s2 [next 8] 10: done