SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
February 14, 2019
PostgreSQL:
Decoding Partition
Beena Emerson
1
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
INTRODUCTION
● What is Partitioning?
● Why partition?
● When to Partition?
2
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
What is Partitioning?
Subdividing a database table into smaller parts.
PARENT TABLE
APPLICATION
INTERNAL
PARTITIONS
3
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Why Partition?
10 mn
rows
2
mn
A = 100?
Faster Data Access
Scanning smaller tables takes less time.
4
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Why Partition?
TABLE SIZE
PERFORMANCEScalability
With partition, we would be dealing with multiple tables of smaller sizes.
5
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Why Partition?
Easy Maintenance
5000
ms
200
ms
180
ms
Commands are faster and can be run in parallel
6
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
When to Partition?
Table Size
● Rule of thumb: size of the table exceeds physical
memory of the database server.
● Extensive DML operations
7
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
When to Partition?
Targeted Columns
Col 1 Col 2 Col 3 Col4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
%
Good partition key candidate
WHERE col3 = xx
8
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
When to Partition?
Hot and Cold Data
FEB
2019
JAN
2019
DEC
2018
NOV
2018
FASTER DISK
9
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
CAUTION!
Bad partitioning is worse than no partitioning!
Choose partitioning keys, strategy and no of partitions
after thorough analysis and testing.
10
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
PARTITIONING IN POSTGRESQL
● Inheritance (Trigger- based) Partition
● Declarative Partitioning
11
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
● Since PostgreSQL 8.1
● Table inheritance simulating partitioning
A = 3
PARENT TABLE TRIGGER
CHECK
CHILD TABLES
Inheritance (Trigger- based) Partition
12
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Inheritance (Trigger- based) Partition
Problems
● Partitions tracking is manual and error-prone.
● Slower Performance
● Constraints may not be mutually exclusive
13
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Declarative Partitioning
● Introduced in PostgreSQL 10
● Somewhat automates creation of partitions
● Easy partition maintenance
○ No overlapping boundaries
○ Add/Delete partition easily
○ Simpler syntax
● Scope for better performance
14
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Declarative Partitioning
Terms
● Partitioned table - parent table
● Partition key - column on which table is
partitioned
● Partitions - the smaller child tables
● Partition bounds - constraints of each partition
15
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Declarative Partitioning
● Create Partitioned Table
CREATE TABLE parent ( <col list > )
PARTITION BY <partition strategy>
( <partition key> );
● Create Partitions
CREATE TABLE child
PARTITION OF parent <partition bounds>;
… repeat for required partitions
16
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Declarative Partitioning
● Existing table can be added as a partition to a
partitioned table
ALTER TABLE parent ATTACH PARTITION p1 <bounds>;
All entries in the table will be checked.
● A partition can be removed and made standalone
ALTER TABLE parent DETACH PARTITION child;
17
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
PARTITION STRATEGY AND TYPE
● Strategy
○ List
○ Range
○ Hash
● Types
○ Multi-column partitioning
○ Multi-level partitioning
18
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
List Strategy
MUMBAI,
PUNE
GOA
CHENNAI,
OOTY
BANGLORE
● PostgreSQL 10
● Explicitly specify key values
○ Single or multiple value per
partition
19
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
List Strategy
● Create Partitioned Table
CREATE TABLE parent ( <col list > )
PARTITION BY LIST ( <partition key> );
● Create Partitions
CREATE TABLE child PARTITION OF parent
FOR VALUES
IN ( <val1> [, <val2>] ) ;
20
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Range Strategy
● PostgreSQL 10
● Range boundaries
○ Lower inclusive (>=)
○ Upper exclusive (<)
● Unbounded values
○ MINVALUE
○ MAXVALUE
1 - 100
100 - 200
200 - 300
300 - 400
21
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Range Strategy
● Create Partitioned Table
CREATE TABLE parent ( <col list > )
PARTITION BY RANGE ( <partition key> );
● Create Partitions
CREATE TABLE child PARTITION OF parent
FOR VALUES
FROM ( <lower> ) TO ( <upper> ) ;
22
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Hash Strategy
● PostgreSQL 11
● Data does not have natural boundaries
● Specify modulus and remainder
○ Modulus - non-zero positive integer
○ Remainder - non-negative integer
○ remainder < modulus
● Rows spread on hash value of the partition key
23
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Hash Strategy
Div
m
ParentA = 3
Hash
fn
24
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Hash Strategy
● Create Partitioned Table
CREATE TABLE parent ( <col list > )
PARTITION BY HASH ( <partition key> );
● Create Partitions
CREATE TABLE child PARTITION OF parent
FOR VALUES
WITH ( MODULUS <m>, REMAINDER <r> );
25
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-Column Partitioning
● Multiple columns as partition key
● Supported for range and hash
● Column limit : 32
26
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-Column Partitioning
Range Partition
CREATE TABLE rparent (col1 int, col2 int)
PARTITION BY RANGE (col1, col2);
CREATE TABLE rpart_1 PARTITION OF rparent
FOR VALUES FROM (0, 0) TO (100, 50);
CREATE TABLE rpart_2 PARTITION OF rparent
FOR VALUES FROM (100, 50)
TO (MAXVALUE, MAXVALUE);
27
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-Column Partitioning
Range Partition
● Every column following MAXVALUE / MINVALUE must
also be the same.
● The row comparison operator is used for insert
○ Elements are compared left-to-right, stopping at
first unequal pair of elements.
Consider partition (0, 0) TO (100, 50)
(0, 199), (100, 49) fits while
(100, 50) does not.
28
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-Column Partitioning
Hash Partition
CREATE TABLE hparent (col1 int, col2 int)
PARTITION BY HASH (col1, col2);
CREATE TABLE hpart_2 PARTITION OF hparent FOR
VALUES WITH (MODULUS 3, REMAINDER 2);
CREATE TABLE hpart_1 PARTITION OF hparent FOR
VALUES WITH (MODULUS 3, REMAINDER 1);
CREATE TABLE hpart_0 PARTITION OF hparent FOR
VALUES WITH (MODULUS 3, REMAINDER 0);
29
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-Column Partitioning
Hash Partition
● Only a single modulus, remainder pair is used.
● The hash of each of partition key is calculated and
then combined to get a single hash value which is
divided by the modulus specified.
30
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-level Partitioning
31
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Multi-level Partitioning
● Create a partitioned partition
CREATE TABLE child1 PARTITION OF parent
FOR VALUES FROM (0) TO (100)
PARTITION BY LIST (col3);
● Attach a partitioned table as a partition.
● There is no check to ensure that the sub partition
bounds are a subset of the partition’s own bound.
CREATE TABLE child1_1 PARTITION OF child1
FOR VALUES IN (500, 2000);
32
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
OTHER FEATURES
● Default Partition
● Handling NULL values
● Inherit Constraints
● Update tuple routing
● Foreign Table Partitions
● Partition Pruning
33
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
● PostgreSQL 11
● Catch tuples that do not match
partition bounds of others.
● Support for: list, range
● Syntax:
CREATE TABLE child
PARTITION OF parent
DEFAULT
A
B
C
DEFAULT
Default Partition
Z
(NOT (col1 IS NOT NULL) AND
(col1 = ANY (ARRAY[ 'A', 'B', 'C' ])))
34
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Default Partition
Add new partition
Default partition should not
have rows that satisfy the
new partition.
ERROR: updated partition constraint for default
partition "part_def" would be violated by some row
1 - 100
100 - 200
DEF: 205, 410
200 - 300
35
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Handling NULL values
● Hash routes based on the hash.
● Range: can be inserted in default partition.
● List: A partition can be made to accept NULL else
it is routed to default partition.
CREATE TABLE child PARTITION OF lparent
FOR VALUES
IN ( NULL ) ;
36
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Inherit Constraints
Constraint Types
● Check
● Not-Null
● Unique
● Primary Keys
● Foreign Keys
● Exclude
37
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Inherit Constraints
PostgreSQL 10
● Supported: Check, Not Null
● Other constraints can be explicitly added to
individual partition.
● No cascading inheritance
38
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Inherit Constraints
PostgreSQL 11
● Added Support: unique, primary and foreign keys
● Not supported: exclude constraints
● Cascading: New constraints added to parent
propagates to partitions.
● Other constraints can be explicitly added to
individual partition.
39
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Update Tuple Routing
PostgreSQL 11
● When an updated tuple no longer fits the partition,
route it to the correct one.
1 - 100
100 - 200
200 - 300
300 - 400
150
380
40
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Foreign Table Partitions
● Can add foreign table as a partition.
CREATE FOREIGN TABLE lpart_f1 PARTITION OF
lparent FOR VALUES IN ('C' , 'E') SERVER s0
● Tuples can be inserted from v11.
● Cannot be partitioned.
● Updated tuples can me moved to foreign
partitions but not from them.
41
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Partition Pruning
0 - 100
100 - 200
200 - 300
300 - 400
180 - 220
Planner Level Pruning
Depending on the query conditions
on partition key, scan only the
relevant partitions.
col1
42
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Partition Pruning
Planner Level Pruning
=# EXPLAIN (COSTS OFF, TIMING OFF, ANALYZE) SELECT *
FROM rparent WHERE col1 BETWEEN 180 AND 220;
Append (actual rows=41 loops=1)
-> Seq Scan on rpart_2 (actual rows=20 loops=1)
Filter: ((col1 >= 180) AND (col1 <= 220))
Rows Removed by Filter: 80
-> Seq Scan on rpart_3 (actual rows=21 loops=1)
Filter: ((col1 >= 180) AND (col1 <= 220))
Rows Removed by Filter: 79
43
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Partition Pruning
Runtime Partition Pruning
● PostgreSQL 11
● Performed at two levels
○ Executor Initialization - prepared query
○ Actual Execution - nested loop join
44
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Partition Pruning
Runtime Pruning - Executor Initialization
=# PREPARE r_q1 (int) as SELECT * from rparent where
col1 < $1;
=# EXPLAIN execute r_q1(150);
Append (cost=0.00..168.06 rows=3012 width=8)
Subplans Removed: 2
-> Seq Scan on rpart_1
Filter: (col1 < $1)
-> Seq Scan on rpart_2
Filter: (col1 < $1)
(6 rows)
45
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Partition Pruning
Runtime Pruning - Actual execution
Nested Loop Join
100 - 200
0 - 100
200 - 300
300 - 400
10000 rows
1-50
Rest > 500
(50 times)
46
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
Partition Pruning
Runtime Pruning - Actual execution
=#EXPLAIN SELECT * FROM tbl JOIN rparent ON id= col1;
Nested Loop (actual rows=50 loops=1)
-> Seq Scan on tbl (actual rows=10000 loops=1)
-> Append (actual rows=0 loops=10000)
-> Index Scan using rpart_1_pkey on rpart_1 ( actual rows=1
loops=50)
Index Cond: (col1 = tbl.id)
-> Index Scan using rpart_2_pkey on rpart_2 (never executed)
Index Cond: (col1 = tbl.id)
-> Index Scan using rpart_3_pkey on rpart_3 (never executed)
Index Cond: (col1 = tbl.id)
-> Index Scan using rpart_4_pkey on rpart_4 (never executed)
Index Cond: (col1 = tbl.id)
47
© Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved.
THANK YOU!
48

Weitere ähnliche Inhalte

Was ist angesagt?

Hive sql的编译过程
Hive sql的编译过程Hive sql的编译过程
Hive sql的编译过程
Chen Chun
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 

Was ist angesagt? (20)

Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Hive sql的编译过程
Hive sql的编译过程Hive sql的编译过程
Hive sql的编译过程
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Scaling HBase for Big Data
Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Introducing Delta Live Tables: Make Reliable ETL Easy on Delta Lake
Introducing Delta Live Tables: Make Reliable ETL Easy on Delta LakeIntroducing Delta Live Tables: Make Reliable ETL Easy on Delta Lake
Introducing Delta Live Tables: Make Reliable ETL Easy on Delta Lake
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Full Page Writes in PostgreSQL PGCONFEU 2022
Full Page Writes in PostgreSQL PGCONFEU 2022Full Page Writes in PostgreSQL PGCONFEU 2022
Full Page Writes in PostgreSQL PGCONFEU 2022
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 

Ähnlich wie PostgreSQL - Decoding Partitions

Ähnlich wie PostgreSQL - Decoding Partitions (20)

The Truth About Partitioning
The Truth About PartitioningThe Truth About Partitioning
The Truth About Partitioning
 
New and Improved Features in PostgreSQL 13
New and Improved Features in PostgreSQL 13New and Improved Features in PostgreSQL 13
New and Improved Features in PostgreSQL 13
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
PostgreSQL 13 is Coming - Find Out What's New!
PostgreSQL 13 is Coming - Find Out What's New!PostgreSQL 13 is Coming - Find Out What's New!
PostgreSQL 13 is Coming - Find Out What's New!
 
Divide and Rule partitioning in pg11
Divide and Rule partitioning in pg11Divide and Rule partitioning in pg11
Divide and Rule partitioning in pg11
 
Large Table Partitioning with PostgreSQL and Django
 Large Table Partitioning with PostgreSQL and Django Large Table Partitioning with PostgreSQL and Django
Large Table Partitioning with PostgreSQL and Django
 
Db2 performance tuning for dummies
Db2 performance tuning for dummiesDb2 performance tuning for dummies
Db2 performance tuning for dummies
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
 
Partition and conquer large data in PostgreSQL 10
Partition and conquer large data in PostgreSQL 10Partition and conquer large data in PostgreSQL 10
Partition and conquer large data in PostgreSQL 10
 
Les 18 space
Les 18 spaceLes 18 space
Les 18 space
 
Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]
 
Inno db 5_7_features
Inno db 5_7_featuresInno db 5_7_features
Inno db 5_7_features
 
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...
PGConf.ASIA 2019 Bali - How did PostgreSQL Write Load Balancing of Queries Us...
 
MySQL Indexing
MySQL IndexingMySQL Indexing
MySQL Indexing
 
A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0
 
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
DB2 LUW V11.1 CERTIFICATION TRAINING PART #1
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

PostgreSQL - Decoding Partitions

  • 1. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. February 14, 2019 PostgreSQL: Decoding Partition Beena Emerson 1
  • 2. © Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved. INTRODUCTION ● What is Partitioning? ● Why partition? ● When to Partition? 2
  • 3. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. What is Partitioning? Subdividing a database table into smaller parts. PARENT TABLE APPLICATION INTERNAL PARTITIONS 3
  • 4. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Why Partition? 10 mn rows 2 mn A = 100? Faster Data Access Scanning smaller tables takes less time. 4
  • 5. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Why Partition? TABLE SIZE PERFORMANCEScalability With partition, we would be dealing with multiple tables of smaller sizes. 5
  • 6. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Why Partition? Easy Maintenance 5000 ms 200 ms 180 ms Commands are faster and can be run in parallel 6
  • 7. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. When to Partition? Table Size ● Rule of thumb: size of the table exceeds physical memory of the database server. ● Extensive DML operations 7
  • 8. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. When to Partition? Targeted Columns Col 1 Col 2 Col 3 Col4 . . . . . . . . . . . . . . . . . . . . 75 % Good partition key candidate WHERE col3 = xx 8
  • 9. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. When to Partition? Hot and Cold Data FEB 2019 JAN 2019 DEC 2018 NOV 2018 FASTER DISK 9
  • 10. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. CAUTION! Bad partitioning is worse than no partitioning! Choose partitioning keys, strategy and no of partitions after thorough analysis and testing. 10
  • 11. © Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved. PARTITIONING IN POSTGRESQL ● Inheritance (Trigger- based) Partition ● Declarative Partitioning 11
  • 12. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. ● Since PostgreSQL 8.1 ● Table inheritance simulating partitioning A = 3 PARENT TABLE TRIGGER CHECK CHILD TABLES Inheritance (Trigger- based) Partition 12
  • 13. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Inheritance (Trigger- based) Partition Problems ● Partitions tracking is manual and error-prone. ● Slower Performance ● Constraints may not be mutually exclusive 13
  • 14. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Declarative Partitioning ● Introduced in PostgreSQL 10 ● Somewhat automates creation of partitions ● Easy partition maintenance ○ No overlapping boundaries ○ Add/Delete partition easily ○ Simpler syntax ● Scope for better performance 14
  • 15. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Declarative Partitioning Terms ● Partitioned table - parent table ● Partition key - column on which table is partitioned ● Partitions - the smaller child tables ● Partition bounds - constraints of each partition 15
  • 16. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Declarative Partitioning ● Create Partitioned Table CREATE TABLE parent ( <col list > ) PARTITION BY <partition strategy> ( <partition key> ); ● Create Partitions CREATE TABLE child PARTITION OF parent <partition bounds>; … repeat for required partitions 16
  • 17. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Declarative Partitioning ● Existing table can be added as a partition to a partitioned table ALTER TABLE parent ATTACH PARTITION p1 <bounds>; All entries in the table will be checked. ● A partition can be removed and made standalone ALTER TABLE parent DETACH PARTITION child; 17
  • 18. © Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved. PARTITION STRATEGY AND TYPE ● Strategy ○ List ○ Range ○ Hash ● Types ○ Multi-column partitioning ○ Multi-level partitioning 18
  • 19. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. List Strategy MUMBAI, PUNE GOA CHENNAI, OOTY BANGLORE ● PostgreSQL 10 ● Explicitly specify key values ○ Single or multiple value per partition 19
  • 20. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. List Strategy ● Create Partitioned Table CREATE TABLE parent ( <col list > ) PARTITION BY LIST ( <partition key> ); ● Create Partitions CREATE TABLE child PARTITION OF parent FOR VALUES IN ( <val1> [, <val2>] ) ; 20
  • 21. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Range Strategy ● PostgreSQL 10 ● Range boundaries ○ Lower inclusive (>=) ○ Upper exclusive (<) ● Unbounded values ○ MINVALUE ○ MAXVALUE 1 - 100 100 - 200 200 - 300 300 - 400 21
  • 22. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Range Strategy ● Create Partitioned Table CREATE TABLE parent ( <col list > ) PARTITION BY RANGE ( <partition key> ); ● Create Partitions CREATE TABLE child PARTITION OF parent FOR VALUES FROM ( <lower> ) TO ( <upper> ) ; 22
  • 23. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Hash Strategy ● PostgreSQL 11 ● Data does not have natural boundaries ● Specify modulus and remainder ○ Modulus - non-zero positive integer ○ Remainder - non-negative integer ○ remainder < modulus ● Rows spread on hash value of the partition key 23
  • 24. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Hash Strategy Div m ParentA = 3 Hash fn 24
  • 25. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Hash Strategy ● Create Partitioned Table CREATE TABLE parent ( <col list > ) PARTITION BY HASH ( <partition key> ); ● Create Partitions CREATE TABLE child PARTITION OF parent FOR VALUES WITH ( MODULUS <m>, REMAINDER <r> ); 25
  • 26. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-Column Partitioning ● Multiple columns as partition key ● Supported for range and hash ● Column limit : 32 26
  • 27. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-Column Partitioning Range Partition CREATE TABLE rparent (col1 int, col2 int) PARTITION BY RANGE (col1, col2); CREATE TABLE rpart_1 PARTITION OF rparent FOR VALUES FROM (0, 0) TO (100, 50); CREATE TABLE rpart_2 PARTITION OF rparent FOR VALUES FROM (100, 50) TO (MAXVALUE, MAXVALUE); 27
  • 28. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-Column Partitioning Range Partition ● Every column following MAXVALUE / MINVALUE must also be the same. ● The row comparison operator is used for insert ○ Elements are compared left-to-right, stopping at first unequal pair of elements. Consider partition (0, 0) TO (100, 50) (0, 199), (100, 49) fits while (100, 50) does not. 28
  • 29. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-Column Partitioning Hash Partition CREATE TABLE hparent (col1 int, col2 int) PARTITION BY HASH (col1, col2); CREATE TABLE hpart_2 PARTITION OF hparent FOR VALUES WITH (MODULUS 3, REMAINDER 2); CREATE TABLE hpart_1 PARTITION OF hparent FOR VALUES WITH (MODULUS 3, REMAINDER 1); CREATE TABLE hpart_0 PARTITION OF hparent FOR VALUES WITH (MODULUS 3, REMAINDER 0); 29
  • 30. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-Column Partitioning Hash Partition ● Only a single modulus, remainder pair is used. ● The hash of each of partition key is calculated and then combined to get a single hash value which is divided by the modulus specified. 30
  • 31. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-level Partitioning 31
  • 32. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Multi-level Partitioning ● Create a partitioned partition CREATE TABLE child1 PARTITION OF parent FOR VALUES FROM (0) TO (100) PARTITION BY LIST (col3); ● Attach a partitioned table as a partition. ● There is no check to ensure that the sub partition bounds are a subset of the partition’s own bound. CREATE TABLE child1_1 PARTITION OF child1 FOR VALUES IN (500, 2000); 32
  • 33. © Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved. OTHER FEATURES ● Default Partition ● Handling NULL values ● Inherit Constraints ● Update tuple routing ● Foreign Table Partitions ● Partition Pruning 33
  • 34. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. ● PostgreSQL 11 ● Catch tuples that do not match partition bounds of others. ● Support for: list, range ● Syntax: CREATE TABLE child PARTITION OF parent DEFAULT A B C DEFAULT Default Partition Z (NOT (col1 IS NOT NULL) AND (col1 = ANY (ARRAY[ 'A', 'B', 'C' ]))) 34
  • 35. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Default Partition Add new partition Default partition should not have rows that satisfy the new partition. ERROR: updated partition constraint for default partition "part_def" would be violated by some row 1 - 100 100 - 200 DEF: 205, 410 200 - 300 35
  • 36. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Handling NULL values ● Hash routes based on the hash. ● Range: can be inserted in default partition. ● List: A partition can be made to accept NULL else it is routed to default partition. CREATE TABLE child PARTITION OF lparent FOR VALUES IN ( NULL ) ; 36
  • 37. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Inherit Constraints Constraint Types ● Check ● Not-Null ● Unique ● Primary Keys ● Foreign Keys ● Exclude 37
  • 38. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Inherit Constraints PostgreSQL 10 ● Supported: Check, Not Null ● Other constraints can be explicitly added to individual partition. ● No cascading inheritance 38
  • 39. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Inherit Constraints PostgreSQL 11 ● Added Support: unique, primary and foreign keys ● Not supported: exclude constraints ● Cascading: New constraints added to parent propagates to partitions. ● Other constraints can be explicitly added to individual partition. 39
  • 40. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Update Tuple Routing PostgreSQL 11 ● When an updated tuple no longer fits the partition, route it to the correct one. 1 - 100 100 - 200 200 - 300 300 - 400 150 380 40
  • 41. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Foreign Table Partitions ● Can add foreign table as a partition. CREATE FOREIGN TABLE lpart_f1 PARTITION OF lparent FOR VALUES IN ('C' , 'E') SERVER s0 ● Tuples can be inserted from v11. ● Cannot be partitioned. ● Updated tuples can me moved to foreign partitions but not from them. 41
  • 42. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Partition Pruning 0 - 100 100 - 200 200 - 300 300 - 400 180 - 220 Planner Level Pruning Depending on the query conditions on partition key, scan only the relevant partitions. col1 42
  • 43. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Partition Pruning Planner Level Pruning =# EXPLAIN (COSTS OFF, TIMING OFF, ANALYZE) SELECT * FROM rparent WHERE col1 BETWEEN 180 AND 220; Append (actual rows=41 loops=1) -> Seq Scan on rpart_2 (actual rows=20 loops=1) Filter: ((col1 >= 180) AND (col1 <= 220)) Rows Removed by Filter: 80 -> Seq Scan on rpart_3 (actual rows=21 loops=1) Filter: ((col1 >= 180) AND (col1 <= 220)) Rows Removed by Filter: 79 43
  • 44. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Partition Pruning Runtime Partition Pruning ● PostgreSQL 11 ● Performed at two levels ○ Executor Initialization - prepared query ○ Actual Execution - nested loop join 44
  • 45. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Partition Pruning Runtime Pruning - Executor Initialization =# PREPARE r_q1 (int) as SELECT * from rparent where col1 < $1; =# EXPLAIN execute r_q1(150); Append (cost=0.00..168.06 rows=3012 width=8) Subplans Removed: 2 -> Seq Scan on rpart_1 Filter: (col1 < $1) -> Seq Scan on rpart_2 Filter: (col1 < $1) (6 rows) 45
  • 46. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Partition Pruning Runtime Pruning - Actual execution Nested Loop Join 100 - 200 0 - 100 200 - 300 300 - 400 10000 rows 1-50 Rest > 500 (50 times) 46
  • 47. © Copyright EnterpriseDB Corporation, 2019. All rights reserved. Partition Pruning Runtime Pruning - Actual execution =#EXPLAIN SELECT * FROM tbl JOIN rparent ON id= col1; Nested Loop (actual rows=50 loops=1) -> Seq Scan on tbl (actual rows=10000 loops=1) -> Append (actual rows=0 loops=10000) -> Index Scan using rpart_1_pkey on rpart_1 ( actual rows=1 loops=50) Index Cond: (col1 = tbl.id) -> Index Scan using rpart_2_pkey on rpart_2 (never executed) Index Cond: (col1 = tbl.id) -> Index Scan using rpart_3_pkey on rpart_3 (never executed) Index Cond: (col1 = tbl.id) -> Index Scan using rpart_4_pkey on rpart_4 (never executed) Index Cond: (col1 = tbl.id) 47
  • 48. © Copyright EnterpriseDB Corporation, 2019. All rights reserved.© Copyright EnterpriseDB Corporation, 2019. All rights reserved. THANK YOU! 48