2. PostgreSQL
Architecture,
Installation &
Configuration
Part-1
• Postgres Architecture
• Process and Memory Architecture
• Postgres Server Process
• Backend Processes &Background Processes
• Buffer Manager Structure
• Write Ahead Logging
• PostgreSQL Installation
• Setting Environment Variables
3. Architecture of PostgreSQL
• The physical structure of
PostgreSQL is very simple, it
consists of the following
components:
Shared Memory
Background processes
Data directory structure /
Data files
4. Data Files / Data Directory Structure
• PostgreSQL consist of multiple
databases this is called a database
cluster. When we initialize
PostgreSQL database template0,
template1 and Postgres databases
are created.
• Template0 and template1 are
template databases for new
database creation of user it contains
the system catalog tables.
• The user database will be created
by cloning the template1 database.
5. Process Architecture
• PostgreSQL is a client/server type relational database
management system with the multi-process architecture and
runs on a single host.
• A collection of multiple processes cooperatively managing
one database cluster is usually referred to as a 'PostgreSQL
server', and it contains the following types of processes:
• A postgres server process is a parent of all processes
related to a database cluster management.
• Each backend process handles all queries and statements
issued by a connected client.
• Various background processes perform processes of each
feature (e.g., VACUUM and CHECKPOINT processes) for
database management.
• In the replication associated processes, they perform the
streaming replication.
• In the background worker process supported from
version 9.3
6. Postgres Server Process
• a postgres server process is a parent of all in a PostgreSQL server. In the earlier versions, it
was called ‘postmaster’.
• By executing the pg_ctl utility with start option, a postgres server process starts up. Then,
it allocates a shared memory area in memory, starts various background processes, starts
replication associated processes and background worker processes if necessary, and waits
for connection requests from clients. Whenever receiving a connection request from a
client, it starts a backend process. (And then, the started backend process handles all
queries issued by the connected client.)
• A postgres server process listens to one network port, the default port is 5432. Although
more than one PostgreSQL server can be run on the same host, each server should be set
to listen to different port number in each other, e.g., 5432, 5433, etc.
7. Backend Processes
• A backend process, which is also called postgres, is started by the postgres server process and handles all queries issued
by one connected client. It communicates with the client by a single TCP connection, and terminates when the client
gets disconnected.
• As it is allowed to operate only one database, you have to specify a database you want to use explicitly when connecting
to a PostgreSQL server.
• PostgreSQL allows multiple clients to connect simultaneously; the configuration parameter max_connections controls
the maximum number of the clients (default is 100).
• The maximum number of backend processes is set by the max_connections parameter, and the default value is 100.
• The backend process performs the query request of the user process and then transmits the result. Some memory
structures are required for query execution, which is called local memory. The main parameters associated with local
memory are:
• work_mem Space used for sorting, bitmap operations, hash joins, and merge joins. The default setting is 4 MB.
• Maintenance_work_mem Space used for Vacuum and CREATE INDEX . The default setting is 64 MB.
• Temp_buffers Space used for temporary tables. The default setting is 8 MB.
8. Background Processes
process description
background writer • In this process, dirty pages on the shared buffer pool are written to a persistent storage (e.g., HDD, SSD) on a regular basis gradually.
• In other word this process writes and flushes periodically the WAL data on the WAL buffer to persistent storage.
checkpointer • The actual work of this process is when a checkpoint occurs it will write dirty buffer into a file.
• Checkpointer will write all dirty pages from memory to disk and clean shared buffers area. If PostgreSQL database is crashed, we can
measure data loss between last checkpoint time and PostgreSQL stopped time. The checkpoint command forces an immediate checkpoint
when the command is executed manually. Only database superuser can call checkpoint.
The checkpoint will occur in the following scenarios:
• The pages are dirty.
• Starting and restarting the DB server (pg_ctl STOP | RESTART).
• Issue of the commit.
• Starting the database backup (pg_start_backup).
• Stopping the database backup (pg_stop_backup).
• Creation of the database.
autovacuum launcher • The autovacuum-worker processes are invoked for vacuum process periodically. (More precisely, it requests to create the autovacuum
workers to the postgres server.)
• When autovacuum is enabled, this process has the responsibility of the autovacuum daemon to carry vacuum operations on bloated tables.
This process relies on the stats collector process for perfect table analysis.
WAL writer This process writes and flushes periodically the WAL data on the WAL buffer to persistent storage.
statistics collector In this process, statistics information such as for pg_stat_activity and for pg_stat_database, etc. is collected.
Logging Collector(logger) • This process also called a logger. It will write a WAL buffer to WAL file.
9. Background Processes
process description
Memory for Locks /
Lock Space
• In this process, dirty pages on the shared buffer pool are written to a persistent storage (e.g., HDD, SSD) on a regular basis gradually.
• In other word this process writes and flushes periodically the WAL data on the WAL buffer to persistent storage.
checkpointer • The actual work of this process is when a checkpoint occurs it will write dirty buffer into a file.
• Checkpointer will write all dirty pages from memory to disk and clean shared buffers area. If PostgreSQL database is crashed, we can
measure data loss between last checkpoint time and PostgreSQL stopped time. The checkpoint command forces an immediate checkpoint
when the command is executed manually. Only database superuser can call checkpoint.
The checkpoint will occur in the following scenarios:
• The pages are dirty.
• Starting and restarting the DB server (pg_ctl STOP | RESTART).
• Issue of the commit.
• Starting the database backup (pg_start_backup).
• Stopping the database backup (pg_stop_backup).
• Creation of the database.
autovacuum launcher • The autovacuum-worker processes are invoked for vacuum process periodically. (More precisely, it requests to create the autovacuum
workers to the postgres server.)
• When autovacuum is enabled, this process has the responsibility of the autovacuum daemon to carry vacuum operations on bloated tables.
This process relies on the stats collector process for perfect table analysis.
WAL writer This process writes and flushes periodically the WAL data on the WAL buffer to persistent storage.
statistics collector In this process, statistics information such as for pg_stat_activity and for pg_stat_database, etc. is collected.
10. Memory Architecture
• Memory architecture in
PostgreSQL can be classified into
two broad categories:
1. Local memory area – allocated
by each backend process for
its own use.
2. Shared memory area – used
by all processes of a
PostgreSQL server.
11. Local memory area
work_mem • Executor uses this area for sorting tuples by
ORDER BY and DISTINCT operations, and for
joining tables by merge-join and hash-join
operations.
• The default value of work memory in 9.3 and
the older version is 1 megabyte (1 MB) from
9.4 and later default value of work memory
is 4 megabytes ( default size 4 MB).
maintenance_
work_mem
• We need to specify the maximum amount of
memory for database maintenance
operations such as VACUUM, ANALYZE,
ALTER TABLE, CREATE INDEX, and ADD
FOREIGN KEY, etc.
• The default value of maintenance work
memory in 9.3 and the older version is 16
megabytes (16 MB) from 9.4 and later
default value of maintenance work memory
is 64 megabytes ( Default Size 64 MB).
• It is safe to set maintenance work memory is
large as compared to work memory. Larger
settings will improve the performance of
maintenance (VACUUM, ANALYZE, ALTER
TABLE, CREATE INDEX, and ADD FOREIGN
KEY, etc.) operations.
temp_buffers Executor uses this area for storing temporary
Shared memory area
shared
buffer pool
• PostgreSQL loads pages within tables and indexes from a
persistent storage to here and operates them directly.
• We need to set some amount of memory to a database
server for uses of shared buffers. The default value of
shared buffers in 9.2 and the older version is 32 megabytes
(32 MB) from 9.3 and the later default value of shared
buffers is 128 megabytes ( Default size 128 MB).
• If we have a dedicated server for PostgreSQL, reasonable
starting to set shared buffers value is 25% of total memory.
The purpose of shared buffers is to minimize server DISK
IO.
WAL buffer To ensure that no data has been lost by server failures,
PostgreSQL supports the WAL mechanism. WAL data (also
referred to as XLOG records) are transaction log in PostgreSQL;
and WAL buffer is a buffering area of the WAL data before
writing to a persistent storage.
WAL buffers temporarily store changes in the database, which
changes in the WAL buffers are written to the WAL file at a
predetermined time. At the time of backup and recovery, WAL
buffers and WAL files are very important to recover the data at
some peak of time.
The minimum value of shared buffers is 32 KB. If we set this
parameter as wal_buffers = -1 it will set based on
shared_buffers.
commit log • Commit Log(CLOG) keeps the states of all transactions
MEMORY ARCHITECTURE
12. Installing PostgreSQL on Linux/Unix
Follow the given steps to install PostgreSQL on your Linux machine.
Make sure you are logged in as root before you proceed for the
installation.
Pick the version number of PostgreSQL you want and, as exactly as
possible, the platform you want from EnterpriseDB
I downloaded postgresql-9.2.4-1-linux-x64.run for my 64 bit CentOS-6
machine. Now, let us execute it as follows −
[root@host]# chmod +x postgresql-9.2.4-1-linux-x64.run
[root@host]# ./postgresql-9.2.4-1-linux-x64.run
------------------------------------------------------------------------
Welcome to the PostgreSQL Setup Wizard.
------------------------------------------------------------------------
Please specify the directory where PostgreSQL will be installed.
Installation Directory [/opt/PostgreSQL/9.2]:
Once you launch the installer, it asks you a few basic questions like
location of the installation, password of the user who will use database,
port number, etc. So keep all of them at their default values except
password, which you can provide password as per your choice. It will
install PostgreSQL at your Linux machine and will display the following
message −
Please wait while Setup installs PostgreSQL on your computer.
Installing
0% ______________ 50% ______________ 100%
#########################################
-----------------------------------------------------------------------
Setup has finished installing PostgreSQL on your computer.
Follow the following post-installation steps to create your
database −
[root@host]# su - postgres
Password:
bash-4.1$ createdb testdb
bash-4.1$ psql testdb
psql (8.4.13, server 9.2.4)
test=#
You can start/restart postgres server in case it is not running using
the following command −
[root@host]# service postgresql restart
Stopping postgresql service: [ OK ]
Starting postgresql service: [ OK ]
If your installation was correct, you will have PotsgreSQL prompt
test=# as shown above.
13. Installing PostgreSQL on Windows
• Follow the given steps to install
PostgreSQL on your Windows machine.
Make sure you have turned Third Party
Antivirus off while installing.
• Pick the version number of PostgreSQL
you want and, as exactly as possible, the
platform you want from EnterpriseDB
• I downloaded postgresql-9.2.4-1-
windows.exe for my Windows PC
running in 32bit mode, so let us
run postgresql-9.2.4-1-windows.exe as
administrator to install PostgreSQL.
Select the location where you want to
install it. By default, it is installed within
Program Files folder.
14. PostgreSQL mostly tuned parameters
Listen_address
No doubt, you need to change it to let PostgreSQL know what IP address(es) to
listen on. If your postgres is not just used for localhost, add or change it
accordingly. Also, you need to setup access rules in pg_hba.conf.
listen_addresses = 'localhost,<dbserver>'
default value: "localhost"
Max_connections
max_connections = 2000
default: max_connections = 100 This parameter really depends on your application, I set it
to 2000 for most of connections are short lifetime SQL, and connection are reused
Buffer size
shared_buffers = 3GB
effective_cache_size = 16GB
Default:
shared_buffers = 32MB
effective_cache_size = 128MB
Work memory
work_mem = 32MB
maintenance_work_mem = 256MB
Default:
work_mem = 1MB and maintenance_work_mem = 16MB work_mem is for each
connection, while maintenance_work_mem is for maintenance tasks for example, vacuum,
create, index etc.. Set work_mem big is good for sorting types of query, but not for small
query. This has to be considered with max_connections
Check_segments
checkpoint_segments = 32
default:
checkpoint_segments=3 Maximum number of log file segments between
automatic WAL checkpoints (each segment is normally 16 megabytes)
Wal_level
wal_level = archiv
default:
wal_level = minimal wal_level determines how much information is written to the WAL. The
default value is minimal, which writes only the information needed to recover from a crash
or immediate shutdown. archive adds logging required for WAL archiving
ARCHIVE
Not mandatery for all cases. Here is my setting
archive_mode = on
archive_command = '/bin/cp -p %p /home/backups/archivelogs/%f
</dev/null'
AUTOVACUUM
autovacuum is a quite hot topic in Postgres, for most of time, global autovacuum doesn't
work well,
track_counts = on
autovacuum = on
autovacuum_max_workers
autovacuum_vacuum_threshold = 500
15. PostgreSQL table access statistics, index io statistics
#1 table access statistics
#select schemaname,relname,seq_scan,idx_scan,cast(idx_scan as numeric) /
(idx_scan + seq_scan)
as idx_scan_pct
from pg_stat_user_tables where (idx_scan +seq_scan) >0 order by idx_scan_pct;
Higher pct means more likely your postgreSQL is using index scan, which is good.
#2 table io statistics
#select relname,cast(heap_blks_hit as numeric) /(heap_blks_hit +heap_blks_read)
as hit_pct,heap_blks_hit,heap_blks_read from pg_statio_user_tables
where (heap_blks_hit + heap_blks_read) >0 order by hit_pct; Higher hit_pct means
more likely the data required is cached.
#3 index access statistics
this shows all of the disk i/o for every index on each table
#select relname,cast(idx_blks_hit as numeric) /(idx_blks_hit + idx_blks_read )
as hit_pct,idx_blks_hit,idx_blks_read from pg_statio_user_tables
where (idx_blks_hit +idx_blks_read) >0 order by hit_pct;
#4 index io statistics
#select indexrelname,cast(idx_blks_hit as numeric) /( idx_blks_hit + idx_blks_read)
as hit_pct,idx_blks_hit,idx_blks_read from pg_statio_user_indexes
where (idx_blks_hit +idx_blks_read)>0 order by hit_pct ;
#5 Less used indexes(from top to bottom)
#select
schemaname,relname,indexrelname,idx_scan,pg_size_pretty(pg_relation_size(i.in
dexrelid))
as index_size from pg_stat_user_indexes i join pg_index using (indexrelid)
where indisunique is false order by idx_scan,relname;
Note: The main thing that the counts in pg_stat_user_indexes are useful for is to
determining which indexes are actually being used by your application. Since
indexes add overhead to the system, but drop them with care.
To show current server configuration setting.
SHOW ALL;
SELECT name, setting, unit, context FROM pg_settings;
16. Write-Ahead Logging (WAL)- Parameter
Common Settings Checkpoints Archiving
#wal_level = minimal # minimal, archive, or hot_standby #
(change requires restart)
#fsync = on # turns forced synchronization on or off
#synchronous_commit = on # synchronization level; on, off, or local
#wal_sync_method = fsync # the default is the first option
#full_page_writes = on # recover from partial page writes
#wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
# (change requires restart)
#wal_writer_delay = 200ms # 1-10000 milliseconds
#commit_delay = 0 # range 0-100000, in microseconds
#commit_siblings = 5 # range 1-1000
#checkpoint_segments = 3 # in logfile segments, min 1, 16MB
each#archive_mode = off # allows archiving to be done #
(change requires restart)
#archive_command = '' # command to use to archive a logfile
segment
#archive_timeout = 0 # force a logfile segment switch after this #
number of seconds; 0 disables
#checkpoint_timeout = 5min # range 30s-1h
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0
- 1.0
#checkpoint_warning = 30s # 0 disables
#archive_mode = off # allows archiving to be done # (change
requires restart)
#archive_command = '' # command to use to archive a logfile
segment
#archive_timeout = 0 # force a logfile segment switch after this #
number of seconds; 0 disables
Configuring the PostgreSQL Archive Log Directory
• Archive log files are stored in the Archive Log directory. Ensure to follow the below checkpoints before running the PostgreSQL File System backup.
• Specify the Archive log directory path in the postgresql.conf file prior to performing the PostgreSQL FS backup. Make sure that this path does not point to pg_log or log directories and pg_xlog or pg_wal directories.
archive_command = 'cp %p /opt/wal/%f' #UNIX
archive_command = 'copy "%p" "D:PostgreSQLwal%f"' #Windows
• The following configuration to turn on the archive_mode. This feature is not supported for PostgreSQL 8.2 and earlier versions.
archive_mode = on
• For PostgreSQL 9.x.x version, use the following configuration.
Set wal_level = archive instead of default wal_level = minimal
• From PostgreSQL 10.x.x version onwards, use the following configuration.
Set wal_level = replica
• Verify that the archive command provided in the postgresql.conf file is correct. You can test this by running the following commands and verifying that they successfully complete.
Select pg_start_backup(‘Testing’);
Select pg_stop_backup();
17. Write-Ahead Logging (WAL)- Parameter
•WAL Archive log In PostgreSQL database system, the actual database 'writes' to an addition file called write-ahead log (WAL) to disk.
•It contains a record of writes that made in the database system. In the case of Crash, database can be repaired/recovered from these
records.
•Normally, the write-ahead log logs at regular intervals (called Checkpoints) matched against the database and then deleted because it no
longer is required. You can also use the WAL as a backup because, there is a record of all writes made to the database.
WAL Archiving Concept :
In pg_xlog write ahead logs are stored. It is the log file, where all the logs are stored of committed and uncommitted transaction. It
contains max 6 logs, and last one overwrites. If archiver is on, it moves there.
•The write-ahead log is composed of each 16 MB large, which are called segments.
•The WALs reside under pg_xlog directory and it is the subdirectory of 'data directory'. The filenames will have numerical(0-9) and
character(a-z) named in ascending order by PostgreSQL Instance. To perform a backup on the basis of WAL, one needs a basic backup that
is, a complete backup of the data directory, and the WAL Segments between the base backup and the current date.
•PostgreSQL managing WAL files by removing or adding according to setting with wal_keep_segments, max_wal_size and min_wal_size.
20. Benefits of WAL
The first obvious benefit of using WAL is a significantly reduced number of disk writes, since only the log file needs to be
flushed to disk at the time of transaction commit; in multiuser environments, commits of many transactions may be
accomplished with a single fsync() of the log file. Furthermore, the log file is written sequentially, and so the cost of
syncing the log is much less than the cost of flushing the data pages.
The next benefit is consistency of the data pages. The truth is that, before WAL, PostgreSQL was never able to guarantee
consistency in the case of a crash. Before WAL, any crash during writing could result in:
1. index rows pointing to nonexistent table rows
2. index rows lost in split operations
3. totally corrupted table or index page content, because of partially written data pages
Problems with indexes (problems 1 and 2) could possibly have been fixed by additional fsync() calls, but it is not obvious
how to handle the last case without WAL; WAL saves the entire data page content in the log if that is required to ensure
page consistency for after-crash recovery.
WAL is significantly faster in most scenarios.
WAL provides more concurrency as readers do not block writers and a writer does not block readers. Reading and
writing can proceed concurrently.
Disk I/O operations tends to be more sequential using WAL.
WAL uses many fewer fsync() operations and is thus less vulnerable to problems on systems where the fsync() system
call is broken.
21. Postgres WAL Config
The postgres.conf file is set as follows for WAL archiving.
#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------
# - Settings -
#wal_level = minimal # minimal, archive, or hot_standby
wal_level = archive
# - Archiving -
archive_mode = on
#archive_mode = off # allows archiving to be done
# (change requires restart)
archive_command = 'cp %p /pgsql-backup/archive/postgres1/%f'
# command to use to archive a logfile segment
# archive_command = ''
# command to use to archive a logfile segment
# placeholders: %p = path of file to archive
# %f = file name only
# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
• #archive_timeout = 0 # force a logfile segment switch after this
• # number of seconds; 0 disables
22. How do I read a Wal file in PostgreSQL?
• First get the source for the version of Postgres that you wish to view
WAL data for. ./configure and make this, but no need to install.
• Then copy the xlogdump folder to the contrib folder (a git clone in
that folder works fine)
• Run make for xlogdump - it should find the parent postgres structure
and build the binary.