Weitere ähnliche Inhalte
Ähnlich wie Introduction to PostgreSQL for System Administrators (20)
Kürzlich hochgeladen (20)
Introduction to PostgreSQL for System Administrators
- 2. About MySelf
Joined VMware in 2010
Database performance on vSphere
Previously at Sun Microsystems from 2000-2010
Database Performance on Solaris/Sun Systems
Work with PostgreSQL Performance Community
Scaling, Bottlenecks using various workloads
My Blog: http://jkshah.blogspot.com
© 2011 VMware Inc 2
- 3. Content
Why OpenSource Databases?
Why PostgreSQL?
Quick Start to PostgreSQL
PostgreSQL Internals
PostgreSQL Monitoring
© 2011 VMware Inc 3
- 4. Why Open Source Databases?
License costs
Need for “Just Enough” databases
Source code access if vendor goes under or above
Many available but currently two popular ones
MySQL – Great for simple, web-based workloads
PostgreSQL – Great for complex, enterprise
transactional oriented workloads
© 2011 VMware Inc 4
- 5. Why PostgreSQL?
PostgreSQL is community driven
Consists of Core team, Committers, Contributors,
Developers
No one company owns it
Great Mailing List Support
Yearly releases
Great strides in overall performance, replication in last
few years
License is more flexible than GPL (OEM, Embedded
customers more welcome)
© 2011 VMware Inc 5
- 6. Quick Start Guide for PostgreSQL
Install PostgreSQL binaries or self build/install them
Set $PGDATA variable to a valid directory path
$ initdb
$ pg_ctl start -l logfile
$ createdb dbname
$ psql dbname
© 2011 VMware Inc 6
- 7. Quick Start Guide for Remote Access
If you want to make the database access from remote
servers
Add network/remote host entry in $PGDATA/pg_hba.conf
Add listen='*' or specific IP address of NIC in
$PGDATA/postgresql.conf
$ pg_ctl reload
$ psql -h hostname -u username -d dbname
© 2011 VMware Inc 7
- 8. Quick Start Guide for Performance Tuning
Modify following parameters in $PGDATA/postgresql.conf
checkpoint_segments=16
shared_buffers=512MB
wal_buffers=1MB
If disks are slow to write and willing to take few
transaction loss without corrupting databases
synchronous_commit=off
Separate pg_xlog on separate filesystem
pg_ctl stop
mv $PGDATA/pg_xlog /path/to/pg_xlog
ln -s /path/to/pg_xlog $PGDATA/pg_xlog
pg_ctl start -l logfile
© 2011 VMware Inc 8
- 9. Quick Start Guide for Client Connections
psql - SQL Interpreter client
libpq – API access to write C applications to access the
database (used by PHP, Python, etc)
JDBC – Standard JDBC for Java Programs to connect
with PostgreSQL databases
Npgsql – Database access for .NET/Mono applications
UnixODBC – Standard ODBC for applications to connect
with PostgreSQL
© 2011 VMware Inc 9
- 10. Quick Start Guide - PostgreSQL Administration
SQL Support: SELECT, INSERT, UPDATE, INSERT
Utilities:
DDL Support: CREATE, ALTER, etc
Backup: pg_dump, pg_restore
Statistics: analyze
Cleanup: vacuumdb (or Auto-vacuum)
Misc: oid2name
© 2011 VMware Inc 10
- 11. PostgreSQL Internals – File System Layout
Files in $PGDATA
postgresql.conf – PostgreSQL Configuration File
pg_hba.conf - PostgreSQL Host Based Access Configuration
pg_ident.conf - Mapping System names to PostgreSQL user
names
PG_VERSION – Contains Version Information
postmaster.pid – Contains PID of PostMaster, $PGDATA and
shared memory segment ids
postmaster.opts – Contains the record for command line options
for the last stared database instance
© 2011 VMware Inc 11
- 12. PostgreSQL Internals – File System Layout
Directories in $PGDATA
base – Contains per-database subdirectories
pg_stat_tmp - Temporary directory for Statistics subsystem
global - Database Cluster-wide tables (pg_control)
pg_log – Server System logs (text or csv readable)
pg_subtrans – Contains sub-transaction status data
pg_xlog – WAL or Write Ahead Log directory
pg_clog - Commit Status data
pg_multixact - Contains multi-transaction status data
pg_tblspc – Contains links to tablespace locations
pg_notify - LISTEN/NOTIFY status data
pg_twophase – Contains state files for prepared transactions
© 2011 VMware Inc 12
- 13. PostgreSQL Internals – Processes Layout
> pgrep -lf postgres
6688 /usr/local/postgres/bin/postgres
(POSTMASTER)
6689 postgres: logger process
6691 postgres: writer process
6692 postgres: wal writer process
6693 postgres: autovacuum launcher process
6694 postgres: stats collector process
13583 postgres: username dbname [local] idle in
transaction
© 2011 VMware Inc 13
- 14. PostgreSQL Internals – Memory Layout
PostgreSQL Shared Memory
PostgreSQL Backend Process
PostgreSQL Backend Process
PostgreSQL Backend Process
© 2011 VMware Inc 14
- 15. PostgreSQL Internals – Shared Memory Layout
PROC
ProcArray XLOG Buffers
CLOG Buffers
Auto Vacuum Subtrans Buffers
Btree Vacuum Two-phase Structs
Free Space Map Multi-xact buffers
Background Writer Shared Invalidation
Buffer Descriptors Shared Buffers
LWLocks
Lock Hashes
PROCLOCK
Statistics
Synchronized Scans
© 2011 VMware Inc 15
- 16. PostgreSQL Internals – WAL Layout
WAL – Write Ahead Log
pg_xlog contains 16MB segments
All transactions are first logged here before they are
committed
checkpoint_segments and checkpoint_timeout drives the
number of WAL/pg_xlog segments
Can grow pretty big based on transaction load
© 2011 VMware Inc 16
- 17. PostgreSQL Internals – Checkpoint
Checkpoint does the following:
Pushes dirty bufferpool pages to storage
Syncs all filesystem cache
Recycle WAL files
Check for server messages indicating too-frequent
checkpoints
Checkpoint causes performance spikes (drops) while
checkpoint is in progress
© 2011 VMware Inc 17
- 18. PostgreSQL Internals – MVCC
Default isolation level in PostgreSQL is READ-
COMMITTED
Each Query only sees transactions completed before it
started
On Query Start, PostgreSQL records:
The transaction counter
All transactions that are in progress
In a multi-statement transaction, a transaction's own
previous queries are also visible
© 2011 VMware Inc 18
- 19. PostgreSQL Internals – MVCC
Visible Tuples must have xmin or creation xact id that:
Is a committed transaction
Is less than the transaction counter stored at query
start
Was not in-process at query start
Visible tuples must also have an expiration transaction id
(xmax) that:
Is blank or aborted or
Is greater than the transaction counter at query start
or
Was in-process at query start
© 2011 VMware Inc 19
- 20. PostgreSQL Internals – VACUUM
VACUUM is basically MVCC Cleanup
Get rid of deleted rows (or earlier versions of rows)
Records free space in .fsm files
Often used with Analyze to collect optimizer statistics
Auto vacuum helps to do the vacuum task as and when
needed
© 2011 VMware Inc 20
- 21. PostgreSQL Internals – Lock Types
Access Share Lock – SELECT
Row Share Lock - SELECT FOR UPDATE
Row Exclusive Lock – INSERT, UPDATE, DELETE
Share Lock – CREATE INDEX
Share Row Exclusive Lock – EXCLUSIVE MODE but
allows ROW SHARE LOCK
Exclusive Lock – Blocks even Row Share Lock
Access Exclusive Lock – ALTER TABLE, DROP TABLE,
VACUUM and LOCK TABLE
© 2011 VMware Inc 21
- 22. PostgreSQL Internals – Query Execution
Parse Statement
Separate Utilities from actual queries
Rewrite Queries in a standard format
Generate Paths
Select Optimal Path
Generate Plan
Execute Plan
© 2011 VMware Inc 22
- 23. Monitoring PostgreSQL – Monitoring IO
Iostat -xcmt 5 (Helps if pg_xlog is on separate filesystem)
12/02/2010 01:09:56 AM
avg-cpu: %user %nice %system %iowait %steal %idle
8.21 0.00 2.40 0.77 0.00 88.62
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd ($PGDATA) 0.00 46.67 0.00 3.67 0.00 0.20 110.55 0.01 1.45 0.73 0.27
sde (pg_xlog) 0.00 509.00 0.00 719.33 0.00 3.39 9.66 0.11 0.15 0.15 10.53
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
© 2011 VMware Inc 23
- 24. Monitoring PostgreSQL – Monitoring CPU/Mem
top -c
top - 01:06:27 up 20 days, 1:48, 1 user, load average: 1.41, 5.30, 2.95
Tasks: 157 total, 1 running, 156 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.0%us, 0.5%sy, 0.0%ni, 93.2%id, 0.6%wa, 0.0%hi, 0.7%si, 0.0%st
Mem: 16083M total, 5312M used, 10770M free, 213M buffers
Swap: 8189M total, 0M used, 8189M free, 4721M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29725 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.66 postgres: pguser
sbtest 192.168.0.65(20121) idle in transaction
29726 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.56 postgres: pguser
sbtest 192.168.0.65(20122) idle in transaction
29727 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.58 postgres: pguser
sbtest 192.168.0.65(20123) idle in transaction
29728 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.62 postgres: pguser
sbtest 192.168.0.65(20124) idle in transaction
© 2011 VMware Inc 24
- 25. Monitoring PostgreSQL – Monitoring SQL
select * from pg_stat_activity;
datid | datname | procpid | usesysid | usename | application_name | client_addr | client_port |
backend_start | xact_start | query_start | waiting | current_query
-------+---------+---------+----------+---------+------------------+--------------+-------------+-------------------------------
16384 | sbtest | 29601 | 10 | pguser | psql | | -1 | 2010-12-02 00:59:26.883118+00
| 2010-12-02 01:00:43.088853+00 | 2010-12-02 01:00:43.088853+00 | f | select * from pg_stat_activity;
16384 | sbtest | 29624 | 10 | pguser | | 192.168.0.65 | 50659 | 2010-12-02
01:00:27.406842+00 | 2010-12-02 01:00:43.087634+00 | 2010-12-02 01:00:43.089749+00 | f | SELECT c
from sbtest where id=$1
16384 | sbtest | 29626 | 10 | pguser | | 192.168.0.65 | 50661 | 2010-12-02
01:00:27.439644+00 | 2010-12-02 01:00:43.060565+00 | 2010-12-02 01:00:43.087203+00 | f | <IDLE> in
transaction
16384 | sbtest | 29627 | 10 | pguser | | 192.168.0.65 | 50662 | 2010-12-02
01:00:27.458667+00 | 2010-12-02 01:00:43.082811+00 | 2010-12-02 01:00:43.089694+00 | f | SELECT
SUM(K) from sbtest where id between $1 and $2
© 2011 VMware Inc 25
- 26. Monitoring PostgreSQL – EXPLAIN ANALYZE
Explain Analyze select count(*) from sbtest where k > 1;
QUERY PLAN
----------------------------------------------------------
Aggregate (cost=380.48..380.49 rows=1 width=0) (actual
time=6.929..6.929 rows=1 loops=1)
-> Index Scan using k on sbtest (cost=0.00..374.58
rows=2356 width=0) (actual time=0.023..6.364 rows=3768
loops=1)
Index Cond: (k > 1)
Total runtime: 6.977 ms
(4 rows)
© 2011 VMware Inc 26
- 27. Monitoring PostgreSQL – Monitoring DB Statistics
select * from pg_stat_database;
[ RECORD 4 ]-+-----------
datid | 16384
datname | sbtest
numbackends | 1
xact_commit | 136978505
xact_rollback | 95
blks_read | 99161
blks_hit | 1481038913
tup_returned | 1138850495
tup_fetched | 986059963
tup_inserted | 3150677
tup_updated | 6145651
tup_deleted | 2050612
© 2011 VMware Inc 27
- 28. Monitoring PostgreSQL – Monitoring Table Statistics
select * from pg_stat_user_tables;
-[ RECORD 1 ]----+------------------------------
relid | 16405
schemaname | public
relname | sbtest
seq_scan | 2
seq_tup_read | 0
idx_scan | 2001731
idx_tup_fetch | 46158444
n_tup_ins | 211207
n_tup_upd | 333527
n_tup_del | 111207
n_tup_hot_upd | 108633
n_live_tup | 100000
n_dead_tup | 0
last_vacuum |
last_autovacuum | 2010-12-02 01:38:44.821662+00
last_analyze |
last_autoanalyze | 2010-12-02 01:38:47.598785+00
© 2011 VMware Inc 28
- 29. Monitoring PostgreSQL – Monitoring IO Statistics
select * from pg_statio_user_tables;
-[ RECORD 1 ]---+---------
relid | 16405
schemaname | public
relname | sbtest
heap_blks_read | 4588
heap_blks_hit | 29358125
idx_blks_read | 1919
idx_blks_hit | 6551658
toast_blks_read |
toast_blks_hit |
tidx_blks_read |
tidx_blks_hit |
© 2011 VMware Inc 29
- 30. Monitoring PostgreSQL – Monitoring IO Statistics
select * from pg_statio_user_indexes;
-[ RECORD 1 ]-+------------
relid | 16405
indexrelid | 16412
schemaname | public
relname | sbtest
indexrelname | sbtest_pkey
idx_blks_read | 391
idx_blks_hit | 5210450
-[ RECORD 2 ]-+------------
relid | 16405
indexrelid | 16414
schemaname | public
relname | sbtest
indexrelname | k
idx_blks_read | 1528
idx_blks_hit | 1341208
© 2011 VMware Inc 30
- 31. Questions / More Information
Email: jshah@vmware.com
Learn more about PostgreSQL
http://www.postgresql.org
Blog: http://jkshah.blogspot.com
© 2011 VMware Inc 31