Dev Dives: Streamline document processing with UiPath Studio Web
Scaling PostgreSQL with Skytools
1. Scaling with SkyTools
& More
Scaling-Out Postgres with Skype’s Open-Source Toolset
Gavin M. Roy
September 14th, 2011
2. About Me
• PostgreSQL ~ 6.5
• CTO @myYearbook.com
• Scaled initial infrastructure
• Not as involved day-to-day database
operational and development
• Twitter: @Crad
19. Master Process
Lock Stats
Collector
Contention? Autovacuum
Each backend for a connected Wall Writer
client has to check for locks
Wall Writer
Connection
Client Connection
Backend
30. Connection Pooling
Clients Clients Clients
Hundreds Hundreds Hundreds
pgBouncer
Tens Tens Tens
Postgres Postgres Postgres
Server #1 Server #2 Server #3
31. Add Local Pooling
Clients Clients Clients
Hundreds Hundreds Hundreds
Local pgBouncer Local pgBouncer Local pgBouncer
Tens Tens Tens
pgBouncer
Tens Tens Tens
Postgres Postgres Postgres
Server #1 Server #2 Server #3
32. Easy to run
Usage: pgbouncer [OPTION]... config.ini
-d, --daemon Run in background (as a daemon)
-R, --restart Do a online restart
-q, --quiet Run quietly
-v, --verbose Increase verbosity
-u, --user=<username> Assume identity of <username>
-V, --version Show version
-h, --help Show this help screen and exit
35. Specifying Connections
[databases]
; foodb over unix socket
foodb =
; redirect bardb to bazdb on localhost
bardb = host=localhost dbname=bazdb
; access to dest database will go with single user
forcedb = host=127.0.0.1 port=300 user=baz password=foo
client_encoding=UNICODE datestyle=ISO connect_query='SELECT
1'
36. Base Daemon Config
[pgbouncer]
logfile = pgbouncer.log
pidfile = pgbouncer.pid
; ip address or * which means all ip-s
listen_addr = 127.0.0.1
listen_port = 6432
; unix socket is also used for -R.
;unix_socket_dir = /tmp
45. ticker.ini
[pgqadm]
job_name = pgopen_ticker
db = dbname=pgopen
# how often to run maintenance [seconds]
maint_delay = 600
# how often to check for activity [seconds]
loop_delay = 0.1
logfile = ~/Source/pgopen_skytools/%(job_name)s.log
pidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
46. Getting PGQ Running
Setup our ticker:
pgqadm.py ticker.ini install
Run the ticker daemon:
pgqadm.py ticker.ini ticker -d
48. replication.ini
[londiste]
job_name = pgopen_to_destination
provider_db = dbname=pgopen
subscriber_db = dbname=destination
# it will be used as sql ident so no dots/spaces
pgq_queue_name = pgopen
logfile = ~/Source/pgopen_skytools/%(job_name)s.log
pidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
58. Simple Remote
Connection
CREATE FUNCTION get_user_email(username text)
RETURNS SETOF text AS $$
CONNECT 'dbname=remotedb';
SELECT email FROM users WHERE username = $1;
$$ LANGUAGE plproxy;
59. Sharded Request
CREATE FUNCTION get_user_email(username text)
RETURNS SETOF text AS $$
CLUSTER “usercluster”;
RUN ON hashtext(username);
$$ LANGUAGE plproxy;
60. Sharding Setup
• Need 3 Functions:
• plproxy.get_cluster_partitions(cluster_name
text)
• plproxy.get_cluster_version(cluster_name text)
• plproxy.get_cluster_config(in cluster_name text,
out key text,
out val text)
61. get_cluster_partitions
CREATE OR REPLACE FUNCTION
plproxy.get_cluster_partitions(cluster_name text)
RETURNS SETOF text AS $$
BEGIN
IF cluster_name = 'usercluster' THEN
RETURN NEXT 'dbname=part00 host=127.0.0.1';
RETURN NEXT 'dbname=part01 host=127.0.0.1';
RETURN;
END IF;
RAISE EXCEPTION 'Unknown cluster';
END;
$$ LANGUAGE plpgsql;
62. get_cluster_version
CREATE OR REPLACE FUNCTION
plproxy.get_cluster_version(cluster_name text)
RETURNS int4 AS $$
BEGIN
IF cluster_name = 'usercluster' THEN
RETURN 1;
END IF;
RAISE EXCEPTION 'Unknown cluster';
END;
$$ LANGUAGE plpgsql;
63. get_cluster_config
CREATE OR REPLACE FUNCTION plproxy.get_cluster_config(
in cluster_name text,
out key text,
out val text)
RETURNS SETOF record AS $$
BEGIN
-- lets use same config for all clusters
key := 'connection_lifetime';
val := 30*60; -- 30m
RETURN NEXT;
RETURN;
END;
$$ LANGUAGE plpgsql;
67. PLProxy + SQL/Med
Behavior
• PL/Proxy will prefer SQL/Med cluster
definitions over the plproxy.get_* functions
• PL/Proxy will fallback to plproxy.get_*
functions if there are no SQL/Med clusters
68. SQL/MED User Mapping
CREATE USER MAPPING FOR bob
SERVER a_cluster
OPTIONS (user 'bob', password 'secret');
CREATE USER MAPPING FOR public
SERVER a_cluster
OPTIONS (user 'plproxy', password 'foo');
69. plproxyrc
• plpgsql based api for table based
management of PL/Proxy
• Used to manage complicated PL/Proxy
infrastructure @myYearbook
• BSD Licensed
https://github.com/myYearbook/plproxyrc
72. Complex PL/Proxy and pgBouncer
Environment
Clients Local pgBouncer Load Balancer
Clients Local pgBouncer
Clients Local pgBouncer pgBouncer pgBouncer
Postgres plProxy Server plProxy Server
Server #1
pgBouncer
Postgres
Load Balancer
Server #3
pgBouncer
Postgres
Server #3