9. SQL Features
● Windowing Functions
● Common Table Expressions
● array_agg
● Per-database Collations
● New data types
– Unsigned Integers
– CIText
● Improved d commands
● Add columns to existing VIEWs
10. Windowing Functions
● Aggregate over part of the data
– SQL 2008 standard
– Great for BI, OLAP
● Functions:
– row_number()
– rank()
– lead()
– lag()
● More from David Fetter later!
11. Windowing Functions
SELECT
y,
m,
SUM(SUM(people)) OVER (PARTITION BY y ORDER BY m),
AVG(people)
FROM(
SELECT
EXTRACT(YEAR FROM accident_date) AS y,
EXTRACT(MONTH FROM accident_date) AS m,
*
FROM
accident SELECT
)s depname,
GROUP BY y, m; empno,
salary,
rank() OVER
(PARTITION BY depname
ORDER BY salary)
FROM
empsalary;
12. Common Table Expressions
● Ability to create "named subqueries" for your
query.
● Best use: WITH RECURSIVE
– real recursive queries
– "walk" trees with one query
● more from David Fetter later
13. Common Table Expressions
WITH RECURSIVE subdepartment AS
(
--
SELECT * FROM department WHERE id = 'A'
UNION ALL
-- recursive term referring to "subdepartment"
SELECT d.* FROM department AS d, subdepartment
AS sd
WHERE d.id = sd.parent_department
)
SELECT * FROM subdepartment;
14. array_agg
● History:
– added Arrays in 7.4
● array_accum() aggregate example code
– intarray contrib module in 8.0
● only ints, but very fast
● array_agg() in 8.4: all arrays, fast C Code
from Robert Haas, new contributor!
–
SELECT status, array_agg(username) FROM
logins GROUP BY status;
15. Per-Database Collations
● Collations (ordering character sets) used to be
per installation
● Now they are per database
● Someday they will be per column
● Google Summer of Code Project!
CREATE DATABASE mydb
COLLATE 'sv_se.UTF-8'
CTYPE 'sv_se.UTF-8'
TEMPLATE template0
16. New Data Types
● Make migrating from other DBMSes easier
● CIText (in /contrib)
– Case Insensitive Text
– Full CI indexing, comparisons
● Unsigned Integers (in pgFoundry)
– migrate from MySQL, others
17. Better d in psql
● d is now multi-version compatible
– dt etc. won't error if you connect an 8.4 client to an
8.2 database
● df for user functions only
– dfS for system functions
● ef to edit a funcion
18. Add columns to VIEWs
● In the bad old days:
– need to add another column to your VIEW?
– have to drop it & recreate it
– have to drop & recreate all dependancies
– enter the World Of Pain
● In 8.4:
– ALTER VIEW lets you add columns
– Can't rename or modify though
22. Improved Hash Indexes
● Our old hash indexes were slow and useless
● Improved hash indexes are fast!
– use them for ID columns
● or other unique keys
– not completely recovery-safe yet though
● don't switch over production DBs until 8.5
● Google Summer of Code project!
23. pg_stat_user_functions
● For each of your functions, see
– # of times called
– amount of time spent
– amount of time spent excluding other functions
26. pg_stat_statements
postgres=# SELECT * FROM pg_stat_statements ORDER BY total_time DESC
LIMIT 3;
-[ RECORD 1 ]------------------------------------------------------------
userid | 10
dbid | 63781
query | UPDATE branches SET bbalance = bbalance + $1 WHERE bid = $2;
calls | 3000
total_time | 20.716706
rows | 3000
-[ RECORD 2 ]------------------------------------------------------------
userid | 10
dbid | 63781
query | UPDATE tellers SET tbalance = tbalance + $1 WHERE tid = $2;
calls | 3000
total_time | 17.1107649999999
rows | 3000
-[ RECORD 3 ]------------------------------------------------------------
userid | 10
dbid | 63781
query | UPDATE accounts SET abalance = abalance + $1 WHERE aid = $2;
calls | 3000
total_time | 0.645601
rows | 3000
27. More DTrace Probes
* Probes to measure query time * Probes to measure checkpoint stats such as running time,
query-parse-start (int, char *) buffers written, xlog files added, removed, recycled, etc
query-parse-done (int, char *)
query-plan-start () checkpoint-start (int)
query-plan-done () checkpoint-done (int, int, int, int, int)
query-execute-start ()
query-execute-done () * Probes to measure Idle in Transaction and client/network
query-statement-start (int, char *) time
query-statement-done (int, char *) idle-transaction-start (int, int)
idle-transaction-done ()
* Probes to measure dirty buffer writes by the backend because * Probes to measure sort time
bgwriter is not effective sort-start (int, int, int, int, int)
sort-done (int, long)
dirty-buffer-write-start (int, int, int, int)
dirty-buffer-write-done (int, int, int, int)
* Probes to determine whether or not the deadlock detector
* Probes to measure physical writes from the shared buffer has found a deadlock
buffer-write-start (int, int, int, int)
buffer-write-done (int, int, int, int, int) deadlock-found ()
deadlock-notfound (int)
* Probes to measure reads of a relation from a particular buffer
block * Probes to measure reads/writes by block numbers and
buffer-read-start (int, int, int, int, int) relations
buffer-read-done (int, int, int, int, int, int) smgr-read-start (int, int, int, int)
smgr-read-end (int, int, int, int, int, int)
* Probes to measure the effectiveness of buffer caching smgr-write-start (int, int, int, int)
buffer-hit () smgr-write-end (int, int, int, int, int, int)
buffer-miss ()
* Probes to measure I/O time because wal_buffers is too small
wal-buffer-write-start ()
wal-buffer-write-done ()
28. auto_explain
● misnamed; actually allows you to manually set specific
queries/sessions/functions to output explain plans to the log
postgres=# LOAD 'auto_explain';
postgres=# SET auto_explain.log_min_duration = 0;
postgres=# SELECT count(*)
FROM pg_class, pg_index
WHERE oid = indrelid AND indisunique;
This might produce log output such as:
LOG: duration: 0.986 ms plan:
Aggregate (cost=14.90..14.91 rows=1 width=0)
-> Hash Join (cost=3.91..14.70 rows=81 width=0)
Hash Cond: (pg_class.oid = pg_index.indrelid)
-> Seq Scan on pg_class (cost=0.00..8.27 rows=227 width
-> Hash (cost=2.90..2.90 rows=81 width=4)
-> Seq Scan on pg_index (cost=0.00..2.90 rows=81
●
29. More Performance Improvements
● Free Space Map is dynamically sized (no more
max_fsm_pages!)
● Visibility Map
– VACUUM only changed pages
– Index-only Scans in 8.5
● Less writing to pgstat file
– plus you can move it
30. Stored Procedures
● Default Parameters
● Variadic Parameters
● New PL/pgSQL Statements
● PL/pythonU OUT Parameters
31. DEFAULT parameters
CREATE OR REPLACE FUNCTI ON
adder ) a i nt de f a ul t 4 0 ,
b i nt de f a ul t 2 (
RETURNS i nt LANGUAGE ' sql '
AS ' sel ect $ 1 + $ 2' ;
SELECT adder ) ( ;
SELECT adder ) 1( ;
SELECT adder ) 1, 2( ;
32. VARIADIC parameters
CREATE OR REPLACE FUNCTION
adder(VARIADIC v int[])
RETURNS int AS $$
DECLARE s int; i int;
BEGIN
s:=0;
FOR i IN SELECT generate_subscripts(v,1) LOOP
s := s + i;
END LOOP;
RETURN s;
END;
$$ LANGUAGE 'plpgsql';
SELECT adder(1);
SELECT adder(1,2,3);
SELECT adder(40,2);
33. New PL/PgSQL Statements
● RETURNS TABLE
– SQL-compliant alias for "SETOF"
● CASE statement
– real switching logic
CASE
WHEN x BETWEEN 0 AND 10 THEN
msg := 'value is between zero and ten';
WHEN x BETWEEN 11 AND 20 THEN
msg := 'value is between eleven and twenty';
END CASE;
34. PL/pythonU OUT Parameters
● You now can use IN, OUT and INOUT
parameters with PL/pythonU functions.
● That's it!
36. SQL/MED
● Foundation for connecting to external servers
– Future of PL/proxy and DBconnect
– Future of DBI-Connect
CREATE FOREIGN DATA WRAPPER pgsql LIBRARY
'pgsql_fdw';
CREATE SERVER foo FOREIGN DATA WRAPPER pgsql
OPTIONS (host 'remotehost', dbname 'remotedb');
CREATE USER MAPPING FOR PUBLIC SERVER foo OPTIONS
(username 'bob', password 'secret');
37. Multi-Column GIN Indexes
● Bad Old Days: to do a single Full Text Search
index over several columns, you had to
concatenate them.
● New Goodness: you can now do a proper
multicolumn index
– and it's faster!
41. Refactored SSL by Magnus
● Proper certificate verification
– Choose level, full verification is default
● Control over all key and certificate files
● SSL certificate authentication
– Trusted root certificate
– Map «cn» value of certificate
42. pg_hba Improvements
● "crypt" is gone (insecure)
● «ident sameuser» => «ident»
● New format for options
– name=value for all options
● usermaps for all external methods
– with regexp support
● Parsed on reload
43. Column Permissions
REVOKE SELECT (col1, col2), INSERT (col1, col2)
ON tab1 FROM role2;
● Restrict access to sensitive columns from
unprivileged ROLEs
– more fine-grained security
– no longer need to use VIEWs to do this
45. Many Patches == Lots of Testing
● Bug Testing
– can you make 8.4 crash?
● Specification Testing
– do the features do what the docs say they do?
● Performance Testing
– is 8.4 really faster? How much?
● Combinational Testing
– what happens when you put several new features
together?
46. Many Patches == Lots of Testing
1. Take a copy of your production applications
2. Port them to 8.4
3. Report breakage and issues
4. Play with implementing new features
Do It Now!
We're counting on you!
47. Contact Information
● Josh Berkus ● Upcoming events
– josh@postgresql.org – SCALE 7, Los
– http://it.toolbox.com/ Angeles, Feb. 20
blogs/database-soup – pgCon 2009, Ottawa,
May 20
This talk is copyright 2009 Josh Berkus, and is licensed under the Creative Commons Attribution License