Rob Sullivan took the stage at this year's Waza 2013 to present "Your Database: A Story of Indiffence." For more from Rob, ping him at @datachomp.
For Waza videos stay tuned at http://blog.heroku.com or visit http://vimeo.com/herokuwaza.
2. Vanity Plate
I’m Rob
Enterprise Data Magistrate (Ego Ridden Stable)
SET twitter = @datachomp
SET blog = datachomp.com
SET app.net = @datachomp, balance = balance-$50
9. ORM As A Service
:mycolumn, :integer
:mydate, :timestamp
ints are 4 bytes for every row - 2B
smallints are 2 bytes for every row - 32K
date = 4 bytes
timestamp = 8 bytes
10. The Setting
CREATE TABLE thinner
(id smallserial,
things smallint,
CONSTRAINT thinner_pkey PRIMARY KEY (id));
CREATE TABLE bloaty
(id bigserial,
things bigint,
CONSTRAINT bloaty_pkey PRIMARY KEY (id));
Table Size Index
Insert 10K
thinner 384kb 240kb
records
bloaty 464kb 240kb
11. Healthy Diet of Suffering
When they key is added to a larger table
CREAT TABLE smallertime (id serial, fk_numbers smallint, stringers varchar(80), recents timestamp default now())
And indexes are added I need sweat pants
Table Size Index
smaller 97MB 113MB
larger 111MB 121MB
12. Death by 1000 Bytes
Disks are cheap, but good disks are not
Bigger DB = Longer Backups/Recovery
Don’t waste your buffer cache
Partitions are free in Postgres
14. Sipping on Gin and GiST
B-Tree, Gin, GiST, Partial, Expression, Unique
Full Text Index / Searching
https://devcenter.heroku.com/articles/postgresql-indexes
15. Sargability
Indexes are only awesome when you can use them
Btree
<
<=
=
>=
>
WHERE sucks LIKE ‘%tablescan’
WHERE suckyfunc(column) = ‘pain’
16. So Worthless
Indexes are not a get out of perf jail free card
They have to be maintained
They have to have writes
SELECT * FROM pg_stat_all_indexes WHERE schemaname<>'pg_catalog'
17. So Worthless
Indexes are not a get out of perf jail free card
They have to be maintained
They have to have writes
There Can Be Dupes!
SELECT pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS size,
(array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2,
(array_agg(idx))[3] AS idx3, (array_agg(idx))[4] AS idx4
FROM (
SELECT indexrelid::regclass AS idx, (indrelid::text ||E'n'|| indclass::text ||E'n'||
indkey::text ||E'n'||
coalesce(indexprs::text,'')||E'n' ||
coalesce(indpred::text,'')) AS KEY
FROM pg_index) sub
GROUP BY KEY HAVING count(*)>1
ORDER BY sum(pg_relation_size(idx)) DESC;
18. ~Story 3~
No One Is Listening
My tuples are
43 doing the
5
Harlem Shake
19. Do you even add bro?
R × S = {(r1, r2, ..., rn, s1, s2, ..., sm) | (r1, r2, ..., rn) ∈ R, (s1, s2, ..., sm) ∈ S}
U := T- R Cartesian Products
Relational Algebra R ⟗ S = (R ⟕ S) (R ⟖ S)
Outer Join
R φ S = σφ(R × S)
equijoins
20. Explain Yourself
The engine can talk:
EXPLAIN SELECT * FROM sites;
Seq Scan on sites (cost=0.00..1.09 rows=9 width=46)
EXPLAIN (BUFFERS true, ANALYZE true, FORMAT
YAML) SELECT * FROM sites;
- Plan:
Node Type: "Seq Scan"
Relation Name: "sites"
Alias: "sites"
Startup Cost: 0.00
Total Cost: 1.09
Plan Rows: 9
Plan Width: 46
Actual Total Time: 0.011
Actual Rows: 9
21. Kind of a big deal
Stats - they are a pretty big deal
SELECT * FROM pg_stat_all_tables
VACUUM ANALYZE <tablename>
22. Real big deal
There are lots of helpful tables
pg_stat_activity pg_stat_all_indexes
pg_stat_databse pg_statio_user_tables
pg_stat_user_tables pg_stat_user_functions
Lots of other tools to show you stat info
Lots of good existing queries
Don’t have to reinvent the wheel
23. Set Based Operations
Marry me
Active Record
N+1 is a performance death sentence
RBAR - Row By Agonizing Row
Always profile your App/ORM
The App has context
24. Choose Your Own
Adventure
Care about your data
So we can all live App’ily ever after
Artsy thank you - @tenderlove and @derekgrape
Thank you - Heroku and other sponsors
Thank you - the person making a difference
Happy Hour?!?!