9. CREATE TABLE employee (
id SERIAL PRIMARY KEY,
first_name TEXT,
last_name TEXT
);
CREATE TABLE sales (
employee_id INTEGER NOT NULL,
sale_closed TIMESTAMPTZ NOT NULL DEFAULT NOW(),
sale_amount MONEY, /* We need to do fix this */
FOREIGN KEY(employee_id) REFERENCES employee(id)
);
Tables
Tuesday, November 18, 14
10. Data
INSERT INTO employee (first_name, last_name)
VALUES ('Larry', 'Ellison'),
('Bill', 'Gates'),
('Vladimir', 'Yulianov');
Tuesday, November 18, 14
11. Moar Data
INSERT INTO sales
SELECT
floor(random()*3)+1, /* Who */
'2014-01-01 00:00:00+00'::timestamptz +
random() * interval '1 year', /* When */
(random() * 1000)::numeric(8,2)::MONEY /* ¿Cuando? */
FROM generate_series(1,1000);
Tuesday, November 18, 14
12. How much did each sell each quarter?
Tuesday, November 18, 14
14. SELECT
employee_id,
date_trunc('Quarter', sale_closed) AS "Quarter",
SUM(sale_amount)
FROM sales
GROUP BY
employee_id,
date_trunc('Quarter', sale_closed)
ORDER BY
employee_id,
date_trunc('Quarter', sale_closed);
* I left out some formatting.
Tuesday, November 18, 14
19. (
SELECT employee_id, to_char(date_trunc('Quarter', sale_closed),
'YYYY-"Q"Q') AS "Quarter", sum(sale_amount)
FROM sales
GROUP BY employee_id, date_trunc('Quarter', sale_closed)
ORDER BY employee_id, date_trunc('Quarter', sale_closed)
)
UNION ALL
(
SELECT employee_id, to_char(date_trunc('Year', sale_closed),
'YYYY') AS "Year", sum(sale_amount)
FROM sales
GROUP BY employee_id, date_trunc('Year', sale_closed)
ORDER BY employee_id, date_trunc('Year', sale_closed)
);
Still Doable...Kinda
Tuesday, November 18, 14
29. Quick stare
SELECT
employee_id,
to_char(
date_trunc('Quarter', sale_closed),
'YYYY-"Q"Q'
) AS "Quarter",
sum(sale_amount)
FROM sales
GROUP BY CUBE (
employee_id,
date_trunc('Quarter', sale_closed)
)
ORDER BY employee_id, date_trunc('Quarter', sale_closed);
Tuesday, November 18, 14
36. SELECT
employee_id,
to_char(
date_trunc('Quarter', sale_closed),
'YYYY-"Q"Q'
) AS "Quarter",
sum(sale_amount)
FROM sales
GROUP BY ROLLUP(
employee_id,
date_trunc('Quarter', sale_closed)
)
ORDER BY
employee_id,
date_trunc('Quarter', sale_closed);
Tuesday, November 18, 14
44. SELECT
employee_id,
to_char(
date_trunc('Quarter', sale_closed),
'YYYY-"Q"Q'
) AS "Quarter",
sum(sale_amount)
FROM sales
GROUP BY GROUPING SETS(
(employee_id, date_trunc('Quarter', sale_closed)),
(employee_id)
)
ORDER BY employee_id, date_trunc('Quarter', sale_closed);
Tuesday, November 18, 14
52. HashAgg
• One pass:
• Update hash value for each row
• Output final value at the end
Tuesday, November 18, 14
53. HashAgg
• Not yet in GROUPING SETS
• Algorithmic speedup opportunity:
• O(n) vs. O(n log n)
Tuesday, November 18, 14
54. HashAgg-- :-(
• Non-hashable data types
• Aggregate functions with LOTS of state
• Ordered aggs
• Distinct aggs
• No spill-to-disk
Tuesday, November 18, 14
55. GroupAgg
• Sorts all input to the agg node to
• Detect group boundary
• Output that group
• Results before end-of-scan
Tuesday, November 18, 14
64. GroupAgg !ROLLUP
• Re-plan input to sort with >1 order
Tuesday, November 18, 14
65. GroupAgg !ROLLUP
• Re-plan input to sort with >1 order
• Plan keeps tons of global state
Tuesday, November 18, 14
66. GroupAgg !ROLLUP
• Re-plan input to sort with >1 order
• Plan keeps tons of global state
• Does NOT like to be called >1x/plan
Tuesday, November 18, 14
79. ChainAgg Nodes
• Pass input state through unchanged
• Update aggregate state
• Put rows into a chain-wide shared
tuplestore when they hit a group boundary
Tuesday, November 18, 14
80. The Last GroupAgg
• Produces its normal output until end-of-data
• Outputs the shared tuplestore
Tuesday, November 18, 14