SlideShare ist ein Scribd-Unternehmen logo
1 von 33
SQL Analytic Queries ...
Tips & Tricks
Mostly in PostgreSQL
What are we going to talk about?
- Some less (or more) know facts about SQL
- Revision history (just most important parts)
- Quickly go through SQL Basics, since we all know those, right
- Range of SQL Advanced topics with comparison and parallels of real-world
situations and applications
- Conclusion, discussion and QA
Some less (or more) know facts about SQL ...
- SQL (Structured Query Language) is STANDARDIZED
internationally!
- By ISO (International Organization for Standardization) committee.
- All existing implementations follow same standards:
Oracle, MSSQL, MySQL, IBM DB2 PostgresSQL, etc, etc ...
- Revisions of standards so far (last 30 years):
SQL-86, SQL-89, SQL-92, SQL:1999 (SQL3), SQL:2003, SQL:2008,
SQL:2011, SQL:2016
Some less (or more) know facts about SQL ...
Today, after many revisions, SQL is:
- Turing complete
- Computationally Universal
- Calculation Engine
* Turing complete means that can be used to write any algorithm or “any
software”.
* In other words - it can do “anything”.
Today, SQL is also:
- Only ever successful 4th generation general-purpose
programming language in existence (known to mankind)
- Python, Java, C# and all others - are still 3rd generation languages ...
- 4th gen language - abstracts (or hides) unimportant details from user:
hardware, algorithms, processes, threads, etc...
* take a deep breath and let that sit for a while ...
Some less (or more) know facts about SQL ...
Some less (or more) know facts about SQL ...
SQL is also:
- Declarative
- You just tell or declare to machine what you want.
- Let the machine to figure out for you how.
* That’s how Oracle got its name
- Let’s you focus on your business logic and your problem and what
is really really important to you …
Revision history - SQL-92
SQL-92 - most important parts
- DATE, TIME, TIMESTAMP, INTERVAL, BIT string, VARCHAR strings
- UNION JOIN, NATURAL JOIN
- Conditional expressions with CASE (upgraded in SQL:2008)
- ALTER and DROP, CHECK constraint
- INFORMATION_SCHEMA tables
- Temporary tables; CREATE TEMP TABLE
- CAST (expr AS type), Scroll Cursors…
- Two extensions, published after standard:
- SQL/CLI (Call Level Interface) - 1995
- SQL/PSM (stored procedures) - 1996
* PostgresSQL 11 (released 2016-10-08) - finally implements stored procedures, standardized in 1996
SQL:1999 (SQL3) - most important parts
- Boolean type, user defined types
- Common Table Expressions (CTE), WITH clause, RECURSIVE queries
- Grouping sets, Group By ROLLUP, Group By CUBE
- Role-based Access Control - CREATE ROLE
- UNNEST keyword
Revision history - SQL:1999 (SQL3)
SQL:2003 - most important parts
- XML features and functions
- Window functions (ROW_NUMBER OVER, RANK OVER…)
- Auto-generated values (default values)
- Sequence generators, IDENTITY columns
Revision history - SQL:2003
SQL:2008 (ISO/IEC 9075:2008) - most important parts
- TRUNCATE TABLE
- CASE WHEN ELSE
- TRIGGERS (INSTEAD OF)
- Partitioned JOINS
- XQuery, pattern matching ...
Revision history - SQL:2008 (ISO/IEC 9075:2008)
SQL:2011 (ISO/IEC 9075:2011) - most important parts
- Support for TEMPORAL databases:
- Time period tables PERIOD FOR
- Temporal primary keys and temporal referential integrity
- System versioned tables (AS OF SYSTEM_TIME, and VERSIONS BETWEEN SYSTEM
TIME)
- Allows working with “historic” data
* MSSQL2016, Oracle 12c, MariaDB v10.3 fully implements, IBM DB2 v10 uses alternative syntax.
* PostgreSQL requires installation of the temporal_tables extension
Revision history - SQL:2011 (ISO/IEC 9075:2011)
SQL:2016 (ISO/IEC 9075:2016) - most important parts
- JSON functions and full support
- Row pattern recognition, matching a row sequence against a regular expression patterns
- Date and time formatting and parsing functions
- LISTAGG - function to transform values to row
- Functions without return time (polymorphic functions)
Revision history - SQL:2016 (ISO/IEC 9075:2016)
1. Basics - EVERYTHING is a set (or table)
-- this is a table:
my_table;
-- this is another table:
select * from my_table;
-- this is again table (with hardcoded values):
values ('first'), ('second'), ('third');
-- yep, you've guess it, another table (or set if you like):
select * from (
values ('first'), ('second'), ('third')
) t;
-- we can give name to our table as we like:
select * from (
values (1, 'first'), (2, 'second'), (3, 'third')
) as t (id, description);
-- we can use pre-defined functions as tables, this one will return series:
select i from generate_series(1,10) as t (i)
1. Basics - execution order
/***
Queries are always executed in following
order:
1. CTE - Common table expressions
2. FROM and JOINS
3. WHERE
4. GROUP BY
5. HAVING
6. [Window functions]
7. SELECT
8. ORDER BY
9. LIMIT
***/
CTE
WHERE
HAVING [Window func.]
FROM, JOIN
GROUP BY
SELECT
ORDER BY
LIMIT
2. TEMP TABLES
-- temp table lives during and it is limited visible to connection:
create temp table temp_test1 (id int, t text);
-- only I can see you, no other connection know that you exist
select * from temp_test1;
-- they can be created on fly (and usually are) from another table or query using "into":
select *
into temp temp_test2 from (
values (1, 'first'), (2, 'second'), (3, 'third')
) as t (id, description);
-- let's see:
select * from temp_test2;
2. TEMP TABLES
Expensive query
(joins, filters)
INTO TEMP
table
Counts and statistics
data from TEMP
Sort and page
from TEMP
Return multiple
result sets
single connection
- Used a lot for optimizations (avoid repeating expensive operations by using temp tables - caching)
- Note that hardware is abstracted, we don’t know is it on disk or in memory, that’s not the point
- Typical, common usage - paging and sorting from large tables with expensive joins, with calculation of
counts and statistics.
3. CTE - Common Table Expressions (WITH queries)
-- we can use common table expressions for same purpose as temp tables:
with my_cte as (
select i from generate_series(1,10) as t (i)
)
select * from my_cte;
-- we can combine multiple CTE's, Postgres will optimize every CTE individually:
with my_cte1 as (
select i from generate_series(1,3) as t (i)
),
my_cte2 as (
select i from generate_series(4,6) as t (i)
),
my_cte3 as (
select i from generate_series(7,9) as t (i)
)
select * from my_cte1
union --intersect
select * from my_cte2
union
select * from my_cte3;
3. CTE - Common Table Expressions (WITH queries) - RECURSION
-- CTE can be used for recursive queries:
with recursive t(i) as (
values (1) -- recursion seed
union all
select i + 1 from t where i < 10 --call
)
select i from t;
-- Typically, used for efficient processing of tree structures, example:
create temp table employees (id serial, name varchar, manager_id int);
insert into employees (name, manager_id)
values ('Michael North', NULL), ('Megan Berry', 1), ('Sarah Berry', 2),
('Zoe Black', 1), ('Tim James', 2), ('Bella Tucker', 2), ('Ryan Metcalfe',
2), ('Max Mills', 2), ('Benjamin Glover', 3) ,('Carolyn Henderson', 4);
select * from employees;
-- Returns ALL subordinates of the manager with the id 2:
with recursive subordinates AS (
select id, manager_id, name from employees where id = 2
union
select e.id, e.manager_id, e.name
from employees e
inner join subordinates s on e.manager_id = s.id
)
select * from subordinates;
4. UNNEST and AGGREGATE
-- any array can be unnest-ed to row values:
select unnest(array[1, 2, 3]);
-- any row values can aggregated back to array
select array_agg(i)
from (
values (1), (2), (3)
) t(i);
-- any row values can aggregated back to json array
select json_agg(i)
from (
values (1), (2), (3)
) t(i);
-- from row values to array and back to row values
select unnest(array_agg(i))
from (
values (1), (2), (3)
) t(i);
5. Subqueries
-- First ten dates in january with extracted day numbers
select cast(d as date), extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d); --ISO type cast
-- First ten dates in february with extracted day numbers
select d::date, extract(day from d) as i
from generate_series('2018-02-01'::date, '2018-02-10'::date, '1 days') as d(d); -- Postgres cast (using ::)
-- Any table expression anywhere can be replaced by another query which is also table expression:
-- So we can join previous queries as SUBQUERIES:
select first_month.i, first_month.d as first_month, second_month.d as second_month
from (
select cast(d as date), extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d)
) first_month inner join (
select cast(d as date), extract(day from d) as i
from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as d(d)
) second_month on first_month.i = second_month.i;
5. Subqueries
-- subquery can be literary everywhere, but, sometimes needs to be limited to single value:
select cast(d as date),
(
select cast(d as date)
from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as sub(d)
where extract(day from sub) = extract(day from d)
limit 1
) as february
from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as d(d);
-- or it can multiple values in single row to be filtered in where clause:
select cast(d as date)
from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as d(d)
where extract(day from d) in (
select extract(day from sub)
from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as sub(d)
)
-- How efficient are these queries ??? What we actually want our machine to do?
-- Let see what execution plan has to say ...
6. LATERAL joins
-- What if want to reference one subquery from another?
-- This doesn't work, we cannot reference joined subquery from outer table:
select by_day.d as date, counts_day.count
from (
select cast(d as date), extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d)
) by_day inner join (
select count(*) as count, extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 hours') as d(d)
where extract(day from d) = by_day.i
group by extract(day from d)
) counts_day on by_day.i = counts_day.i;
6. LATERAL joins
-- To achieve this, we must use LATERAL join:
select by_day.d as date, counts_day.count
from (
select cast(d as date), extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d)
) by_day inner join lateral (
select count(*) as count, extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 hours') as d(d)
where extract(day from d) = by_day.i
group by extract(day from d)
) counts_day on by_day.i = counts_day.i;
6. LATERAL joins
-- Now, we can simplify even further this query:
select by_day.d as date, counts_day.count
from (
select cast(d as date), extract(day from d) as i
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d)
) by_day inner join lateral (
select count(*) as count
from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 hours') as d(d)
where extract(day from d) = by_day.i
) counts_day on true;
7. DISTINCT ON
create temp table sales (brand varchar, segment varchar, quantity int);
insert into sales values ('ABC', 'Premium', 100), ('ABC', 'Basic', 200), ('XYZ', 'Premium', 100), ('XYZ', 'Basic', 300);
select * from sales;
-- brands with highest quantities:
select brand, max(quantity)
from sales
group by brand;
-- what are segments of brands with highest quantities? This is NOT allowed:
select brand, max(quantity), segment
from sales
group by brand;
-- we must use select distinct on:
select distinct on (brand) brand, quantity, segment
from sales
order by brand, quantity desc;
8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP
create temp table sales (brand varchar, segment varchar, quantity int);
insert into sales values ('ABC', 'Premium', 100), ('ABC', 'Basic', 200), ('XYZ', 'Premium', 100), ('XYZ', 'Basic', 300);
-- sum quantities by brand and segment:
select brand, segment, sum(quantity) from sales group by brand, segment;
-- sum quantities by brand only:
select brand, sum(quantity) from sales group by brand;
-- sum quantities by segment only:
select segment, sum(quantity) from sales group by segment;
-- sum all quantities:
select sum(quantity) from sales;
-- we can union of all of these queries but this is long an extremely un-efficient:
select brand, segment, sum(quantity) from sales group by brand, segment
union all
select brand, null as segment, sum(quantity) from sales group by brand
union all
select null as brand, segment, sum(quantity) from sales group by segment
union all
select null as brand, null as segment, sum(quantity) from sales;
8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP
-- unless we use grouping sets to get all sums by all categories
-- this is many times more efficient instead of separate queries with union
-- and lot shorter and easier to read:
select
brand, segment, sum(quantity)
from
sales
group by grouping sets (
(brand, segment),
(brand),
(segment),
()
)
order by
brand nulls last, segment nulls last;
8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP
-- generate ALL possible grouping combinations:
CUBE(c1,c2,c3)
-- results in:
GROUPING SETS (
(c1,c2,c3),
(c1,c2),
(c1,c3),
(c2,c3),
(c1),
(c2),
(c3),
()
)
-- previous example:
select brand, segment, sum(quantity)
from sales
group by cube (brand, segment);
8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP
-- generate grouping combinations by assuming hierarchy c1 > c2 > c3
ROLLUP(c1,c2,c3)
-- results in:
GROUPING SETS (
(c1, c2, c3)
(c1, c2)
(c1)
()
)
-- previous example:
select brand, segment, sum(quantity)
from sales
group by rollup (brand, segment);
-- results in:
select brand, segment, sum(quantity)
from sales
group by grouping sets (
(brand, segment),
(brand),
()
);
9. OLAP: WINDOW FUNCTIONS
create temp table employee (id serial, department varchar, salary int);
insert into employee (department, salary)
values
('develop', 5200), ('develop', 4200), ('develop', 4500), ('develop', 6000), ('develop', 5200),
('personnel', 3500), ('personnel', 3900),
('sales', 4800), ('sales', 5000), ('sales', 4800);
-- average salaries by department will return less rows because it is grouped by
select department, avg(salary)
from employee
group by department;
-- but not if we use aggregate function over partition (window) - this returns ALL records:
select department, salary, avg(salary) over (partition by department)
from employee;
9. OLAP: WINDOW FUNCTIONS
-- syntax:
window_function(arg1, arg2,..) OVER (PARTITION BY expression ORDER BY expression)
-- return all employees, no grouping
select
department, salary,
-- average salary:
avg(salary) over (partition by department),
-- employee order number within department (window):
row_number() over (partition by department order by id),
-- rank of employee salary within department (window):
rank() over (partition by department order by salary)
from employee;
BONUS: Mandelbrot set fractal
WITH RECURSIVE
x(i)
AS (
VALUES(0)
UNION ALL
SELECT i + 1 FROM x WHERE i < 101
),
Z(Ix, Iy, Cx, Cy, X, Y, I)
AS (
SELECT Ix, Iy, X::FLOAT, Y::FLOAT, X::FLOAT, Y::FLOAT, 0
FROM
(SELECT -2.2 + 0.031 * i, i FROM x) AS xgen(x,ix)
CROSS JOIN
(SELECT -1.5 + 0.031 * i, i FROM x) AS ygen(y,iy)
UNION ALL
SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1
FROM Z
WHERE X * X + Y * Y < 16.0
AND I < 27
),
Zt (Ix, Iy, I) AS (
SELECT Ix, Iy, MAX(I) AS I
FROM Z
GROUP BY Iy, Ix
ORDER BY Iy, Ix
)
SELECT array_to_string(
array_agg(
SUBSTRING(
' .,,,-----++++%%%%@@@@#### ',
GREATEST(I,1),
1
)
),''
)
FROM Zt GROUP BY Iy ORDER BY Iy;
Conclusion and final words
- SQL is “mysterious machine”. Even after 15 years can pull some new surprises.
- Practice is the key. You need to practice, practice and get some more practice.
- Payoffs are huge: Application performances can be improve dramatically with significantly less
code.
- It can reduce amount of code and significantly improve system maintainability many, many times.
- It can be intimidating to some. Percentage of keywords in code is much higher, levels of
assembler code or cobol code.
- Don't be intimidated, it will pay off in the end. Any day gone without learn anything new is wasted
day.

Weitere ähnliche Inhalte

Was ist angesagt? (19)

Mysql quick guide
Mysql quick guideMysql quick guide
Mysql quick guide
 
Mysql Ppt
Mysql PptMysql Ppt
Mysql Ppt
 
Oracle Sql & PLSQL Complete guide
Oracle Sql & PLSQL Complete guideOracle Sql & PLSQL Complete guide
Oracle Sql & PLSQL Complete guide
 
Best sql plsql material
Best sql plsql materialBest sql plsql material
Best sql plsql material
 
Les11 Including Constraints
Les11 Including ConstraintsLes11 Including Constraints
Les11 Including Constraints
 
Adbms 21 sql 99 schema definition constraints and queries
Adbms 21 sql 99 schema definition constraints and queriesAdbms 21 sql 99 schema definition constraints and queries
Adbms 21 sql 99 schema definition constraints and queries
 
Oracle sql material
Oracle sql materialOracle sql material
Oracle sql material
 
SQL
SQLSQL
SQL
 
Oracle ORA Errors
Oracle ORA ErrorsOracle ORA Errors
Oracle ORA Errors
 
Les10 Creating And Managing Tables
Les10 Creating And Managing TablesLes10 Creating And Managing Tables
Les10 Creating And Managing Tables
 
Database administration commands
Database administration commands Database administration commands
Database administration commands
 
DML using oracle
 DML using oracle DML using oracle
DML using oracle
 
mySQL and Relational Databases
mySQL and Relational DatabasesmySQL and Relational Databases
mySQL and Relational Databases
 
MYSQL
MYSQLMYSQL
MYSQL
 
SQL
SQLSQL
SQL
 
Lab
LabLab
Lab
 
Dbms lab Manual
Dbms lab ManualDbms lab Manual
Dbms lab Manual
 
BITS: Introduction to relational databases and MySQL - SQL
BITS: Introduction to relational databases and MySQL - SQLBITS: Introduction to relational databases and MySQL - SQL
BITS: Introduction to relational databases and MySQL - SQL
 
MySQL lecture
MySQL lectureMySQL lecture
MySQL lecture
 

Ähnlich wie Sql analytic queries tips

SQL Server Select Topics
SQL Server Select TopicsSQL Server Select Topics
SQL Server Select TopicsJay Coskey
 
MySQL Database System Hiep Dinh
MySQL Database System Hiep DinhMySQL Database System Hiep Dinh
MySQL Database System Hiep Dinhwebhostingguy
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slidesmetsarin
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.pptMARasheed3
 
Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...
Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...
Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...TAISEEREISA
 
SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?Andrej Pashchenko
 
Database Oracle Basic
Database Oracle BasicDatabase Oracle Basic
Database Oracle BasicKamlesh Singh
 
DDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using OracleDDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using OracleFarhan Aslam
 
Tony jambu (obscure) tools of the trade for tuning oracle sq ls
Tony jambu   (obscure) tools of the trade for tuning oracle sq lsTony jambu   (obscure) tools of the trade for tuning oracle sq ls
Tony jambu (obscure) tools of the trade for tuning oracle sq lsInSync Conference
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1MariaDB plc
 
Mysqlppt
MysqlpptMysqlppt
MysqlpptReka
 
My sql with querys
My sql with querysMy sql with querys
My sql with querysNIRMAL FELIX
 
lec02-data-models-sql-basics.pptx
lec02-data-models-sql-basics.pptxlec02-data-models-sql-basics.pptx
lec02-data-models-sql-basics.pptxcAnhTrn53
 

Ähnlich wie Sql analytic queries tips (20)

SQL Server Select Topics
SQL Server Select TopicsSQL Server Select Topics
SQL Server Select Topics
 
MySQL Database System Hiep Dinh
MySQL Database System Hiep DinhMySQL Database System Hiep Dinh
MySQL Database System Hiep Dinh
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
MySql slides (ppt)
MySql slides (ppt)MySql slides (ppt)
MySql slides (ppt)
 
Overview of Oracle database12c for developers
Overview of Oracle database12c for developersOverview of Oracle database12c for developers
Overview of Oracle database12c for developers
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.ppt
 
Oracle Material.pdf
Oracle Material.pdfOracle Material.pdf
Oracle Material.pdf
 
Sql 3
Sql 3Sql 3
Sql 3
 
Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...
Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...
Chapter 3.pptx Oracle SQL or local Android database setup SQL, SQL-Lite, codi...
 
Sql lite android
Sql lite androidSql lite android
Sql lite android
 
Sql
SqlSql
Sql
 
SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?SQL Macros - Game Changing Feature for SQL Developers?
SQL Macros - Game Changing Feature for SQL Developers?
 
My Sql concepts
My Sql conceptsMy Sql concepts
My Sql concepts
 
Database Oracle Basic
Database Oracle BasicDatabase Oracle Basic
Database Oracle Basic
 
DDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using OracleDDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using Oracle
 
Tony jambu (obscure) tools of the trade for tuning oracle sq ls
Tony jambu   (obscure) tools of the trade for tuning oracle sq lsTony jambu   (obscure) tools of the trade for tuning oracle sq ls
Tony jambu (obscure) tools of the trade for tuning oracle sq ls
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
 
Mysqlppt
MysqlpptMysqlppt
Mysqlppt
 
My sql with querys
My sql with querysMy sql with querys
My sql with querys
 
lec02-data-models-sql-basics.pptx
lec02-data-models-sql-basics.pptxlec02-data-models-sql-basics.pptx
lec02-data-models-sql-basics.pptx
 

Kürzlich hochgeladen

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Kürzlich hochgeladen (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Sql analytic queries tips

  • 1. SQL Analytic Queries ... Tips & Tricks Mostly in PostgreSQL
  • 2. What are we going to talk about? - Some less (or more) know facts about SQL - Revision history (just most important parts) - Quickly go through SQL Basics, since we all know those, right - Range of SQL Advanced topics with comparison and parallels of real-world situations and applications - Conclusion, discussion and QA
  • 3. Some less (or more) know facts about SQL ... - SQL (Structured Query Language) is STANDARDIZED internationally! - By ISO (International Organization for Standardization) committee. - All existing implementations follow same standards: Oracle, MSSQL, MySQL, IBM DB2 PostgresSQL, etc, etc ... - Revisions of standards so far (last 30 years): SQL-86, SQL-89, SQL-92, SQL:1999 (SQL3), SQL:2003, SQL:2008, SQL:2011, SQL:2016
  • 4. Some less (or more) know facts about SQL ... Today, after many revisions, SQL is: - Turing complete - Computationally Universal - Calculation Engine * Turing complete means that can be used to write any algorithm or “any software”. * In other words - it can do “anything”.
  • 5. Today, SQL is also: - Only ever successful 4th generation general-purpose programming language in existence (known to mankind) - Python, Java, C# and all others - are still 3rd generation languages ... - 4th gen language - abstracts (or hides) unimportant details from user: hardware, algorithms, processes, threads, etc... * take a deep breath and let that sit for a while ... Some less (or more) know facts about SQL ...
  • 6. Some less (or more) know facts about SQL ... SQL is also: - Declarative - You just tell or declare to machine what you want. - Let the machine to figure out for you how. * That’s how Oracle got its name - Let’s you focus on your business logic and your problem and what is really really important to you …
  • 7. Revision history - SQL-92 SQL-92 - most important parts - DATE, TIME, TIMESTAMP, INTERVAL, BIT string, VARCHAR strings - UNION JOIN, NATURAL JOIN - Conditional expressions with CASE (upgraded in SQL:2008) - ALTER and DROP, CHECK constraint - INFORMATION_SCHEMA tables - Temporary tables; CREATE TEMP TABLE - CAST (expr AS type), Scroll Cursors… - Two extensions, published after standard: - SQL/CLI (Call Level Interface) - 1995 - SQL/PSM (stored procedures) - 1996 * PostgresSQL 11 (released 2016-10-08) - finally implements stored procedures, standardized in 1996
  • 8. SQL:1999 (SQL3) - most important parts - Boolean type, user defined types - Common Table Expressions (CTE), WITH clause, RECURSIVE queries - Grouping sets, Group By ROLLUP, Group By CUBE - Role-based Access Control - CREATE ROLE - UNNEST keyword Revision history - SQL:1999 (SQL3)
  • 9. SQL:2003 - most important parts - XML features and functions - Window functions (ROW_NUMBER OVER, RANK OVER…) - Auto-generated values (default values) - Sequence generators, IDENTITY columns Revision history - SQL:2003
  • 10. SQL:2008 (ISO/IEC 9075:2008) - most important parts - TRUNCATE TABLE - CASE WHEN ELSE - TRIGGERS (INSTEAD OF) - Partitioned JOINS - XQuery, pattern matching ... Revision history - SQL:2008 (ISO/IEC 9075:2008)
  • 11. SQL:2011 (ISO/IEC 9075:2011) - most important parts - Support for TEMPORAL databases: - Time period tables PERIOD FOR - Temporal primary keys and temporal referential integrity - System versioned tables (AS OF SYSTEM_TIME, and VERSIONS BETWEEN SYSTEM TIME) - Allows working with “historic” data * MSSQL2016, Oracle 12c, MariaDB v10.3 fully implements, IBM DB2 v10 uses alternative syntax. * PostgreSQL requires installation of the temporal_tables extension Revision history - SQL:2011 (ISO/IEC 9075:2011)
  • 12. SQL:2016 (ISO/IEC 9075:2016) - most important parts - JSON functions and full support - Row pattern recognition, matching a row sequence against a regular expression patterns - Date and time formatting and parsing functions - LISTAGG - function to transform values to row - Functions without return time (polymorphic functions) Revision history - SQL:2016 (ISO/IEC 9075:2016)
  • 13. 1. Basics - EVERYTHING is a set (or table) -- this is a table: my_table; -- this is another table: select * from my_table; -- this is again table (with hardcoded values): values ('first'), ('second'), ('third'); -- yep, you've guess it, another table (or set if you like): select * from ( values ('first'), ('second'), ('third') ) t; -- we can give name to our table as we like: select * from ( values (1, 'first'), (2, 'second'), (3, 'third') ) as t (id, description); -- we can use pre-defined functions as tables, this one will return series: select i from generate_series(1,10) as t (i)
  • 14. 1. Basics - execution order /*** Queries are always executed in following order: 1. CTE - Common table expressions 2. FROM and JOINS 3. WHERE 4. GROUP BY 5. HAVING 6. [Window functions] 7. SELECT 8. ORDER BY 9. LIMIT ***/ CTE WHERE HAVING [Window func.] FROM, JOIN GROUP BY SELECT ORDER BY LIMIT
  • 15. 2. TEMP TABLES -- temp table lives during and it is limited visible to connection: create temp table temp_test1 (id int, t text); -- only I can see you, no other connection know that you exist select * from temp_test1; -- they can be created on fly (and usually are) from another table or query using "into": select * into temp temp_test2 from ( values (1, 'first'), (2, 'second'), (3, 'third') ) as t (id, description); -- let's see: select * from temp_test2;
  • 16. 2. TEMP TABLES Expensive query (joins, filters) INTO TEMP table Counts and statistics data from TEMP Sort and page from TEMP Return multiple result sets single connection - Used a lot for optimizations (avoid repeating expensive operations by using temp tables - caching) - Note that hardware is abstracted, we don’t know is it on disk or in memory, that’s not the point - Typical, common usage - paging and sorting from large tables with expensive joins, with calculation of counts and statistics.
  • 17. 3. CTE - Common Table Expressions (WITH queries) -- we can use common table expressions for same purpose as temp tables: with my_cte as ( select i from generate_series(1,10) as t (i) ) select * from my_cte; -- we can combine multiple CTE's, Postgres will optimize every CTE individually: with my_cte1 as ( select i from generate_series(1,3) as t (i) ), my_cte2 as ( select i from generate_series(4,6) as t (i) ), my_cte3 as ( select i from generate_series(7,9) as t (i) ) select * from my_cte1 union --intersect select * from my_cte2 union select * from my_cte3;
  • 18. 3. CTE - Common Table Expressions (WITH queries) - RECURSION -- CTE can be used for recursive queries: with recursive t(i) as ( values (1) -- recursion seed union all select i + 1 from t where i < 10 --call ) select i from t; -- Typically, used for efficient processing of tree structures, example: create temp table employees (id serial, name varchar, manager_id int); insert into employees (name, manager_id) values ('Michael North', NULL), ('Megan Berry', 1), ('Sarah Berry', 2), ('Zoe Black', 1), ('Tim James', 2), ('Bella Tucker', 2), ('Ryan Metcalfe', 2), ('Max Mills', 2), ('Benjamin Glover', 3) ,('Carolyn Henderson', 4); select * from employees; -- Returns ALL subordinates of the manager with the id 2: with recursive subordinates AS ( select id, manager_id, name from employees where id = 2 union select e.id, e.manager_id, e.name from employees e inner join subordinates s on e.manager_id = s.id ) select * from subordinates;
  • 19. 4. UNNEST and AGGREGATE -- any array can be unnest-ed to row values: select unnest(array[1, 2, 3]); -- any row values can aggregated back to array select array_agg(i) from ( values (1), (2), (3) ) t(i); -- any row values can aggregated back to json array select json_agg(i) from ( values (1), (2), (3) ) t(i); -- from row values to array and back to row values select unnest(array_agg(i)) from ( values (1), (2), (3) ) t(i);
  • 20. 5. Subqueries -- First ten dates in january with extracted day numbers select cast(d as date), extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d); --ISO type cast -- First ten dates in february with extracted day numbers select d::date, extract(day from d) as i from generate_series('2018-02-01'::date, '2018-02-10'::date, '1 days') as d(d); -- Postgres cast (using ::) -- Any table expression anywhere can be replaced by another query which is also table expression: -- So we can join previous queries as SUBQUERIES: select first_month.i, first_month.d as first_month, second_month.d as second_month from ( select cast(d as date), extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d) ) first_month inner join ( select cast(d as date), extract(day from d) as i from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as d(d) ) second_month on first_month.i = second_month.i;
  • 21. 5. Subqueries -- subquery can be literary everywhere, but, sometimes needs to be limited to single value: select cast(d as date), ( select cast(d as date) from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as sub(d) where extract(day from sub) = extract(day from d) limit 1 ) as february from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as d(d); -- or it can multiple values in single row to be filtered in where clause: select cast(d as date) from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as d(d) where extract(day from d) in ( select extract(day from sub) from generate_series(cast('2018-02-01' as date), cast('2018-02-10' as date), '1 days') as sub(d) ) -- How efficient are these queries ??? What we actually want our machine to do? -- Let see what execution plan has to say ...
  • 22. 6. LATERAL joins -- What if want to reference one subquery from another? -- This doesn't work, we cannot reference joined subquery from outer table: select by_day.d as date, counts_day.count from ( select cast(d as date), extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d) ) by_day inner join ( select count(*) as count, extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 hours') as d(d) where extract(day from d) = by_day.i group by extract(day from d) ) counts_day on by_day.i = counts_day.i;
  • 23. 6. LATERAL joins -- To achieve this, we must use LATERAL join: select by_day.d as date, counts_day.count from ( select cast(d as date), extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d) ) by_day inner join lateral ( select count(*) as count, extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 hours') as d(d) where extract(day from d) = by_day.i group by extract(day from d) ) counts_day on by_day.i = counts_day.i;
  • 24. 6. LATERAL joins -- Now, we can simplify even further this query: select by_day.d as date, counts_day.count from ( select cast(d as date), extract(day from d) as i from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 days') as d(d) ) by_day inner join lateral ( select count(*) as count from generate_series(cast('2018-01-01' as date), cast('2018-01-10' as date), '1 hours') as d(d) where extract(day from d) = by_day.i ) counts_day on true;
  • 25. 7. DISTINCT ON create temp table sales (brand varchar, segment varchar, quantity int); insert into sales values ('ABC', 'Premium', 100), ('ABC', 'Basic', 200), ('XYZ', 'Premium', 100), ('XYZ', 'Basic', 300); select * from sales; -- brands with highest quantities: select brand, max(quantity) from sales group by brand; -- what are segments of brands with highest quantities? This is NOT allowed: select brand, max(quantity), segment from sales group by brand; -- we must use select distinct on: select distinct on (brand) brand, quantity, segment from sales order by brand, quantity desc;
  • 26. 8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP create temp table sales (brand varchar, segment varchar, quantity int); insert into sales values ('ABC', 'Premium', 100), ('ABC', 'Basic', 200), ('XYZ', 'Premium', 100), ('XYZ', 'Basic', 300); -- sum quantities by brand and segment: select brand, segment, sum(quantity) from sales group by brand, segment; -- sum quantities by brand only: select brand, sum(quantity) from sales group by brand; -- sum quantities by segment only: select segment, sum(quantity) from sales group by segment; -- sum all quantities: select sum(quantity) from sales; -- we can union of all of these queries but this is long an extremely un-efficient: select brand, segment, sum(quantity) from sales group by brand, segment union all select brand, null as segment, sum(quantity) from sales group by brand union all select null as brand, segment, sum(quantity) from sales group by segment union all select null as brand, null as segment, sum(quantity) from sales;
  • 27. 8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP -- unless we use grouping sets to get all sums by all categories -- this is many times more efficient instead of separate queries with union -- and lot shorter and easier to read: select brand, segment, sum(quantity) from sales group by grouping sets ( (brand, segment), (brand), (segment), () ) order by brand nulls last, segment nulls last;
  • 28. 8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP -- generate ALL possible grouping combinations: CUBE(c1,c2,c3) -- results in: GROUPING SETS ( (c1,c2,c3), (c1,c2), (c1,c3), (c2,c3), (c1), (c2), (c3), () ) -- previous example: select brand, segment, sum(quantity) from sales group by cube (brand, segment);
  • 29. 8. OLAP: GROUPING, GROUPING SETS, CUBE, ROLLUP -- generate grouping combinations by assuming hierarchy c1 > c2 > c3 ROLLUP(c1,c2,c3) -- results in: GROUPING SETS ( (c1, c2, c3) (c1, c2) (c1) () ) -- previous example: select brand, segment, sum(quantity) from sales group by rollup (brand, segment); -- results in: select brand, segment, sum(quantity) from sales group by grouping sets ( (brand, segment), (brand), () );
  • 30. 9. OLAP: WINDOW FUNCTIONS create temp table employee (id serial, department varchar, salary int); insert into employee (department, salary) values ('develop', 5200), ('develop', 4200), ('develop', 4500), ('develop', 6000), ('develop', 5200), ('personnel', 3500), ('personnel', 3900), ('sales', 4800), ('sales', 5000), ('sales', 4800); -- average salaries by department will return less rows because it is grouped by select department, avg(salary) from employee group by department; -- but not if we use aggregate function over partition (window) - this returns ALL records: select department, salary, avg(salary) over (partition by department) from employee;
  • 31. 9. OLAP: WINDOW FUNCTIONS -- syntax: window_function(arg1, arg2,..) OVER (PARTITION BY expression ORDER BY expression) -- return all employees, no grouping select department, salary, -- average salary: avg(salary) over (partition by department), -- employee order number within department (window): row_number() over (partition by department order by id), -- rank of employee salary within department (window): rank() over (partition by department order by salary) from employee;
  • 32. BONUS: Mandelbrot set fractal WITH RECURSIVE x(i) AS ( VALUES(0) UNION ALL SELECT i + 1 FROM x WHERE i < 101 ), Z(Ix, Iy, Cx, Cy, X, Y, I) AS ( SELECT Ix, Iy, X::FLOAT, Y::FLOAT, X::FLOAT, Y::FLOAT, 0 FROM (SELECT -2.2 + 0.031 * i, i FROM x) AS xgen(x,ix) CROSS JOIN (SELECT -1.5 + 0.031 * i, i FROM x) AS ygen(y,iy) UNION ALL SELECT Ix, Iy, Cx, Cy, X * X - Y * Y + Cx AS X, Y * X * 2 + Cy, I + 1 FROM Z WHERE X * X + Y * Y < 16.0 AND I < 27 ), Zt (Ix, Iy, I) AS ( SELECT Ix, Iy, MAX(I) AS I FROM Z GROUP BY Iy, Ix ORDER BY Iy, Ix ) SELECT array_to_string( array_agg( SUBSTRING( ' .,,,-----++++%%%%@@@@#### ', GREATEST(I,1), 1 ) ),'' ) FROM Zt GROUP BY Iy ORDER BY Iy;
  • 33. Conclusion and final words - SQL is “mysterious machine”. Even after 15 years can pull some new surprises. - Practice is the key. You need to practice, practice and get some more practice. - Payoffs are huge: Application performances can be improve dramatically with significantly less code. - It can reduce amount of code and significantly improve system maintainability many, many times. - It can be intimidating to some. Percentage of keywords in code is much higher, levels of assembler code or cobol code. - Don't be intimidated, it will pay off in the end. Any day gone without learn anything new is wasted day.