Behind JPA ans SQL Query Optimizations. Talk about PostgreSQL Indexes and Query Planner and Java Persistence API Performance Tips. Hibernate. Java. PostgreSQL. Spring Boot. Spring JPA.
1. No more than a few
milliseconds
Behind JPA and SQL
Query Optimizations
Gleydson Lima
gleydson@esig.com.br
https://www.linkedin.com/in/gleydsonlima
2. Agenda
âą What is Fast?
âą Common problems with JPA/Hibernate
âą JPA/Hibernate Cache Levels
âą Indexes
âą Query Explain
âą App Example
âą Real App Examples
3. What is fast for the user?
Everything you do not realize loading
and feels instantaneous or close to it.
4. N + 1 SELECT Problem
â Main cases:
â ManyToOne Associations
â OneToOne Associations
â OneToMany Associations
9. Answer
â findAll: N + 1 select happened in this case.
The reason is that JPQL by default does not
consider the fetch strategy.
â findById: In this case, fetch EAGER is
considered to load the entity and its
associations with just ONE select.
12. N + 1 SELECT for OneToMany
Use Join Fetch
Only one collection per query.
13. Performance Tips
- Use EAGER in main association relationships;
- Use LAZY in ManyToOne less used for exhibition;
- Pay attention in your console with show_sql =
true. Be careful with too many selects.
- JQL query does not use FetchType definition!!!
17. JPA Projection
Hibernate: select aluno0_.nome as col_0_0_, curso1_.id as
col_1_0_, curso1_.id as id1_1_, curso1_.nome as nome2_1_ from
opt.aluno aluno0_ left outer join opt.curso curso1_ on
aluno0_.curso_id=curso1_.id where aluno0_.nome like ? limit ?
offset ?
20. The Explain Command
PostgreSQL devises a query plan for each query it receives.
Choosing the right plan to match the query structure and the properties of the data is absolutely
critical for good performance, so the system includes a complex planner that tries to choose good
plans.
You can use the EXPLAIN command to see what query plan the planner creates for any query
21. Explain - The simplest example
startup cost - in
general, for
sorting.
complete
phase cost
number of
rows
fetched
bytes
recovered.
The size of
information
33. Case 02
There is no filter.
The couting function, in
this case, requires go
through all table.
The planner choose SEQ
SCAN even if you have
indexes.
38. Tips
âą Always use index for relevant foreign
keys
âą Use native query for complex query.
âą Explain your query
âą Execute your query in real environment.
The average time must be less than
500ms!
âą Create multi-column index for relevant
filters.
âą Be careful: index uses disk space!
39. Future talks (Suggestion)
Join Operations
Nested Loops
Joins two tables by fetching the result from one table and querying the other table for each row from the first.
Hash Join / Hash
The hash join loads the candidate records from one side of the join into a hash table (marked with Hash in the plan) which
is then probed for each record from the other side of the join.
Merge Join
The (sort) merge join combines two sorted lists like a zipper. Both sides of the join must be presorted
40. Sorting and Grouping
Sort / Sort Key
Sorts the set on the columns mentioned in Sort Key. The Sortoperation needs large amounts of memory to materialize
the intermediate result (not pipelined).
GroupAggregate
Aggregates a presorted set according to the group by clause. This operation does not buffer large amounts of data
(pipelined).
HashAggregate
Uses a temporary hash table to group records. The HashAggregateoperation does not require a presorted data set, instead
it uses large amounts of memory to materialize the intermediate result (not pipelined). The output is not ordered in any
meaningful way.