2. Why do we need to tune?
‣ No query planner is ever perfect
‣ You know your domain better than the
database
3. The Cost planner
‣ Introduced in 2.2.0
‣ It uses the statistics service in Neo4j to
assign costs to various query execution
plans, picking the cheapest one
‣ All queries use this by default
5. How do I view a query plan?
‣ EXPLAIN
• shows the execution plan without actually
executing it or returning any results.
‣ PROFILE
• executes the statement and returns the results
along with profiling information.
9. What is our goal?
At a high level, the goal is
simple: get the number of
db hits down.
10. an abstract unit of storage
engine work.
What is a database hit?
“
”
11. ‣ Operators to look out for
• All nodes scan expensive
• Label scan cheaper
• Node index seek cheapest
• Node index scan used for range queries
‣ http://neo4j.com/docs/3.0.0-RC1/execution-plans.html
Execution plan operators
17. Finding The Matrix
MATCH (movie
{title: "The Matrix"})
RETURN movie
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
18. Tip: Use indexes and constraints
‣ Indexes for non unique values
‣ Constraints for unique values
CREATE INDEX ON :Movie(title)
CREATE INDEX ON :Person(name)
CREATE CONSTRAINT ON (g:Genre)
ASSERT g.name IS UNIQUE
19. How does Neo4j use indexes?
‣ Indexes are only used to find the starting
point for queries.
Use index scans to look up
rows in tables and join them
with rows from other tables
Use indexes to find the starting
points for a query.
Relational
Graph
20. Tip: Use indexes and constraints
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
21. Finding The Matrix
(no index)
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
(index)
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
22. Actors who appeared together
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
RETURN COUNT(*)
23. Actors who appeared together
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
RETURN COUNT(*)
24. Tip: Enforce index usage
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
USING INDEX a:Person(name)
USING INDEX b:Person(name)
RETURN COUNT(*)
25. Tip: Enforce index usage
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
USING INDEX a:Person(name)
USING INDEX b:Person(name)
RETURN COUNT(*)
26. Actors who appeared together
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
RETURN COUNT(*)
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
USING INDEX a:Person(name)
USING INDEX b:Person(name)
RETURN COUNT(*)
27. Tom Hanks’ colleagues’ movies
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
28. Tom Hanks’ colleagues’ movies
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
29. Tip: Reduce cardinality of WIP
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)
WITH DISTINCT coActor
MATCH (coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
30. Tip: Reduce cardinality of WIP
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)
WITH DISTINCT coActor
MATCH (coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
31. MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-(coActor)
WITH DISTINCT coActor
MATCH (coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
Tom Hanks’ colleagues’ movies
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title;
32. Hints
USING INDEX
Force the use of a specific index
MATCH (a:Person {name:"TomHanks"})-[:ACTS_IN]->()
USING INDEX a:Person(name)
RETURN count(*)
33. Hints
USING SCAN
Forces a label scan on lower cardinality labels
MATCH (a:Actor)-->(m:Movie:Comedy)
USING SCAN m:Comedy
RETURN count(distinct a)
35. Use parameters
MATCH (p:Person {name: {name}})
-[:ACTS_IN]->(m)
RETURN m.title
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m)
RETURN m.title
36. Avoid Cartesian products
‣ Easy to do this inadvertently:
MATCH (a:Actor), (m:Movie)
RETURN count(a), count(m)
‣ This is correct, and performs better
MATCH (a:Actor)
WITH count(a) as a_count
MATCH (m:Movie)
RETURN a_count, count(m)
39. Only RETURN what you need
‣ This is not recommended:
MATCH (a:Actor)
RETURN a
‣ Use this instead:
MATCH (a:Actor)
RETURN a.name, a.birthdate, a.height
40. tl;dr
‣ View query plans with EXPLAIN and PROFILE
‣ Use labels
‣ Index your starting points
‣ Reduce work in progress
‣ Remember the hints
41. Thanks for coming
‣ And don’t forget, if the tips aren’t working
ask us for help on Stack Overflow!
Mark Needham @markhneedham
Petra Selmer @Aethelraed