WordPress Websites for Engineers: Elevate Your Brand
The SPARQL Query Graph Model for Query Optimization
1. The SPARQL Query Graph Model
for Query Optimization
Olaf Hartig and Ralf Heese
2. Postings on the Jena Mailinglist
Question:
http://groups.yahoo.com/group/jena-dev/message/21436
Date: Mar 8, 2006
A series of SPARQL queries of the form:
... WHERE {
{ ?family ex:dad ?d .
?d ex:name “Peter” . }
{ ?family ex:mom ?m .
?m ex:name ”Robin” . } ...
My queries run very slowly
Simple queries on a database of 10 000 trees
describing families
Answer:
... Put the more specific part of the query first; it
makes a significant difference. ...
Reply:
... My time went from 33 000 ms to 150 ms ...
2
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
5. Outline
Query processing in databases
SPARQL query graph model (SQGM)
Rewriting SQGMs
Evaluation
Conclusion
5
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
6. Outline
Query processing in databases
SPARQL query graph model (SQGM)
Rewriting SQGMs
Evaluation
Conclusion
6
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
7. Tasks of the Query Engine
Query Parsing Query Rewriting
Query
Processing
in Databases
SPARQL SPARQL
Query Graph
Query Internal Model
representation
Graph Rewriting
Model the query
of SQGMs
Evaluation
Conclusion
QEP Execution QEP Generation
QEP – Query Execution Plan
7
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
8. Outline
Query processing in databases
SPARQL query graph model (SQGM)
Rewriting SQGMs
Evaluation
Conclusion
8
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
9. Advantages
Extensible to new
Supports all phases concepts of the
of query processing query language
Query
Processing
in Databases
SPARQL SPARQL
Query Graph
Query Model
Graph Rewriting
Model SQGMs
Evaluation
Conclusion
Adaptable to
Stores additional
changes of the
information
query language
needed for query
processing
9
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
10. Basic Structures
Operators
Process data (sets of variable bindings,
Query
RDF graphs) Head Processing
in Databases
Head: provided variables Body
SPARQL
Body: operator details Query Graph
Model
Rewriting
Dataflows SQGMs
Connects the input and Evaluation
Conclusion
the output of two operators
Directed acyclic graph
10
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
11. Constructing an SQGM
SELECT ?n ?c
FROM http://example.org/university.rdf
WHERE {
?s rdf:type ub:GraduateStudent . ?c ?n
OPTIONAL { ?s ub:takesCourse ?c } Query
SELECT
?s ub:name ?n . Processing
in Databases
} ?n,?c
SPARQL
Graph access operators ?s ?c ?n Query Graph
Model
Graph pattern operators JOIN
Rewriting
?s,?c
Join operators ?s,?n SQGMs
?s ?c
Select result ?s ?n Evaluation
JOIN
?s ub:name ?n
operators Conclusion
?s,?c
?s
optional
?s ?s ?c
?s rdf:type ub:GraduateStudent ?s ub:takesCourse ?c
http://example.org/university.rdf
11
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
12. Operator Types
Graph selection operators
Graph merge operators ?c ?n
Query
SELECT
Union operators Processing
in Databases
?n,?c
Solution modifier operators
SPARQL
Construct result operators ?s ?c ?n Query Graph
Model
JOIN
...
Rewriting
?s,?c ?s,?n SQGMs
?s ?c
?s ?n Evaluation
JOIN
?s ub:name ?n Conclusion
?s,?c
?s
optional
?s ?s ?c
?s rdf:type ub:GraduateStudent ?s ub:takesCourse ?c
http://example.org/university.rdf
12
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
13. Outline
Query processing in databases
SPARQL query graph model (SQGM)
Rewriting SQGMs
Evaluation
Conclusion
13
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
14. Query Rewriting
Goals:
Query
Faster evaluation of a query Processing
in Databases
Provide more options for the generation of query
plans, e.g.: SPARQL
Query Graph
Data access strategy Model
Join order Rewriting
SQGMs
Selection of indexes
Evaluation
Conclusion
14
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
15. Transformation Rules
Currently 26 transformation rules, e.g.:
MergeJoinedGPOs
?s ?n ?c
JOIN ?s ?n ?c
Query
?s,?c ?s,?n ?s ub:takesCourse ?c Processing
?s ub:name ?n in Databases
?s ?c ?s ?n regex(?n, quot;^Squot;)
?s ub:name ?n
?s ub:takesCourse ?c SPARQL
regex(?n, quot;^Squot;)
Query Graph
Model
SwitchJoinedJoinRightInputs
Rewriting
SQGMs
Evaluation
JOIN JOIN
Conclusion
JOIN JOIN
FindContradiction
15
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
16. Heuristic: Merge Graph Pattern Operators
Heuristic: rewrite strategy based on a set of rules
?c ?n
Graph pattern operators Query
SELECT
Processing
in Databases
can not be merged ?n,?c
SPARQL
?s ?c ?n Query Graph
Model
JOIN
Rewriting
?s,?c ?s,?n SQGMs
?s ?c
?s ?n Evaluation
JOIN
?s ub:name ?n Conclusion
?s,?c
?s
optional
?s ?s ?c
?s rdf:type ub:GraduateStudent ?s ub:takesCourse ?c
http://example.org/university.rdf
16
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
17. Heuristic: Merge Graph Pattern Operators
But these could be merged if they were
operands of the same join operation. ?c ?n
Query
SELECT
Processing
Apply transformation rules to in Databases
➔
?n,?c
restructure the SQGM SPARQL
?s ?c ?n Query Graph
Model
(SwitchJoinedJoinRightInputs) JOIN
Rewriting
?s,?n
?s,?c ?s,?n SQGMs
?s ?c
?s ?n Evaluation
JOIN
?s ub:name ?n Conclusion
?s,?c
?s
optional
?s ?s ?c
?s rdf:type ub:GraduateStudent ?s ub:takesCourse ?c
http://example.org/university.rdf
17
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
18. Heuristic: Merge Graph Pattern Operators
But these could be merged if they were
operands of the same join operation. ?c ?n
Query
SELECT
Processing
Apply transformation rules to in Databases
➔
?n,?c
restructure the SQGM SPARQL
?s ?c ?n Query Graph
Model
(SwitchJoinedJoinRightInputs) JOIN
Rewriting
?s,?c
?s,?n
?s,?c
SQGMs
optional
?s ?n
?c
?s ?c Evaluation
JOIN
?s ub:takesCourse ?c
Conclusion
?s ?s,?n
?s ?s ?n
?s rdf:type ub:GraduateStudent ?s ub:name ?n
http://example.org/university.rdf
18
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
19. Heuristic: Merge Graph Pattern Operators
Apply transformation rule to merge
(MergeJoinedGPOs) ?c ?n
Query
SELECT
Processing
in Databases
?n,?c
SPARQL
?s ?c ?n Query Graph
Model
JOIN
Rewriting
?s,?c
?s,?n
SQGMs
optional
?s ?n
?s ?c Evaluation
JOIN
?s ub:takesCourse ?c Conclusion
?s ?s,?n
?s ?s ?n
?s rdf:type ub:GraduateStudent ?s ub:name ?n
http://example.org/university.rdf
19
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
20. Heuristic: Merge Graph Pattern Operators
Apply transformation rule to merge
(MergeJoinedGPOs) ?c ?n
Query
SELECT
Processing
in Databases
?n,?c
SPARQL
?s ?c ?n Query Graph
Model
JOIN
Rewriting
?s,?c
?s,?n SQGMs
optional
?s ?c Evaluation
?s ?n
?s ub:takesCourse ?c Conclusion
?s rdf:type ub:GraduateStudent
?s
?s ub:name ?n
http://example.org/university.rdf
20
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
21. Evaluation Results
Messured query execution time of a selected query:
Factor ≈ 2.4 80
70 Query
Processing
60 in Databases
50
Seconds
SPARQL
40 Query Graph
original query
Model
30
rewritten query
20 Rewriting
5.8 2.5 SQGMs
10 39.4 16.4 77.9 32.3
0 Evaluation
UnivBench UnivBench UnivBench
Conclusion
(1.0) (5.0) (10.0)
SELECT ?n ?c
FROM http://example.org/university.rdf
WHERE {
?s rdf:type ub:GraduateStudent .
OPTIONAL { ?s ub:takesCourse ?c }
?s ub:name ?n .
}
21
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
22. Evaluation Results ctd.
Time for transformation between models: < 1 ms
Query
Processing
Query with contradiction: nearly 100% savings in Databases
SPARQL
Query Graph
Model
Approx. time savings:
Rewriting
SQGMs
Average: 45%
Evaluation
Best cases: 95%
Conclusion
22
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
23. Explanation of the Results
Reason: fast path algorithm of Jena
Perform pattern matching within the underlying
relational database Query
Processing
in Databases
Combined match of multiple basic graph patterns
SPARQL
possible Query Graph
Model
multiple SQL
... WHERE { Rewriting
queries – one
?s rdf:type ub:GraduateStudent . SQGMs
for every basic
OPTIONAL { ?s ub:takesCourse ?c }
Evaluation
graph pattern
?s ub:name ?n .
} Conclusion
one SQL query
... WHERE {
combining the
?s rdf:type ub:GraduateStudent .
marked basic
?s ub:name ?n .
graph patterns
OPTIONAL { ?s ub:takesCourse ?c }
}
23
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
24. Outline
Query processing in databases
SPARQL query graph model (SQGM)
Rewriting SQGMs
Evaluation
Conclusion
24
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
25. Conclusion and Future Work
SPARQL
SQGM: a query model
Query
for SPARQL
Graph Query
Supports all phases of Model Processing
query processing in Databases
SPARQL
Easy to extend
Query Graph
Model
Transformation rules and heuristics for SQGMs
Rewriting
Implementation illustrated the potential of SQGMs SQGMs
Evaluation
Outlook Conclusion
Develop further heuristics to rewrite SQGMs
Consider other phases of query processing
Integrate index selection into the query optimization
25
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization
26. The SPARQL Query Graph Model
for Query Optimization
Olaf Hartig and Ralf Heese
Thank you!
26
Olaf Hartig and Ralf Heese - The SPARQL Query Graph Model for Query Optimization