Abstract: An enhanced hybrid approach to OWL query answering that combines an RDF triple-store with an OWL reasoner in order to provide scaleable pay-as-you-go performance. The enhancements presented here include an extension to deal with arbitary OWL ontologies and optimisations that significantly improve scalability. We have implemented these techniques in a prototype system, a preliminary evaluation of which has produced very encouraging results.
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
PAGOdA poster
1. Pay-as-you-go OWL Query Answering
Using a Triple Store
Yujiao Zhou, Yavor Nenov, Bernardo Cuenca Grau and Ian Horrocks
Problem Setting
‣ Ontology Σ — a set of rules of the form φ(x) → Vi ∃yi ψ(x, yi)
‣ Data D — a set of ground atoms of the form P(a)
‣ Conjunctive queries — FO formula of the form q(x) ← ∃y ψ(x, y)
where ψ and φ are conjunctions of atoms.
Pay-as-you-go Approach
Intuition
‣ to delegate the bulk of the
computational workload to a
highly scalable datalog reasoner
!
‣ to minimise the use of a fully-fledged
reasoner
Diagram Over-approx to datalog
‣ Disjunctive knowledge
Evaluation
‣ Existential knowledge
{A, . . .}
‣ Evaluated on LUBM(100,1000), UOBM(1, 60, 500), FLY, DBPedia+travel
and NPD FactPages.
Average time without OWL 2 reasoning
triple store OWL 2 reasoner
ELHO Lower
Upper
Data
Ontology
L=LRL ∪ LEL ∪ … U
U
D
Query
Datalog Engine
Fragment
Summarisation
Summary
Datalog
Engine
Datalog
Engine
Full Reasoner Q
F
Dependency Analysis
F
Full Reasoner Q
Output
L = U
Tracking by datalog encoding
σ(cert(q, F)) ⊆ cert(q, σ(F))
Rule out
non-answers
Incomplete endomorphisms
Arrange calls to the
reasoner according
to the dependencies
heuristically
Acknowledgements Average time
Lower
Data
Done
This work was supported by the Royal Society, the EPSRC projects
Score!, ExODA, and MaSI3, and the FP7 project OPTIQUE.
!
!
!
!
‣ upper bound U
answer of q w.r.t the resulting set of rules U(Σ) and D.
Lower bounds
‣ basic lower bound LRL
answer of q w.r.t. the datalog fragment of Σ and D;
‣ EL lower bound LEL
answer of q w.r.t. the ELHO fragment of Σ and D.
Tracking encoding in datalog
Intuition: to compute all the rules and facts that participate in a proof
of q(a) in Σ∪D.
This goal can be archived using datalog encoding.
‣ Example:
‣ If B1(x1),…,Bm(xm) → H(x) is a rule in U(Σ),
Ht(x), B1 (x1), . . . , Bm (xm) → S(cr)∧B1t (x1 )∧ . . . ∧Bmt(xm ) is added to
the tracking rule.
‣ Involved rules: {r | S(cr) is derived}
Involved facts: {P(a) ∈ D | Pt(a) is derived}
Summarisation & dependency between answers
‣ Let σ be the summary function, σ(cert(q, F)) ⊆ cert(q, σ(F))
‣ If there is an endomorphism from a to b in F, then
a ∈ cert(q, F) implies b ∈ cert(q, F)
{. . . , A u B, . . .}
{C} {C}
x1
{A, . . .}
x2
R R
{A, . . .}
c {C}
x1
{A, . . .}
x2
R R
{. . . , A t B, . . .}
DL Ontology Dataset Queries
LUBM(n) SHI 93 ~100,000n 14 (std)+10
UOBM(n) SHIN 314 ~200,000n 1!5
FLY SRI 144,407 6,308
88
5
DBPedia SHOIN 1,757 12,119,662 441 (atomic)
NPD SHIF 819 3,817,079 329 (atomic)
LUBM(1000) UOBM(100) FLY DBPedia NPD
Queries 22/24 12/15 5/5 439/441 294/329
Time(s) 18.4 0.7 0.2 0.3 0.1
LUBM(100) UOBM(1) FLY DBPedia NPD
Time(s) 29.6 1.8 0.2 3 3