Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Nächste SlideShare
×

# Horn Concerto – AKSW Colloquium

282 Aufrufe

Veröffentlicht am

Preliminary results of an ongoing work titled "Efficient Rule Mining on RDF Data". University of Leipzig, AKSW Colloquium, April 3rd, 2017.

Veröffentlicht in: Daten & Analysen
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Als Erste(r) kommentieren

• Gehören Sie zu den Ersten, denen das gefällt!

### Horn Concerto – AKSW Colloquium

1. 1. HORN CONCERTO Efﬁcient Rule Mining on RDF Data Tommaso Soru AKSW Colloquium, 03.04.2017
2. 2. RDF Rule Mining • RDFS/OWL rules are given as schema. • Schema-free datasets might have an implicit schema. • Why “mining”? Because rules are not visible in data. 2
3. 3. Motivation/1 Link Prediction Problem.    Given a union of graphs G = G1 ∪ … ∪ GN,  ﬁnd new edges among vertices s and t in G. 3
4. 4. Motivation/2 Markov Logic Networks.    Given a collection of ﬁrst-order statements (evidence) and a set of weighted ﬁrst-order rules, build an undirected weighted graph where nodes are statements and edges indicate dependency. 4
5. 5. Motivation/3 RDFS/OWL Interpretation Rule Mining Weight Learning Grounding Inference Input Dataset(s) Predicted TriplesMANDOLIN’s pipeline Markov Logic Networks 5
6. 6. Rule Mining & Weight Learning Given a directed labelled graph, ﬁnd rules  and weights associated with them. w rule w1 p(x, y) ← q(x, y) w2 p(x, y) ← q(y, x) w3 p(x, y) ← q(x, z) ^ r(y, z) w4 p(x, y) ← q(x, z) ^ r(z, y) w5 p(x, y) ← q(z, x) ^ r(y, z) w6 p(x, y) ← q(z, x) ^ r(z, y) Horn Clauses 6
7. 7. Horn Clauses A Horn clause is a clause (a disjunction of literals patterns)  with at most one positive, i.e. unnegated, literal pattern. p ∨ ¬q1 ∨ ¬q2 ∨ ... ∨ ¬qn p ← q1 ∧ q2 ∧ ... ∧ qn head body p(x,y) 7
8. 8. Conﬁdence score (weight) The conﬁdence score of a rule is deﬁned as the rate of  the occurrences of head and body together  over  the occurrences of the body. p ← q1 ∧ q2 ∧ ... ∧ qn head body 8
9. 9. The HORN CONCERTO approach P(A | B) = P(A∩ B) P(B) Bayes’ Theorem p ← q1 ∧ q2 ∧ ... ∧ qn Horn Clause Event-based conﬁdence score P(p | ! q) = P(p ∩ ! q) P( ! q) ≈ p ∧ ! q{ } ! q{ } 9
10. 10. Rules with p ∧ ! q{ } ! q{ } SELECT ?p (COUNT(*) AS ?c) WHERE { ?x ?p ?y . ?x <target_q> ?y . FILTER(?p != <target_q> ) } GROUP BY ?p SELECT ?q (COUNT(*) AS ?c) WHERE { [] ?q [] } GROUP BY ?q ! q = 1 p(x, y) ← q(x, y) 10
11. 11. Rules with p ∧ ! q{ } ! q{ } SELECT ?q ?r (COUNT(*) AS ?c) WHERE { ?x ?q ?z . ?z ?r ?y . ?x <target_p> ?y } GROUP BY ?q ?r SELECT (COUNT(*) AS ?c) WHERE { ?x <target_q> ?z . ?z <target_r> ?y } ! q = 2 p(x, y) ← q(x, z) ∧ r(z, y) 11
12. 12. Optimizations • Select only top N properties. • Order by descending score and prune when it’s lower than a threshold T. • Cache scores in-memory, as there might exist p1, p2 such that: pi(x,y) ← q(?,?), r(?,?). • Parallelize algorithm by rule type. 12
13. 13. Evaluation Setup • 8 CPUs, 32 GB RAM, Ubuntu 16.04 • Scalability study DBpedia Person (7 million triples) DBpedia 2016-04 (397 million triples) FUTURE • Rule effectiveness for link prediction FB15k (592 thousand triples) WN18 (151 thousand triples) • Rule quality (human judgment?) 13
14. 14. Preliminary results – DBpedia Person Runtime (s) # Rules Used RAM (GB) AMIE+ > 10 days 6,337 4 Ontological PF > 3 hours > 1,000 4 HORN CONCERTO single-thread 1.7 hours 3,125 client: 0.2 server: 1.0 14
15. 15. Preliminary results – DBpedia 2016-04 Runtime (s) # Rules Used RAM (GB) HORN CONCERTO single-thread 11 hours 887 client: 0.2 server: N/A 15
16. 16. Discussion • AMIE+ Cons: indexes the graph in-memory. • Ontological PF Pros: Very fast Cons: Relies on schema data (types domain, range) • Horn Concerto Pros: Works with SPARQL endpoint, fast also single-threaded,  may be able to overperform Ontological PF in RDF datasets with available schema. 16
17. 17. Thank you.