This document discusses coupling Semantic MediaWiki (SMW) with MASTRO to add more expressivity to SMW while maintaining polynomial reasoning times. SMW is augmented with DL-Lite constructs like disjoint classes and properties. Queries in SMW are translated to union of conjunctive queries (UCQs) for evaluation in MASTRO. A control panel manages the semantic data exchange between SMW and MASTRO's knowledge base. Future work includes using a more scalable coupling and SparSQL to handle additional SMW query features.
Scanning the Internet for External Cloud Exposures via SSL Certs
Coupling SMW with MASTRO Thesis
1. Coupling Semantic MediaWiki with
MASTRO
Student: Albin Ahmeti
Advisor: Prof. Maurizio Lenzerini
Master thesis 26/01/2011
2. Outline
• Mediawiki
•Semantic Mediawiki (SMW)
• Coupling SMW with MASTRO
•SMWQuonto Control Panel
• Conclusion
• Future work
Coupling SemanticMediaWiki with MASTRO 2
3. MediaWiki
• MediaWiki (abbr. MW) is a free server-based software, licensed under the
GNU General Public License, which runs Wikipedia
• It has been widely used in a lots of companies as a content-management
system, it provides fast page processing and short request time
• PHP & Mysql in backend
• It manages content mirroring, concurrent and conflicting page edits
between users
• The articles in Wikipedia are consisted of wiki text, which is actually, a
bunch of plain text and a kind of lightweight markup language
Coupling SemanticMediaWiki with MASTRO 3
4. MediaWiki (cont.)
• Creating pages is simplified to minimum, one has just to type “[[Title]]”
and then type the content, e.g. creating a wiki page titled “Rome”,
[[Rome]]
Coupling SemanticMediaWiki with MASTRO 4
5. MediaWiki (cont. )
• It distinguishes between pages using namespaces
• Main:
• User:
• Help:
• Talk:
• Usage of templates:
• Transclusion -> {{Template name}}
• Substitution -> {{subst:Template name}}
Coupling SemanticMediaWiki with MASTRO 5
6. Semantic MediaWiki
• Semantic MediaWiki (abbr. SMW) is the most popular platform to date
that encodes semantic data to the wiki articles
• Introduces some basic syntax, a sort of metadata that is machine
processable for each of the page constructs (link types)
• SMW is built on top of MediaWiki (MW), it has been developed using
the same technology as MW, i.e. tight coupling
Coupling SemanticMediaWiki with MASTRO 6
7. Semantic MediaWiki
• Annotation of pages
Is page-centric oriented:
Categories
Properties
Attributes
Coupling SemanticMediaWiki with MASTRO 7
8. Semantic MediaWiki
•Categories
Are used to classify pages for better retrieval and organization
Correspond to Classes in OWL DL
[[Category:City]] [[Category:Holy cities]]
•Sub-categories
Same notation, but defined in the namespace Category:
Class inclusion in OWL DL -> Intensional knowledge
Holy Cities ⊑ City
Coupling SemanticMediaWiki with MASTRO 8
9. Semantic MediaWiki
• Properties
Link types (relations) between wiki pages, i.e. hyperlinks
Correspond to OWL Object Property
Rome is capital of [[Italy]]
Rome is capital of [[capital of::Italy]]
• Subproperties
are defined in the namespace Property:
[[subproperty of::Property:Located In]]
capital of ⊑ Located In
Coupling SemanticMediaWiki with MASTRO 9
10. Semantic MediaWiki
•Attributes
Relations between a wiki page and a datatype
Rome has population 2,700,000.
Rome has population [[population::2,700,000]]
A property can be changed to an attribute, by giving a meaningful datatype
in Property namespace:
[[Has type::number]]
Attributes in wiki pages correspond to OWL Data Type Property, Annotation
Property and Object Property
Coupling SemanticMediaWiki with MASTRO 10
11. Semantic MediaWiki
•Querying in SMW (inline queries)
{{#ask: [[Category:City]] [[Located in::Italy]]
|?Population
|?Area
|sort=Population, Area
|order=descending, ascending
}}
Coupling SemanticMediaWiki with MASTRO 11
12. Semantic MediaWiki
{{#ask: [[Category:Student]][[degree::!Sapienza]] [[age::>24]][[age::<30]]
|?name
|?surname
|?age
}}
CWA -> easy to evaluate
OWA -> not easy to evaluate, ontology does not have complete information
Coupling SemanticMediaWiki with MASTRO 12
13. Semantic MediaWiki
• Architecture of SMW
Coupling SemanticMediaWiki with MASTRO 13
14. Semantic MediaWiki
SMW has
• rather limited expressivity
• no disjoint classes
• no disjoint properties
• no functionalities
• no inverses
• query language that does not allow joins and explicit variables
What, if we add more expressivity and keep the reasoning tasks (query answering) polynomial
?
Coupling SemanticMediaWiki with MASTRO 14
15. Our approach
•We use QuOnto as a reasoner
• Use expressivity of DL-Lite, reasoning tasks are polynomial wrt to the size
of the ontology:
query answering
subsumption
ontology satisfiability
instance checking
• Use Union of Conjunctive Queries (UCQs) for posing queries
can express joins
allow variables
coincide with SELECT-PROJECT-JOIN SQL queries
LOGSPACE wrt to the data complexity
Coupling SemanticMediaWiki with MASTRO 15
16. Our approach
• DL-Litecore
B → A | ∃R R → P | P −
C → B | ¬B E → R | ¬R
B ⊑ C
A denotes an atomic concept, P an atomic role and P − its inverse.
B denotes a basic concept, R is a basic role
A(a) P (a, b)
DL-LiteF
(funct R)
DL-LiteR
R ⊑ E
Coupling SemanticMediaWiki with MASTRO 16
21. Our approach
DL-Lite vs OWL DL interpretation of SMW annotations
Coupling SemanticMediaWiki with MASTRO 21
22. Our approach
•Query translation
• query translation from #ask to function free positive LP (logic
programming) rules, obtaining minimal Herbrand model
semantics in the process, based on JieBao et al. (captures
SMW 1.4.2)
n-ary predicates removed
Adjustments (tunings) to deal with latest SMW 1.5.x
(mainly code-based)
Coupling SemanticMediaWiki with MASTRO 22
23. Our approach
•Query translation
Translation from SMW-QL to Logic Program, as defined in the paper
“Knowledge Representation and Query in Semantic MediaWiki: A Formal Study”-JieBao et al.
Coupling SemanticMediaWiki with MASTRO 23
24. Our approach
•Query translation
1) map query annotations to Logic counterparts (based on schema)
2) Replace body atoms
3) Perform Depth-First-Search on rules
Coupling SemanticMediaWiki with MASTRO 24
26. Our approach
•Query translation
Replace body of the queries with head definitions (occurring once)
Q(x):- Student(x), enrolledIn(x, y), N(y)
N(x):- Sapienza(x)
N(x):- x:=DIS … (2)
Coupling SemanticMediaWiki with MASTRO 26
27. Our approach
•Query translation
• Apply Depth First Search (DFS) on rules (2):
Q(x):- Student(x), enrolledIn(x, y), N(y)
N(x):- Sapienza(x) N(x):- x:=DIS
Coupling SemanticMediaWiki with MASTRO 27
28. Our approach
•Query translation
After applying DFS, we got Union Of Conjunctive Queries (UCQs):
Q(x):-Student(x), enrolledIn(x, y), Sapienza(y)
Q(x):-Student(x), enrolledIn(x, ‘DIS’)
Coupling SemanticMediaWiki with MASTRO 28
29. Our approach
•Query translation
{{#ask: [[Category:Student]]
[[enrolled in:: <q>[[Category:Sapienza]]</q> || DIS]] }}
• It returns all students that are in class Sapienza, and all its subclasses
(research centers, affiliations, etc.)
• This is fulfilled thanks to the PerfectRef, implemented in QuOnto, so no
special algorithm needed!
Coupling SemanticMediaWiki with MASTRO 29
30. SMWQuonto Control Panel
DBMS
it manages ABox
XML file
manages TBox
Coupling SemanticMediaWiki with Mastro 30
31. SMWQuonto Control Panel
Semantic data
pushed to QuOnto
Coupling SemanticMediaWiki with Mastro 31
32. SMWQuonto Control Panel
AskQL query
posed to QuOnto.
UCQs obtained
Coupling SemanticMediaWiki with Mastro 32
Results
33. Conclusion
•We managed to create a system, which offers a higher level of expressivity,
than the one offered by SMW, by maintaining complexity polynomial
• It can deal with a hundred thousands of pages (instances), e.g. Wikipedia
•We have more expressive query language – UCQ
– Constraining !, <, >, ~ , which can be dealt as well imposing EQL constraints on UCQ –
SparSQL
– Any extension in expressivity of askQL query language, makes the complexity NP-hard
• It is meant to be “proposal” for alternative triple-store used in SMW
• It can be considered as a SMW extension
Coupling SemanticMediaWiki with MASTRO 33
34. Future work
• Use SparSQL query language in order to deal with special constructs (!, <, >,
∼), hence by fully capturing the askQL queries.
• Provide a more scalable coupling between the two architectures, e.g.,
SOAP and Web Services
• Grab semantic data from more than one page, using RDF/XML output
• Grab category hierarchy at once, thus by having the taxonomy of the ontology
Coupling SemanticMediaWiki with MASTRO 34