In this paper, we propose a domain ontology construction tool with OWL. The advantage of our tool is focusing the quality refinement phase of ontology construction. Through interactive support for refining the initial ontology, OWL-Lite level ontology, which consists of taxonomic relationships (defined as classes) and non-taxonomic relationships (defined as properties), is constructed effectively. The tool also provides semi-automatic generation of the initial ontology using domain specific documents and general ontologies.
Diagrammatic knowledge modeling for managers – ontology-based approach
DODDLE-OWL: A Domain Ontology Construction Tool with OWL
1. DODDLE-OWL: A Domain Ontology
Construction Tool with OWL
Takeshi Morita 1)
,Naoki Fukuta 2)
, Noriaki Izumi 3)
,
and Takahira Yamaguchi 1)
1) Keio University, Japan
2) Shizuoka University, Japan
3) National Institute of AIST, Japan
2. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
3. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
4. Motivation
• Background
– The role of domain ontologies is important for the
Semantic Web
– Sharing common understanding among people and
software agents
– Finding appropriate information on the web
• Issues of large cost with building up domain
ontologies
– Many concepts in a domain
– Each concept has high specific meaning
– We need knowledge of domain experts
– Cost-benefit performance of domain ontologies is
lower than that of general ontologies (e.g. WordNet,
EDR)
5. Semi-Automatic Construction
Set of
Concept Pairs
Quality Refinement
Domain
Specific
Documents
(English or
Japanese)
A Domain Ontology
(OWL format)
Initial
Concept Hierarchy
Translation
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
Our Goal
User (domain expert)
Taxonomic
Relationships
Non-Taxonomic
Relationships
DODDLE-OWL:
a Domain Ontology rapiD
DeveLopment Environment
– OWL extension
Focusing the quality refinement
phase of ontology construction
6. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
7. Related WorksMehrnoush Shamsfard, Ahmad Abdollahzadeh Barforoush,
The State of the Art in Ontology Learning: A Framework for Comparison
Learning System Element (s)
Learned
Prior Knowledge Input
DODDLE-OWL
(Keio University)
Taxonomic and non-taxonomic
conceptual Relations
WordNet, EDR Unstructured domain specific
texts (English and Japanese)
ASIUM
(Paris-Sud
University )
Verb subcat.
Frames + hierarchies
Linguistic K. Unstructured
(corpora) ( French )
HASTI
(Amir Kabir University
of Technology )
Words, Concepts, Taxonomic
and Non-Taxonomic
conceptual relations, axioms
Almost empty
(small kernel)
Unstructured NL texts
(Persian )
SVETLAN’
(CNRS laboratory)
Noun Classes Structured + Unstructured
input to SEGAPSITH
(French )
SYNDIKATE
(University of Albert-
Ludwigs)
Words , Concepts, Taxonomic
and Non-Taxonomic
conceptual relations
Generic and
domain lexicons
and ontologies
Unstructured NL texts
(German )
TEXT-TO-ONTO
(University of
Karlsruhe)
Concepts, Taxonomic and Non-
Taxonomic conceptual
relations
Lexical DB +
domain lexicon
NL texts , Web docs,
Semi-structured (XML, DTD)
and structured
(German , HTML , XML,
DTD )
WEB→KB Instances of classes and The ontology for An ontology + Training
Support to construct
Taxonomic and Non-Taxonomic
Relationships.
8. Learning System Degree of
Automation
DODDLE-OWL
(Keio University)
User Interaction,
Hand-made
modification
ASIUM
(Paris-Sud University )
Cooperative
HASTI
(Amir Kabir University
of Technology )
Both automatic and
cooperative modes
SVETLAN’
(CNRS laboratory)
Automatic
SYNDIKATE
(University of Albert-
Ludwigs)
Automatic
TEXT-TO-ONTO
(University of
Karlsruhe)
Semi-automatic
interactive,
balanced
cooperative
WEB→KB
(Distributed Systems
Technology Centre)
Automatic
Related Works (cont.)Mehrnoush Shamsfard, Ahmad Abdollahzadeh Barforoush,
The State of the Art in Ontology Learning: A Framework for Comparison
Many ontology learning systems
are focusing on automatic
ontology construction.
The user is difficult to refine the
automatic generated ontologies.
However
Our system is focusing on
high-level support
for user interaction.
The user is easy to refine semi-
automatic generated ontologies
and constructs high quality
domain ontologies.
Therefore
9. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
10. Set of
Concept Pairs
A Domain Ontology
(OWL format)
Initial
Concept Hierarchy
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
System Overview
User (domain expert)
Taxonomic
Relationships
Non-Taxonomic
Relationships
Semi-Automatic Construction
Quality Refinement
Domain
Specific
Documents
(English or
Japanese)
Translation
DODDLE-OWL:
a Domain Ontology rapiD
DeveLopment Environment
– OWL extension
Focusing the quality refinement
phase of ontology construction
Documents
Construction Module
Refinement Module
Visualization
Module
Input Module
Translation
Module
11. Set of
Concept Pairs
A Domain Ontology
(OWL format)
Initial
Concept Hierarchy
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
System Overview
User (domain expert)
Taxonomic
Relationships
Non-Taxonomic
Relationships
Documents
Construction Module
Refinement Module
Visualization
Module
Input Module
Translation
Module
12. Procedure of Input Module
WordNet
EDR
(general)
EDR
(technical)
Ontology Selection
Domain specific documents
(English or Japanese)
Input Word Selection
Word POS TF IDF TF-IDF
W1 Noun ….. ….. …..
W2 Complex Word ….. ….. …..
Morphological Analysis
Complex Word Extraction
Document Selection
………………………………………..
Disambiguation
Input Word Set
Input Concepts
W1
W2
W3
EDR (general): Ci
WordNet: Cj
EDR (technical): Ck
………………………………………..
Input Word Set Input Concept Set
Select significant
words for the domain
(input words)
Identify the sense of
input words to map
those words to concepts
in the general ontologies
13. A Domain Ontology
(OWL format)
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
User (domain expert)
Taxonomic
Relationships
Non-Taxonomic
Relationships
Documents
Refinement Module
Visualization
Module
Translation
Module
statistic
methods
matching
& trimming
Relationship
Construction
Construction Module
Set of Concept Pairs
Input Concept Selection
Hierarchy
Construction
Association Rule WordSpace
Initial Concept Hierarchy
Input concepts
Input Module
15. Hierarchy Construction Module
Taxonomic Relationships
in the general ontologies
Merging
Input Concepts
Trimming
Root Root
unnecessary Internal Node
Best Matched Node
Salient Internal Node
Initial Model initial concept hierarchy
get paths related
to input concepts
generate
an initial model
trimming
16. Extract Concept Pairs
by different methods
Relationship Construction Module
Input concepts
Matching
Documents
WordSpace
Association
Rule
Set of concept pairs
method based on
context similarity
Popular method
in the field of data mining
17. • Words and phrases in documents can be expressed
by vector representation containing co-occurrence
statistics
WordSpace Method
WordSpace ( Marti A. Hearst, Hinrich Schutze )
… wi … wj …C1 … wk …
… wi … wj …C2 … wk …
Context Similarity between concepts C1 and C2
• Inner products among the vectors work as
the similarity between the words and phrases.
High similarity
Significant related concept pair for the domain
18. Association Rule
• Find associations between items in a set of
transaction
• In our research
– Each item is an input concept appearing in the
document
– One transaction is one sentence in the document
• Parameters
– Support = contain X and Y / All transaction
– Confidence = contain X and Y / contain X
X and Y: input concepts
19. statistic
methods
matching
& trimming
Relationship
Construction
Construction Module
Set of Concept Pairs
Input Concept Selection
Hierarchy
Construction
Refinement Module
Association Rule WordSpace
Concept
Specification
Template
Documents
Visualization
Module
A Domain Ontology
(OWL format)
Input Module
Initial Concept Hierarchy
Matched Result
Analysis
Hierarchy
Refinement
Trimmed Result
Analysis
Translation
Module
Input conceptsEDR
(general)
EDR
(technical)
WordNet
General
Ontologies
The value of
co-concurrency
Relationship
Refinement
20. Concept
Specification
Hierarchy Refinement
Module
Relationship Refinement
Module
Refinement Module
Concept
Hierarchy
Concept Drift
Management
Better performance
by changing parameters
with interaction of a userVisualization Module:
MR3
RDF&RDFS Visual Editing
Translation
Module
Set of Concept PairsInitial Concept Hierarchy
support to refine
the initial concept
hierarchy graphically
21. Hierarchy Refinement Module -
Concept Drift
A Domain Ontologygeneral ontologies
Reusable Part
No reusable part
because of
concept drift
an initial concept
hierarchy
constructed
The position of particular
concepts changes depending
on the domain.
adjust the initial concept hierarchy
to the specific domain
Concept Drift
22. Hierarchy Refinement Module
strategy 1 Matched Result Analysis
Point out differences
of abstraction level
among sibling nodes
according to the
Trimmed Result
MOVE
MOVE
MOVE
STAY
A
B C
D
Trimming
Area
Initial Model
0
0 3
B C D
Trimmed Model
A
B C
A
D
strategy 2 Trimmed Result Analysis
Divide the initial concept
hierarchy into reusable area
and not reusable area
according to the position of
Best Matched Nodes
suggest a user to move not
reusable area
Trimming
Reconstructed
by User
Best Matched Node
Internal Node
Reconstructed
by User
23. Concept Drift Management
Matched Result Analysis
Trimmed Result Analysis
Visualization Module
Visualization Module
Parts of Modification
are highlighted based on
Matched result analysis
and Trimmed Result analysis
24. Relationship Refinement Module
Non-Taxonomic Relationship
Learning
Non-Taxonomic Relationships
Identify correct pairs from
generated candidates
Setting parameters for
WordSpace and
Association Rule
Construct non-taxonomic
relationships by
considering the relation
with each concept pair
25. statistic
methods
matching
& trimming
Relationship
Construction
Construction Module
Set of Concept Pairs
Input Concept Selection
Hierarchy
Construction
Refinement Module
Association Rule WordSpace
Concept
Specification
Template
Documents
Visualization
Module
A Domain Ontology
(OWL format)
Input Module
Initial Concept Hierarchy
Matched Result
Analysis
Hierarchy
Refinement
Trimmed Result
Analysis
Translation
Module
Input conceptsEDR
(general)
EDR
(technical)
WordNet
General
Ontologies
The value of
co-concurrency
Relationship
Refinement
27. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
28. Implementation Architecture
Java Virtual Machine
Jena2MR3
Visualization
Module
Translation
Module
Construction and
Refinement Module
Input Module
Java WordNet Library (JWNL)
Gensen Sen SS-Tagger
JWNL: http://jwordnet.sourceforge.net/
Gensen: A Complex Word Extraction Tool
Sen: A Japanese morpheme analyzer, http://ultimania.org/sen/
SS-Tagger: English Tagger
MR3
: an RDF & RDFS graphical editor, http://mmm.semanticweb.org/mr3/
Jena Semantic Web Tool Framework: HP Labs, http://jena.sourceforge.net/
29. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
30. Case Studies
• Purpose
– To check DODDLE-OWL can support the user in
constructing taxonomic and non-taxonomic relationships
• Target Field
– Particular field of business
– xCBL (XML Common Business Library)
• http://www.xcbl.org/
• Domain Specific Document
– xCBL Document Description
– about 150 sentences and 2500 words
• Input Concepts
– 57 business concepts from the document
• User
– Not an expert but has business knowledge
31. Results and Evaluation for
Taxonomic Relationships Construction
Initial
Concept Hierarcy
Business Ontology
Get paths related
to input concepts
from WordNet
Trimming Modification
57
Concepts
Precision Recall
ST1: Matched Result Analysis 5/25(=0.2)
5/7(=0.71)
Evaluation of two strategies by the user
82
Concepts
152
Concepts
83
Concepts
The number of concepts in each model
Input Concepts Initial Model
Constructed with Hierarchy Construction Module
Constructed with Hierarchy
Refinement Module
32. Results and Evaluation for
Non-Taxonomic Relationships Construction
WS AR The union of WS & AR
# Extracted
concept pairs
40 39 66
# Accepted
concept pairs
30 20 39
# Rejected
concept pairs
10 19 27
Precision 0.75 (30/40) 0.51 (20/39) 0.59 (39/66)
Association Rule ( 20/39 )WordSpace ( 30/40 )
8 17
1119 9
2
The union of WS & AR ( 39/66 )
Frequency of
Extracted 4-gram
Context Scope
( before :
after )
threshold
of similarity
2 10:10 0.6
Minimum
Support
Minimum
Confidence
0.7 % 55 %
For WordSpace parametersFor Association Rule parameters
33. Results and Evaluation for
Non-Taxonomic Relationships Acquisition
WS AR The union of WS & AR
# Extracted
concept pairs
40 39 66
# Accepted
concept pairs
30 20 39
# Rejected
concept pairs
10 19 27
Precision 0.75 (30/40) 0.51 (20/39) 0.59 (39/66)
Association Rule
( 20/39 )
8 17
1119 9
WordSpace ( 30/40 )
2
The union of WS & AR ( 39/66 ) The precision of WS method is good,
but the WS method has its bias
so we cannot get certain types of
concept pairs from it.
we combine two different methods for
getting wider range of concept pairs.
34. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
36. Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions
37. Conclusions
• Summary
– DODDLE-OWL: a Domain Ontology rapiD DeveLopment
Environment – OWL extension
• Focusing the quality refinement phase of ontology construction
– Case studies
• construct a domain ontology for xCBL
• Support the user in constructing and refining the domain ontology
• Future Work
– Reuse existing (domain) ontologies in any forms
– Apply DODDLE-OWL to large scale domain ontology
construction
• Rocket operation ontology
• About 40,000 concepts
38. Thank you for your attention.
DODDLE-OWL has been released .
Please visit this web site, if you like it.
about 100 user now
http://mmm.semanticweb.org/doddle/