Most of the current search engines are based on the keyword search and it returns a large amount of irrelevant information in response to the user query. This is because of the keyword matching does not consider the semantics of the keyword term. Ontology has been increasingly used to efficiently capture the semantics of the information which overcomes the problems associated with traditional keyword search. The importance of ontology is identified in the area of building semantic information retrieval systems for the specific domain and gains greater significance over the tradition keyword search.Knowledge representation formalism such as description logic is used to obtain the knowledge of the domain of interest and is used for the development of intelligent information retrieval system.Prot�g� is the widely used tool for developing and editing ontology with GUI which enables the ontology developers to describe the conceptual terms and relationship between them by creating taxonomies. This paper describes the steps for creating operating system ontology and the various aspects like subclassof, superclassof relationships and an overview of search algorithm have been described
Introduction to IEEE STANDARDS and its different types.pptx
Semantic Information Retrieval Based on Domain Ontology
1. Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 04 Issue: 02 December 2015 Page No.74-76
ISSN: 2278-2389
74
Semantic Information Retrieval Based on Domain
Ontology
A. Sudha Ramkumar1
, B. Poorna2
1
Research Scholar, Bharathiar University , Coimbatore, India,
2
Principal, SSS Jain College,Chennai, India.
Email: sudharam99@gmail.compoornasundar@yahoo.com
Abstract - Most of the current search engines are based on the
keyword search and it returns a large amount of irrelevant
information in response to the user query. This is because of
the keyword matching does not consider the semantics of the
keyword term. Ontology has been increasingly used to
efficiently capture the semantics of the information which
overcomes the problems associated with traditional keyword
search. The importance of ontology is identified in the area of
building semantic information retrieval systems for the specific
domain and gains greater significance over the tradition
keyword search.Knowledge representation formalism such as
description logic is used to obtain the knowledge of the domain
of interest and is used for the development of intelligent
information retrieval system.Protégé is the widely used tool for
developing and editing ontology with GUI which enables the
ontology developers to describe the conceptual terms and
relationship between them by creating taxonomies. This paper
describes the steps for creating operating system ontology and
the various aspects like subclassof, superclassof relationships
and an overview of search algorithm have been described.
Keywords - Semantic Information retrieval; Search algorithm;
Ontology.
I. INTRODUCTION
Traditional keyword search relies on the occurrence of the
keywords in the document and this search returns a large
amount of irrelevant information.Knowledge management is
the process of capturing, structuring and reusing the
information to develop an understanding of how a particular
system works, and subsequently to share this information
meaningfully to other information system. The adequate
management of knowledge is becoming more important for
academicians in educational institutions. Ontology is used to
represent the knowledge of the domain of interest which
significantly improves the search results. According to Gruber
ontology is an explicit specification of conceptualizations.
Ontologies are developed with the help of ontology languages
such as web ontology language OWL, Resource Description
Framework RDF, and RDF-schema. Ontologies serve to
provide conceptualization in the domain of interest and thus
guarantee to provide semantics of the information. Thus
ontologies are an integral part in every knowledge management
initiative. The protégé tool developed by the Stanford
university is widely used tool for constructing domain ontology
and to develop semantic information retrieval system.This
paper provides an overview of development of ontology and
the use of DL query. Finally we discuss how a DL queries are
used for the semantic search. This paper is organized as
follows. Section 2 provides the overview of semantic
information retrieval systems, ontology languages and DL
queries. Section 3 describes our framework of semantic
information retrieval system based on ontology. Sections 4
provides an overview of search algorithm and section 5
describes the future work and conclusion.
II.BACKGROUND
The semantic web aims to extend the current web standards
and technology so that the semantics of the web is machine
processible[12] . The use of domain ontology in the semantic
information retrieval system enablesusers torepresent a
meaningful knowledge of a particular domain of interest which
makes the machine more intelligent to understand the
semantics of the information.Domain Ontology describes a list
of terms to represent important concepts, such as classes of
objects and the relationship between them in order to represent
the knowledge of domain[13]. Gruber defines ontology as an
explicit specification of conceptualization. Ontologies play an
important role in representing the terms unambiguously which
serve as a background for machine processible.Ontology
languages are formal languages and are used to construct
ontologies. They allow the encoding of knowledge about the
specific domain and often include reasoning rules that support
the processing of the available knowledge. Ontology languages
are commonly based on either first order logic or on
description logic. Among the several ontology languages such
as Resource Description Framework(RDF), RDF-Schema,Web
ontology language OWL, OWL is the most acceptable and
recommended ontology language by W3C to construct a
domain ontology[5].Sir Jorge Cardoso[2] surveyed on most
widely used ontology editors and found that protégé tool had a
market share of 68.2% followed by swoop, ontoedit,
texteditor, altova semantic works and hence ontologies were
mostly developed in the educational field 31%.Semantic
information retrieval using ontology in university domain[10]
developed and implemented a search engine confined to the
university domain where the query is analysed semantically to
obtain the accurate result. Semantic search using ontology and
RDBMS for cricket[9] uses reference to ontology data directly
from SQL using semantic match operators and focuses on
semantic search based on ontology. Developing an university
ontology in education domain using protégé for Semantic
web[6]by Sanjay Kumar Maliket al., demonstrated the
indraprastha university ontology with aspects like super class,
subclass, query retrieval process and TGVIZ graph
visualization.Developing university ontology using protégé owl
tool: Process and Reasoning[7] by Naveen Malviya et al., is
similar to Sanjay Kumar malik et al., .The Jena based ontology
model inference and retrieval application[14] by Luo Zhong et
al., proposed computer graphics ontology model and stored the
2. Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 04 Issue: 02 December 2015 Page No.74-76
ISSN: 2278-2389
75
owl documents in MySQL, sets the inference rule and achieve
search queries on the ontology database. Semantic Search :
Document ranking and clustering using computer science
ontology and N-grams[1] by Thanyaporn Boonyoung
presented a new document ranking process that proposes a new
weight of query term in the document based on the computer
science ontology.
III. FRAMEWORK OF PROPOSED SYSTEM
The operating system ontology has been developed using
protégé 4.2 with the related classes as conc epts. All the
concepts related to the operating system ontology has been
defined in the hierarchy with data properties, object properties
and annotation properties. The object property defines the
relationship between the concepts of operating system
ontology and the data properties used to define the relationship
between the concept individual and the data literal. The
annotation properties such as comment, seealso, isdefinedby
etc are used to annotate the concepts of the domain ontology of
interest. Definition of each class, subclass concept and list of
references to each class and subclass are attached to the node
of class hierarchy. The ontoGraf tab is used to visualize the
ontology concepts like graph. The DL query tab of protégé is a
powerful and easy to use feature for searching a classified
ontology. This query language is based on the OWL syntax
that is fundamentally based on collecting all the information
about a class, property or instances into a single frame.
Fig 1: Framework of Proposed system
Fig 2: Class Hierarchy of OS ontology
Our proposed system first identifies the concepts and their
relationship for the operating system ontology. The identified
concepts and their relationship are built into domain ontology
and stored in the knowledge repository with sufficient
annotation properties and instances. Finally, this ontology is
called through the user interface by matching the keyword term
with the concepts present in the class hierarchy. The search
algorithm is to be used to search the concepts of this domain
ontology and to obtain tha knowledge of that concept.
Fig 3: Visualization of ontology
Fig 4: DL query
IV. OVERVIEW OF SEARCH ALGORITHM
Hildebrand[4] examined 35 search systems and according to
him, the search system consists of 3 main steps. They areQuery
construction, Core search process and presentation and
organization of results.The core search process contains the
search algorithm. All keyword based search involves some form
of syntactic matching of the query against textual content or
metadata. The semantic matching extends the syntactic
matching by exploiting the linked data in the semantic graph.
To index the textual content of the ontology, a search engine
such as Lucene can be used along with a triple store. When
writing search algorithm, the easiest way to optimize code is to
create inverted index which is a structure which maps each
searchable keyword to a list of document that occurs in. Along
with each document, store a corresponding score for that
keyword in that document. To calculate score, there are many
search algorithms[4] available like weighted graph search
algorithm may constrain the possible path structures and path
length, e-culture algorithm assigns weights manually to RDF
relations, SemRank automatically computes weights based on
statistics derived from graph structures, and spreading
activation algorithm[11] incorporates weights as well as the
number of incoming links. The vector space model uses the
spreading activation algorithm by exploiting the relationship
3. Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 04 Issue: 02 December 2015 Page No.74-76
ISSN: 2278-2389
76
between the concepts will give the required appropriate
information to the user in response to their query.The spreading
activation model[3] is widely used in the information retrieval
which consists of a network structure of concepts of a domain
and processing techniques applied on this structure to find the
concepts related to the query term. The relations between
concepts of a domain are usually labelled and weighted. The
spreading activation algorithm will create a set of concepts
after receiving the contents of the query. With this set of
concept, it will find the nearest concepts in the network
structure. This process is an iterative one. This spreading
activationalgorithm is like waves activates from one concept to
all other conceptsconncected to the network. When this
spreading activation[8] is used with the query expansion
technique will yield very good search results.
V.CONCLUSION AND FUTURE WORK
The ontology becomes an integral part of every semantic
applications. The use of ontology in the search process will
make the machine understandable about the information and
helps to produce intelligent search results. The use of ontology
in the spreading activation algorithm increases the coverage of
relationship between the concepts. This paper developed a
operating system ontology in protégé along with relationships,
check for consistency of classes, visualizates the ontology and
finally describes the usage of DL query on ontology concepts.
The scope of this developed operating system is very small. Our
future work is to call the ontology concepts as network structure
consists of the nodes and arcs, here the concepts are represented
as nodes and properties are represented as arcs. This network
structure is to be applied to the spreading activation algortithm
to find relevant information based on the result of traditional
keyword search based on the document index.
REFERENCES
[1] Boonyoung, Thanyaporn, and Anirach Mingkhwan. "Semantic Search
Using Computer Science Ontology Based on Edge Counting and N-
Grams." Recent Advances in Information and Communication
Technology. Springer International Publishing, 2014. 283-291.
[2] Cardoso, Jorge. "The semantic web vision: Where are we?." Intelligent
Systems, IEEE 22.5 (2007): 84-88.
[3] Crestani, Fabio. "Application of spreading activation techniques in
information retrieval." Artificial Intelligence Review 11.6 (1997): 453-
482.
[4] Hildebrand, Michiel, Jacco Van Ossenbruggen, and Lynda Hardman. "An
analysis of search-based user interaction on the semantic web." (2007).
[5] Horridge, Matthew, et al. "A Practical Guide To Building OWL
Ontologies Using The Protégé-OWL Plugin and CO-ODE Tools Edition
1.0." University of Manchester (2004).
[6] Malik, Sanjay Kumar, Nupur Prakash, and S. A. M. Rizvi. "Developing an
university ontology in education domain using protégé for semantic
web."International Journal of Science and Technology 2.9 (2010): 4673-
4681.
[7] Malviya, Naveen, Nishchol Mishra, and Santosh Sahu. "Developing
university ontology using protégé owl tool: Process and
reasoning." International Journal of Scientific & Engineering Research 2.9
(2011): 1-8.
[8] Ngo, Vuong M. "Discovering Latent Informaion By Spreading Activation
Algorithm For Document Retrieval." International Journal of Artificial
Intelligence & Applications 5.1 (2014).
[9] Patil, S. M., and D. M. Jadhav. "Semantic Search using Ontology and
RDBMS for Cricket." International Journal of Computer Applications 46
(2012).
[10] Rajasurya, Swathi, et al. "Semantic information retrieval using ontology in
university domain." arXiv preprint arXiv:1207.5745 (2012).
[11] Schumacher, Kinga, Michael Sintek, and Leo Sauermann. Combining fact
and document retrieval with spreading activation for semantic desktop
search. Springer Berlin Heidelberg, 2008.
[12] Shadbolt, Nigel, Wendy Hall, and Tim Berners-Lee. "The semantic web
revisited." Intelligent Systems, IEEE 21.3 (2006): 96-101.
[13] Swartout, Bill, et al. "Toward distributed use of large-scale
ontologies." Proc. of the Tenth Workshop on Knowledge Acquisition for
Knowledge-Based Systems. 1996.
[14]Zhong, Luo, et al. "The Jena-Based Ontology Model Inference
andRetrieval Application." Intelligent Information Management 4.04
(2012):157