This document discusses using natural language processing (NLP) for searching intranets. It begins with an abstract that introduces NLP and the purpose of exploring its use for intranet searching. The introduction provides an overview of NLP, including that it uses tools from artificial intelligence to process natural languages in two ways: parsing and transition networks. The document then discusses the goals, levels, and applications of NLP, and how NLP is implemented through setting up dictionaries and relationships. It concludes that while still a developing area, NLP has shown promise for information access and will continue to be researched and developed for applications like intranet searching.
1. An Natural Language processing
approach for Searching on
Intranets
Ansuman Acharya
Subrat Kumar Chand
2. Abstract
• Natural Language Processing (NLP) is the
computerized approach to analyzing text that is
based on both a set of theories and a set of
technologies. And, being a very active area of
research and development, there is not a single
agreed-upon definition that would satisfy everyone.
The purpose of this paper is to explore how and if
Natural Language Processing (NLP) should be used
for searching on intranets, as opposed to using NLP
for searching on Internet sites. Also it describes the
limitations and implementation techniques of
Natural language processing.
3. Introduction To Natural Language
Processing
• NLP is the means for accomplishing a particular
task.
• It is a combination of computational linguistics
and artificial intelligence .
• The natural language processing uses the tools of
AI such as: algorithms, data structures, formal
models for representing knowledge, models or
reasoning processes etc.
• There are two ways through which the natural
languages are being processed. First parsing
technique and the second is the transition network
5. Goals of NLP
• To specify a language comprehension and
production theory to such a level of detail that a
person is able to write a computer program which
can understand and produce natural language.
• To accomplish human like language processing.
The choice of word “processing” is very deliberate
and should not be replaced with “understanding”.
• Paraphrase an input text.
• Translate the text into another language.
• Answer questions about the contents of the text.
• Draw inferences from the text
6. Levels of NLP
• Phonetics and Phonological Knowledge
• Morphological Knowledge
• Lexical Knowledge
• Syntactic Knowledge
• Semantic Knowledge
• Discourse Knowledge
• Pragmatic Knowledge
• World Knowledge
7. How NLP is implemented
Implementation of NLP takes place in two steps:
• Technical set-up of the search engine (software)
• Set-up of the dictionaries and relationships that
enable the search engine to use natural language
searches.
• NL search engines rely heavily on this initial step of
establishing a list of everyday words (dictionaries)
that might be used to search the database and
establishing how these words relate to each other
(relationships).
• NLP is implemented either by purchasing an NL
search engine and carrying out the implementation
of the dictionaries and relationships internally or by
contracting an NLP search engine developer to help
in this process .
8. Natural Language Processing
Applications
• Information Retrieval
• Information Extraction (IE)
• Question-Answering
• Summarization
• Machine Translation
• Dialogue Systems
9. Should We Implement NLP
• The effectiveness of NL searching increases with
the domain specialization.
• Search engines are an extremely important facility for
any electronic body of information.
• On the Internet, users often go straight to the search
facility, rather than bother with understanding the
navigation principles and categories of a site.
• The question here is whether NLP is appropriate for
searching within an intranet. It is fair to assume that
users of a search engine for an intranet are
professionals that have specific questions in a domain
with which they are quite familiar already.
10. Conclusion
• While NLP is a relatively recent area of research and
application, as compared to other information
technology approaches, there have been sufficient
successes to date that suggest that NLP-based
information access technologies will continue to be a
major area of research and development in
information systems now and far into the future.
• Such a suite of technologies can produce collective
intelligence and a record of how it grew in a group
setting.
• Furthermore, the collaboration among the various
domain’s scientists tackling this challenging problem
may have unforeseeable technology spin-offs through
their working closely together, as opposed to the
tradition of the fields operating independently.