3. Introduction
The Association for Computational linguistics defines CL as the
scientific study of language from a computational perspective.
Computational linguists are interested in providing computational
models of various kinds of linguistic phenomena.
Work in computational linguistics is in some cases motivated from a
scientific perspective in that one is trying to provide a computational
explanation for a particular linguistic or psycholinguistic
phenomenon.
4. Definition
Computational linguistics is the application of linguistic theories and
computational techniques to problems of natural language
processing.
Grishman (1986) defines Computational linguistics as the study of
computer systems for understanding and generating natural
language.
6. Purpose
The purpose of CL is to develop applications that deal with
computer tasks related to human language, like development of
software for grammar correction, word sense disambiguation,
compilation of dictionaries and corpora, automatic translation from
one language to another, etc.
7. Origin
Computational linguistics is often grouped within the field of artificial intelligence but was present before the
development of artificial intelligence. Computational linguistics originated with efforts in the United States in
the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian
scientific journals, into English. Since computers can make arithmetic (systematic) calculations much faster
and more accurately than humans, it was thought to be only a short matter of time before they could also
begin to process language. Computational and quantitative methods are also used historically in the attempted
reconstruction of earlier forms of modern languages and sub-grouping modern languages into language
families. Earlier methods, such as lexicostatistics and glottochronology, have been proven to be premature
and inaccurate. However, recent interdisciplinary studies that borrow concepts from biological studies,
especially gene mapping, have proved to produce more sophisticated analytical tools and more reliable
results.
When machine translation (also known as mechanical translation) failed to yield accurate translations right
away, automated processing of human languages was recognized as far more complex than had originally
been assumed. Computational linguistics was born as the name of the new field of study devoted to
developing algorithms and software for intelligently processing language data. The term "computational
linguistics" itself was first coined by David Hays, a founding member of both the Association for
Computational Linguistics (ACL) and the International Committee on Computational Linguistics (ICCL).
8. Origin
To translate one language into another, it was observed that one had to understand the
grammar of both languages, including both morphology (the grammar of word forms) and
syntax (the grammar of sentence structure). To understand syntax, one had to also
understand the semantics and the lexicon (or 'vocabulary'), and even something of the
pragmatics of language use. Thus, what started as an effort to translate between languages
evolved into an entire discipline devoted to understanding how to represent and process
natural languages using computers.
Nowadays research within the scope of computational linguistics is done at computational
linguistics departments, computational linguistics laboratories, computer science
departments, and linguistics departments. Some research in the field of computational
linguistics aims to create working speech or text processing systems while others aim to
create a system allowing human-machine interaction. Programs meant for human-machine
communication are called conversational agents.
9. Main Application Areas
Machine Translation
Natural Language Interface
Speech Recognition
Intelligent word processing: Grammar Checking, spelling correction
10. Machine Translation:
Machine translation (MT) is a sub-field of computational linguistics that investigates
the use of software to translate text or speech from one language to another.
The process of translation involves,
o Moving texts from one (human) language (source language) to another (target language) in a way that
preserves meaning.
o Machine translation (MT) automates the process, or part of the process.
o Fully automatic translation
o Computer-aided (human) translation
Is MT good or not?
11. Natural Language interface:
Natural Language Interface (NLI) is an interface that allows users to interact with the computer using a
human language.
NLI essentially provides an abstract layer between users and computers by enabling computers to understand
human language instead of the other way around. It allows the user to enter natural language search
queries in written or spoken text.
Proven Capabilities:
Natural-language (NL) interfaces built so far have primarily addressed the problem of accessing
information stored in conventional data base systems. Among the proven capabilities exhibited by these systems are
those that,
Provide reasonably good access to specific data bases.
Answer direct questions. ( What is Komal’s salary?)
Coordinate multiple files. ( ‘‘What is Komal’s location? Translates into What is the location of the department of Komal?’’)
12. Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops
methodologies and technologies that enable the recognition and translation of spoken language into text by computers.
How does it work?
Speech recognition software works by breaking down the audio of a speech
recording into individual sounds, analyzing each sound, using algorithms to find
the most probable word fit in that language, and transcribing those sounds into
text.
Speech Recognition Digital Assistants:
Digital assistants are designed to help people perform or complete basic tasks and
respond to queries with the ability to access information from vast databases and
various digital sources.
For Example: Amazon’s Alexa, Apple’s Siri and Google’s Google Assistant
Speech Recognition:
13. Grammar Checking:
Grammar checking is the task of detection and correction of grammatical errors in the text. English is the dominating language
in the field of science and technology. Therefore, the non-native English speakers must be able to use correct English grammar
while reading, writing or speaking. This generates the need of automatic grammar checking tools.
Grammar Checking Tools:
The trend of developing such tools has been evolved from 80’s till now. Earliest grammar checking tools e.g., Writer’s
Workbench were aimed at detecting punctuation errors and style errors.
In 90’s, many tools were made available e.g., Right Writer.
In recent decades, rapid development has been seen in this field. For example, Park et al developed a grammar
checker as a web application for university. ESL students.
Types of Errors:
Sentence Structure Error, Spelling Error, Punctuation Error, Syntax
Error,
Preposition Error, Semantic Error etc.
14. Approaches:
Just as computational linguistics can be performed by experts in a variety of fields
and through a wide assortment of departments, so too can the research fields broach a
diverse range of topics. The following sections discuss some of the literature available
across the entire field broken into four main area of discourse: developmental
linguistics, structural linguistics, linguistic production, and linguistic comprehension.
Developmental Approach
Language is a cognitive skill that develops throughout the life of an individual. This
developmental process has been examined using several techniques, and a computational
approach is one of them. Human language development does provide some constraints which
make it harder to apply a computational method to understanding it.
15. Approaches:
Structural Approach
To create better computational models of language, an understanding of language's structure is crucial. To
this end, the English language has been meticulously studied using computational approaches to better
understand how the language works on a structural level. One of the most important pieces of being able to
study linguistic structure is the availability of large linguistic corpora or samples.
Production Approach
The production of language is equally as complex in the information it provides and the necessary skills
which a fluent producer must have. That is to say, comprehension is only half the problem of communication.
The other half is how a system produces language, and computational linguistics has made interesting
discoveries in this area.
Text-based interactive Approach
By this method, words typed by a user trigger the computer to recognize specific patterns and reply
accordingly, through a process known as keyword spotting.
16. Approaches:
Speech-based interactive Approach
Recent technologies have placed more of an emphasis on speech-based interactive systems. These systems,
such as Siri of the iOS operating system, operate on a similar pattern-recognizing technique as that of text-
based systems, but with the former, the user input is conducted through speech recognition.
Comprehension Approach
Much of the focus of modern computational linguistics is on comprehension. With the proliferation of the
internet and the abundance of easily accessible written human language, the ability to create a program
capable of understanding human language would have many broad and exciting possibilities, including
improved search engines, automated customer service, and online education.
17. Conclusion
Now a days research within the scope of CL is done at computational
linguistics departments, CL laboratories, computer science
departments, and linguistics departments.