Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Arabic question answering
1. Imam University
College of Computer and Information systems
Computer sciences Department
Arabic Question Answering :
by Asma Ahmad Asma alharbi
nadia AL-Mutiri
Supervised by: Dr .Amal Al seef
Second semester :1434-1435
2013
2. Arabic Question Answering
Overview:
O The implementation of Arabic Question-
Answering system components .
O QASAL & QARAB System components.
O Yes/No Arabic Question Answering.
4. Named Entity Recognizer
O A NER system identifies proper
names, temporal and numeric expressions .
O in this Arabic NER system is based ME
approach.
O For the proper names recognition:
O For temporal and numeric expressions: is
totally based on patterns and a small
dictionary containing the names of days and
months in Arabic, and numbers written in
letters.
5. The implementation of Arabic
Question-Answering system
O NooJ is a linguistic environment that
includes large-coverage dictionaries and
grammars.
O a spell-checker that corrects the most
frequent errors.
O a named entity recognition tool which is
set of rules described into local grammars
7. Question analysis: this step it is apply the set of
linguistic resources to the input question.
For example shows the NooJ’s text annotation
structure that gives the linguistic analysis of each
word form in our sample question
8. Passage retrieval: The first task of this step
could be the selection of one or more
automatically extract the answer of the
input question.
9. Answer Extraction: this last step uses the
displayed concordance table to
automatically extract the answer of the
input question.
Example1 :Answer Extraction for the factoid question:
12. Information Retrieval system .
O To search the document collection to select
documents containing information relevant to the
user’s query.
O Lundquist et al. [1999] IR system that can be
constructed using a relational database management
system (RDBMS).
O But in this paper it contain following database
relations:
1. ROOT_TABLE.
2. STEM_TABLE.
3. POSTING_TABLE.
4. DOCUMENT_TABLE.
5. PARAGRAPH_TABLE.
13. The NLb system
The NLB model is:
1. Tokenizer.
2. type finder.
3. feature finder.
4. proper noun phrase parser.
14. How to extract the Answer
Assume the user posed the following question to
QARAB:
The IR return this passage . How?!
20. Question Analysis
O Removing the question mark.
O Removing the interrogative particle
O Tokenizing: the tokenizer divides the user
question into its separate words .And
normalize the (Alef) letter.
O Removing the stop words.
O Removing the negation particles. (if it
exits) and set the negation property of the
question representation
21. Question Analysis
O Tagging: to determine the type of a
word, verb or noun and obtain its root.
O Parsing: recall that the Arabic sentence
after the interrogative particle is nominal
or verbal.
22. Question Analysis
In nominal sentence, we are interested with the
beginning noun “topic” ( ) which is the first
noun after the interrogative particle ( ). And the
comment noun ( ) and we can mark it as the
last noun without the article ( ).
In verbal sentence we are interested with the
verb of the sentence which occur immediately
after
the interrogative particle ( ) , and the subject
that follow the verb.
23. Question Analysis
Logical Representation(With Nominal Sentences)
Affirmative questions
O N (Topic, root (Comment), root
({remaining words }))
O N (Topic, root (Comment Synonyms), root
({remaining words}))
O ~N (Topic, root (Comment Antonyms), root
({remaining words}))
24. Question Analysis
Logical Representation(With Nominal Sentences)
O Negated questions :
O ~N (Topic, root (Comment), root
({remaining words}))
O ~N (topic, root (Comment Synonyms), root
({remaining words}))
O N (Topic, root (Comment Antonyms), root
({remaining words}))
29. Text Processing & Retrieval
They are 20 documents in corpus. This module uses two
techniques to retrieve the top 5
candidate paragraphs (with variable length (that are most
relevant to the user question:
O Paragraphs technique: - Split the documents into its
built-in paragraphs and retrieve the top 5 paragraphs
regardless from which document they are, according to
some indexing scheme.
O Document technique-:Retrieve the top 5 documents
after they are ranked, then use the first indexing scheme
to retrieve the top 5 paragraphs.
30. Answer Selection &
generation
After the 5 paragraphs are selected using
documents technique or paragraphs
technique, we need to select the best
sentence to represent the answer, and
accordingly generates yes or no .
31. Answer Selection &
generation
O Split the paragraphs into their sentences .
O In normal sentences we are interested in
the exact topic ( ) not its used root, so
we omit each sentence that does not
contain it (in the original form )In verbal
sentence we are interested in the exact
subject ( ) not its used root , so we omit
each sentence that does not contain it (in
the original form )
32. Answer Selection &
generation
O In the result sentence , we look for the
remaining terms (in root form) that derived
from the
question in the logical representation (except
the subject or the topic ), if the they exist
, assign
those indexes according to their position in the
sentence. So each sentence will have its own
rank
as follow :
Rank =last occurrence - first occurrence
O look for ( ) negation particles in the
selected answer (if exist).
33. Answer Selection &
generation
O Using the selected answer and the logical
representation of the question to generate
yes ,or no a follows :
1. Yes ,if : The question and the answer
are affirmative .The question and the
answer are negated.
2. No, if :The question if affirmative and the
answer are negated.The question is
negated and the answer is affirmative.
35. conclusion
O We have described the generic
architecture for AQ answer
O compare with deferent system
O How presses the question and give the
answers.