A Framework For Automatic Question Answering in Indian Languages

A Framework for Automatic Question
Answering in Indian Languages
Ritwik Mishra (PhD19013)
27th July 2022

Comprehensive Panel Members
Dr Rajiv Ratn Shah
PhD advisor
IIIT-Delhi
Prof Ponnurangam Kumaraguru
PhD advisor
IIIT-Hyderabad
Prof Radhika Mamidi
External expert
IIIT-Hyderabad
Dr Arun Balaji Buduru
Internal expert
IIIT-Delhi
Dr Raghava Mutharaju
Internal expert
IIIT-Delhi
2

What is Automatic Question Answering (QnA)?
Human/user asks a question, and a computer produces an answer.
3

Outline
● Motivation
● Thesis Statement
● Contributions
○ FAQ retrieval
○ Open Information Extraction
○ Better Contextual Embeddings
● Future Plans
○ Pronoun Resolution
○ Long-context Question Answering
4

Question Answering
Open
Information
Extraction
(OIE)
Better Contextual
Embeddings
Pronoun Resolution
Our Work
5
Thesis Overview

Why languages?
● Human intelligence is very much
attributed to our ability to express
complex ideas through language.
● With the advent of text modality
(i.e. writing) our ability to store
and spread information increased
dramatically.
6

Why Indian Languages?
● Despite being spoken by billions of people, Indic languages
are considered low-resource due to the lack of annotated
resources.
“Hindi is... an underdog. It has enough resources and everything looks
promising for Hindi.”
- Dr. Monojit Choudhury, MSR India, ML India Podcast, Aug 2020
7

Cases where Indic-QnA works well
8

Thesis Statement
● Explore the possibility to develop a framework for Automated
Question-Answering (QnA) in Indian languages with the help of multiple
supporting tasks like Open Information Extraction (OIE) and Pronoun
Resolution, and improved contextual embeddings.
10

Thesis Statement (visualized)
11
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking
Improve Contextual Embeddings

Question-Answering
12
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking

Types of QnA
QnA
Domain
Context
Answer Type
Discourse
Open-Domain Closed-Domain
Short
Long
No
Span-based
(MRC)
Sentence-based
(AS2)
Conversational
Memory-less
13

Methodologies in QnA
● Rule based approach
● Generative approach
● Retrieval based approach
14

15

16

Retrieval-based
17
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking

FAQ retrieval based QnA
User query
(q)
Questioni
( Qi
)
Answeri
( Ai
)
Top-k Question-Answer pairs (QA)
Where Qi is most similar to the user query (q)
k
18

FAQ retrieval based QnA (internal view)
User query (q)
FAQ database Sentence
Similarity
Calculator
(SSC)
# Q A
# Q A score
FAQ database with each
row having its similarity
score with q 19

Example
User query (q) : बच्चे की ना भ फ
ू ली हो है ठीक करने क
े लए क्या क्या करे?
Chatbot suggested:
Q1) बच्चे की अचानक ना भ फ
ू ल जाए तो क्या करें?
A1) बच्चे की ना भ अचानक नहीं फ
ू लती आमतौर पर यह देखा गया है क जब बच्चे का पेट थोड़ा फ
ू लता है …
Q2) अगर 5 म हने क बच्चे की ना भ फ
ू ली हुई हो तो क्या करे?
A2) 5 महीने क
े बच्चे की ना भ आम तोर पर मसल्स की कमजोरी क
े कारण होती है इस में कोई चंता ….
Q3) नवजात बच्चे की फ
ू ली हुए ना भ होती है वह जब ठीक हो जाए तो क्या उसक
े बढ़ उसे गैस की प्रॉब्लम हो
जाती है?
A3) यह एक बह्रम है बच्चे की ना भ जब फ
ु ल्ली होती है तो वह खतरे क
े लक्षण नहीं है फ
ु ल्ली हुई ना भ …
20

How to evaluate?
Question ( Q )
Answer ( A )
User deciding the relevancy of the
top-k Question-Answer pairs (QA)
suggested by chatbot for 4 diﬀerent queries
k
21
Question ( Q )
Answer ( A )
k
Question ( Q )
Answer ( A )
k
Question ( Q )
Answer ( A )
k
Top-k suggestions for q1 Top-k suggestions for q2
Top-k suggestions for q3 Top-k suggestions for q4

Evaluation Metrics
1. Success Rate (SR)
2. Precision at k (P@k)
3. Mean Average Precision (mAP)
4. Mean Reciprocal Rank (MRR)
5. normalized Discounted Cumulative Gain (nDCG)
22
Success Rate is the % of user queries for whom the chatbot suggested at least one relevant QA pair
among its top-k suggestions.

Evaluation techniques
Precision at k is the proportion of recommended items in the top-k set that
are relevant. 23

For each user (or q)
For each relevant item
Compute precision till that item
Average them
Average them 24

25

5. normalized Discounted
Cumulative Gain (nDCG)
26

5. normalized Discounted
Cumulative Gain (nDCG)
27

Selected Literature
● Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting."
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
○ Non-public healthcare dataset in African languages
○ 230K QA pairs → 150K Qs with 126 As
○ Hence 126 clusters of Qs
○ Predict a cluster for q
● Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated
Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication
Technologies and Development. 2020.
○ Transcription based
○ 516 Qs with 90 As
○ User had to specify broad category of q
● Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
○ Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q).
○ Q-q similarity works better than QA-q or A-q similarity.
28

Selected Literature
● Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting."
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
○ Non-public healthcare dataset in African languages
○ 230K QA pairs → 150K Qs with 126 As
○ Hence 126 clusters of Qs
○ Predict a cluster for q
● Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated
Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication
Technologies and Development. 2020.
○ Transcription based
○ 516 Qs with 90 As
○ User had to specify broad category of q
● Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
○ Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q).
○ Q-q similarity works better than QA-q or A-q similarity.
Research Gap: There is a need to propose automated solutions to provide healthcare related information
to an end-user in remote places of India. 29

The proposed FAQ-chatbot
Chatbot
FAQ
database
if count < k
User query (q)
Finding top-k most similar
QA pair for the given q
Asking user if Q is similar to q?
No
No
Yes
A
q cannot be
answered
Yes Response
Response
User User User
30

Diﬀerent SSC techniques
Sentence
Similarity
Calculator
(SSC)
Dependency Tree
Pruning (DTP) method
Cosine Similarity (COS)
method
Sentence-pair Classiﬁer
(SPC) method
Training NOT needed Training needed
31

Evaluation data
● On ﬁeld testing was not feasible due to COVID.
● We collected 336 queries from the healthcare workers.
● Manually annotated each of them, and found complete/partial matching Qs.
34
q
336
Healthcare-worker
q Qs
Hold-out
test-set
For each query (q) we manually
found similar questions (Qs) from
the FAQ database
270

Results 1/2
DTP SPC COS
ℇ
DTP DTPq-e
SPC SPC+A
SPCq-e
COS COSq-e
mAP 30.5 35.1 39.4 21.1 39.1 26.5 27.9 45.3
MRR 42.6 48.5 54.6 42.2 54.2 38.7 41.0 61.6
SR 27.1 59.6 66.2 49.6 64.4 47.7 51.1 70.3
nDCG 55.5 51.2 57.1 43.9 56.5 40.8 43.3 62.5
P@3 27.1 30.0 34.6 34.6 34.6 22.7 23.9 34.6
Table 1. Comparison of all three primary approaches on hold-out test set for top-3 suggestions. Ensemble
(ℇ) is obtained by taking the best performing models, highlighted with underlined text, from each of the
primary approach.
35

Results 2/2
ℇ-COS
ℇ-DTP
ℇ-SPC
ℇ
mAP 40.9 40.8 30.3 45.3
MRR 56.2 56.5 43.8 61.6
SR 66.2 66.2 51.1 70.3
nDCG 58.4 58.2 45.5 62.5
P@3 34.6 34.6 23.9 34.6
Table 2. Results of ablation study on the Ensemble method (ℇ). We observed that each
approach plays an important role in the performance of the developed chatbot.
36

Ensemble technique ablation
37
ℇsus
ℇns
ℇqfm
ℇ
mAP 45.2 45.7 44.8 45.3
MRR 61.4 62.0 60.6 61.6
SR 71.8 70.0 67.7 70.3
nDCG 62.4 62.7 60.9 62.5
P@3 34.6 34.6 34.6 34.6

Performance depends on diversity of database as well
Performance of ensemble models on test-sets with diﬀerent level of pruning. The label ℇp2
#152
represents the ensemble results on a hold-out test-set that is pruned by removing all user
queries that have 2 or less relevant questions in the FAQ database. The resulting pruned
test-set (p2 set) has 152 user queries. 39

Limitations
● The retrieval-based approaches are limited to extracting information from an
indexed database which has to be curated manually.
● In order to retrieve information from a text dump of raw sentences, a
knowledge-graph has to constructed from the unstructured text.
● OIE tools could be used to extract facts from raw sentences in diﬀerent
domains.
40

Open Information Extraction
41
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking

Open Information Extraction (OIE)
● Extract facts from raw sentences.
○ Use triples to represent the facts.
○ Format of triples is <head, relation, tail>.
● Example
○ PM Modi to visit UAE in Jan marking 50 yrs of diplomatic ties.
■ one of the possible meaningful triple would be:
<PM Modi, to visit, UAE>
42

Why OIE is important?
● Its ability to extract triples from large amounts of texts in an unsupervised
manner.
● It serves as an initial step in building or augmenting a knowledge graph out of
an unstructured text.
● OIE tools have been used to solve downstream applications like
Question-Answering, Text Summarization, and Entity Linking.
43

OIE in Indian language needs a special attention
● Take an English sentence
○ Ram ate an apple
○ A sensible triple would be
■ <Ram, ate, an apple>
○ A nonsensical triple would be
■ <an apple, ate, Ram>
● Now look at a Hindi sentence (same meaning)
○ राम ने सेब खाया
○ Both these triples would be sensible
■ <राम ने, खाया, सेब>
■ <सेब, खाया, राम ने> 44

How generated triples are evaluated?
● Manual Annotations
Valid
Well-formed
Reasonable
Concrete
True
Tabular representation
Image representation 45

How generated triples are evaluated?
● Manual Annotations
● Automatic Evaluations
46

sent_id:1 वह ऑस्ट्रे लया क
े पहले प्रधान मंत्री क
े रूप में कायर्यरत थे और ऑस्ट्रे लया क
े उच्च न्यायालय क
े संस्थापक न्यायाधीश बने ।
(He served as the first Prime Minister of Australia and became a founding justice of the High Court of Australia .)
------ Cluster 1 ------
वह --> कायर्यरत थे --> [ऑस्ट्रे लया क
े ]{a} [पहले] प्रधान मंत्री क
े रूप में
(He --> served --> as [the] [first] Prime Minister [of Australia]{a})
वह --> बने --> [ऑस्ट्रे लया क
े उच्च न्यायालय क
े ]{b} [संस्थापक] न्यायाधीश
(He --> became --> [founding] justice [of the High Court of Australia]{b})
{a} ऑस्ट्रे लया क
े --> property --> [पहले] प्रधान मंत्री क
े रूप में |OR| वह --> [पहले] प्रधान मंत्री क
े रूप में कायर्यरत थे --> ऑस्ट्रे लया क
े
({a} of Australia --> property --> as [first] Prime Minister |OR| He --> served as [the] [first] Prime Minister --> of Australia)
{b} [ऑस्ट्रे लया क
े ]{c} उच्च न्यायालय क
े --> property --> [संस्थापक] न्यायाधीश |OR| वह --> [संस्थापक] न्यायाधीश बने --> [ऑस्ट्रे लया क
े ]{c} उच्च न्यायालय क
े
({b} of High Court [of Australia]{c} --> property --> [founding] justice |OR| He --> became [founding] justice --> of High Court [of
Australia]{c})
{c} ऑस्ट्रे लया क
े --> property --> उच्च न्यायालय क
े
({c} of Australia --> property --> of High Court)
------ Cluster 2 ------
वह --> [पहले] प्रधान मंत्री क
े रूप में कायर्यरत थे --> ऑस्ट्रे लया क
े
(He --> served as [the] [first] Prime Minister --> of Australia)
वह --> [संस्थापक] न्यायाधीश बने --> [ऑस्ट्रे लया क
े ]{a} उच्च न्यायालय क
े
(He --> became [founding] justice --> of High Court [of Australia]{a})
{a} ऑस्ट्रे लया क
े --> property --> उच्च न्यायालय क
े
({a} of Australia --> property --> of High Court)
Hindi-BenchIE
47

Selected Literature
● Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of
the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies. 2015.
○ Supported Hindi but it was translation based
○ Didn’t release code but released triples
● Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian
Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290.
○ Event specific and domain specific IE
● Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head
Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
○ Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end)
○ Identify relations, and then their head-tail
48

Selected Literature
● Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of
the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies. 2015.
○ Supported Hindi but it was translation based
○ Didn’t release code but released triples
● Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian
Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290.
○ Event specific and domain specific IE
● Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head
Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
○ Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end)
○ Identify relations, and then their head-tail
Research Gap:
(1) The ﬁeld of Open Information Extraction (OIE) from unstructured text in Indic languages has not been
explored much. Moreover, the eﬀectiveness of existing multilingual OIE techniques has to be evaluated on
Indic languages.
(2) There is a scarcity of annotated resources for automatic evaluation of automatically generated triples for
Indic languages.
(3) Construction of knowledge-graph using extracted triples from unstructured text in Indian languages needs
to be explored as well.
49

IndIE: our Indic OIE tool
Raw Text
Output
Sentence segmented text,
Triples for each sentence,
Execution Time
Oﬀ-the-shelf sentence segmentation and
dependency parser by Stanford
(a) XLM-roberta ﬁne-tuned on chunk
annotated data
(b) Creating Merged-phrases
Dependency Tree (MDT)
3: Chunk tags
3: Dependency
tree
Greedy algorithm
based on
hand-crafted rules
(c) Triple extraction
2: Sentences
1: Shallow parsing
4: Passing MDT
5: Result
accumulation
50

Chunker
51
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking

Chunker Implementation 1/2
Transformer
Tokenizer
Pretrained
Transformer
Tokens Token IDs Embeddings
52

Chunker Implementation 2/2
Transformer
Tokenizer
Pretrained
Transformer
Feed-forward
Layer
Tokens Token IDs
Token-level
predictions
Ground-Truth
Calculating
Loss
(a) Initial
Embeddings
(b) Taking
average
(c) Final
Embeddings
53

Results: Chunker
Classification Layers
First sub-word token
embedding
Last sub-word token
embedding
Average embedding of all
sub-word tokens
1 82±10 (50±20) 89±0.5 (62±1.0) 91±0.0 (65±0.5)
2 86±1.8 (51±6.2) 89±0.5 (54±7.4) 90±0.5 (54±4.5)
3 79±14 (43±13) 82±11 (41±12) 90±0.5 (48±2.2)
Table 3. A comparison of three approaches for solving the sub-word token embeddings for chunking task. Four diﬀerent random seeds
were used to calculate the mean and standard deviation for the given samples. All the experiments were run on the combined data of
TDIL and UD. The numbers written outside round brackets represent the accuracy, whereas numbers inside round brackets represent
the macro average.
Model Hindi English Urdu Nepali Gujarati Bengali
XLM 78% 60% 84% 65% 56% 66%
CRF 67% 56% 71% 58% 53% 53%
Table 4. A comparison of (ﬁne-tuned) XLM chunker and CRF chunker on the languages which are removed from training-set.
The numbers represent the accuracy obtained by each model when sentences from the given language are used only in the
test-set.
54

Results: Triple Evaluation (manual)
Image options ArgOE M&K Multi2OIE PredPatt IndIE
No information 17% 28% 71% 5% 4%
Most Information 22% 44% 24% 29% 17%
All information 61% 28% 5% 66% 79%
Table 5. Percentage of sentences having no/most/all information in the image representation of their generated triples. The method which
generates maximum triples with ‘All information’ is considered the best method.
#Triple ArgOE M&K Multi2OIE PredPatt IndIE
Total 45 180 50 40 272
Valid 38 142 10 39 252
Well-formed 32 75 9 36 240
Reasonable 26 69 6 32 227
Concrete 21 53 4 25 158
True 19 51 4 25 152
Table 6. Number of triples extracted by each OIE method for 106 Hindi sentences.
55

Results: Triple Evaluation (automatic)
ArgOE M&K Multi2OIE PredPatt IndIE
Precision 0.26 0.14 0.12 0.37 0.62
Recall 0.06 0.16 0.03 0.08 0.62
F1-score 0.10 0.15 0.05 0.14 0.62
Table 7. Performance of diﬀerent OIE methods on Hindi-BenchIE golden set consisting of 75
Hindi sentences. It is observed that IndIE outperforms other methods on all three metrics.
56

Limitations
● Small size of evaluation datasets.
● The contextual embeddings (from transformers) are known to not take the
syntactic structure of sentence into account.
● As a result, it has been observed that syntactic information is either absent in
the transformer embeddings or it is not utilized while making the predictions.
● Therefore, there is a need to explore the possibility of incorporating syntactic
(dependency) information in transformer embeddings.
57

58
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking

Dependency-aware Transformer (DaT) Embedding
Transformer
Tokenizer
Pretrained
Transformer
Tokens Token IDs (a) Initial
Embeddings
(b) Taking
average
(c) Final
Embeddings
59

DaT Embedding: Graph
1 2 3 4 5
1 0 1 1 0 0
2 0 0 0 1 1
3 1 0 0 0 0
4 0 1 0 0 0
5 0 1 0 0 0
Adjacency Matrix A
1 2 3 4 5
1 2 0 0 0 0
2 0 3 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
Degree Matrix D
1
2 3
4 5
60

DaT Embedding: Graph Convolution Network (GCN)
1 2 3 4 5
1 0 1 1 0 0
2 0 0 0 1 1
3 1 0 0 0 0
4 0 1 0 0 0
5 0 1 0 0 0
Adjacency Matrix A
1 2 3 4 5
1 2 0 0 0 0
2 0 3 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
Degree Matrix D
1
2 3
4 5
61

DaT Embedding: GCN message passing
62

Selected Literature
● Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI
Conference on Artificial Intelligence. 2017.
● Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling."
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
● Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation
Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
64

Selected Literature
● Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI
Conference on Artificial Intelligence. 2017.
● Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling."
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
● Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation
Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
Research Gap: Incorporating dependency structure of a sentence in generating dependency-aware
contextual embeddings is an under-explored ﬁeld in Indic-NLP.
65

GCN variants
0) No GCN
1)
2.1)
2.2)
3.1)
66

DaT Embedding Results
69
Epoch 1 Epoch 2 Epoch 3 Epoch 4
v0
71.2 +/- 2.9
(67.7-74.6)
75.9 +/- 1.2
(74.1-77.5)
77.4 +/- 0.6
(76.8-78.1)
79.0 +/- 0.7
(78.2-79.8)
v1
72.4 +/- 1.9
(69.4-74.0)
76.6 +/- 1.2
(74.9-78.0)
78.8 +/- 1.0
(77.1-79.6)
80.3 +/- 1.1
(79.3-81.8)
v2.2
74.2 +/- 1.1
(72.5-75.3)
77.2 +/- 0.9
(75.9-78.3)
79.3 +/- 0.8
(78.4-80.2)
80.5 +/- 0.7
(79.5-81.2)
Epoch 1 Epoch 2 Epoch 3 Epoch 4
v0
71.2 +/- 2.5
(67.3-73.9)
76.2 +/- 1.2
(74.6-77.4)
77.9 +/- 0.6
(77.2-78.8)
79.8 +/- 0.6
(78.8-80.3)
v1
72.6 +/- 1.6
(70.2-74.0)
76.5 +/- 1.0
(75.5-77.8)
78.5 +/- 1.1
(76.9-79.7)
79.9 +/- 1.1
(79.0-81.8)
v2.2
74.2 +/- 1.3
(72.1-75.2)
76.6 +/- 1.0
(75.5-77.8)
79.0 +/- 1.0
(77.9-80.3)
80.0 +/- 1.2
(78.9-82.1)
Table 8. Validation Accuracy on the Small dataset
Table 9. Testing Accuracy on the Small dataset

Limitation
लौटे
राम अयोध्या
बने
वह राजा
।
।
nsubj
punct
obl
nsubj
xcomp
punct
root
root
राम अयोध्या लौटे । वह राजा बने । [PAD] [PAD]
राम 1 1
अयोध्या 1 1
लौटे 1 1 1 1
। 1 1
वह 1 1
राजा 1 1
बने 1 1 1 1
। 1 1
[PAD] 1
[PAD] 1 71

Future Plans
● End-to-End Pronoun Resolution in Hindi
● Indic-SpanBERT
● Long-context Question Answering
72

Pronoun Resolution
73
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
Chunking

Pronoun Resolution
Every speaker had to
present his paper .
Barack Obama visited
India. Modiji came to
receive Obama.
He was brave. He was
Ashoka the great.
If you want to hire them,
then all candidates must
be treated nicely.
Ram went to forest
with his wife
Krishna was an avatar of
Vishnu.
CEO of Reliance, Mukesh
Ambani, inaugurated the
SOM building.
74

Dataset
Mujadia, Vandan, Palash Gupta, and Dipti Misra Sharma. "Coreference Annotation Scheme and Relation Types for Hindi."
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 2016.
75
Table 10. Corpus Details
Table 11. Distribution of co-referential entities

Coref chain
76
Unique
mention ID
0: Intermediate token
1: End token
Unique chain ID
They modify the mention
Source

Selected Literature
● Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing. 2017.
● Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association
for Computational Linguistics 8 (2020): 64-77.
● Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings
of the Sixth International Joint Conference on Natural Language Processing. 2013.
● Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages."
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
● Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages."
Knowledge-Based Systems 109 (2016): 147-159.
77

Selected Literature
● Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing. 2017.
● Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association
for Computational Linguistics 8 (2020): 64-77.
● Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings
of the Sixth International Joint Conference on Natural Language Processing. 2013.
● Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages."
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
● Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages."
Knowledge-Based Systems 109 (2016): 147-159.
Research Gap:
(1) A publicly available tool for coreference resolution in Indic languages is required to be built using
the state-of-art coreference resolution techniques.
(2) Pretraining of transformer models using diﬀerent subword tokenization methods and pretraining
tasks are to be investigated extensively. 78

exp( )
exp( ) + exp( ) +exp( )
BERT
Loss
L1
Loss
L2
Loss
L3
= - Log ( )
Proposed Architecture: TRAINING PHASE
Mention
detector
Pronoun
detector
Pronoun
resolver
P
L
A
I
N
T
E
X
T
embeddings
79

BERT
Softmax
Proposed Architecture: INFERENCE PHASE
P
L
A
I
N
T
E
X
T
embeddings
80

Future Plans
● Indic-SpanBERT
Ritwik: In terms of architectural choices, what
are some of the promising directions for
low-resource languages?
Sebastian Ruder: (summarizing)
There are three
1. Language speciﬁc tokenization
2. Model eﬃciency
3. New pretraining objectives
81
QnA
Rb OIE
PR
Ch
Improve CE

Future Plans
● Indic-SpanBERT
○ One multilingual dataset called XQA by Liu et al. 2019.
○ We need to explore the applicability of our aforementioned methodologies.
82
QnA
Rb OIE
PR
Ch
Improve CE

Timeline
1. Aug 2022 - Oct 2022
▣ Incorporate the internal reviews for maternal chatbot, and submit the
manuscript to a journal like ACM Health (average citation per article is 2) or
IEEE Transactions on CSS (Impact score is 5).
▣ Finish the experimentation pipeline for DaT embeddings. Prepare the ﬁrst
draft.
▣ Data preprocessing for pronoun resolution.
Today Dec 2023
Defence
2 3 4
1
83

Timeline
2. Nov 2022 - Jan 2023
▣ Incorporate internal reviews for DaT embeddings paper, and submit the draft
to ACL Rolling Reviews (ARR). Target AACL.
▣ Build the multi-task learning pipeline for pronoun resolution.
▣ Prepare data and dataloader for Indic-SpanBERT pretraining. Explore the
usability of bert and distilbert.
Today Dec 2023
Defence
2 3 4
1
84

Timeline
3. Feb 2023 - Mar 2023
▣ Implement the baselines for pronoun resolution and design evaluation
metrics.
▣ Explore the usability of OIE in long-context question answering.
▣ Use a combination of Answer Sentence Selection (AS2) and Machine
Comprehension (MC) to solve long-context question answering.
▣ Evaluate Indic-SpanBERT on various downstream tasks in Indic languages.
Document the work as a short paper.
Today Dec 2023
Defence
3
2 4
1
85

Timeline
4. May 2023 - Aug 2023
▣ Prepare the draft for pronoun resolution, and long-context question
answering. Collect internal reviews. Modify and submit in EMNLP.
▣ Begin a survey on work done in Indic NLP.
Today Dec 2023
Defence
2 4
3
1
86

Publications
1. Mishra, Ritwik, Simranjeet Singh, Rajiv Ratn Shah, Ponnurangam Kumaraguru, and
Pushpak Bhattacharya. "IndIE: A Multilingual Open Information Extraction Tool For Indic
Languages". 2022. [Submitted to a special issue of TALLIP]
2. Mishra, Ritwik, Simranjeet Singh, Jasmeet Kaur, Rajiv Ratn Shah, and Pushpendra Singh.
"Exploring the Use of Chatbots for Supporting Maternal and Child Health in
Resource-constrained Environments". 2022. [Draft ready. Under internal review]
Miscellaneous
● Mishra, Ritwik, Ponnurangam Kumaraguru, Rajiv Ratn Shah, Aanshul Sadaria, Shashank
Srikanth, Kanay Gupta, Himanshu Bhatia, and Pratik Jain. "Analyzing traﬃc violations
through e-challan system in metropolitan cities (workshop paper)." In 2020 IEEE Sixth
International Conference on Multimedia Big Data (BigMM), pp. 485-493. IEEE, 2020.
87

Acknowledgements
● I would like to express my gratitude to pillars group members for their valuable
guidance.
● I would like to thank Simranjeet, Samarth, Ajeet, and Jasmeet for being diligent
co-authors.
● I would also like to thank Prof Pushpak Bhattacharyya, and CFILT lab members for
providing me great insights and hosting me at IIT Bombay under Anveshan Setu
program.
● I would also like to University Grant Commission (UGC) Junior Research Fellowship
(JRF) / Senior Research Fellowship (SRF) for funding my PhD program
88

A Framework For Automatic Question Answering in Indian Languages

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie A Framework For Automatic Question Answering in Indian Languages

Ähnlich wie A Framework For Automatic Question Answering in Indian Languages (20)

Mehr von IIIT Hyderabad

Mehr von IIIT Hyderabad (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

A Framework For Automatic Question Answering in Indian Languages