SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
A Compare-Aggregate Model with Latent Clustering
for Answer Selection
Seunghyun Yoon1†, Franck Dernoncourt2, Doo Soon Kim2, Trung Bui2 and Kyomin Jung1
1Seoul National University, Seoul, Korea 2Adobe Research, San Jose, CA, USA
mysmilesh@snu.ac.kr
Research Problem
• We propose a novel method for a sentence-level answer-selection task.
• We explore the effect of additional information by adopting a pretrained lan-
guage model to compute the vector representation of the input text and by
applying transfer learning from a large-scale corpus.
• We enhance the compare-aggregate model by proposing a novel latent clus-
tering method and by changing the objective function from listwise to point-
wise.
Methodology
Latent memory
𝑴𝑴1 … 𝑴𝑴𝑛𝑛
Sentence
Representation
Compute
Similarity
𝑴𝑴LC
𝐴𝐴
= � 𝛼𝛼𝑘𝑘
𝐴𝐴
∙ 𝑴𝑴𝑘𝑘
𝑘𝑘
𝒑𝒑1:𝑛𝑛
𝐴𝐴
= (𝒔𝒔𝐴𝐴
)⊺
𝑾𝑾𝑾𝑾
𝒔𝒔 𝐴𝐴
=
1
𝑚𝑚
� 𝒂𝒂�𝑖𝑖
𝑖𝑖
𝑞𝑞1 𝑞𝑞2 𝑞𝑞𝑛𝑛
Context
Representation
𝒍𝒍1
𝑄𝑄
𝒍𝒍1
𝑄𝑄
𝒍𝒍2
𝑄𝑄
𝒍𝒍2
𝑄𝑄
𝒍𝒍𝑛𝑛
𝑄𝑄
𝒍𝒍𝑛𝑛
𝑄𝑄
Comparison
+
𝒒𝒒�1 𝒒𝒒�2 𝒒𝒒�𝑛𝑛
Language
Model
Attention
𝒉𝒉1
𝑄𝑄
𝒂𝒂�1
⊙
𝒉𝒉2
𝑄𝑄
𝒂𝒂�2
⊙
𝒉𝒉 𝑚𝑚
𝑄𝑄
𝒂𝒂� 𝑚𝑚
⊙
𝒄𝒄1
𝑄𝑄
CNN
Latent
Clustering
Aggregation
𝒍𝒍1
𝐴𝐴
𝒍𝒍1
𝐴𝐴
𝒍𝒍2
𝐴𝐴
𝒍𝒍2
𝐴𝐴
𝒍𝒍 𝑚𝑚
𝐴𝐴
𝒍𝒍 𝑚𝑚
𝐴𝐴
+
𝒂𝒂�1 𝒂𝒂�2 𝒂𝒂� 𝑚𝑚
𝒉𝒉1
𝐴𝐴
𝒒𝒒�1
⊙
𝒉𝒉2
𝐴𝐴
𝒒𝒒�𝟏𝟏
⊙
𝒉𝒉𝑛𝑛
𝐴𝐴
𝒒𝒒�𝑛𝑛
⊙
CNN
𝑎𝑎1 𝑎𝑎2 𝑎𝑎 𝑚𝑚
𝒄𝒄2
𝑄𝑄
𝒄𝒄 𝑚𝑚
𝑄𝑄
𝒄𝒄1
𝐴𝐴 𝒄𝒄2
𝐴𝐴 𝒄𝒄𝑛𝑛
𝐴𝐴
…
…
…
…
…
…
…
…
…
…
softmax
Latent-cluster
Information
𝒔𝒔𝐴𝐴
𝑴𝑴LC
𝐴𝐴
k-max-pool
𝒑𝒑�1:𝑘𝑘
𝐴𝐴
𝛼𝛼1:𝑘𝑘
𝐴𝐴
Figure 1: The architecture of the model. The dotted box on the right shows the process through which the latent-cluster infor-
mation is computed and added to the answer. This process is also performed in the question part but is omitted in the figure.
The latent memory is shared in both processes.
• Pretrained Language Model (LM): We select an ELMo language model and
replace the previousword embedding layer with the ELMo model.
• Latent Clustering (LC): We assume that extracting the LC information of the
text and using it as auxiliary information will help the neural network model
analyze the corpus.
p1:n = s WM1:n,
p1:k = k-max-pool(p1:n),
α1:k = softmax(p1:k),
MLC = kαkMk.
(1)
MQ
LC = f(( iqi)/n), qi ⊂ Q1:n,
MA
LC = f(( iai)/m), ai ⊂ A1:m,
CQ
new = [CQ
; MQ
LC], CA
new = [CA
; MA
LC].
(2)
• Transfer Learning (TL): To observe the efficacy in a large dataset, we apply
transfer learning using the question-answering NLI (QNLI) corpus.
• Point-wise Learning: Previous research adopts a listwise learning approach
using KL-divergence score. In contrast, we pair each answer candidate to the
question and compute the cross-entropy loss to train the model.
scorei = model(Q, Ai),
S = softmax([score1, ..., scorei]),
loss = N
n=1KL(Sn||yn).
(3)
loss = − N
n=1 yn log (scoren). (4)
Datasets
• WikiQA (Yang et al. 2015) is an answer selection QA dataset constructed from
real queries of Bing and Wikipedia
• TREC-QA (Wang et al. 2007) is an answer selection QA dataset created from
the TREC Question-Answering tracks
• QNLI (Wang et al. 2018) is a modified version of the SQuAD dataset that
allows for sentence selection QA
Dataset
Listwise pairs Pointwise pairs Answer candidates
/ question
train dev test train dev test
WikiQA 873 126 143 8.6k 1.1k 2.3k 9.9
TREC-QA 1.2k 65 68 53k 1.1k 1.4k 43.5
QNLI 86k 10k - 428k 169k - 5.0
Table 1: Properties of the dataset.
Experimental Results
We regard all tasks as relevant answer selections for the given questions. Follow-
ing the previous study, we report the model performance as the mean average
precision (MAP) and the mean reciprocal rank (MRR).
# Clusters
Listwise Learning Pointwise Learning
MAP MRR MAP MRR
1 0.822 0.819 0.842 0.841
4 0.839 0.840 0.846 0.845
8 0.841 0.842 0.846 0.846
16 0.840 0.842 0.847 0.846
Table 2: Model (Comp-Clip +LM +LC) performance on the QNLI corpus with a variant number of clusters.
Conclusions
• Achieve state-of-the-art performance on the WikiQA and TREC-QA
• Show that leveraging a large amount of data is crucial for capturing the con-
textual representation of input text
• Show that the proposed latent clustering method with a pointwise objective
function significantly improves model performance
Model
WikiQA TREC-QA
MAP MRR MAP MRR
dev test dev test dev test dev test
Compare-Aggregate (Wang et al. 2017) [1] 0.743* 0.699* 0.754* 0.708* - - - -
Comp-Clip (Bian et al. 2017) [2] 0.732* 0.718* 0.738* 0.732* - 0.821 - 0.899
IWAN (Shen et al. 2017) [3] 0.738* 0.692* 0.749* 0.705* - 0.822 - 0.899
IWAN + sCARNN (Tran et al. 2018) [4] 0.719* 0.716* 0.729* 0.722* - 0.829 - 0.875
MCAN (Tay et al. 2018) [5] - - - - - 0.838 - 0.904
Question Classification (Madabushi et al. 2018) [6] - - - - - 0.865 - 0.904
Listwise Learning to Rank
Comp-Clip (our implementation) 0.756 0.708 0.766 0.725 0.750 0.744 0.805 0.791
Comp-Clip (our implementation) + LM 0.783 0.748 0.791 0.768 0.825 0.823 0.870 0.868
Comp-Clip (our implementation) + LM + LC 0.787 0.759 0.793 0.772 0.841 0.832 0.877 0.880
Comp-Clip (our implementation) + LM + LC +TL 0.822 0.830 0.836 0.841 0.866 0.848 0.911 0.902
Pointwise Learning to Rank
Comp-Clip (our implementation) 0.776 0.714 0.784 0.732 0.866 0.835 0.933 0.877
Comp-Clip (our implementation) + LM 0.785 0.746 0.789 0.762 0.872 0.850 0.930 0.898
Comp-Clip (our implementation) + LM + LC 0.782 0.764 0.785 0.784 0.879 0.868 0.942 0.928
Comp-Clip (our implementation) + LM + LC +TL 0.842 0.834 0.845 0.848 0.913 0.875 0.977 0.940
Table 3: Model performance (the top 3 scores are marked in bold for each task). We evaluate model [1-4] on the WikiQA corpus using author’s implementation (marked by *). For TREC-QA case, we present reported results in the original papers.
†
Work conducted while the author was an intern at Adobe Research.

Weitere ähnliche Inhalte

Was ist angesagt?

Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]akm sabbir
 
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017MLconf
 
Joint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiesJoint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiestaeseon ryu
 
Context-aware preference modeling with factorization
Context-aware preference modeling with factorizationContext-aware preference modeling with factorization
Context-aware preference modeling with factorizationBalázs Hidasi
 
Paraphrasing complex network
Paraphrasing complex networkParaphrasing complex network
Paraphrasing complex networkNAVER Engineering
 
Boosted tree
Boosted treeBoosted tree
Boosted treeZhuyi Xue
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GANNAVER Engineering
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Sangwoo Mo
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
 
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...NAVER Engineering
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종WooSung Choi
 
Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Balázs Hidasi
 
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015Chris Ohk
 
Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...
Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...
Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...Association for Computational Linguistics
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringAkram El-Korashy
 
Trust Region Policy Optimization, Schulman et al, 2015
Trust Region Policy Optimization, Schulman et al, 2015Trust Region Policy Optimization, Schulman et al, 2015
Trust Region Policy Optimization, Schulman et al, 2015Chris Ohk
 
Competition winning learning rates
Competition winning learning ratesCompetition winning learning rates
Competition winning learning ratesMLconf
 

Was ist angesagt? (20)

Network recasting
Network recastingNetwork recasting
Network recasting
 
Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]
 
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
 
Joint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiesJoint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilities
 
Context-aware preference modeling with factorization
Context-aware preference modeling with factorizationContext-aware preference modeling with factorization
Context-aware preference modeling with factorization
 
Paraphrasing complex network
Paraphrasing complex networkParaphrasing complex network
Paraphrasing complex network
 
Boosted tree
Boosted treeBoosted tree
Boosted tree
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종
 
Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...
 
Icml2018 naver review
Icml2018 naver reviewIcml2018 naver review
Icml2018 naver review
 
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
Continuous Control with Deep Reinforcement Learning, lillicrap et al, 2015
 
Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...
Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...
Jan Wira Gotama Putra - 2017 - Evaluating Text Coherence Based on Semantic Si...
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question Answering
 
Trust Region Policy Optimization, Schulman et al, 2015
Trust Region Policy Optimization, Schulman et al, 2015Trust Region Policy Optimization, Schulman et al, 2015
Trust Region Policy Optimization, Schulman et al, 2015
 
InfoGAIL
InfoGAIL InfoGAIL
InfoGAIL
 
Competition winning learning rates
Competition winning learning ratesCompetition winning learning rates
Competition winning learning rates
 

Ähnlich wie [poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection

CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
Part 1
Part 1Part 1
Part 1butest
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningMLAI2
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
AttSum: Joint Learning of Focusing and Summarization with Neural Attention
AttSum: Joint Learning of Focusing and Summarization with Neural AttentionAttSum: Joint Learning of Focusing and Summarization with Neural Attention
AttSum: Joint Learning of Focusing and Summarization with Neural AttentionKodaira Tomonori
 
Subjective evaluation answer ppt
Subjective evaluation answer pptSubjective evaluation answer ppt
Subjective evaluation answer pptMrunal Pagnis
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...Jinho Choi
 
Recent and Robust Query Auto-Completion - WWW 2014 Conference Presentation
Recent and Robust Query Auto-Completion - WWW 2014 Conference PresentationRecent and Robust Query Auto-Completion - WWW 2014 Conference Presentation
Recent and Robust Query Auto-Completion - WWW 2014 Conference Presentationstewhir
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptxKtonNguyn2
 
Manta ray optimized deep contextualized bi-directional long short-term memor...
Manta ray optimized deep contextualized bi-directional long  short-term memor...Manta ray optimized deep contextualized bi-directional long  short-term memor...
Manta ray optimized deep contextualized bi-directional long short-term memor...IJECEIAES
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
A Machine learning approach to classify a pair of sentence as duplicate or not.
A Machine learning approach to classify a pair of sentence as duplicate or not.A Machine learning approach to classify a pair of sentence as duplicate or not.
A Machine learning approach to classify a pair of sentence as duplicate or not.Pankaj Chandan Mohapatra
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan Kumar
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
Crystallization classification semisupervised
Crystallization classification semisupervisedCrystallization classification semisupervised
Crystallization classification semisupervisedMadhav Sigdel
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)Yun Huang
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEJianfeng Chen
 
Finding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsFinding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsCSCJournals
 

Ähnlich wie [poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection (20)

CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
Part 1
Part 1Part 1
Part 1
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
AttSum: Joint Learning of Focusing and Summarization with Neural Attention
AttSum: Joint Learning of Focusing and Summarization with Neural AttentionAttSum: Joint Learning of Focusing and Summarization with Neural Attention
AttSum: Joint Learning of Focusing and Summarization with Neural Attention
 
Subjective evaluation answer ppt
Subjective evaluation answer pptSubjective evaluation answer ppt
Subjective evaluation answer ppt
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
Recent and Robust Query Auto-Completion - WWW 2014 Conference Presentation
Recent and Robust Query Auto-Completion - WWW 2014 Conference PresentationRecent and Robust Query Auto-Completion - WWW 2014 Conference Presentation
Recent and Robust Query Auto-Completion - WWW 2014 Conference Presentation
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
 
Manta ray optimized deep contextualized bi-directional long short-term memor...
Manta ray optimized deep contextualized bi-directional long  short-term memor...Manta ray optimized deep contextualized bi-directional long  short-term memor...
Manta ray optimized deep contextualized bi-directional long short-term memor...
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
A Machine learning approach to classify a pair of sentence as duplicate or not.
A Machine learning approach to classify a pair of sentence as duplicate or not.A Machine learning approach to classify a pair of sentence as duplicate or not.
A Machine learning approach to classify a pair of sentence as duplicate or not.
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Crystallization classification semisupervised
Crystallization classification semisupervisedCrystallization classification semisupervised
Crystallization classification semisupervised
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSE
 
Finding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsFinding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster Results
 

Kürzlich hochgeladen

SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrsaastr
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMoumonDas2
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 

Kürzlich hochgeladen (20)

SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 

[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection

  • 1. A Compare-Aggregate Model with Latent Clustering for Answer Selection Seunghyun Yoon1†, Franck Dernoncourt2, Doo Soon Kim2, Trung Bui2 and Kyomin Jung1 1Seoul National University, Seoul, Korea 2Adobe Research, San Jose, CA, USA mysmilesh@snu.ac.kr Research Problem • We propose a novel method for a sentence-level answer-selection task. • We explore the effect of additional information by adopting a pretrained lan- guage model to compute the vector representation of the input text and by applying transfer learning from a large-scale corpus. • We enhance the compare-aggregate model by proposing a novel latent clus- tering method and by changing the objective function from listwise to point- wise. Methodology Latent memory 𝑴𝑴1 … 𝑴𝑴𝑛𝑛 Sentence Representation Compute Similarity 𝑴𝑴LC 𝐴𝐴 = � 𝛼𝛼𝑘𝑘 𝐴𝐴 ∙ 𝑴𝑴𝑘𝑘 𝑘𝑘 𝒑𝒑1:𝑛𝑛 𝐴𝐴 = (𝒔𝒔𝐴𝐴 )⊺ 𝑾𝑾𝑾𝑾 𝒔𝒔 𝐴𝐴 = 1 𝑚𝑚 � 𝒂𝒂�𝑖𝑖 𝑖𝑖 𝑞𝑞1 𝑞𝑞2 𝑞𝑞𝑛𝑛 Context Representation 𝒍𝒍1 𝑄𝑄 𝒍𝒍1 𝑄𝑄 𝒍𝒍2 𝑄𝑄 𝒍𝒍2 𝑄𝑄 𝒍𝒍𝑛𝑛 𝑄𝑄 𝒍𝒍𝑛𝑛 𝑄𝑄 Comparison + 𝒒𝒒�1 𝒒𝒒�2 𝒒𝒒�𝑛𝑛 Language Model Attention 𝒉𝒉1 𝑄𝑄 𝒂𝒂�1 ⊙ 𝒉𝒉2 𝑄𝑄 𝒂𝒂�2 ⊙ 𝒉𝒉 𝑚𝑚 𝑄𝑄 𝒂𝒂� 𝑚𝑚 ⊙ 𝒄𝒄1 𝑄𝑄 CNN Latent Clustering Aggregation 𝒍𝒍1 𝐴𝐴 𝒍𝒍1 𝐴𝐴 𝒍𝒍2 𝐴𝐴 𝒍𝒍2 𝐴𝐴 𝒍𝒍 𝑚𝑚 𝐴𝐴 𝒍𝒍 𝑚𝑚 𝐴𝐴 + 𝒂𝒂�1 𝒂𝒂�2 𝒂𝒂� 𝑚𝑚 𝒉𝒉1 𝐴𝐴 𝒒𝒒�1 ⊙ 𝒉𝒉2 𝐴𝐴 𝒒𝒒�𝟏𝟏 ⊙ 𝒉𝒉𝑛𝑛 𝐴𝐴 𝒒𝒒�𝑛𝑛 ⊙ CNN 𝑎𝑎1 𝑎𝑎2 𝑎𝑎 𝑚𝑚 𝒄𝒄2 𝑄𝑄 𝒄𝒄 𝑚𝑚 𝑄𝑄 𝒄𝒄1 𝐴𝐴 𝒄𝒄2 𝐴𝐴 𝒄𝒄𝑛𝑛 𝐴𝐴 … … … … … … … … … … softmax Latent-cluster Information 𝒔𝒔𝐴𝐴 𝑴𝑴LC 𝐴𝐴 k-max-pool 𝒑𝒑�1:𝑘𝑘 𝐴𝐴 𝛼𝛼1:𝑘𝑘 𝐴𝐴 Figure 1: The architecture of the model. The dotted box on the right shows the process through which the latent-cluster infor- mation is computed and added to the answer. This process is also performed in the question part but is omitted in the figure. The latent memory is shared in both processes. • Pretrained Language Model (LM): We select an ELMo language model and replace the previousword embedding layer with the ELMo model. • Latent Clustering (LC): We assume that extracting the LC information of the text and using it as auxiliary information will help the neural network model analyze the corpus. p1:n = s WM1:n, p1:k = k-max-pool(p1:n), α1:k = softmax(p1:k), MLC = kαkMk. (1) MQ LC = f(( iqi)/n), qi ⊂ Q1:n, MA LC = f(( iai)/m), ai ⊂ A1:m, CQ new = [CQ ; MQ LC], CA new = [CA ; MA LC]. (2) • Transfer Learning (TL): To observe the efficacy in a large dataset, we apply transfer learning using the question-answering NLI (QNLI) corpus. • Point-wise Learning: Previous research adopts a listwise learning approach using KL-divergence score. In contrast, we pair each answer candidate to the question and compute the cross-entropy loss to train the model. scorei = model(Q, Ai), S = softmax([score1, ..., scorei]), loss = N n=1KL(Sn||yn). (3) loss = − N n=1 yn log (scoren). (4) Datasets • WikiQA (Yang et al. 2015) is an answer selection QA dataset constructed from real queries of Bing and Wikipedia • TREC-QA (Wang et al. 2007) is an answer selection QA dataset created from the TREC Question-Answering tracks • QNLI (Wang et al. 2018) is a modified version of the SQuAD dataset that allows for sentence selection QA Dataset Listwise pairs Pointwise pairs Answer candidates / question train dev test train dev test WikiQA 873 126 143 8.6k 1.1k 2.3k 9.9 TREC-QA 1.2k 65 68 53k 1.1k 1.4k 43.5 QNLI 86k 10k - 428k 169k - 5.0 Table 1: Properties of the dataset. Experimental Results We regard all tasks as relevant answer selections for the given questions. Follow- ing the previous study, we report the model performance as the mean average precision (MAP) and the mean reciprocal rank (MRR). # Clusters Listwise Learning Pointwise Learning MAP MRR MAP MRR 1 0.822 0.819 0.842 0.841 4 0.839 0.840 0.846 0.845 8 0.841 0.842 0.846 0.846 16 0.840 0.842 0.847 0.846 Table 2: Model (Comp-Clip +LM +LC) performance on the QNLI corpus with a variant number of clusters. Conclusions • Achieve state-of-the-art performance on the WikiQA and TREC-QA • Show that leveraging a large amount of data is crucial for capturing the con- textual representation of input text • Show that the proposed latent clustering method with a pointwise objective function significantly improves model performance Model WikiQA TREC-QA MAP MRR MAP MRR dev test dev test dev test dev test Compare-Aggregate (Wang et al. 2017) [1] 0.743* 0.699* 0.754* 0.708* - - - - Comp-Clip (Bian et al. 2017) [2] 0.732* 0.718* 0.738* 0.732* - 0.821 - 0.899 IWAN (Shen et al. 2017) [3] 0.738* 0.692* 0.749* 0.705* - 0.822 - 0.899 IWAN + sCARNN (Tran et al. 2018) [4] 0.719* 0.716* 0.729* 0.722* - 0.829 - 0.875 MCAN (Tay et al. 2018) [5] - - - - - 0.838 - 0.904 Question Classification (Madabushi et al. 2018) [6] - - - - - 0.865 - 0.904 Listwise Learning to Rank Comp-Clip (our implementation) 0.756 0.708 0.766 0.725 0.750 0.744 0.805 0.791 Comp-Clip (our implementation) + LM 0.783 0.748 0.791 0.768 0.825 0.823 0.870 0.868 Comp-Clip (our implementation) + LM + LC 0.787 0.759 0.793 0.772 0.841 0.832 0.877 0.880 Comp-Clip (our implementation) + LM + LC +TL 0.822 0.830 0.836 0.841 0.866 0.848 0.911 0.902 Pointwise Learning to Rank Comp-Clip (our implementation) 0.776 0.714 0.784 0.732 0.866 0.835 0.933 0.877 Comp-Clip (our implementation) + LM 0.785 0.746 0.789 0.762 0.872 0.850 0.930 0.898 Comp-Clip (our implementation) + LM + LC 0.782 0.764 0.785 0.784 0.879 0.868 0.942 0.928 Comp-Clip (our implementation) + LM + LC +TL 0.842 0.834 0.845 0.848 0.913 0.875 0.977 0.940 Table 3: Model performance (the top 3 scores are marked in bold for each task). We evaluate model [1-4] on the WikiQA corpus using author’s implementation (marked by *). For TREC-QA case, we present reported results in the original papers. † Work conducted while the author was an intern at Adobe Research.