SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Reasoning about Quantities
in Natural Language
Subhro Roy, Tim Vieira, Dan Roth.
TACL 2015
1
• 
• 
 
 
• 
 
 
 
• 
• 
• 
• 
2
  About six and a half hours later,
  Mr. Armstrong opened the landing craft’s hatch.
  [About six and a half hours later],
  Mr. Armstrong opened the landing craft’s hatch.
3
  About six and a half hours later
4
• 
• 
• 
• 
• 
5
下限 上限 なし
下限 上限 なし
6
  The number of member nations was 80 in 2000,
and then it increased to 95.
  The number of adults and children with
HIV/AIDS reached 39.4 million in 2004.
7
  CERN has now grown to include 20 member
states and enjoys the active participation of many
other countries world-wide.
8
  CERN has 20 member states.
  CERN has now grown to include 20 member
states and enjoys the active participation of many
other countries world-wide.
9
  
10
• 
• 
• 
• 
• 
• 
• 
次のスライドで詳細
11
• 
• 
• 
• 
• 
• 
• 
12
• 
• 
• 
• 
• 
• 
 
• 
 
• 
13
 
• 
• 
14
down” we would like to segment together ”nearly
two years after” . We consider a quantity to be
correctly detected only when we have the exact
phrase that we want, otherwise we consider the
segment to be undetected.
Model P% R% F%
Train Test
Time Time
Semi-CRF (SC) 75.6 77.7 76.6 15.8 1.5
C+I (PR) 80.3 79.3 79.8 1.0 1.0
Table 2: 10-fold cross-validation results of segmentation
accuracy and time required for segmentation, the columns for
runtime have been normalized and expressed as ratios
Table 2 describes the segmentation accuracy, as
well as the ratio between the time taken by both
approaches. The bank of classifiers approach gives
slightly better accuracy than the semi-CRF model,
and is also significantly faster.
Task
Entailm
Contradi
No Rela
Table 3:
consistently
quantities ca
 
• 
• 
• 
15
e increased 10%”, we would like
her “increased 10%”, since this
quantity denotes a rise in value.
nce “Apple restores push email in
two years after Motorola shut it
like to segment together ”nearly
. We consider a quantity to be
only when we have the exact
want, otherwise we consider the
etected.
P% R% F%
Train Test
Time Time
75.6 77.7 76.6 15.8 1.5
80.3 79.3 79.8 1.0 1.0
ross-validation results of segmentation
quired for segmentation, the columns for
malized and expressed as ratios
es the segmentation accuracy, as
between the time taken by both
bank of classifiers approach gives
uracy than the semi-CRF model,
antly faster.
exact match only supports 43.3% of the entailment
decisions. It is also evident that the deeper semantic
analysis using SRL and Coreference improves the
quantitative inference.
Task System P% R% F%
Entailment
Baseline 100.0 43.3 60.5
GOLDSEG 98.5 88.0 92.9
+SEM 97.8 88.6 93.0
PREDSEG 94.9 76.2 84.5
+SEM 95.4 78.3 86.0
Contradiction
Baseline 16.6 48.5 24.8
GOLDSEG 61.6 92.9 74.2
+SEM 64.3 91.5 75.5
PREDSEG 51.9 79.7 62.8
+SEM 52.8 81.1 64.0
No Relation
Baseline 41.8 71.9 52.9
GOLDSEG 81.1 76.7 78.8
+SEM 80.0 78.5 79.3
PREDSEG 54.0 75.4 62.9
+SEM 56.3 72.7 63.5
Table 3: Results of QE; Adding Semantics(+SEM)
consistently improves performance; Only 43.3% of entailing
quantities can be recovered by simple string matching
 
• 
• 
• 
16
ge, divide its
obtain a new
e of the two
tity with the
., time-stamp
of time.
h )
value triples
contradicts or
( Q )
in Algorithm 3.
5.2 Scope of QE Inference
Our current QE procedure is limited in
several ways. In all cases, we attribute these
limitations to subtle and deeper language
understanding, which we delegate to the application
module that will use our QE procedure as a
subroutine. Consider the following examples:
T : Adam has exactly 100 dollars in the bank.
H1 : Adam has 50 dollars in the bank.
H2 : Adam’s bank balance is 50 dollars.
Here, T implies H1 but not H2. However for both
H1 and H2, QE will infer that “50 dollars” is a
contradiction to sentence T, since it cannot make
the subtle distinction required here.
T : Ten students passed the exam, but six students
failed it.
H : At least eight students failed the exam.
• 
• 
17
., time-stamp
of time.
h )
value triples
contradicts or
Q )
Q do
entails then
= contradicts
module that will use our QE procedure as a
subroutine. Consider the following examples:
T : Adam has exactly 100 dollars in the bank.
H1 : Adam has 50 dollars in the bank.
H2 : Adam’s bank balance is 50 dollars.
Here, T implies H1 but not H2. However for both
H1 and H2, QE will infer that “50 dollars” is a
contradiction to sentence T, since it cannot make
the subtle distinction required here.
T : Ten students passed the exam, but six students
failed it.
H : At least eight students failed the exam.
Here again, QE will only output that T implies
“At least eight students”, despite the second part of
T. QE reasons about the quantities, and there needs
to be an application specific module that understands
which quantity is related to the predicate “failed”.
There also exists limitations regarding inferences
with respect to events that could occur over a period
of time. In “It was raining from 5 pm to 7 pm” one
 
 
 
18
• 
• 
• 
• 
• 
• 
• 
• 
19

Weitere ähnliche Inhalte

Ähnlich wie Reasoning about Quantities in Natural Language.

Findings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport WritingFindings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport Writing
ShainaBoling829
 
Data.txtPatient_number Disease_duration Improvement1 .docx
Data.txtPatient_number Disease_duration Improvement1       .docxData.txtPatient_number Disease_duration Improvement1       .docx
Data.txtPatient_number Disease_duration Improvement1 .docx
theodorelove43763
 
1. A study is conducted to estimate survival in patients following.docx
1. A study is conducted to estimate survival in patients following.docx1. A study is conducted to estimate survival in patients following.docx
1. A study is conducted to estimate survival in patients following.docx
jackiewalcutt
 
MidtermReview.pdfStatistics 411511Important Concepts an.docx
MidtermReview.pdfStatistics 411511Important Concepts an.docxMidtermReview.pdfStatistics 411511Important Concepts an.docx
MidtermReview.pdfStatistics 411511Important Concepts an.docx
ARIV4
 
Points 250Assignment 3Biggest Challenges Facing Organizations .docx
Points 250Assignment 3Biggest Challenges Facing Organizations .docxPoints 250Assignment 3Biggest Challenges Facing Organizations .docx
Points 250Assignment 3Biggest Challenges Facing Organizations .docx
harrisonhoward80223
 

Ähnlich wie Reasoning about Quantities in Natural Language. (20)

4 Solutions To Exercises 4.1 About These Solutions 4.2 Using The Table Of Ran...
4 Solutions To Exercises 4.1 About These Solutions 4.2 Using The Table Of Ran...4 Solutions To Exercises 4.1 About These Solutions 4.2 Using The Table Of Ran...
4 Solutions To Exercises 4.1 About These Solutions 4.2 Using The Table Of Ran...
 
SOLUTION.PDF
SOLUTION.PDFSOLUTION.PDF
SOLUTION.PDF
 
Math 221 Massive Success / snaptutorial.com
Math 221 Massive Success / snaptutorial.comMath 221 Massive Success / snaptutorial.com
Math 221 Massive Success / snaptutorial.com
 
Preparing For The Gmat 2009
Preparing For The Gmat 2009Preparing For The Gmat 2009
Preparing For The Gmat 2009
 
Stop multiplying by 4 nyphp
Stop multiplying by 4 nyphpStop multiplying by 4 nyphp
Stop multiplying by 4 nyphp
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
Room to Breathe: The BA's role in project estimation
Room to Breathe: The BA's role in project estimationRoom to Breathe: The BA's role in project estimation
Room to Breathe: The BA's role in project estimation
 
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
Quantitative Methods for Lawyers - Class #15 - R Boot Camp - Part 2 - Profess...
 
Findings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport WritingFindings, Conclusions, & RecommendationsReport Writing
Findings, Conclusions, & RecommendationsReport Writing
 
Chap009.ppt
Chap009.pptChap009.ppt
Chap009.ppt
 
Data.txtPatient_number Disease_duration Improvement1 .docx
Data.txtPatient_number Disease_duration Improvement1       .docxData.txtPatient_number Disease_duration Improvement1       .docx
Data.txtPatient_number Disease_duration Improvement1 .docx
 
Toc Education
Toc EducationToc Education
Toc Education
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Test Automation Day 2018
Test Automation Day 2018Test Automation Day 2018
Test Automation Day 2018
 
The nature of the data
The nature of the dataThe nature of the data
The nature of the data
 
1. A study is conducted to estimate survival in patients following.docx
1. A study is conducted to estimate survival in patients following.docx1. A study is conducted to estimate survival in patients following.docx
1. A study is conducted to estimate survival in patients following.docx
 
MidtermReview.pdfStatistics 411511Important Concepts an.docx
MidtermReview.pdfStatistics 411511Important Concepts an.docxMidtermReview.pdfStatistics 411511Important Concepts an.docx
MidtermReview.pdfStatistics 411511Important Concepts an.docx
 
Points 250Assignment 3Biggest Challenges Facing Organizations .docx
Points 250Assignment 3Biggest Challenges Facing Organizations .docxPoints 250Assignment 3Biggest Challenges Facing Organizations .docx
Points 250Assignment 3Biggest Challenges Facing Organizations .docx
 
Cynthia Lee ITEM 2018
Cynthia Lee ITEM 2018Cynthia Lee ITEM 2018
Cynthia Lee ITEM 2018
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Reasoning about Quantities in Natural Language.

  • 1. Reasoning about Quantities in Natural Language Subhro Roy, Tim Vieira, Dan Roth. TACL 2015 1
  • 3.   About six and a half hours later,   Mr. Armstrong opened the landing craft’s hatch.   [About six and a half hours later],   Mr. Armstrong opened the landing craft’s hatch. 3
  • 4.   About six and a half hours later 4
  • 6. 下限 上限 なし 下限 上限 なし 6
  • 7.   The number of member nations was 80 in 2000, and then it increased to 95.   The number of adults and children with HIV/AIDS reached 39.4 million in 2004. 7
  • 8.   CERN has now grown to include 20 member states and enjoys the active participation of many other countries world-wide. 8   CERN has 20 member states.
  • 9.   CERN has now grown to include 20 member states and enjoys the active participation of many other countries world-wide. 9   
  • 14.   •  •  14 down” we would like to segment together ”nearly two years after” . We consider a quantity to be correctly detected only when we have the exact phrase that we want, otherwise we consider the segment to be undetected. Model P% R% F% Train Test Time Time Semi-CRF (SC) 75.6 77.7 76.6 15.8 1.5 C+I (PR) 80.3 79.3 79.8 1.0 1.0 Table 2: 10-fold cross-validation results of segmentation accuracy and time required for segmentation, the columns for runtime have been normalized and expressed as ratios Table 2 describes the segmentation accuracy, as well as the ratio between the time taken by both approaches. The bank of classifiers approach gives slightly better accuracy than the semi-CRF model, and is also significantly faster. Task Entailm Contradi No Rela Table 3: consistently quantities ca
  • 15.   •  •  •  15 e increased 10%”, we would like her “increased 10%”, since this quantity denotes a rise in value. nce “Apple restores push email in two years after Motorola shut it like to segment together ”nearly . We consider a quantity to be only when we have the exact want, otherwise we consider the etected. P% R% F% Train Test Time Time 75.6 77.7 76.6 15.8 1.5 80.3 79.3 79.8 1.0 1.0 ross-validation results of segmentation quired for segmentation, the columns for malized and expressed as ratios es the segmentation accuracy, as between the time taken by both bank of classifiers approach gives uracy than the semi-CRF model, antly faster. exact match only supports 43.3% of the entailment decisions. It is also evident that the deeper semantic analysis using SRL and Coreference improves the quantitative inference. Task System P% R% F% Entailment Baseline 100.0 43.3 60.5 GOLDSEG 98.5 88.0 92.9 +SEM 97.8 88.6 93.0 PREDSEG 94.9 76.2 84.5 +SEM 95.4 78.3 86.0 Contradiction Baseline 16.6 48.5 24.8 GOLDSEG 61.6 92.9 74.2 +SEM 64.3 91.5 75.5 PREDSEG 51.9 79.7 62.8 +SEM 52.8 81.1 64.0 No Relation Baseline 41.8 71.9 52.9 GOLDSEG 81.1 76.7 78.8 +SEM 80.0 78.5 79.3 PREDSEG 54.0 75.4 62.9 +SEM 56.3 72.7 63.5 Table 3: Results of QE; Adding Semantics(+SEM) consistently improves performance; Only 43.3% of entailing quantities can be recovered by simple string matching
  • 16.   •  •  •  16 ge, divide its obtain a new e of the two tity with the ., time-stamp of time. h ) value triples contradicts or ( Q ) in Algorithm 3. 5.2 Scope of QE Inference Our current QE procedure is limited in several ways. In all cases, we attribute these limitations to subtle and deeper language understanding, which we delegate to the application module that will use our QE procedure as a subroutine. Consider the following examples: T : Adam has exactly 100 dollars in the bank. H1 : Adam has 50 dollars in the bank. H2 : Adam’s bank balance is 50 dollars. Here, T implies H1 but not H2. However for both H1 and H2, QE will infer that “50 dollars” is a contradiction to sentence T, since it cannot make the subtle distinction required here. T : Ten students passed the exam, but six students failed it. H : At least eight students failed the exam.
  • 17. •  •  17 ., time-stamp of time. h ) value triples contradicts or Q ) Q do entails then = contradicts module that will use our QE procedure as a subroutine. Consider the following examples: T : Adam has exactly 100 dollars in the bank. H1 : Adam has 50 dollars in the bank. H2 : Adam’s bank balance is 50 dollars. Here, T implies H1 but not H2. However for both H1 and H2, QE will infer that “50 dollars” is a contradiction to sentence T, since it cannot make the subtle distinction required here. T : Ten students passed the exam, but six students failed it. H : At least eight students failed the exam. Here again, QE will only output that T implies “At least eight students”, despite the second part of T. QE reasons about the quantities, and there needs to be an application specific module that understands which quantity is related to the predicate “failed”. There also exists limitations regarding inferences with respect to events that could occur over a period of time. In “It was raining from 5 pm to 7 pm” one