In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)
While there is a wide consensus in the NLP community over the modeling of temporal relations between events, mainly based on Allen’s temporal logic, the question on how to annotate other types of event relations, in particular causal ones, is still open. In this work, we present some annotation guidelines to capture causality between event
pairs, partly inspired by TimeML. We then implement a rule-based algorithm to automatically identify explicit causal relations in the TempEval-3 corpus. Based on this
annotation, we report some statistics on the behavior of causal cues in text and perform a preliminary investigation on the interaction between causal and temporal relations.
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Annotating Causality in the TempEval-3 Corpus
1. Annotating Causality in the
TempEval-3 Corpus
Paramita Mirza Rachele Sprugnoli Sara Tonelli Manuela Speranza
paramita@fbk.eu sprugnoli@fbk.eu satonelli@fbk.eu manspera@fbk.eu
CAtoCL Workshop, EACL
April, 2014
2. • TimeML annotation → a markup language for events
and temporal expressions
• Include causal information in the TempEval-3 corpus
CAUSE
IS_INCLUDED
EVENT TIMEX
BEFORE
EVENT
TLINKTLINK
SIGNAL
Hewlett-Packard acquired 730,070 common shares from Octel as
a result of a stock purchase agreement signed on Aug. 10, 1988.
Hewlett-Packard acquired 730,070 common shares from Octel as
a result of a stock purchase agreement signed on Aug. 10, 1988.
TempEval-3 Corpus
3. What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• Qualitative and quantitative evaluation
4. What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• Qualitative and quantitative evaluation
5. C-SIGNAL and CLINK
TimeML annotation
- EVENT
- TIMEX3
- SIGNAL
- TLINK
+ Causality
- C-SIGNAL
- CLINK
• C-SIGNAL → textual elements indicating the presence of causal relations
• Prepositions
• Conjunctions
• Adverbial connectors
• Clause-integrated expressions
because of, as a result of, due to, …
because, since, so that, …
as a result, so, therefore, …
the result is, that’s why, …
• CLINK → a directional one-to-one relation where
source = causing event and target = caused event
(optional) c-signalID = ID of related C-SIGNAL
6. Causal Concepts
Dynamics Model based on Force Dynamics Theory (Talmy, 1988)
• Captures the concept of causality, along with its related
concepts, in terms of three dimensions:
– the patient tendency for the result
– the presence of concordance between the affector and the
patient
– the occurrence of the result
• Able to distinguish the concept of CAUSE from ENABLE, which
is not available in the counterfactual model
• Was tested by linking it with natural language
• The causality concepts can be lexicalized as verbs (Wolff and
Song, 2003):
– CAUSE-type cause, influence, persuade, prompt, …
– ENABLE-type aid, allow, enable, let, …
– PREVENT-type block, constrain, prevent, restrain, …
7. CLINK: explicit causal constructions
linking two events (source to target)
• Basic construction
– The purchaseS caused the creationT of the current building
– The purchaseS enabled the diversificationT of their business
– The purchaseS prevented a future transferT
• Expressions with affect verbs affect, influence, determine, change
– Ogun CAN crisisS affects the launchT of the All Progressives Congress
• Expressions with link verbs link, lead, depend (on)
– An earthquakeT in North America was linked to a tsunamiS in Japan
• Periphrastic causatives
– The blastS prompts the boat to heelT violently
– The oxygenS lets the fire getsT bigger
– The poleS restrains the tent from collapsingT
• Expressions with C-SIGNALs
– Iraq said it invadedT Kuwait because of disputesS over oil and money
8. Polarity of CLINK
• Polarity of events can help determining
polarity of CLINKs
– Serotonin deficiencyS does not cause depressionT
9. What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• Qualitative and quantitative evaluation
10. Rule-based Annotation
• Dataset:
– TBAQ-cleaned corpus from TempEval-3, with gold annotated events
• Algorithm:
– The dataset is PoS-tagged and parsed with Stanford dependency
parser
– The dataset is further analyzed with addDiscourse tool
– Look for specific dependency constructions where a causal verb/signal
is connected to two events
– If such dependency is found:
• establish CLINK
• identify source and target events
– If a causal connector is an event, uses the polarity of the event to
assign polarity of the CLINK
• Limitations:
– Only look for CLINKs between events within the same sentence
– Only consider a finite set of causal verbs/signals
11. Statistics of Automatic Annotation
• Remarks
– ENABLE-type verbs never
appear in basic construction
– 36 affect verb occurrences
– 50 link verb occurrences
– From around 1K causative verb occurrences, only 14% are
in periphrastic constructions
– From around 1.2K potential causal connectors, only 194
are recognized as causal signals (after disambiguation)
– Only 2 CLINKs found with negative polarity
Explicit causality CLINKs
Basic construction 17
Affect verbs 0
Link verbs 4
Periphrastic causatives 41
Causal signals 111
Total 173
12. Statistics of Automatic Annotation (3)
• CLINKs vs TLINKs
– 173 CLINKs vs 5.2K TLINKs
– 33% of CLINKs have underlying TLINKs, most are signaled by C-
SIGNALs
• Iraq said it invadedT Kuwait because of disputesS over oil and
money → BEFORE
– For CLINK with causative verbs, BEFORE is the only type (with
one exception of SIMULTANEOUS)
– For CLINK with causal signals, BEFORE type is also the majority,
with some exceptions:
• But some analysts questionedT how much of an impact the
retirement will have, because few jobs will endS up being
eliminated → AFTER
• The 486 is the descendant of a long series of Intel chips that
beganT dominating the market ever since IBM pickedS the 16-
bit 8088 chip for its first personal computer → BEGINS
13. What will be covered…
• Annotation guidelines for causality
• Automatic annotation of explicit causality
between events
• Qualitative and quantitative evaluation
14. Qualitative Evaluation
• Two main types of errors:
– Wrong identification of involved events, due to
dependency parser mistakes
• StatesWest Airlines said it withdrew its offer to acquireS
Mesa Airlines because the Farmington carrier did not
respondT to its offer.
– Annotation of sentences not containing causal relations,
due to ambiguous nature of verbs, prepositions and
conjunctions
• Since then, 427 fugitives have been taken into custody
or located.
17. Conclusions
• Annotation guidelines for causality between events…
presented
• Rule-based algorithm for automatic annotation:
– Manual evaluation: 0.61 precision
– Compared with manual annotation: 0.55 F1-score for
CSIGNAL and 0.3 F1-score for CLINK
– Mistakes are introduced by tools used for parsing and
disambiguating causal signals
– Not all events involved in causal relations are
annotated
• Recognizing CLINKs based on causal signals is more
straightforward
18. Conclusions (2)
• Polarity of CLINK can be easily identified, though negative
polarity is not so frequent
• There are only few overlaps between CLINKs and TLINKs,
with BEFORE as the majority underlying temporal
relation type
• Future work…
– Factuality and certainty annotation of events
– Complete the manual annotation of TempEval-3 corpus,
and make it available
– Another approach for automatic causal relation extraction
– Integration of the proposed guidelines1 with GAF (Fokkens
et al., 2013)
1 available at http://www.newsreader-project.eu/publications/technical-reports/
(NWR-2014-2)