2. Anaphoric Pronoun Resolution
Finding links
• Pronoun to antecedent
Enriching text
• Input: preprocessed document
• Output: All found anaphoric pronoun
references to words/phrases
3. Areas of use
Document summarization
• Improving sentence comparisons Ontology enrichment
• Enriching results • Populating with more
data.
Entity level sentiment analysis Question answering
• Adding more information to indata. • Extracting more RDF-
tripples
4. Preprocessing
Required Additional
• Sentence splitting • Dependency parsing
• Tokenization
• Part of Speech-tagging
• Named Entity Reconition
• Gender Detection
5. Model representation
Anaphora pairs Candidate selection/ranking
• Pronoun • Find pronoun
• Antecedent • Pair with antecedent candidates
- Entities • Filter out improbable pairs (rules)
- Nouns, cardinals, foreign words • Rank candidate pairs
• Select the most probable
candidate (if any)
6. Feature representation
Distance Features Overlap Features/Filters
• Sentence distance • Gender
• Hobbs distance • Animacity
Antecedent Features • Number
• PoS-tag • Entity
• Gender Pronoun Features
• Animacity • Word string
• Number • Gender
• Entity tag • Animacity
• ... • ...
7. Machine learning models
Running the models
Models
• Condidtional Random Fields (CRF) • Control confiedence
threshold
- Mallet
- Precision/Recall trade
• Logistic Regression off
- Liblinear
Training the models
• OntoNotes Conll 2012
• English
• 1667 documents
• Various domains
8. Further Work/Ideas for Improvement
Full coreference/anaphora resolution
Improved Features
• Change model representations • Improved gender detection
- Clusters
- Chains • Improved animacity detection
• Generalize comparisons (not only • Additional overlap features
Multi pass approach
pronoun - antecedent)
Non referential/cataphora detection • First pass(es) rule based
• Training separate models • Harder classifications with
machine learning models
• Rule based