SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
ANAPHORA RESOLUTION
       FINDWISE
Anaphoric Pronoun Resolution




Finding links
 • Pronoun to antecedent
Enriching text
 • Input: preprocessed document
 • Output: All found anaphoric pronoun
    references to words/phrases
Areas of use




Document summarization
 • Improving sentence comparisons       Ontology enrichment
 • Enriching results                     • Populating with more
                                           data.
Entity level sentiment analysis         Question answering
 • Adding more information to indata.    • Extracting more RDF-
                                           tripples
Preprocessing




 Required                     Additional
  • Sentence splitting         • Dependency parsing
  • Tokenization
  • Part of Speech-tagging
  • Named Entity Reconition
  • Gender Detection
Model representation




Anaphora pairs                         Candidate selection/ranking
 • Pronoun                              • Find pronoun
 • Antecedent                           • Pair with antecedent candidates
   - Entities                           • Filter out improbable pairs (rules)
   - Nouns, cardinals, foreign words    • Rank candidate pairs
                                        • Select the most probable
                                          candidate (if any)
Feature representation




Distance Features      Overlap Features/Filters
 • Sentence distance    • Gender
 • Hobbs distance       • Animacity
Antecedent Features     • Number
 • PoS-tag              • Entity
 • Gender              Pronoun Features
 • Animacity            • Word string
 • Number               • Gender
 • Entity tag           • Animacity
 • ...                  • ...
Machine learning models


                                      Running the models
Models
 • Condidtional Random Fields (CRF)    • Control confiedence
                                          threshold
    - Mallet
                                               - Precision/Recall trade
 • Logistic Regression                off
        - Liblinear
Training the models
 • OntoNotes Conll 2012
 • English
 • 1667 documents
 • Various domains
Further Work/Ideas for Improvement




Full coreference/anaphora resolution
                                       Improved Features
 • Change model representations         • Improved gender detection
     - Clusters
     - Chains                           • Improved animacity detection
 • Generalize comparisons (not only     • Additional overlap features
                                       Multi pass approach
     pronoun - antecedent)
Non referential/cataphora detection     • First pass(es) rule based
 • Training separate models             • Harder classifications with
                                           machine learning models
 • Rule based
Demonstration

Weitere ähnliche Inhalte

Ähnlich wie Anaphora Resolution

Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognition
Dhiraj Singh
 
Introduction to ruby
Introduction to rubyIntroduction to ruby
Introduction to ruby
Andrew Liu
 
Levels and lexiles
Levels and lexilesLevels and lexiles
Levels and lexiles
Teri Lesesne
 

Ähnlich wie Anaphora Resolution (20)

Lec02 primitive types
Lec02   primitive typesLec02   primitive types
Lec02 primitive types
 
98 poe 2
98 poe 298 poe 2
98 poe 2
 
Shell Scripting Classroom Training
Shell Scripting Classroom TrainingShell Scripting Classroom Training
Shell Scripting Classroom Training
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolution
 
OWL briefing
OWL briefingOWL briefing
OWL briefing
 
PHP Classroom Training
PHP Classroom TrainingPHP Classroom Training
PHP Classroom Training
 
Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands
Apache Solr for TYPO3 at TYPO3 Usergroup Day NetherlandsApache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands
Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands
 
Webinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior RelevanceWebinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior Relevance
 
Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)
 
Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...
 
Supporting the authoring process with linguistic software
Supporting the authoring process with linguistic softwareSupporting the authoring process with linguistic software
Supporting the authoring process with linguistic software
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognition
 
Introduction to ruby
Introduction to rubyIntroduction to ruby
Introduction to ruby
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
 
Scala Bay Meetup - The state of Scala code style and quality
Scala Bay Meetup - The state of Scala code style and qualityScala Bay Meetup - The state of Scala code style and quality
Scala Bay Meetup - The state of Scala code style and quality
 
Levels and lexiles
Levels and lexilesLevels and lexiles
Levels and lexiles
 
Story generation-Sarah Saneei
Story generation-Sarah SaneeiStory generation-Sarah Saneei
Story generation-Sarah Saneei
 

Mehr von Findwise

Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findwise
 

Mehr von Findwise (20)

White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017
 
AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017
 
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
 
Findwise and IBM Watson
Findwise and IBM WatsonFindwise and IBM Watson
Findwise and IBM Watson
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaboration
 
Findability Day 2016 - SKF case study
Findability Day 2016 - SKF case studyFindability Day 2016 - SKF case study
Findability Day 2016 - SKF case study
 
Findability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experienceFindability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experience
 
Findability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligence
 
Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?
 
Findability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPRFindability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPR
 
Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365
 
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
 
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any messFindability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
 
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
 
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
 
Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!
 
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Anaphora Resolution

  • 2. Anaphoric Pronoun Resolution Finding links • Pronoun to antecedent Enriching text • Input: preprocessed document • Output: All found anaphoric pronoun references to words/phrases
  • 3. Areas of use Document summarization • Improving sentence comparisons Ontology enrichment • Enriching results • Populating with more data. Entity level sentiment analysis Question answering • Adding more information to indata. • Extracting more RDF- tripples
  • 4. Preprocessing Required Additional • Sentence splitting • Dependency parsing • Tokenization • Part of Speech-tagging • Named Entity Reconition • Gender Detection
  • 5. Model representation Anaphora pairs Candidate selection/ranking • Pronoun • Find pronoun • Antecedent • Pair with antecedent candidates - Entities • Filter out improbable pairs (rules) - Nouns, cardinals, foreign words • Rank candidate pairs • Select the most probable candidate (if any)
  • 6. Feature representation Distance Features Overlap Features/Filters • Sentence distance • Gender • Hobbs distance • Animacity Antecedent Features • Number • PoS-tag • Entity • Gender Pronoun Features • Animacity • Word string • Number • Gender • Entity tag • Animacity • ... • ...
  • 7. Machine learning models Running the models Models • Condidtional Random Fields (CRF) • Control confiedence threshold - Mallet - Precision/Recall trade • Logistic Regression off - Liblinear Training the models • OntoNotes Conll 2012 • English • 1667 documents • Various domains
  • 8. Further Work/Ideas for Improvement Full coreference/anaphora resolution Improved Features • Change model representations • Improved gender detection - Clusters - Chains • Improved animacity detection • Generalize comparisons (not only • Additional overlap features Multi pass approach pronoun - antecedent) Non referential/cataphora detection • First pass(es) rule based • Training separate models • Harder classifications with machine learning models • Rule based