SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
ANAPHORA RESOLUTION
       FINDWISE
Anaphoric Pronoun Resolution




Finding links
 • Pronoun to antecedent
Enriching text
 • Input: preprocessed document
 • Output: All found anaphoric pronoun
    references to words/phrases
Areas of use




Document summarization
 • Improving sentence comparisons       Ontology enrichment
 • Enriching results                     • Populating with more
                                           data.
Entity level sentiment analysis         Question answering
 • Adding more information to indata.    • Extracting more RDF-
                                           tripples
Preprocessing




 Required                     Additional
  • Sentence splitting         • Dependency parsing
  • Tokenization
  • Part of Speech-tagging
  • Named Entity Reconition
  • Gender Detection
Model representation




Anaphora pairs                         Candidate selection/ranking
 • Pronoun                              • Find pronoun
 • Antecedent                           • Pair with antecedent candidates
   - Entities                           • Filter out improbable pairs (rules)
   - Nouns, cardinals, foreign words    • Rank candidate pairs
                                        • Select the most probable
                                          candidate (if any)
Feature representation




Distance Features      Overlap Features/Filters
 • Sentence distance    • Gender
 • Hobbs distance       • Animacity
Antecedent Features     • Number
 • PoS-tag              • Entity
 • Gender              Pronoun Features
 • Animacity            • Word string
 • Number               • Gender
 • Entity tag           • Animacity
 • ...                  • ...
Machine learning models


                                      Running the models
Models
 • Condidtional Random Fields (CRF)    • Control confiedence
                                          threshold
    - Mallet
                                               - Precision/Recall trade
 • Logistic Regression                off
        - Liblinear
Training the models
 • OntoNotes Conll 2012
 • English
 • 1667 documents
 • Various domains
Further Work/Ideas for Improvement




Full coreference/anaphora resolution
                                       Improved Features
 • Change model representations         • Improved gender detection
     - Clusters
     - Chains                           • Improved animacity detection
 • Generalize comparisons (not only     • Additional overlap features
                                       Multi pass approach
     pronoun - antecedent)
Non referential/cataphora detection     • First pass(es) rule based
 • Training separate models             • Harder classifications with
                                           machine learning models
 • Rule based
Demonstration

Weitere ähnliche Inhalte

Ähnlich wie Anaphora Resolution

Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognition
Dhiraj Singh
 
Introduction to ruby
Introduction to rubyIntroduction to ruby
Introduction to ruby
Andrew Liu
 
Levels and lexiles
Levels and lexilesLevels and lexiles
Levels and lexiles
Teri Lesesne
 

Ähnlich wie Anaphora Resolution (20)

Lec02 primitive types
Lec02   primitive typesLec02   primitive types
Lec02 primitive types
 
98 poe 2
98 poe 298 poe 2
98 poe 2
 
Shell Scripting Classroom Training
Shell Scripting Classroom TrainingShell Scripting Classroom Training
Shell Scripting Classroom Training
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
 
Netflix Global Search - Lucene Revolution
Netflix Global Search - Lucene RevolutionNetflix Global Search - Lucene Revolution
Netflix Global Search - Lucene Revolution
 
OWL briefing
OWL briefingOWL briefing
OWL briefing
 
PHP Classroom Training
PHP Classroom TrainingPHP Classroom Training
PHP Classroom Training
 
Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands
Apache Solr for TYPO3 at TYPO3 Usergroup Day NetherlandsApache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands
Apache Solr for TYPO3 at TYPO3 Usergroup Day Netherlands
 
Webinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior RelevanceWebinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior Relevance
 
Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)Speech recognition (dr. m. sabarimalai manikandan)
Speech recognition (dr. m. sabarimalai manikandan)
 
Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...
 
Supporting the authoring process with linguistic software
Supporting the authoring process with linguistic softwareSupporting the authoring process with linguistic software
Supporting the authoring process with linguistic software
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognition
 
Introduction to ruby
Introduction to rubyIntroduction to ruby
Introduction to ruby
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
 
Scala Bay Meetup - The state of Scala code style and quality
Scala Bay Meetup - The state of Scala code style and qualityScala Bay Meetup - The state of Scala code style and quality
Scala Bay Meetup - The state of Scala code style and quality
 
Levels and lexiles
Levels and lexilesLevels and lexiles
Levels and lexiles
 
Story generation-Sarah Saneei
Story generation-Sarah SaneeiStory generation-Sarah Saneei
Story generation-Sarah Saneei
 

Mehr von Findwise

Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findwise
 

Mehr von Findwise (20)

White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017
 
AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017
 
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
 
Findwise and IBM Watson
Findwise and IBM WatsonFindwise and IBM Watson
Findwise and IBM Watson
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaboration
 
Findability Day 2016 - SKF case study
Findability Day 2016 - SKF case studyFindability Day 2016 - SKF case study
Findability Day 2016 - SKF case study
 
Findability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experienceFindability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experience
 
Findability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligence
 
Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?
 
Findability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPRFindability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPR
 
Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365
 
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
 
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any messFindability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
 
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
 
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
 
Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!
 
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Anaphora Resolution

  • 2. Anaphoric Pronoun Resolution Finding links • Pronoun to antecedent Enriching text • Input: preprocessed document • Output: All found anaphoric pronoun references to words/phrases
  • 3. Areas of use Document summarization • Improving sentence comparisons Ontology enrichment • Enriching results • Populating with more data. Entity level sentiment analysis Question answering • Adding more information to indata. • Extracting more RDF- tripples
  • 4. Preprocessing Required Additional • Sentence splitting • Dependency parsing • Tokenization • Part of Speech-tagging • Named Entity Reconition • Gender Detection
  • 5. Model representation Anaphora pairs Candidate selection/ranking • Pronoun • Find pronoun • Antecedent • Pair with antecedent candidates - Entities • Filter out improbable pairs (rules) - Nouns, cardinals, foreign words • Rank candidate pairs • Select the most probable candidate (if any)
  • 6. Feature representation Distance Features Overlap Features/Filters • Sentence distance • Gender • Hobbs distance • Animacity Antecedent Features • Number • PoS-tag • Entity • Gender Pronoun Features • Animacity • Word string • Number • Gender • Entity tag • Animacity • ... • ...
  • 7. Machine learning models Running the models Models • Condidtional Random Fields (CRF) • Control confiedence threshold - Mallet - Precision/Recall trade • Logistic Regression off - Liblinear Training the models • OntoNotes Conll 2012 • English • 1667 documents • Various domains
  • 8. Further Work/Ideas for Improvement Full coreference/anaphora resolution Improved Features • Change model representations • Improved gender detection - Clusters - Chains • Improved animacity detection • Generalize comparisons (not only • Additional overlap features Multi pass approach pronoun - antecedent) Non referential/cataphora detection • First pass(es) rule based • Training separate models • Harder classifications with machine learning models • Rule based