SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Web & Media Group
http://CrowdTruth.org
Harnessing Diversity in Crowds and
Machines for Better NER Performance
Oana Inel and Lora Aroyo Extended Semantic Web
Conference
May 30th 2017
Web & Media Group
Ground Truth Datasets:
OKE2015: Open Knowledge Extraction
challenge (ESWC) 2015
• 101 sentences
• 664 entities
OKE2016: Open Knowledge Extraction
challenge (ESWC) 2016
• 55 sentences
• 340 entities
Named Entities of Type:
Person, Place, Organization & Role
https://github.com/anuzzolese/oke-challenge
https://github.com/anuzzolese/oke-challenge-2016
What’s wrong with the
ground truth for NER?
Web & Media Group
http://CrowdTruth.org
What’s wrong with the ground truth for NER?
• Inconsistencies in annotating entities of type ROLE and PERSON
… and similar for
• But, “Bishop Petronius” is a PERSON
... controversial [bishop] [Richard Williamson].
● [bishop] → role
● [Richard Williamson] → person
… appointed Knight Bachelor by the [Queen] [Elizabeth II] in 1988.
Bologna was reborn in the 5th century under [Bishop Petronius].
Web & Media Group
http://CrowdTruth.org
What’s wrong with the ground truth for NER?
• Inconsistencies across datasets
• 4 cases in OKE2015 where concatenation of entities of type
PLACE were treated as a single entity
… but, the offsets, do not match with the actual string
• In OKE2016, such entities were classified as two entities of type
PLACE
Such a man did post-doctoral work at the Salk Institute in San Diego in
the laboratory of Renato Dulbecco, then worked at the Basel Institute for
Immunology in [Basel, Switzerland].
Web & Media Group
http://CrowdTruth.org
What’s wrong with the ground truth for NER?
Giulio Natta was born in Imperia, Italy. [He] earned [his] degree in
chemical engineering from the Politecnico di Milano university in Milan
in 1924.
[One of the them] was an eminent scholar at Berkeley.
• There are errors in the ground truth
• 1 case in OKE2015
• Personal pronouns (co-references) and possessive pronouns
are considered named entities of type PERSON
• 85/304 cases (OKE2015)
• 27/105 cases (OKE2016)
What’s wrong with the ground truth for NER?
Web & Media Group
http://CrowdTruth.org
How reliable are the NER tools
trained using this ground truth?
… but, then ...
Web & Media Group
How do state-of-the-art
NER tools perform on this
ground truth?
Performance of NER Tools:
• NERD-ML
• TextRazor
• THD
• DBpediaSpotlight
• SemiTags
http://nerd.eurecom.fr 

https://www.textrazor.com 

http://ner.vse.cz/thd/ 

http://dbpedia-spotlight.github.io/demo/ 

http://nlp.vse.cz/SemiTags/ 

Web & Media Group
http://CrowdTruth.org
• do not extract personal pronouns (co-references) and possessive
pronouns as type PERSON entities
• 83/85 missed cases in OKE2015
• 26/27 missed cases in OKE2016
• extract multiple span variations for the same entity type PERSON
• 73/92 cases in OKE2015
• 9/13 cases in OKE2016
… let’s see how NER tools perform on this GT
Giulio Natta was born in Imperia, Italy. [He] earned [his] degree in chemical
engineering from the Politecnico di Milano university in Milan in 1924.
The woman was awarded the Nobel Prize in Physics in 1963, which she
shared with [[J.] [[Hans] D.] [Jensen]] and Eugene Wigner.
Web & Media Group
http://CrowdTruth.org
● Many entities of type ORGANIZATION are a combination of
ORGANIZATION + PLACE
• NER tools tend to extract each entity granularity
• GT does not allow for overlapping entities or multiple perspectives
• 105/213 cases in OKE2015
• 62/157 cases in OKE2016
… let’s see how NER tools perform on this GT
Such a man did post-doctoral work at the Salk Institute in San Diego in the
laboratory of Renato Dulbecco, then worked at the [[[Basel] [Institute]] for
Immunology] in Basel, Switzerland.
Web & Media Group
http://CrowdTruth.org
… and now let’s see their overall performance
● High disagreement between the NER tools
○ Similar performance in F1, but different #FP, #TP, #FN
Web & Media Group
http://CrowdTruth.org
… and now let’s see their overall performance
● High disagreement between the NER tools
○ Similar performance in F1, but different #FP, #TP, #FN
● Low recall, many entities missed
Web & Media Group
http://CrowdTruth.org
difficult to understand the reliability of different
NER tools
it’s difficult to choose the “best one” for your case
Web & Media Group
Hypothesis
Aggregating the output of NER tools by harnessing the inter-tool
disagreement (Multi-NER) performs better than the individual
NERs (Single-NER)
Web & Media Group
http://CrowdTruth.org
Multi-NER vs. SOTA Single-NER
Web & Media Group
http://CrowdTruth.org
• Multi-NER significantly higher #TP & lower #FN
Multi-NER vs. SOTA Single-NER
Web & Media Group
http://CrowdTruth.org
• Multi-NER significantly higher #TP & lower #FN
• Multi-NER significantly higher #FP
Multi-NER vs. SOTA Single-NER
Web & Media Group
http://CrowdTruth.org17
The more the merrier?
Is performance correlated with the number of NER tools that
extracted a given named entity?
● Performance comparison:
• Likelihood of an entity to be contained in the gold standard
based on how many NER tools extracted it
Sentence-entity score = ratio of NER that extracted the entity
Multi-NER vs. SOTA Single-NER
Web & Media Group
http://CrowdTruth.org18
Disagreement Aware Multi-NER vs. Single-NER
Multi-NER outperforms SOTA Single-NER tools
at a sentence-entity score threshold >= 0.4
Web & Media Group
http://CrowdTruth.org19
Disagreement Aware Multi-NER vs. Single-NER
19
Multi-NER outperforms the majority vote approach
(sentence-entity score threshold = 0.6)
at a sentence-entity score threshold >= 0.4
Web & Media Group
Hypothesis
Crowdsourced ground truth by harnessing inter-annotator
disagreement (Multi-NER+CrowdGT) produces diversity in
annotations and thus, improves the aggregated output of NER
tools
Web & Media Group
http://CrowdTruth.org21
• Identify all the valid expressions and their types
• Reduce the number of FP and FN cases
Crowdsourcing for Better Ground Truth
Web & Media Group
http://CrowdTruth.org22
Crowdsourcing Task
Web & Media Group
http://CrowdTruth.org
Crowdsourcing for Better NER
For each crowd sentence-entity score the crowd enhanced Multi-NER
(Multi-NER+Crowd) performs much better than the Multi-NER
Web & Media Group
http://CrowdTruth.org
Crowdsourcing for Better Ground Truth
Multi-NER+CrowdGT, the enhanced Multi-NER through crowd-driven
ground truth gathering performs better than Multi-NER+Crowd.
Web & Media Group
http://CrowdTruth.org
Conclusions & Future Work
● a hybrid Multi-NER crowd-driven approach for improved NER
performance
● the crowd is able to correct the mistakes of the NER tools by reducing the
total number of false positive cases
● the crowd-driven ground truth, that harnesses diversity, perspectives and
granularities, proves to be more reliable
● improving the event detection ground truth (FactBank) through
disagreement-aware crowdsourcing
● train an event classifier with crowd-driven events
Data
http://data.crowdtruth.org
http://data.crowdtruth.org/crowdsourcing-ne-goldstandards/
Tool & Code
http://dev.CrowdTruth.org
http://github.com/CrowdTruth

Weitere ähnliche Inhalte

Was ist angesagt?

(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 Lora Aroyo
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Isabelle Augenstein
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact CheckingIsabelle Augenstein
 
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionCCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionLora Aroyo
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Ke Tao
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaDiana Maynard
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignJonathan Stray
 
Introduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumIntroduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumFuming Shih
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisJonathan Stray
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real worldDiana Maynard
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Jonathan Stray
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Leon Derczynski
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Lora Aroyo
 
Fake News Detector
Fake News DetectorFake News Detector
Fake News DetectorIrisYoon5
 
Data visualisationsummit 2013
Data visualisationsummit 2013Data visualisationsummit 2013
Data visualisationsummit 2013The Pathway Group
 

Was ist angesagt? (20)

(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
 
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionCCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
 
Pydata Taipei 2020
Pydata Taipei 2020Pydata Taipei 2020
Pydata Taipei 2020
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
 
Introduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumIntroduction to Bayesian Truth Serum
Introduction to Bayesian Truth Serum
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
ase-social-informatics (6)
ase-social-informatics (6)ase-social-informatics (6)
ase-social-informatics (6)
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)
 
Fake News Detector
Fake News DetectorFake News Detector
Fake News Detector
 
Data visualisationsummit 2013
Data visualisationsummit 2013Data visualisationsummit 2013
Data visualisationsummit 2013
 

Ähnlich wie Harnessing diversity in crowds and machines for better ner performance

Secure360 May 2018 Lessons Learned from OWASP T10 Datacall
Secure360 May 2018 Lessons Learned from OWASP T10 DatacallSecure360 May 2018 Lessons Learned from OWASP T10 Datacall
Secure360 May 2018 Lessons Learned from OWASP T10 DatacallBrian Glas
 
Insights From Social Media
Insights From Social MediaInsights From Social Media
Insights From Social MediaDr Wasim Ahmed
 
Boosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through CrowdsourcingBoosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through Crowdsourcingoanainel
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Marieke van Erp
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media suresh sood
 
Collecting and Coding Twitter Data in DiscoverText
Collecting and Coding Twitter Data in DiscoverTextCollecting and Coding Twitter Data in DiscoverText
Collecting and Coding Twitter Data in DiscoverTextJill Hopke
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Bianca Pereira
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)James Hendler
 
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...TAUS - The Language Data Network
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteDeep Kayal
 
Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...Federal Communicators Network
 
Digital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social SciencesDigital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social SciencesShawn Day
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)James Hendler
 
An Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on TwitterAn Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on TwitterSymeon Papadopoulos
 
Redesign Must Die (updated Feb 2014)
Redesign Must Die (updated Feb 2014)Redesign Must Die (updated Feb 2014)
Redesign Must Die (updated Feb 2014)Louis Rosenfeld
 
What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century Human Capital Media
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationRobert Sanderson
 
Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019bodaceacat
 
Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019bodaceacat
 

Ähnlich wie Harnessing diversity in crowds and machines for better ner performance (20)

Secure360 May 2018 Lessons Learned from OWASP T10 Datacall
Secure360 May 2018 Lessons Learned from OWASP T10 DatacallSecure360 May 2018 Lessons Learned from OWASP T10 Datacall
Secure360 May 2018 Lessons Learned from OWASP T10 Datacall
 
Insights From Social Media
Insights From Social MediaInsights From Social Media
Insights From Social Media
 
Boosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through CrowdsourcingBoosting Named Entity Extraction through Crowdsourcing
Boosting Named Entity Extraction through Crowdsourcing
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
Collecting and Coding Twitter Data in DiscoverText
Collecting and Coding Twitter Data in DiscoverTextCollecting and Coding Twitter Data in DiscoverText
Collecting and Coding Twitter Data in DiscoverText
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
 
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
TaaS Workshop 2014, Terminology Trends- First-hand Experience as a Blogger, M...
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...
 
Digital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social SciencesDigital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social Sciences
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
 
An Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on TwitterAn Ensemble Model for Cross-Domain Polarity Classification on Twitter
An Ensemble Model for Cross-Domain Polarity Classification on Twitter
 
Redesign Must Die (updated Feb 2014)
Redesign Must Die (updated Feb 2014)Redesign Must Die (updated Feb 2014)
Redesign Must Die (updated Feb 2014)
 
What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century What your employees need to learn to work with data in the 21 st century
What your employees need to learn to work with data in the 21 st century
 
A42020106
A42020106A42020106
A42020106
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need Reconciliation
 
Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019Terp breuer misinfosecframeworks_cansecwest2019
Terp breuer misinfosecframeworks_cansecwest2019
 
Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019Misinfosec frameworks Cansecwest 2019
Misinfosec frameworks Cansecwest 2019
 

Kürzlich hochgeladen

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 

Kürzlich hochgeladen (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Harnessing diversity in crowds and machines for better ner performance

  • 1. Web & Media Group http://CrowdTruth.org Harnessing Diversity in Crowds and Machines for Better NER Performance Oana Inel and Lora Aroyo Extended Semantic Web Conference May 30th 2017
  • 2. Web & Media Group Ground Truth Datasets: OKE2015: Open Knowledge Extraction challenge (ESWC) 2015 • 101 sentences • 664 entities OKE2016: Open Knowledge Extraction challenge (ESWC) 2016 • 55 sentences • 340 entities Named Entities of Type: Person, Place, Organization & Role https://github.com/anuzzolese/oke-challenge https://github.com/anuzzolese/oke-challenge-2016 What’s wrong with the ground truth for NER?
  • 3. Web & Media Group http://CrowdTruth.org What’s wrong with the ground truth for NER? • Inconsistencies in annotating entities of type ROLE and PERSON … and similar for • But, “Bishop Petronius” is a PERSON ... controversial [bishop] [Richard Williamson]. ● [bishop] → role ● [Richard Williamson] → person … appointed Knight Bachelor by the [Queen] [Elizabeth II] in 1988. Bologna was reborn in the 5th century under [Bishop Petronius].
  • 4. Web & Media Group http://CrowdTruth.org What’s wrong with the ground truth for NER? • Inconsistencies across datasets • 4 cases in OKE2015 where concatenation of entities of type PLACE were treated as a single entity … but, the offsets, do not match with the actual string • In OKE2016, such entities were classified as two entities of type PLACE Such a man did post-doctoral work at the Salk Institute in San Diego in the laboratory of Renato Dulbecco, then worked at the Basel Institute for Immunology in [Basel, Switzerland].
  • 5. Web & Media Group http://CrowdTruth.org What’s wrong with the ground truth for NER? Giulio Natta was born in Imperia, Italy. [He] earned [his] degree in chemical engineering from the Politecnico di Milano university in Milan in 1924. [One of the them] was an eminent scholar at Berkeley. • There are errors in the ground truth • 1 case in OKE2015 • Personal pronouns (co-references) and possessive pronouns are considered named entities of type PERSON • 85/304 cases (OKE2015) • 27/105 cases (OKE2016) What’s wrong with the ground truth for NER?
  • 6. Web & Media Group http://CrowdTruth.org How reliable are the NER tools trained using this ground truth? … but, then ...
  • 7. Web & Media Group How do state-of-the-art NER tools perform on this ground truth? Performance of NER Tools: • NERD-ML • TextRazor • THD • DBpediaSpotlight • SemiTags http://nerd.eurecom.fr 
 https://www.textrazor.com 
 http://ner.vse.cz/thd/ 
 http://dbpedia-spotlight.github.io/demo/ 
 http://nlp.vse.cz/SemiTags/ 

  • 8. Web & Media Group http://CrowdTruth.org • do not extract personal pronouns (co-references) and possessive pronouns as type PERSON entities • 83/85 missed cases in OKE2015 • 26/27 missed cases in OKE2016 • extract multiple span variations for the same entity type PERSON • 73/92 cases in OKE2015 • 9/13 cases in OKE2016 … let’s see how NER tools perform on this GT Giulio Natta was born in Imperia, Italy. [He] earned [his] degree in chemical engineering from the Politecnico di Milano university in Milan in 1924. The woman was awarded the Nobel Prize in Physics in 1963, which she shared with [[J.] [[Hans] D.] [Jensen]] and Eugene Wigner.
  • 9. Web & Media Group http://CrowdTruth.org ● Many entities of type ORGANIZATION are a combination of ORGANIZATION + PLACE • NER tools tend to extract each entity granularity • GT does not allow for overlapping entities or multiple perspectives • 105/213 cases in OKE2015 • 62/157 cases in OKE2016 … let’s see how NER tools perform on this GT Such a man did post-doctoral work at the Salk Institute in San Diego in the laboratory of Renato Dulbecco, then worked at the [[[Basel] [Institute]] for Immunology] in Basel, Switzerland.
  • 10. Web & Media Group http://CrowdTruth.org … and now let’s see their overall performance ● High disagreement between the NER tools ○ Similar performance in F1, but different #FP, #TP, #FN
  • 11. Web & Media Group http://CrowdTruth.org … and now let’s see their overall performance ● High disagreement between the NER tools ○ Similar performance in F1, but different #FP, #TP, #FN ● Low recall, many entities missed
  • 12. Web & Media Group http://CrowdTruth.org difficult to understand the reliability of different NER tools it’s difficult to choose the “best one” for your case
  • 13. Web & Media Group Hypothesis Aggregating the output of NER tools by harnessing the inter-tool disagreement (Multi-NER) performs better than the individual NERs (Single-NER)
  • 14. Web & Media Group http://CrowdTruth.org Multi-NER vs. SOTA Single-NER
  • 15. Web & Media Group http://CrowdTruth.org • Multi-NER significantly higher #TP & lower #FN Multi-NER vs. SOTA Single-NER
  • 16. Web & Media Group http://CrowdTruth.org • Multi-NER significantly higher #TP & lower #FN • Multi-NER significantly higher #FP Multi-NER vs. SOTA Single-NER
  • 17. Web & Media Group http://CrowdTruth.org17 The more the merrier? Is performance correlated with the number of NER tools that extracted a given named entity? ● Performance comparison: • Likelihood of an entity to be contained in the gold standard based on how many NER tools extracted it Sentence-entity score = ratio of NER that extracted the entity Multi-NER vs. SOTA Single-NER
  • 18. Web & Media Group http://CrowdTruth.org18 Disagreement Aware Multi-NER vs. Single-NER Multi-NER outperforms SOTA Single-NER tools at a sentence-entity score threshold >= 0.4
  • 19. Web & Media Group http://CrowdTruth.org19 Disagreement Aware Multi-NER vs. Single-NER 19 Multi-NER outperforms the majority vote approach (sentence-entity score threshold = 0.6) at a sentence-entity score threshold >= 0.4
  • 20. Web & Media Group Hypothesis Crowdsourced ground truth by harnessing inter-annotator disagreement (Multi-NER+CrowdGT) produces diversity in annotations and thus, improves the aggregated output of NER tools
  • 21. Web & Media Group http://CrowdTruth.org21 • Identify all the valid expressions and their types • Reduce the number of FP and FN cases Crowdsourcing for Better Ground Truth
  • 22. Web & Media Group http://CrowdTruth.org22 Crowdsourcing Task
  • 23. Web & Media Group http://CrowdTruth.org Crowdsourcing for Better NER For each crowd sentence-entity score the crowd enhanced Multi-NER (Multi-NER+Crowd) performs much better than the Multi-NER
  • 24. Web & Media Group http://CrowdTruth.org Crowdsourcing for Better Ground Truth Multi-NER+CrowdGT, the enhanced Multi-NER through crowd-driven ground truth gathering performs better than Multi-NER+Crowd.
  • 25. Web & Media Group http://CrowdTruth.org Conclusions & Future Work ● a hybrid Multi-NER crowd-driven approach for improved NER performance ● the crowd is able to correct the mistakes of the NER tools by reducing the total number of false positive cases ● the crowd-driven ground truth, that harnesses diversity, perspectives and granularities, proves to be more reliable ● improving the event detection ground truth (FactBank) through disagreement-aware crowdsourcing ● train an event classifier with crowd-driven events Data http://data.crowdtruth.org http://data.crowdtruth.org/crowdsourcing-ne-goldstandards/ Tool & Code http://dev.CrowdTruth.org http://github.com/CrowdTruth