SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Vrije Universiteit Amsterdam
Boosting Named Entity Extraction
through Crowdsourcing
what goes wrong with IE tools?
what can we learn from the crowd?
Oana Inel
5th
December 2016
1
Vrije Universiteit Amsterdam
● work best on limited (predefined) entity types (e.g., people,
places, organizations, and to some extend time)
● are all trained on different data
○ perform well only on particular type of data/entities
● their performance is highly dependent on
○ the type of input text
○ the choice of gold standards
■ gold standards are not perfect
■ large amount of training & evaluation data is needed
● similar performance, but different entities coverage
○ different confidence scores
○ different way (non-transparent) of computing it
Named Entity Recognition: Observations
2
Vrije Universiteit Amsterdam
Problem:
- difficult to understand the reliability of the different NER tools
- difficult to choose “the best one” for your case
Solution:
- Combined use, e.g. NERD
- However, it also has problems
- On the spot reliability on other NER
- Limited number of types identified
- An alternative to NERD
IE Tools Issues
3
Vrije Universiteit Amsterdam
1. Choose multiple SOTA NER tools
2. Combine (aggregate) their output
3. Identify cases where the NER tools underperform
4. Correct and improve NER tools output through
crowdsourcing improved ground truth
Combining Machines & Crowd for NER
4
Vrije Universiteit Amsterdam
Usecase
5
5 NER tools
● NERD-ML
● TextRazor
● SemiTags
● THD
● DBpediaSpotlight
Comparative analysis on:
- their individual performance (output)
- their combined performance (output)
Using two existing gold standard datasets:
- Open Knowledge Extraction (OKE) Challenge 2015 & 2016
Vrije Universiteit Amsterdam
● OKE challenge (ESWC) 2015
○ 101 sentences
○ 664 entities
■ Person: 304
■ Place: 120
■ Organization: 139
■ Role: 103
○ https://github.com/anuzzolese/
oke-challenge
OKE 2015 & 2016 Datasets
6
● OKE challenge (ESWC) 2016
○ 55 sentences
○ 340 entities
■ Person: 105
■ Place: 44
■ Organization: 105
■ Role: 86
○ https://github.com/anuzzolese/
oke-challenge-2016
Vrije Universiteit Amsterdam
NER Performance: entity surface
7
● High disagreement between the NER tools
○ Similar performance in F1, but different #FP, #TP, #FN
● Low recall, many entities missed
Vrije Universiteit Amsterdam
NER Performance: entity surface
8
● High disagreement between the NER tools
○ Similar performance in F1, but different #FP, #TP, #FN
● NERD seems to perform the best on F1
Vrije Universiteit Amsterdam
NER Performance: entity surface
9
● High disagreement between the NER tools
○ Similar performance in F1, but different #FP, #TP, #FN
● NERD seems to perform the best on F1
● CombinedNER significantly higher #TP & lower #FN
Vrije Universiteit Amsterdam
NER Performance: entity surface
10
● High disagreement between the NER tools
○ Similar performance in F1, but different #FP, #TP, #FN
● NERD seems to perform the best on F1
● CombinedNER significantly higher #TP & lower #FN
● CombinedNER significantly higher #FP
Vrije Universiteit Amsterdam
CombinedNER vs. SOTA NER
11
The more the merrier?
Is performance correlated to the number of NER tools that
extracted a given named entity?
● Performance comparison:
○ Applied CrowdTruth metrics on CombinedNER
○ Likelihood of an entity to be contained in the gold
standard based on how many NER tools extracted it
Sentence-entity score = ratio of NER that extracted the entity
Vrije Universiteit Amsterdam
CombinedNER vs. SOTA NER
12
CombinedNER outperforms the-state-of-the-art NER tools
at a sentence-entity score >= 0.4, which is also better than
considering the majority vote approach
Vrije Universiteit Amsterdam
Where do NER tools fail and why?
13
Vrije Universiteit Amsterdam
NER Performance: entity surface & type
14
● many instances of “people” were missed
Vrije Universiteit Amsterdam
Deeper look in Ground Truth: People
15
● Personal pronouns (co-references) and possessive pronouns are
considered named entities of type “person”
○ 83/85 cases (in OKE2015)
○ 26/27 cases (in OKE2016)
Giulio Natta was born in Imperia, Italy. [He] earned [his] degree in chemical
engineering from the Politecnico di Milano university in Milan in 1924.
● There are also errors in the ground truth
○ 1 case in OKE2015
[One of the them] was an eminent scholar at Berkeley.
Vrije Universiteit Amsterdam
NER Performance: entity surface & type
16
● many instances of “people” were missed
● only few “places” were missed in 2015, and none in 2016
Vrije Universiteit Amsterdam
Deeper look in Ground Truth: Places
17
● There are concatenation of multiple entities of type place, e.g:
City, Country
● 4/4 cases in OKE2015
Such a man did post-doctoral work at the Salk Institute in San Diego in the
laboratory of Renato Dulbecco, then worked at the Basel Institute for
Immunology in [Basel, Switzerland].
but, the offsets given in the GT, do not match with the actual string
● Inconsistencies across datasets
○ In 2016, such entities were actually classified as two entities of
type place
Vrije Universiteit Amsterdam
NER Performance: entity surface & type
18
● many instances of “people” were missed
● only few “places” were missed in 2015, and none in 2016
● many FP for entities of type “organization”
Vrije Universiteit Amsterdam
Deeper look in Ground Truth: Organization
19
● Many entities of type “organization” are a combination of
“organization” + “place”
○ NER tools tend to extract each entity granularity
○ GT does not allow for overlapping entities or multiple
perspectives
■ 105/213 cases in OKE2015
■ 62/157 cases in OKE2016
Such a man did post-doctoral work at the Salk Institute in San Diego in
the laboratory of Renato Dulbecco, then worked at the [[[Basel]
[Institute]] for Immunology] in Basel, Switzerland.
Vrije Universiteit Amsterdam
NER Performance: entity surface & type
20
● many instances of “people” were missed
● only few “places” were missed in 2015, and none in 2016
● many FP for entities of type “organization”
● several FP for entities of type “people” and “role”
Vrije Universiteit Amsterdam
Deeper look in Ground Truth: People & Role
21
● Multiple span variations for the same entity type “person”
○ 73/92 cases in OKE2015
○ 9/13 cases in OKE2016
The woman was awarded the Nobel Prize in Physics in 1963, which she
shared with [[J.] [[Hans] D.] [Jensen]] and Eugene Wigner.
● Inconsistencies & ambiguous combinations of type “role” and
“person”
○ Bishop Petronius → person
○ But, Queen Elizabeth II was not typed “person”
■ Queen → role
■ Elizabeth II → person
● Many combinations of “person” and “role”, especially when the
“person” is an ethnic group (e.g., French author, Canadian citizen)
○ 9/92 cases in OKE2015
○ 2/13 cases in OKE2016
Vrije Universiteit Amsterdam
Crowdsourcing for better Ground Truth
22
● Crowd-driven Ground Truth
Case 1: Crowd reduces the number of #FP
● For each entity that has multiple variations (span
alternative) we create an entity cluster
Case 2: Crowd reduces the number of #FN
● For each entity that was not extracted, we create a cluster
with partial overlaps but also every other combination of
words contained in the overlap
● Goal:
○ identify all the valid expressions and their types
○ decrease the number of FP and the number of FN
Vrije Universiteit Amsterdam
Crowdsourcing task - template
23
Vrije Universiteit Amsterdam
Results : CombinedNER vs. CombinedNER+Crowd
24
CombinedNER+Crowd outperforms CombinedNER for each
crowd-entity score
crowd-entity score - likelihood of an entity to be a valid entity in the dataset (based on CrowdTruth)
Vrije Universiteit Amsterdam
Conclusions
● difficult to find one NER tool that performs well
● combining the output of several NER tools results in disagreement
But,
● using crowdsourcing to correct and improve their out results in a better
outcome
● furthermore, the crowd can help us in identifying problems of the GT
25

Weitere ähnliche Inhalte

Andere mochten auch

Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?CrowdTruth
 
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 Lora Aroyo
 
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataVisualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataCrowdTruth
 
Crowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsCrowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsBenjamin Timmermans
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Lora Aroyo
 
CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015Lora Aroyo
 
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...CrowdTruth
 
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Lora Aroyo
 
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014Lora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation Victor de Boer
 
Harnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event ExtractionHarnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event Extractionoanainel
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...Lora Aroyo
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyLora Aroyo
 
Keynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program EventKeynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program EventLora Aroyo
 
NoTube: integrating TV and Web with the help of semantics
NoTube: integrating TV and Web with the help of semanticsNoTube: integrating TV and Web with the help of semantics
NoTube: integrating TV and Web with the help of semanticsGuus Schreiber
 
Web Science: the digital heritage case
Web Science: the digital heritage caseWeb Science: the digital heritage case
Web Science: the digital heritage caseGuus Schreiber
 
Ontologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebOntologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebGuus Schreiber
 

Andere mochten auch (20)

Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?Gamification of crowdsourcing tasks: What motivates a medical expert?
Gamification of crowdsourcing tasks: What motivates a medical expert?
 
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
(Presentation Chris) Crowdsourcing & Semantic Web: Dagstuhl 2014
 
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing DataVisualization of Disagreement-based Quality Metrics of Crowdsourcing Data
Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data
 
Crowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain QuestionsCrowdsourcing Disagreement on Open-Domain Questions
Crowdsourcing Disagreement on Open-Domain Questions
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
 
CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015CrowdTruth Games @NLeSc eHumanities day 2015
CrowdTruth Games @NLeSc eHumanities day 2015
 
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
 
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...
 
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
Truth is a Lie: 7 Myths about Human Annotation @CogComputing Forum 2014
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Kick-off meeting Linkflows project
Kick-off meeting Linkflows projectKick-off meeting Linkflows project
Kick-off meeting Linkflows project
 
DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation DIVE Semantic Web Challenge Presentation
DIVE Semantic Web Challenge Presentation
 
Harnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event ExtractionHarnessing the Power of Machines & Crowds for Event Extraction
Harnessing the Power of Machines & Crowds for Event Extraction
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Keynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program EventKeynote @Final NWO CATCH Program Event
Keynote @Final NWO CATCH Program Event
 
NoTube: integrating TV and Web with the help of semantics
NoTube: integrating TV and Web with the help of semanticsNoTube: integrating TV and Web with the help of semantics
NoTube: integrating TV and Web with the help of semantics
 
Web Science: the digital heritage case
Web Science: the digital heritage caseWeb Science: the digital heritage case
Web Science: the digital heritage case
 
Ontologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebOntologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture Web
 

Ähnlich wie Boosting Named Entity Extraction through Crowdsourcing

Harnessing diversity in crowds and machines for better ner performance
Harnessing diversity in crowds and machines for better ner performanceHarnessing diversity in crowds and machines for better ner performance
Harnessing diversity in crowds and machines for better ner performanceoanainel
 
Babak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesBabak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesZoltan Varju
 
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...Sociotechnical Roundtable
 
Dannys Slides
Dannys SlidesDannys Slides
Dannys SlidesMary Rose
 
[13 - A] Experiment validity
[13 - A] Experiment validity[13 - A] Experiment validity
[13 - A] Experiment validityIvano Malavolta
 
databases2
databases2databases2
databases2c.west
 
Leveraging Networks May2013 for Skanska
Leveraging Networks May2013 for SkanskaLeveraging Networks May2013 for Skanska
Leveraging Networks May2013 for SkanskaRobin Teigland
 
databases3b
databases3bdatabases3b
databases3bc.west
 
The blind spots of evaluations in academic work
The blind spots of evaluations in academic workThe blind spots of evaluations in academic work
The blind spots of evaluations in academic workFrank van der Most
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Bianca Pereira
 
Odsc machine-learning-guide-v1
Odsc machine-learning-guide-v1Odsc machine-learning-guide-v1
Odsc machine-learning-guide-v1Harsh Khatke
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesHendrik Drachsler
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
 
MetropolitanU.pptx
MetropolitanU.pptxMetropolitanU.pptx
MetropolitanU.pptxImre Hild
 

Ähnlich wie Boosting Named Entity Extraction through Crowdsourcing (20)

Harnessing diversity in crowds and machines for better ner performance
Harnessing diversity in crowds and machines for better ner performanceHarnessing diversity in crowds and machines for better ner performance
Harnessing diversity in crowds and machines for better ner performance
 
Babak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesBabak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entities
 
Tf wiads
Tf wiadsTf wiads
Tf wiads
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
How Virtual is Virtual: Designing for Distributed Work in Research and Develo...
 
Dannys Slides
Dannys SlidesDannys Slides
Dannys Slides
 
[13 - A] Experiment validity
[13 - A] Experiment validity[13 - A] Experiment validity
[13 - A] Experiment validity
 
databases2
databases2databases2
databases2
 
Deliveroo_Edinburgh
Deliveroo_EdinburghDeliveroo_Edinburgh
Deliveroo_Edinburgh
 
Leveraging Networks May2013 for Skanska
Leveraging Networks May2013 for SkanskaLeveraging Networks May2013 for Skanska
Leveraging Networks May2013 for Skanska
 
databases3b
databases3bdatabases3b
databases3b
 
The blind spots of evaluations in academic work
The blind spots of evaluations in academic workThe blind spots of evaluations in academic work
The blind spots of evaluations in academic work
 
DMDW Unit 1.pdf
DMDW Unit 1.pdfDMDW Unit 1.pdf
DMDW Unit 1.pdf
 
CORFU-MTSR 2013
CORFU-MTSR 2013CORFU-MTSR 2013
CORFU-MTSR 2013
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)
 
Odsc machine-learning-guide-v1
Odsc machine-learning-guide-v1Odsc machine-learning-guide-v1
Odsc machine-learning-guide-v1
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for Universities
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
 
Scaling Crisismapping
Scaling CrisismappingScaling Crisismapping
Scaling Crisismapping
 
MetropolitanU.pptx
MetropolitanU.pptxMetropolitanU.pptx
MetropolitanU.pptx
 

Kürzlich hochgeladen

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 

Kürzlich hochgeladen (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 

Boosting Named Entity Extraction through Crowdsourcing

  • 1. Vrije Universiteit Amsterdam Boosting Named Entity Extraction through Crowdsourcing what goes wrong with IE tools? what can we learn from the crowd? Oana Inel 5th December 2016 1
  • 2. Vrije Universiteit Amsterdam ● work best on limited (predefined) entity types (e.g., people, places, organizations, and to some extend time) ● are all trained on different data ○ perform well only on particular type of data/entities ● their performance is highly dependent on ○ the type of input text ○ the choice of gold standards ■ gold standards are not perfect ■ large amount of training & evaluation data is needed ● similar performance, but different entities coverage ○ different confidence scores ○ different way (non-transparent) of computing it Named Entity Recognition: Observations 2
  • 3. Vrije Universiteit Amsterdam Problem: - difficult to understand the reliability of the different NER tools - difficult to choose “the best one” for your case Solution: - Combined use, e.g. NERD - However, it also has problems - On the spot reliability on other NER - Limited number of types identified - An alternative to NERD IE Tools Issues 3
  • 4. Vrije Universiteit Amsterdam 1. Choose multiple SOTA NER tools 2. Combine (aggregate) their output 3. Identify cases where the NER tools underperform 4. Correct and improve NER tools output through crowdsourcing improved ground truth Combining Machines & Crowd for NER 4
  • 5. Vrije Universiteit Amsterdam Usecase 5 5 NER tools ● NERD-ML ● TextRazor ● SemiTags ● THD ● DBpediaSpotlight Comparative analysis on: - their individual performance (output) - their combined performance (output) Using two existing gold standard datasets: - Open Knowledge Extraction (OKE) Challenge 2015 & 2016
  • 6. Vrije Universiteit Amsterdam ● OKE challenge (ESWC) 2015 ○ 101 sentences ○ 664 entities ■ Person: 304 ■ Place: 120 ■ Organization: 139 ■ Role: 103 ○ https://github.com/anuzzolese/ oke-challenge OKE 2015 & 2016 Datasets 6 ● OKE challenge (ESWC) 2016 ○ 55 sentences ○ 340 entities ■ Person: 105 ■ Place: 44 ■ Organization: 105 ■ Role: 86 ○ https://github.com/anuzzolese/ oke-challenge-2016
  • 7. Vrije Universiteit Amsterdam NER Performance: entity surface 7 ● High disagreement between the NER tools ○ Similar performance in F1, but different #FP, #TP, #FN ● Low recall, many entities missed
  • 8. Vrije Universiteit Amsterdam NER Performance: entity surface 8 ● High disagreement between the NER tools ○ Similar performance in F1, but different #FP, #TP, #FN ● NERD seems to perform the best on F1
  • 9. Vrije Universiteit Amsterdam NER Performance: entity surface 9 ● High disagreement between the NER tools ○ Similar performance in F1, but different #FP, #TP, #FN ● NERD seems to perform the best on F1 ● CombinedNER significantly higher #TP & lower #FN
  • 10. Vrije Universiteit Amsterdam NER Performance: entity surface 10 ● High disagreement between the NER tools ○ Similar performance in F1, but different #FP, #TP, #FN ● NERD seems to perform the best on F1 ● CombinedNER significantly higher #TP & lower #FN ● CombinedNER significantly higher #FP
  • 11. Vrije Universiteit Amsterdam CombinedNER vs. SOTA NER 11 The more the merrier? Is performance correlated to the number of NER tools that extracted a given named entity? ● Performance comparison: ○ Applied CrowdTruth metrics on CombinedNER ○ Likelihood of an entity to be contained in the gold standard based on how many NER tools extracted it Sentence-entity score = ratio of NER that extracted the entity
  • 12. Vrije Universiteit Amsterdam CombinedNER vs. SOTA NER 12 CombinedNER outperforms the-state-of-the-art NER tools at a sentence-entity score >= 0.4, which is also better than considering the majority vote approach
  • 13. Vrije Universiteit Amsterdam Where do NER tools fail and why? 13
  • 14. Vrije Universiteit Amsterdam NER Performance: entity surface & type 14 ● many instances of “people” were missed
  • 15. Vrije Universiteit Amsterdam Deeper look in Ground Truth: People 15 ● Personal pronouns (co-references) and possessive pronouns are considered named entities of type “person” ○ 83/85 cases (in OKE2015) ○ 26/27 cases (in OKE2016) Giulio Natta was born in Imperia, Italy. [He] earned [his] degree in chemical engineering from the Politecnico di Milano university in Milan in 1924. ● There are also errors in the ground truth ○ 1 case in OKE2015 [One of the them] was an eminent scholar at Berkeley.
  • 16. Vrije Universiteit Amsterdam NER Performance: entity surface & type 16 ● many instances of “people” were missed ● only few “places” were missed in 2015, and none in 2016
  • 17. Vrije Universiteit Amsterdam Deeper look in Ground Truth: Places 17 ● There are concatenation of multiple entities of type place, e.g: City, Country ● 4/4 cases in OKE2015 Such a man did post-doctoral work at the Salk Institute in San Diego in the laboratory of Renato Dulbecco, then worked at the Basel Institute for Immunology in [Basel, Switzerland]. but, the offsets given in the GT, do not match with the actual string ● Inconsistencies across datasets ○ In 2016, such entities were actually classified as two entities of type place
  • 18. Vrije Universiteit Amsterdam NER Performance: entity surface & type 18 ● many instances of “people” were missed ● only few “places” were missed in 2015, and none in 2016 ● many FP for entities of type “organization”
  • 19. Vrije Universiteit Amsterdam Deeper look in Ground Truth: Organization 19 ● Many entities of type “organization” are a combination of “organization” + “place” ○ NER tools tend to extract each entity granularity ○ GT does not allow for overlapping entities or multiple perspectives ■ 105/213 cases in OKE2015 ■ 62/157 cases in OKE2016 Such a man did post-doctoral work at the Salk Institute in San Diego in the laboratory of Renato Dulbecco, then worked at the [[[Basel] [Institute]] for Immunology] in Basel, Switzerland.
  • 20. Vrije Universiteit Amsterdam NER Performance: entity surface & type 20 ● many instances of “people” were missed ● only few “places” were missed in 2015, and none in 2016 ● many FP for entities of type “organization” ● several FP for entities of type “people” and “role”
  • 21. Vrije Universiteit Amsterdam Deeper look in Ground Truth: People & Role 21 ● Multiple span variations for the same entity type “person” ○ 73/92 cases in OKE2015 ○ 9/13 cases in OKE2016 The woman was awarded the Nobel Prize in Physics in 1963, which she shared with [[J.] [[Hans] D.] [Jensen]] and Eugene Wigner. ● Inconsistencies & ambiguous combinations of type “role” and “person” ○ Bishop Petronius → person ○ But, Queen Elizabeth II was not typed “person” ■ Queen → role ■ Elizabeth II → person ● Many combinations of “person” and “role”, especially when the “person” is an ethnic group (e.g., French author, Canadian citizen) ○ 9/92 cases in OKE2015 ○ 2/13 cases in OKE2016
  • 22. Vrije Universiteit Amsterdam Crowdsourcing for better Ground Truth 22 ● Crowd-driven Ground Truth Case 1: Crowd reduces the number of #FP ● For each entity that has multiple variations (span alternative) we create an entity cluster Case 2: Crowd reduces the number of #FN ● For each entity that was not extracted, we create a cluster with partial overlaps but also every other combination of words contained in the overlap ● Goal: ○ identify all the valid expressions and their types ○ decrease the number of FP and the number of FN
  • 24. Vrije Universiteit Amsterdam Results : CombinedNER vs. CombinedNER+Crowd 24 CombinedNER+Crowd outperforms CombinedNER for each crowd-entity score crowd-entity score - likelihood of an entity to be a valid entity in the dataset (based on CrowdTruth)
  • 25. Vrije Universiteit Amsterdam Conclusions ● difficult to find one NER tool that performs well ● combining the output of several NER tools results in disagreement But, ● using crowdsourcing to correct and improve their out results in a better outcome ● furthermore, the crowd can help us in identifying problems of the GT 25