SlideShare a Scribd company logo
1 of 24
Error analysis ofWord
Sense Disambiguation
Ruben Izquierdo
Marten Postma
PiekVossen
Izquierdo,PostmaandVossen
VUAmsterdam
Motivation
 Word Sense Disambiguation is still an unsolved problem
2 Izquierdo, Postma and Vossen VU Amsterdam
Error Analysis
 Perform error analysis on previousWSD evaluations to prove
our hypothesis
 Senseval-2: all-words task
 Senseval-3: all-words task
 Semeval2007: all-words task (#17)
 Semeval2010: all-words on specific domain (#17)
 Semeval2013: multilingual all-wordsWSD and entity linking
(#12)
3 Izquierdo, Postma and Vossen VU Amsterdam
Motivation
 Some “propagated” errors
 Errors on monosemous
 Errors because pos-tags
 Multiwords and phrasal verbs
 Little attention has been paid to the real problem
 WSD is not 1 problem but N problems
 Our hypothesis
 Context is not modeled properly in general
 System rely too much on the most frequent sense
4 Izquierdo, Postma and Vossen VU Amsterdam
Monosemous errors
5 Izquierdo, Postma and Vossen VU Amsterdam
Monosemous errors
6 Izquierdo, Postma and Vossen VU Amsterdam
Competition Monosemous Wrong Examples
Senseval2 499 (20.9%) 37.5% gene.n (suppressor_gene.n), chance.a
(chance.n) next.r (next.a)
Senseval3 334 (16.6%) 44.1% Datum.n (data.n) making.n (make.v)
out_of_sight (sight)
Semeval2007 25 (5.5%) 11.1% get_stuck.v, lack.v, write_about.v
Semeval2010 31 (2.2%) 97.9% Tidal_zone.n pine_marten.n roe_deer.n
cordgrass.n
Semeval2013
(lemmas)
348 (21.1%) 1.9% Private_enterprise, developing_country,
narrow_margin
Most Frequent Sense
7 Izquierdo, Postma and Vossen VU Amsterdam
Most Frequent Sense
 When the correct sense is NOT the most frequent sense
 Systems still assign mostly the MFS
 Senseval2
 799 tokens are not MFS
 84% systems still assign the MFS
 Most “failed” words due to MFS bias
 Senseval2, senseval3
 Say.v find.v take.v have.v cell.n church.n
 Semeval2010
 Area.n nature.n connection.n water.n population.n
8 Izquierdo, Postma and Vossen VU Amsterdam
Analysis per PoS-tag
9 Izquierdo, Postma and Vossen VU Amsterdam
Analysis per polysemy class
10 Izquierdo, Postma and Vossen VU Amsterdam
2Senses
Poly. C.
6 15
Low Medium High
Analysis per frequency class
11 Izquierdo, Postma and Vossen VU Amsterdam
Most difficult words
12 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observed
difficulties
 Calculate per sentence
 The “expected” difficulty
 Average polysemy, sentence length, average word length
13 Izquierdo, Postma and Vossen VU Amsterdam
 Calculate per sentence
 The “expected” difficulty
 Average polysemy, sentence length, average word length
14 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observed
difficulties
 Calculate per sentence
 The “expected” difficulty
 Average polysemy, sentence length, average wor length
 The “observed” difficulty
 From the real participant outputs, average error rate
 We should expect:
harder sentences  higher error rate
easier sentences   lower error rate
15 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observed
difficulties
16 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observed
difficulties
17 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observed
difficulties
• The context is not (probably) exploited properly
• Expected “easy” sentences SHOULD show low error rates
• Occurrences of the same word in different contexts have similar error
rate
• The difficulty of a word depends more on its polysemy than on the
context where it appears
18 Izquierdo, Postma and Vossen VU Amsterdam
Expected vs. Observed
difficulties
WSD Corpora
http://github.com/rubenIzquierdo/wsd_corpora
19 Izquierdo, Postma and Vossen VU Amsterdam
WSD Corpora
20 Izquierdo, Postma and Vossen VU Amsterdam
System Outputs
https://github.com/rubenIzquierdo/sval_systems
21 Izquierdo, Postma and Vossen VU Amsterdam
System Outputs
22 Izquierdo, Postma and Vossen VU Amsterdam
Error analysis of
Word Sense Disambiguation
Ruben Izquierdo
Marten Postma
PiekVossen
ruben.izquierdobevia@vu.nl
http://github.com/rubenIzquierdo/wsd_corpora
http://github.com/rubenIzquierdo/sval_systems
23
Analysis per PoS-tag
24 Izquierdo, Postma and Vossen VU Amsterdam

More Related Content

Viewers also liked

A word sense disambiguation technique for sinhala
A word sense disambiguation technique  for sinhalaA word sense disambiguation technique  for sinhala
A word sense disambiguation technique for sinhala
Vijayindu Gamage
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
vini89
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
Seid Hassen
 
PhD defense Koen Deschacht
PhD defense Koen DeschachtPhD defense Koen Deschacht
PhD defense Koen Deschacht
guest1add48f
 
Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]
akm sabbir
 
Ontology-Based Word Sense Disambiguation for Scientific Literature
Ontology-Based Word Sense Disambiguation for Scientific LiteratureOntology-Based Word Sense Disambiguation for Scientific Literature
Ontology-Based Word Sense Disambiguation for Scientific Literature
eXascale Infolab
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
butest
 
Words - Morphology Presentation- Dr. Shadia Y. Banjar
Words -  Morphology Presentation- Dr. Shadia Y. BanjarWords -  Morphology Presentation- Dr. Shadia Y. Banjar
Words - Morphology Presentation- Dr. Shadia Y. Banjar
Dr. Shadia Banjar
 

Viewers also liked (20)

Draft programme 15 09-2015
Draft programme 15 09-2015Draft programme 15 09-2015
Draft programme 15 09-2015
 
A word sense disambiguation technique for sinhala
A word sense disambiguation technique  for sinhalaA word sense disambiguation technique  for sinhala
A word sense disambiguation technique for sinhala
 
Graph-based Word Sense Disambiguation
Graph-based Word Sense DisambiguationGraph-based Word Sense Disambiguation
Graph-based Word Sense Disambiguation
 
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
COLING 2014 - An Enhanced Lesk Word Sense Disambiguation Algorithm through a ...
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
 
Thesis
ThesisThesis
Thesis
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
 
Zoological nomenclature
Zoological nomenclatureZoological nomenclature
Zoological nomenclature
 
PhD defense Koen Deschacht
PhD defense Koen DeschachtPhD defense Koen Deschacht
PhD defense Koen Deschacht
 
Word-sense disambiguation
Word-sense disambiguationWord-sense disambiguation
Word-sense disambiguation
 
Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]Biomedical Word Sense Disambiguation presentation [Autosaved]
Biomedical Word Sense Disambiguation presentation [Autosaved]
 
Ontology-Based Word Sense Disambiguation for Scientific Literature
Ontology-Based Word Sense Disambiguation for Scientific LiteratureOntology-Based Word Sense Disambiguation for Scientific Literature
Ontology-Based Word Sense Disambiguation for Scientific Literature
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Semantic annotation of biomedical data
Semantic annotation of biomedical dataSemantic annotation of biomedical data
Semantic annotation of biomedical data
 
presentation on Rotavator
presentation on Rotavatorpresentation on Rotavator
presentation on Rotavator
 
Babelfy: Entity Linking meets Word Sense Disambiguation.
Babelfy: Entity Linking meets Word Sense Disambiguation.Babelfy: Entity Linking meets Word Sense Disambiguation.
Babelfy: Entity Linking meets Word Sense Disambiguation.
 
Words - Morphology Presentation- Dr. Shadia Y. Banjar
Words -  Morphology Presentation- Dr. Shadia Y. BanjarWords -  Morphology Presentation- Dr. Shadia Y. Banjar
Words - Morphology Presentation- Dr. Shadia Y. Banjar
 
Logic
LogicLogic
Logic
 
Sifting Social Data: Word Sense Disambiguation Using Machine Learning
Sifting Social Data: Word Sense Disambiguation Using Machine LearningSifting Social Data: Word Sense Disambiguation Using Machine Learning
Sifting Social Data: Word Sense Disambiguation Using Machine Learning
 

More from Rubén Izquierdo Beviá

CLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFCLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRF
Rubén Izquierdo Beviá
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
Rubén Izquierdo Beviá
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
Rubén Izquierdo Beviá
 

More from Rubén Izquierdo Beviá (17)

ULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of Ambiguity
 
DutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsDutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systems
 
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
 
Topic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusTopic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpus
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Juan Calvino y el Calvinismo
Juan Calvino y el CalvinismoJuan Calvino y el Calvinismo
Juan Calvino y el Calvinismo
 
KafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF filesKafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF files
 
CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)
 
CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)
 
CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)
 
CLTL Software and Web Services
CLTL Software and Web Services CLTL Software and Web Services
CLTL Software and Web Services
 
Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)
 
ULM1 - The borders of Ambiguity
ULM1 - The borders of AmbiguityULM1 - The borders of Ambiguity
ULM1 - The borders of Ambiguity
 
CLTL: Description of web services and sofware. Nijmegen 2013
CLTL: Description of web services and sofware. Nijmegen 2013CLTL: Description of web services and sofware. Nijmegen 2013
CLTL: Description of web services and sofware. Nijmegen 2013
 
CLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFCLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRF
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
 

Recently uploaded

If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
amilabibi1
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 

Recently uploaded (18)

Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 

Error analysis of Word Sense Disambiguation

  • 1. Error analysis ofWord Sense Disambiguation Ruben Izquierdo Marten Postma PiekVossen Izquierdo,PostmaandVossen VUAmsterdam
  • 2. Motivation  Word Sense Disambiguation is still an unsolved problem 2 Izquierdo, Postma and Vossen VU Amsterdam
  • 3. Error Analysis  Perform error analysis on previousWSD evaluations to prove our hypothesis  Senseval-2: all-words task  Senseval-3: all-words task  Semeval2007: all-words task (#17)  Semeval2010: all-words on specific domain (#17)  Semeval2013: multilingual all-wordsWSD and entity linking (#12) 3 Izquierdo, Postma and Vossen VU Amsterdam
  • 4. Motivation  Some “propagated” errors  Errors on monosemous  Errors because pos-tags  Multiwords and phrasal verbs  Little attention has been paid to the real problem  WSD is not 1 problem but N problems  Our hypothesis  Context is not modeled properly in general  System rely too much on the most frequent sense 4 Izquierdo, Postma and Vossen VU Amsterdam
  • 5. Monosemous errors 5 Izquierdo, Postma and Vossen VU Amsterdam
  • 6. Monosemous errors 6 Izquierdo, Postma and Vossen VU Amsterdam Competition Monosemous Wrong Examples Senseval2 499 (20.9%) 37.5% gene.n (suppressor_gene.n), chance.a (chance.n) next.r (next.a) Senseval3 334 (16.6%) 44.1% Datum.n (data.n) making.n (make.v) out_of_sight (sight) Semeval2007 25 (5.5%) 11.1% get_stuck.v, lack.v, write_about.v Semeval2010 31 (2.2%) 97.9% Tidal_zone.n pine_marten.n roe_deer.n cordgrass.n Semeval2013 (lemmas) 348 (21.1%) 1.9% Private_enterprise, developing_country, narrow_margin
  • 7. Most Frequent Sense 7 Izquierdo, Postma and Vossen VU Amsterdam
  • 8. Most Frequent Sense  When the correct sense is NOT the most frequent sense  Systems still assign mostly the MFS  Senseval2  799 tokens are not MFS  84% systems still assign the MFS  Most “failed” words due to MFS bias  Senseval2, senseval3  Say.v find.v take.v have.v cell.n church.n  Semeval2010  Area.n nature.n connection.n water.n population.n 8 Izquierdo, Postma and Vossen VU Amsterdam
  • 9. Analysis per PoS-tag 9 Izquierdo, Postma and Vossen VU Amsterdam
  • 10. Analysis per polysemy class 10 Izquierdo, Postma and Vossen VU Amsterdam 2Senses Poly. C. 6 15 Low Medium High
  • 11. Analysis per frequency class 11 Izquierdo, Postma and Vossen VU Amsterdam
  • 12. Most difficult words 12 Izquierdo, Postma and Vossen VU Amsterdam
  • 13. Expected vs. Observed difficulties  Calculate per sentence  The “expected” difficulty  Average polysemy, sentence length, average word length 13 Izquierdo, Postma and Vossen VU Amsterdam
  • 14.  Calculate per sentence  The “expected” difficulty  Average polysemy, sentence length, average word length 14 Izquierdo, Postma and Vossen VU Amsterdam Expected vs. Observed difficulties
  • 15.  Calculate per sentence  The “expected” difficulty  Average polysemy, sentence length, average wor length  The “observed” difficulty  From the real participant outputs, average error rate  We should expect: harder sentences  higher error rate easier sentences   lower error rate 15 Izquierdo, Postma and Vossen VU Amsterdam Expected vs. Observed difficulties
  • 16. 16 Izquierdo, Postma and Vossen VU Amsterdam Expected vs. Observed difficulties
  • 17. 17 Izquierdo, Postma and Vossen VU Amsterdam Expected vs. Observed difficulties
  • 18. • The context is not (probably) exploited properly • Expected “easy” sentences SHOULD show low error rates • Occurrences of the same word in different contexts have similar error rate • The difficulty of a word depends more on its polysemy than on the context where it appears 18 Izquierdo, Postma and Vossen VU Amsterdam Expected vs. Observed difficulties
  • 20. WSD Corpora 20 Izquierdo, Postma and Vossen VU Amsterdam
  • 22. System Outputs 22 Izquierdo, Postma and Vossen VU Amsterdam
  • 23. Error analysis of Word Sense Disambiguation Ruben Izquierdo Marten Postma PiekVossen ruben.izquierdobevia@vu.nl http://github.com/rubenIzquierdo/wsd_corpora http://github.com/rubenIzquierdo/sval_systems 23
  • 24. Analysis per PoS-tag 24 Izquierdo, Postma and Vossen VU Amsterdam

Editor's Notes

  1. Relative freq (norvig method) <0.01  low 0.01 -= 0.05  medium > 0.05 high