2. Bibliography
● Ravi Sinha and Rada Mihalcea, Unsupervised Graph-based
Word Sense Disambiguation Using Measures of Word Semantic
Similarity, In Proceedings of the IEEE International Conference on
Semantic Computing (ICSC 2007), Irvine, CA, September 2007
● Rada Mihalcea, Unsupervised Large-Vocabulary Word Sense
Disambiguation with Graph-based Algorithms for Sequence Data
Labeling, In Proceedings of the Joint Conference on Human
Language Technology / Empirical Methods in Natural Language
Processing (HLT/EMNLP), Vancouver, October, 2005
Tăbăranu Elena-Oana 2
3. Plan
1. Introduction
2. Graph-based Centrality for WSD
3. Measures of Semantic Similarity
4. Graph-based Centrality Algorithms
5. Demo
Tăbăranu Elena-Oana 3
4. 1. Introduction
● WSD = assign automatically the most
appropriate meaning to a polysemous word within
a given context
● Example:
1. The plant is producing far too little to sustain its operation for more
than a year. (fabrică)
2. An overabundance of oxygen was produced by the plant in the third
week of the study. (plantă)
Tăbăranu Elena-Oana 4
5. 2. Graph-based Centrality for
WSD(I)
● GWSD = graph representation used to model word sense
dependencies in text (WSD with graphs, not just word
window)
● Goal: identify the most probable sense (label) for each word
Tăbăranu Elena-Oana 5
7. Example
The church bells no longer rung on Sundays.
● church
1: one of the groups of Christians who have their own beliefs and forms of
worship
2: a place for public (especially Christian) worship
3: a service conducted in a church
● bell
1: a hollow device made of metal that makes a ringing sound when struck
2: a push button at an outer door that gives a ringing or buzzing signal when
pushed
3: the sound of a bell
● ring
1: make a ringing sound
2: ring or echo with sound
3: make (bells) ring, often for the purposes of musical edification
● Sunday
1: first day of the week; observed as a day of rest and worship
by most Christians
Tăbăranu Elena-Oana 7
8. 3.Measures of Semantic Similarity
● Quantify the degree to which two words are semantically related
using information drawn from semantic networks
● Word similarity measures
1. Leacock & Chodorow
2. Leck
3. Wu and Palmer
4. Resnik
5. Lin
6. Jiang & Conrath
● Tăbăranu Elena-Oana 8
10. 5. Demo - GWSD
Dependencies
●
● WordNet (semantic hierarchy)
● WordNet::QueryData
● WordNet::Similarity(implementation of similarity measures)
Input
●
● Senseval-2, Senseval-3 datasets in Semcor format
GWSD improvements
●
● Combine similarity measures(jcn for nouns, lch for verbs,
lesk for other parts of speech)
● Voting system between 4 centrality algorithms
Tăbăranu Elena-Oana 10