This document provides an overview of Martin Thorsen Ranang's trial lecture on statistics-based approaches to lexical semantics. The lecture covers the definition of lexical semantics, applications to natural language processing, and Ranang's PhD research mapping words to WordNet concepts. It then discusses key statistics-based approaches like word sense disambiguation, vector space models, dimensionality reduction, and ontology merging, providing examples of each.
1. Statistics-based Approaches to Lexical
Semantics
Martin Thorsen Ranang
Department of Computer and Information Science (IDI)
Trial Lecture, February 5th 2010
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
2. 2
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
3. 3
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
4. 4
Lexical Semantics
— “The study of how and what the words of a language
denote.” (Pustejovsky, 1998)
— lexical semantic relations like: synonymy, antonymy (“close vs.
distant”), hypo-/hypernymy (“car vs. vehicle”)
— polysemy (lexical ambiguity)
— selectional restrictions: “Joe ate <. . . > in a hurry.”
— Typical resources:
• Dictionaries, Machine Readable Dictionaries (MRDs) (Wilks
et al., 1996)
• Ontologies and Semantic Networks
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
5. 5
The Distributional Hypothesis
— “You shall know a word by the company it keeps.” Firth (1957).
— “There is a positive relationship between the degree of
synonymy (semantic similarity) existing between a pair of
words and the degree to which their contexts are
similar.” (Rubenstein and Goodenough, 1965)
— “The meaning of entities, and the meaning of grammatical
relations among them, is related to the restriction of
combinations of these entities relative to other
entities.” (Harris, 1968)
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
6. 6
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
7. 7
Example Areas
— Word Sense Disambiguation (WSD)
— Natural Language Understanding (NLU) and Text
Interpretation (TI)
— Machine Translation (MT)
— Information Retrieval (IR)
What parts of of Natural Language Processing (NLP) are not
affected by Lexical Semantics?
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
8. 8
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
9. 9
My PhD Research
— Developed a method for automatically mapping words from
languages other than English to concepts in the Princeton
WordNet by Miller et al. (1990); Fellbaum (1998)
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
10. 10
WordNet Example
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
11. 11
Why Statistics-based?
— Frequencies of actual language usage
— Adapts to changes of the above
— Well suited to provide generalizations and to summarize
features of huge text corpora.
(Manning and Schütze, 1999)
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
12. 12
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
13. 13
Word Sense Disambiguation (WSD)
Morone saxatilis
Tones of low
Bass frequency
Marchione bass
guitar
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
14. 14
Usage Context
— “He fished for bass using scented attractants.”
— “Joe played the bass fluently, while George played the piano.”
— “When the neighbors play their music I can’t hear the tune but
can hear the bass tones.”
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
15. 15
Word Sense Disambiguation (WSD)
— Two main approaches:
Integrated approach: postponed until semantic analysis;
elimination of ill-formed semantic representations
Stand-alone approach: independent of, and prior to
compositional semantic analysis; more often
statistics-based
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
16. 16
Statistics-based Stand-alone
Approaches I
Supervised learning
Training: sense-tagged corpus; naïve Bayesian
classifiers; feature vectors; “sliding
window”
Feature vectors represent local context,
and may include words and POS.
Application: Use the trained classifier on unseen
ambiguous words, given a local-context
feature vector
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
17. 17
Statistics-based Stand-alone
Approaches II
Bootstrapping
small number of training instances used as seeds;
classifier trained through supervised learning
Unsupervised disambiguation
sense-discrimination, not sense tagging; groups of
similar words, based on their local-context
Dictionary-based approach
Count overlap between sliding window and dictionary
definition of candidate senses.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
18. 18
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
19. 19
Vector Space Model (Salton, 1971)
Term Frequency:
ni,j Importance of term i
tfi,j =
k nk ,j to doc j
Inverse Document Frequency:
|D| Common words are
idfi = log less descriptive
|{d : ti ∈ d}|
Vector elements:
wi,j = tfi,j · idfi v
2 1
v2 ... vd
3
w1,1 w1,2 ... w1,d
Weight vector for doc d: 6 w2,1
6 w2,2 ... w2,d 7 7
4. . . . . . . . . . . . . . . . . . . . . . .5
vd = wN,1 wN,2 ... wN,d
[w1,d , w2,d , . . . , wN,d ]T
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
20. 20
Vector Space Model
Astronaut
Rocket
Cosmonaut
— Enables comparison with other documents, based on content.
— Does it really describe a document’s meaning?
— Restrictions?
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
21. 21
Semantic Augmentation of the Vector
Space Model
Several attempts to improve document retrieval efficiency by
incorporating lexical semantic information:
— Voorhees (1994, 1998)
— Moldovan and Mihalcea (2000)
— Buscaldi et al. (2005)
No, or small, improvements to IR; some improvement for document
classification.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
22. 22
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
23. 23
Latent Semantic Analysis (LSA) /
Indexing (LSI)
— Discrete entities are mapped onto a continuous vector space;
— the mapping is determined by global correlation patterns; and
— Dimensionality reduction is an integral part of the process
(Landauer and Dumais, 1997; Ando, 2000; Bellegarda, 2007)
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
24. 24
Dimensionality Reduction
— Singular Value Decomposition
{0.65 Cosmonaut,
0.35 Astronaut} Rocket
Quantitative evaluation of different semantic word space models:
Van de Cruys (2010)
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
25. 25
Outline
Introduction
What is Lexical Semantics?
Natural Language Processing (NLP) Applications
My PhD Research
Statistics-based Approaches to Lexical Semantics
Word Sense Disambiguation (WSD)
Vector Space Model (VSM)
Dimensionality Reduction
Ontology Merging and Alignment
Summary
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
26. 26
Ontology Matching
— Lacher and Groh (2001) used signature tfidf vectors for
computing similarity between two ontology nodes.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
27. 27
Summary
— Lexical semantics
— How this relates to my PhD research
— Examples of statistics-based approaches to Lexical
Semantics, including:
• different Word Sense Disambiguation techniques
• semantic augmentation of the vector space model
• how LSA/dimensionality reduction of vector spaces handles
synonymy
• how statistics-based similarity measures are used to align and
merge ontologies
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
28. 28
References I
Ando, Rie Kubota. 2000. Latent semantic space: Iterative scaling
improves precision of inter-document similarity measurement. In
SIGIR’00.
Bellegarda, Jerome R. 2007. Latent Semantic Mapping: Principles
& Applications, vol. 3 of Synthesis Lectures on Speech and
Audio Processing. Morgan & Claypool Publishers.
Buscaldi, D., P. Rosso, and E.S. Arnal. 2005. A WordNet-based
query expansion method for geographical information retrieval.
In Working Notes for the CLEF Workshop.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
29. 29
References II
Van de Cruys, Tim. 2010. A quantitative evaluation of semantic
word space models. In Computational Linguistics In The
Netherlands (CLIN) 20. Utrecht, Netherlands.
Fellbaum, Christiane, ed. 1998. WordNet: An electronic lexical
database. Language, Speech, and Communication, Cambridge,
Massachusetts, USA: The MIT Press.
Firth, John Rupert. 1957. Papers in linguistics 1934–1951. Oxford,
UK: Oxford University Press.
Harris, Zellig Sabbettai. 1968. Mathematical structures of
language. Krieger Publishing Company.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
30. 30
References III
Lacher, Martin S., and Georg Groh. 2001. Facilitating the
exchange of explicit knowledge through ontology mappings. In
Proceedings of the fourteenth international florida artificial
intelligence research society conference, 305–309. AAAI Press.
Landauer, Thomas K., and Susan T. Dumais. 1997. A solution to
Plato’s problem: The latent semantic analysis theory of
acquisition, induction and representation of knowledge.
Psychological Review (104):211–240.
Manning, Christopher D., and Hinrich Schütze. 1999. Foundations
of statistical natural language processing. Cambridge,
Massachusetts, USA: The MIT Press.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
31. 31
References IV
Miller, George A., Richard Beckwith, Christiane Fellbaum, Derek
Gross, and Katherine J. Miller. 1990. Introduction to WordNet:
an on-line lexical database. International Journal of
Lexicography 3(4):235–244. (Revised August 1993).
Moldovan, Dan I., and Rada Mihalcea. 2000. Using WordNet and
lexical operators to improve Internet searches. Internet
Computing, IEEE 4:34–43.
Pustejovsky, James. 1998. The generative lexicon. Cambridge,
Massachusetts, USA: The MIT Press.
Rubenstein, Herbert, and John B. Goodenough. 1965. Contextual
correlates of synonymy. Commun. ACM 8(10):627–633.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics
32. 32
References V
Salton, Gerard, ed. 1971. The smart retrieval system: Experiments
in automatic document processing. Englewood Cliffs, NJ:
Prentice-Hall.
Voorhees, Ellen M. 1994. Query expansion using lexical-semantic
relations. In SIGIR’94: Proceedings of the 17th Annual
International ACM SIGIR Conference on Research and
Development in Information Retrieval, 61–69.
———. 1998. Using WordNet for text retrieval. In Fellbaum (1998),
chap. 12, 285–304.
Wilks, Yorick, Louise Guthrie, and Brian M. Slator. 1996. Electric
words: Dictionaries, computers, and meanings. Cambridge,
Massachusetts, USA: The MIT Press.
www.ntnu.no Martin Thorsen Ranang, Statistics-based Approaches to Lexical Semantics