The document discusses methods for measuring similarity between concepts and contexts. It describes approaches that measure conceptual similarity using structured knowledge bases like WordNet and contextual similarity using co-occurrence information from large corpora. Word sense disambiguation can be performed by finding the sense of a word most related to its neighbors based on these similarity measures. The document also discusses limitations and opportunities for improving current approaches.
Measuring Similarity Between Contexts and Concepts
1. Measuring Similarity Between Concepts and Contexts Ted Pedersen Department of Computer Science University of Minnesota, Duluth http://www.d.umn.edu/~tpederse
2.
3.
4.
5.
6.
7.
8.
9. watercraft instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork from Jiang and Conrath [1997]
10.
11. watercraft instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork
12.
13. Observed “car”... motor vehicle (327 +1) *root* (32783 + 1) minicab (6) cab (23) car (73 +1) bus (17) stock car (12)
14. Observed “stock car”... motor vehicle (328+1) *root* (32784+1) minicab (6) cab (23) car (74+1) bus (17) stock car (12+1)
15. After Counting Concepts... motor vehicle (329) IC = 1.998 *root* (32785) minicab (6) cab (23) car (75) bus (17) stock car (13) IC = 3.042
43. Name Conflated Data 51.4% 231,069 JapAnce 112,357 France 118,712 Japan 53.9% 46,431 JorGypt 21,762 Egyptian 25,539 Jordan 56.0% 13,734 MonSlo 6,176 Slobodan Milosovic 7,846 Shimon Peres 58.6% 5,807 MSIIBM 2,406 IBM 3,401 Microsoft 73.7% 4,073 JikRol 1,071 Rolf Ekeus 3,002 Tajik 69.3% 2,452 RoBeck 740 David Beckham 1,652 Ronaldo Maj. Total New Count Name Count Name