SlideShare a Scribd company logo
1 of 47
Measuring Similarity Between  Concepts and Contexts Ted Pedersen  Department of Computer Science University of Minnesota, Duluth http://www.d.umn.edu/~tpederse
The problems… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Similarity and Relatedness ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The approaches… ,[object Object],[object Object],[object Object]
Why measure conceptual similarity?  ,[object Object],[object Object],[object Object],[object Object]
Word Sense Disambiguation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SenseRelate ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
WordNet::Similarity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
watercraft  instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork from Jiang and Conrath [1997]
Path Finding ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
watercraft  instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork
Information Content ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Observed “car”... motor vehicle (327 +1) *root* (32783 + 1) minicab (6) cab (23) car (73 +1) bus (17) stock car (12)
Observed “stock car”... motor vehicle (328+1) *root* (32784+1) minicab (6) cab (23) car (74+1) bus (17) stock car (12+1)
After Counting Concepts...  motor vehicle (329) IC = 1.998 *root* (32785) minicab (6) cab (23) car (75) bus (17) stock car (13) IC = 3.042
Similarity and Information Content ,[object Object],[object Object],[object Object]
Why doesn’t this  solve problem? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using Dictionary Glosses  to Measure Relatedness ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Context/Gloss Vectors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Gloss/Context Vectors
Experiment ,[object Object],[object Object],[object Object],[object Object]
Results ,[object Object],[object Object],[object Object],[object Object]
Why this doesn’t  solve the problem.. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Knowledge Lean Methods ,[object Object],[object Object]
Word Sense Discrimination ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Name Discrimination ,[object Object],[object Object],[object Object],[object Object]
 
 
 
 
Objective ,[object Object],[object Object],[object Object]
Similarity of Context?  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Feature Selection ,[object Object],[object Object],[object Object],[object Object],[object Object]
Second Order Context Representation ,[object Object],[object Object],[object Object],[object Object],[object Object]
2 nd  Order Context Vectors ,[object Object],0 6272.85 2.9133 62.6084 20.032 1176.84 51.021 O2 context 0 18818.55 0 0 0 205.5469 134.5102 guy 0 0 0 136.0441 29.576 0 0 Oscar 0 0 8.7399 51.7812 30.520 3324.98 18.5533 won needle family war movie actor football baseball
Limitations of 2 nd  order  0 52.27 0 0.92 0 4.21 0 28.72 0 3.24 0 1.28 0 2.53 Weapon Missile Shoot Fire Destroy Murder Kill 17.77 0 14.6 46.2 22.1 0 34.2 19.23 2.36 0 72.7 0 1.28 2.56 Execute Command Bomb Pipe Fire CD Burn
Singular Value Decomposition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
After context representation… ,[object Object],[object Object],[object Object],[object Object],[object Object]
Evaluation  (before mapping) c1 c2 c4 c3 2 1 15 2 C4 6 1 1 2 C3 1 7 1 1 C2 2 3 0 10 C1
Evaluation  (after mapping) Agreement=38/55=0.69 20 15 2 1 2 C4 17 1 1 0 55 11 12 15 10 6 1 2 C3 10 1 7 1 C2 15 2 3 10 C1
Majority Sense Classifier Maj. =17/55=0.31
Experimental Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Name Conflated Data 51.4% 231,069 JapAnce 112,357 France 118,712 Japan 53.9% 46,431 JorGypt 21,762 Egyptian 25,539 Jordan 56.0% 13,734 MonSlo 6,176 Slobodan Milosovic 7,846 Shimon Peres 58.6% 5,807 MSIIBM 2,406 IBM 3,401 Microsoft 73.7% 4,073 JikRol 1,071 Rolf Ekeus 3,002 Tajik 69.3% 2,452 RoBeck 740 David Beckham 1,652 Ronaldo Maj. Total New Count Name Count Name
50.3 50.3 51.1 51.1 51.4 231,069 JapAnce 53.0 57.0 59.1 56.6 53.9 46,431 JorGypt 91.4 54.6 96.6 62.8 56.0 13,734 MonSLo 60.0 68.0 51.3 47.7 58.6 5,807 MSIIBM 90.4 91.0 96.2 94.7 73.7 4,073 JikRol 54.7 85.9 72.7 57.3 69.3 2,452 Robeck Ft 20 Ft 5 Ft 20 Ft 5 Maj. #  Cxt 20 Cxt 5
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Ongoing work ,[object Object],[object Object],[object Object],[object Object]
Thanks to… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

Similar to Measuring Similarity Between Contexts and Concepts

CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
butest
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
Nikhil Jaiswal
 

Similar to Measuring Similarity Between Contexts and Concepts (20)

Icon 2007 Pedersen
Icon 2007 PedersenIcon 2007 Pedersen
Icon 2007 Pedersen
 
The Semantic Quilt
The Semantic QuiltThe Semantic Quilt
The Semantic Quilt
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Eacl 2006 Pedersen
Eacl 2006 PedersenEacl 2006 Pedersen
Eacl 2006 Pedersen
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Ijcai 2007 Pedersen
Ijcai 2007 PedersenIjcai 2007 Pedersen
Ijcai 2007 Pedersen
 
Cicling2005
Cicling2005Cicling2005
Cicling2005
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012
 
Aaai 2006 Pedersen
Aaai 2006 PedersenAaai 2006 Pedersen
Aaai 2006 Pedersen
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Context Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsContext Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word Pairs
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
Class14
Class14Class14
Class14
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011
 
Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介
Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介
Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介
 

More from University of Minnesota, Duluth

More from University of Minnesota, Duluth (20)

Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
 
Automatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social MediaAutomatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social Media
 
What Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshopWhat Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshop
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it?
 
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?
 
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
 
Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...
 
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
 
Puns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyPuns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and weary
 
The horizon isn't found in a dictionary : Identifying emerging word senses a...
The horizon isn't found in a  dictionary : Identifying emerging word senses a...The horizon isn't found in a  dictionary : Identifying emerging word senses a...
The horizon isn't found in a dictionary : Identifying emerging word senses a...
 
Screening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSDScreening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSD
 
Duluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyDuluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of Lexicography
 
Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014
 
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
 
What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)
 
Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25
 
Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24
 
Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013
 
Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1
 
Pedersen ACL Disco-2011 workshop
Pedersen ACL Disco-2011 workshopPedersen ACL Disco-2011 workshop
Pedersen ACL Disco-2011 workshop
 

Recently uploaded

Recently uploaded (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Measuring Similarity Between Contexts and Concepts

  • 1. Measuring Similarity Between Concepts and Contexts Ted Pedersen Department of Computer Science University of Minnesota, Duluth http://www.d.umn.edu/~tpederse
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. watercraft instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork from Jiang and Conrath [1997]
  • 10.
  • 11. watercraft instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork
  • 12.
  • 13. Observed “car”... motor vehicle (327 +1) *root* (32783 + 1) minicab (6) cab (23) car (73 +1) bus (17) stock car (12)
  • 14. Observed “stock car”... motor vehicle (328+1) *root* (32784+1) minicab (6) cab (23) car (74+1) bus (17) stock car (12+1)
  • 15. After Counting Concepts... motor vehicle (329) IC = 1.998 *root* (32785) minicab (6) cab (23) car (75) bus (17) stock car (13) IC = 3.042
  • 16.
  • 17.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.  
  • 28.  
  • 29.  
  • 30.  
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. Limitations of 2 nd order 0 52.27 0 0.92 0 4.21 0 28.72 0 3.24 0 1.28 0 2.53 Weapon Missile Shoot Fire Destroy Murder Kill 17.77 0 14.6 46.2 22.1 0 34.2 19.23 2.36 0 72.7 0 1.28 2.56 Execute Command Bomb Pipe Fire CD Burn
  • 37.
  • 38.
  • 39. Evaluation (before mapping) c1 c2 c4 c3 2 1 15 2 C4 6 1 1 2 C3 1 7 1 1 C2 2 3 0 10 C1
  • 40. Evaluation (after mapping) Agreement=38/55=0.69 20 15 2 1 2 C4 17 1 1 0 55 11 12 15 10 6 1 2 C3 10 1 7 1 C2 15 2 3 10 C1
  • 41. Majority Sense Classifier Maj. =17/55=0.31
  • 42.
  • 43. Name Conflated Data 51.4% 231,069 JapAnce 112,357 France 118,712 Japan 53.9% 46,431 JorGypt 21,762 Egyptian 25,539 Jordan 56.0% 13,734 MonSlo 6,176 Slobodan Milosovic 7,846 Shimon Peres 58.6% 5,807 MSIIBM 2,406 IBM 3,401 Microsoft 73.7% 4,073 JikRol 1,071 Rolf Ekeus 3,002 Tajik 69.3% 2,452 RoBeck 740 David Beckham 1,652 Ronaldo Maj. Total New Count Name Count Name
  • 44. 50.3 50.3 51.1 51.1 51.4 231,069 JapAnce 53.0 57.0 59.1 56.6 53.9 46,431 JorGypt 91.4 54.6 96.6 62.8 56.0 13,734 MonSLo 60.0 68.0 51.3 47.7 58.6 5,807 MSIIBM 90.4 91.0 96.2 94.7 73.7 4,073 JikRol 54.7 85.9 72.7 57.3 69.3 2,452 Robeck Ft 20 Ft 5 Ft 20 Ft 5 Maj. # Cxt 20 Cxt 5
  • 45.
  • 46.
  • 47.