SlideShare a Scribd company logo
1 of 47
Measuring Similarity Between  Concepts and Contexts Ted Pedersen  Department of Computer Science University of Minnesota, Duluth http://www.d.umn.edu/~tpederse
The problems… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Similarity and Relatedness ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The approaches… ,[object Object],[object Object],[object Object]
Why measure conceptual similarity?  ,[object Object],[object Object],[object Object],[object Object]
Word Sense Disambiguation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SenseRelate ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
WordNet::Similarity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
watercraft  instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork from Jiang and Conrath [1997]
Path Finding ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
watercraft  instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork
Information Content ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Observed “car”... motor vehicle (327 +1) *root* (32783 + 1) minicab (6) cab (23) car (73 +1) bus (17) stock car (12)
Observed “stock car”... motor vehicle (328+1) *root* (32784+1) minicab (6) cab (23) car (74+1) bus (17) stock car (12+1)
After Counting Concepts...  motor vehicle (329) IC = 1.998 *root* (32785) minicab (6) cab (23) car (75) bus (17) stock car (13) IC = 3.042
Similarity and Information Content ,[object Object],[object Object],[object Object]
Why doesn’t this  solve problem? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using Dictionary Glosses  to Measure Relatedness ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Context/Gloss Vectors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Gloss/Context Vectors
Experiment ,[object Object],[object Object],[object Object],[object Object]
Results ,[object Object],[object Object],[object Object],[object Object]
Why this doesn’t  solve the problem.. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Knowledge Lean Methods ,[object Object],[object Object]
Word Sense Discrimination ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Name Discrimination ,[object Object],[object Object],[object Object],[object Object]
 
 
 
 
Objective ,[object Object],[object Object],[object Object]
Similarity of Context?  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Feature Selection ,[object Object],[object Object],[object Object],[object Object],[object Object]
Second Order Context Representation ,[object Object],[object Object],[object Object],[object Object],[object Object]
2 nd  Order Context Vectors ,[object Object],0 6272.85 2.9133 62.6084 20.032 1176.84 51.021 O2 context 0 18818.55 0 0 0 205.5469 134.5102 guy 0 0 0 136.0441 29.576 0 0 Oscar 0 0 8.7399 51.7812 30.520 3324.98 18.5533 won needle family war movie actor football baseball
Limitations of 2 nd  order  0 52.27 0 0.92 0 4.21 0 28.72 0 3.24 0 1.28 0 2.53 Weapon Missile Shoot Fire Destroy Murder Kill 17.77 0 14.6 46.2 22.1 0 34.2 19.23 2.36 0 72.7 0 1.28 2.56 Execute Command Bomb Pipe Fire CD Burn
Singular Value Decomposition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
After context representation… ,[object Object],[object Object],[object Object],[object Object],[object Object]
Evaluation  (before mapping) c1 c2 c4 c3 2 1 15 2 C4 6 1 1 2 C3 1 7 1 1 C2 2 3 0 10 C1
Evaluation  (after mapping) Agreement=38/55=0.69 20 15 2 1 2 C4 17 1 1 0 55 11 12 15 10 6 1 2 C3 10 1 7 1 C2 15 2 3 10 C1
Majority Sense Classifier Maj. =17/55=0.31
Experimental Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Name Conflated Data 51.4% 231,069 JapAnce 112,357 France 118,712 Japan 53.9% 46,431 JorGypt 21,762 Egyptian 25,539 Jordan 56.0% 13,734 MonSlo 6,176 Slobodan Milosovic 7,846 Shimon Peres 58.6% 5,807 MSIIBM 2,406 IBM 3,401 Microsoft 73.7% 4,073 JikRol 1,071 Rolf Ekeus 3,002 Tajik 69.3% 2,452 RoBeck 740 David Beckham 1,652 Ronaldo Maj. Total New Count Name Count Name
50.3 50.3 51.1 51.1 51.4 231,069 JapAnce 53.0 57.0 59.1 56.6 53.9 46,431 JorGypt 91.4 54.6 96.6 62.8 56.0 13,734 MonSLo 60.0 68.0 51.3 47.7 58.6 5,807 MSIIBM 90.4 91.0 96.2 94.7 73.7 4,073 JikRol 54.7 85.9 72.7 57.3 69.3 2,452 Robeck Ft 20 Ft 5 Ft 20 Ft 5 Maj. #  Cxt 20 Cxt 5
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Ongoing work ,[object Object],[object Object],[object Object],[object Object]
Thanks to… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

Similar to Measuring Similarity Between Contexts and Concepts

CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
butest
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
Nikhil Jaiswal
 

Similar to Measuring Similarity Between Contexts and Concepts (20)

Icon 2007 Pedersen
Icon 2007 PedersenIcon 2007 Pedersen
Icon 2007 Pedersen
 
The Semantic Quilt
The Semantic QuiltThe Semantic Quilt
The Semantic Quilt
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Eacl 2006 Pedersen
Eacl 2006 PedersenEacl 2006 Pedersen
Eacl 2006 Pedersen
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Ijcai 2007 Pedersen
Ijcai 2007 PedersenIjcai 2007 Pedersen
Ijcai 2007 Pedersen
 
Cicling2005
Cicling2005Cicling2005
Cicling2005
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012
 
Aaai 2006 Pedersen
Aaai 2006 PedersenAaai 2006 Pedersen
Aaai 2006 Pedersen
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Context Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word PairsContext Sensitive Relatedness Measure of Word Pairs
Context Sensitive Relatedness Measure of Word Pairs
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
Class14
Class14Class14
Class14
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011
 
Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介
Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介
Entity linking meets Word Sense Disambiguation: a unified approach(TACL 2014)の紹介
 

More from University of Minnesota, Duluth

More from University of Minnesota, Duluth (20)

Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
 
Automatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social MediaAutomatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social Media
 
What Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshopWhat Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshop
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it?
 
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?
 
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
 
Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...Who's to say what's funny? A computer using Language Models and Deep Learning...
Who's to say what's funny? A computer using Language Models and Deep Learning...
 
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
 
Puns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyPuns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and weary
 
The horizon isn't found in a dictionary : Identifying emerging word senses a...
The horizon isn't found in a  dictionary : Identifying emerging word senses a...The horizon isn't found in a  dictionary : Identifying emerging word senses a...
The horizon isn't found in a dictionary : Identifying emerging word senses a...
 
Screening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSDScreening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSD
 
Duluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyDuluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of Lexicography
 
Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014
 
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
 
What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)
 
Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25
 
Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24
 
Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013
 
Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1
 
Pedersen ACL Disco-2011 workshop
Pedersen ACL Disco-2011 workshopPedersen ACL Disco-2011 workshop
Pedersen ACL Disco-2011 workshop
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 

Measuring Similarity Between Contexts and Concepts

  • 1. Measuring Similarity Between Concepts and Contexts Ted Pedersen Department of Computer Science University of Minnesota, Duluth http://www.d.umn.edu/~tpederse
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. watercraft instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork from Jiang and Conrath [1997]
  • 10.
  • 11. watercraft instrumentality object artifact conveyance vehicle motor-vehicle car boat ark article ware table-ware cutlery fork
  • 12.
  • 13. Observed “car”... motor vehicle (327 +1) *root* (32783 + 1) minicab (6) cab (23) car (73 +1) bus (17) stock car (12)
  • 14. Observed “stock car”... motor vehicle (328+1) *root* (32784+1) minicab (6) cab (23) car (74+1) bus (17) stock car (12+1)
  • 15. After Counting Concepts... motor vehicle (329) IC = 1.998 *root* (32785) minicab (6) cab (23) car (75) bus (17) stock car (13) IC = 3.042
  • 16.
  • 17.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.  
  • 28.  
  • 29.  
  • 30.  
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. Limitations of 2 nd order 0 52.27 0 0.92 0 4.21 0 28.72 0 3.24 0 1.28 0 2.53 Weapon Missile Shoot Fire Destroy Murder Kill 17.77 0 14.6 46.2 22.1 0 34.2 19.23 2.36 0 72.7 0 1.28 2.56 Execute Command Bomb Pipe Fire CD Burn
  • 37.
  • 38.
  • 39. Evaluation (before mapping) c1 c2 c4 c3 2 1 15 2 C4 6 1 1 2 C3 1 7 1 1 C2 2 3 0 10 C1
  • 40. Evaluation (after mapping) Agreement=38/55=0.69 20 15 2 1 2 C4 17 1 1 0 55 11 12 15 10 6 1 2 C3 10 1 7 1 C2 15 2 3 10 C1
  • 41. Majority Sense Classifier Maj. =17/55=0.31
  • 42.
  • 43. Name Conflated Data 51.4% 231,069 JapAnce 112,357 France 118,712 Japan 53.9% 46,431 JorGypt 21,762 Egyptian 25,539 Jordan 56.0% 13,734 MonSlo 6,176 Slobodan Milosovic 7,846 Shimon Peres 58.6% 5,807 MSIIBM 2,406 IBM 3,401 Microsoft 73.7% 4,073 JikRol 1,071 Rolf Ekeus 3,002 Tajik 69.3% 2,452 RoBeck 740 David Beckham 1,652 Ronaldo Maj. Total New Count Name Count Name
  • 44. 50.3 50.3 51.1 51.1 51.4 231,069 JapAnce 53.0 57.0 59.1 56.6 53.9 46,431 JorGypt 91.4 54.6 96.6 62.8 56.0 13,734 MonSLo 60.0 68.0 51.3 47.7 58.6 5,807 MSIIBM 90.4 91.0 96.2 94.7 73.7 4,073 JikRol 54.7 85.9 72.7 57.3 69.3 2,452 Robeck Ft 20 Ft 5 Ft 20 Ft 5 Maj. # Cxt 20 Cxt 5
  • 45.
  • 46.
  • 47.