SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Leveraging Sentiment to Compute
Word Similarity
Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak
Bhattacharyya
Dept. of Computer Science and Engineering, IIT Bombay
6th International Global Wordnet Conference
GWC 2011,
Matsue, Japan, Jan, 2012
Motivation
2










Motivation
3
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?







Motivation
4
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set






Motivation
5
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
Motivation
6
 Introduce Sentiment as another feature in the Semantic Similarity
Measure
 “Among a set of a similar word pairs, a pair is more similar if their
sentiment content is the same”
 Is “enchant” (hold spellbound) more similar to “endear” (make
endearing or lovable) than to “delight ”(give pleasure to or be
pleasing to) ?
 Useful for replacing an unknown feature in test set with a similar
feature in training set
 Given a word in a sentence, create its Similarity Vector
 Use Word Sense Disambiguation on context to find its Synset-id
 Create a Gloss Vector (sparse) using its gloss
 Extend gloss using relevant WordNet Relations
 Learn the relations to use for different POS tags and the depth in WordNet
hierarchy
 Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
Sentiment-Semantic Correlation
7/23/2013Department of Computer Science and Engineering, IIT Bombay
7
Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS
Meaning 0.768 0.803 0.750 0.527 0.759
Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844
WordNet Relations used for
Expansion
7/23/2013Department of Computer Science and Engineering, IIT Bombay
8
POS WordNet relations used for expansion
Nouns hypernym, hyponym, nominalization
Verbs nominalization, hypernym, hyponym
Adjectives also see, nominalization, attribute
Adverbs derived
Scoring Formula
7/23/2013Department of Computer Science and Engineering, IIT Bombay
9
 ScoreSD(A) = SWNpos(A)- SWNneg(A)
 ScoreSM(A)= max(SWNpos(A), SWNneg(A))
 ScoreTM(A) =
sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A),
SWNneg(A)))
SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B)))
Where,
glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n)
scorex(Y) = Sentiment score of word Y using scoring function x
x = Scoring function of type SD/SM/TD/TM
Evaluation on Gold Standard Data:
Word Pair Similarity10
Evaluation on Gold Standard Data:
Word Pair Similarity11
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Evaluation on Gold Standard Data:
Word Pair Similarity12
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Evaluation on Gold Standard Data:
Word Pair Similarity13
• A set of 50 word pairs (with given context) manually marked
• Each word pair is given 3 scores in the form of ratings (1-5):
– Similarity based on meaning
– Similarity based on sentiment
– Similarity based on meaning + sentiment
Agreement metric: Pearson correlation coefficient
Metric Used Overall NOUN VERB ADJECTIVES ADVERBS
LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37
LIN (Lin, 1998) 0.27 0.24 0.00 NA Na
LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA
SenSim (SD) 0.46 0.73 0.55 0.08 0.76
SenSim (SM) 0.50 0.62 0.48 0.06 0.54
SenSim (TD) 0.45 0.73 0.55 0.08 0.59
SenSim (TM) 0.48 0.62 0.48 0.06 0.78
Evaluation on Travel Review Data:
Feature Replacement
7/23/2013Department of Computer Science and Engineering, IIT Bombay
14
Metric Used Accuracy
(%)
PP NP PR NR
Baseline 89.10 91.50 87.07 85.18 91.24
LESK
(Banerjee et al., 2003)
89.36 91.57 87.46 85.68 91.25
LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90
LCH
(Leacock et al., 1998)
89.64 90.48 88.86 86.47 89.63
SenSim (SD) 89.95 91.39 88.65 87.11 90.93
SenSim (SM) 90.06 92.01 88.38 86.67 91.58
SenSim (TD) 90.11 91.68 88.69 86.97 91.23

Weitere ähnliche Inhalte

Ähnlich wie Leveraging Sentiment to Compute Word Similarity

Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
Caroline Chua
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
guesta59df6
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
Andre Freitas
 

Ähnlich wie Leveraging Sentiment to Compute Word Similarity (20)

Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.ppt
 
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
EVALution 1.0 - An Evolving Semantic Dataset for Trainining and Evaluation of...
 
Lac presentation
Lac presentationLac presentation
Lac presentation
 
A Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisA Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment Analysis
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
Detecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi LanguageDetecting Paraphrases in Marathi Language
Detecting Paraphrases in Marathi Language
 
An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.An Improved sentiment classification for objective word.
An Improved sentiment classification for objective word.
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
 
Sourcebased Questions Skills
Sourcebased Questions SkillsSourcebased Questions Skills
Sourcebased Questions Skills
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Fyp ca2
Fyp ca2Fyp ca2
Fyp ca2
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
NLP
NLPNLP
NLP
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENT
 
N01741100102
N01741100102N01741100102
N01741100102
 
Introduction to Distributional Semantics
Introduction to Distributional SemanticsIntroduction to Distributional Semantics
Introduction to Distributional Semantics
 
Word sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy wordsWord sense disambiguation using wsd specific wordnet of polysemy words
Word sense disambiguation using wsd specific wordnet of polysemy words
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt
 

Mehr von Subhabrata Mukherjee

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Subhabrata Mukherjee
 

Mehr von Subhabrata Mukherjee (18)

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual ModelsXtremeDistil: Multi-stage Distillation for Massive Multilingual Models
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Fact Checking from Text
Fact Checking from TextFact Checking from Text
Fact Checking from Text
 
OpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product ProfilesOpenTag: Open Attribute Value Extraction From Product Profiles
OpenTag: Open Attribute Value Extraction From Product Profiles
 
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Co...
 
Continuous Experience-aware Language Model
Continuous Experience-aware Language ModelContinuous Experience-aware Language Model
Continuous Experience-aware Language Model
 
Experience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review CommunitiesExperience aware Item Recommendation in Evolving Review Communities
Experience aware Item Recommendation in Evolving Review Communities
 
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construc...
 
Leveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News CommunitiesLeveraging Joint Interactions for Credibility Analysis in News Communities
Leveraging Joint Interactions for Credibility Analysis in News Communities
 
People on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health ForumsPeople on Drugs: Credibility of User Statements in Health Forums
People on Drugs: Credibility of User Statements in Health Forums
 
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
Author-Specific Hierarchical Sentiment Aggregation for Rating Prediction of R...
 
Joint Author Sentiment Topic Model
Joint Author Sentiment Topic ModelJoint Author Sentiment Topic Model
Joint Author Sentiment Topic Model
 
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in TwitterTwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
TwiSent: A Multi-Stage System for Analyzing Sentiment in Twitter
 
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
Adaptation of Sentiment Analysis to New Linguistic Features, Informal Languag...
 
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarizat...
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviews
 
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
YouCat : Weakly Supervised Youtube Video Categorization System from Meta Data...
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Leveraging Sentiment to Compute Word Similarity

  • 1. Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global Wordnet Conference GWC 2011, Matsue, Japan, Jan, 2012
  • 3. Motivation 3  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?       
  • 4. Motivation 4  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set      
  • 5. Motivation 5  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
  • 6. Motivation 6  Introduce Sentiment as another feature in the Semantic Similarity Measure  “Among a set of a similar word pairs, a pair is more similar if their sentiment content is the same”  Is “enchant” (hold spellbound) more similar to “endear” (make endearing or lovable) than to “delight ”(give pleasure to or be pleasing to) ?  Useful for replacing an unknown feature in test set with a similar feature in training set  Given a word in a sentence, create its Similarity Vector  Use Word Sense Disambiguation on context to find its Synset-id  Create a Gloss Vector (sparse) using its gloss  Extend gloss using relevant WordNet Relations  Learn the relations to use for different POS tags and the depth in WordNet hierarchy  Incorporate SentiWordNet Scores in the Expanded Vector using Different Scoring
  • 7. Sentiment-Semantic Correlation 7/23/2013Department of Computer Science and Engineering, IIT Bombay 7 Annotation Strategy Overall NOUN VERB ADJECTIVES ADVERBS Meaning 0.768 0.803 0.750 0.527 0.759 Meaning + Sentiment 0.799 0.750 0.889 0.720 0.844
  • 8. WordNet Relations used for Expansion 7/23/2013Department of Computer Science and Engineering, IIT Bombay 8 POS WordNet relations used for expansion Nouns hypernym, hyponym, nominalization Verbs nominalization, hypernym, hyponym Adjectives also see, nominalization, attribute Adverbs derived
  • 9. Scoring Formula 7/23/2013Department of Computer Science and Engineering, IIT Bombay 9  ScoreSD(A) = SWNpos(A)- SWNneg(A)  ScoreSM(A)= max(SWNpos(A), SWNneg(A))  ScoreTM(A) = sign(max(SWNpos(A), SWNneg(A)))∗ (1+abs(max(SWNpos(A), SWNneg(A))) SenSimx(A, B) = cosine (glossvec (sense(A)), glossvec (sense(B))) Where, glossvec =1:scorex(1) 2:scorex(2)… n:scorex(n) scorex(Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM
  • 10. Evaluation on Gold Standard Data: Word Pair Similarity10
  • 11. Evaluation on Gold Standard Data: Word Pair Similarity11 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment
  • 12. Evaluation on Gold Standard Data: Word Pair Similarity12 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient
  • 13. Evaluation on Gold Standard Data: Word Pair Similarity13 • A set of 50 word pairs (with given context) manually marked • Each word pair is given 3 scores in the form of ratings (1-5): – Similarity based on meaning – Similarity based on sentiment – Similarity based on meaning + sentiment Agreement metric: Pearson correlation coefficient Metric Used Overall NOUN VERB ADJECTIVES ADVERBS LESK (Banerjee et al., 2003) 0.22 0.51 -0.91 0.19 0.37 LIN (Lin, 1998) 0.27 0.24 0.00 NA Na LCH (Leacock et al., 1998) 0.36 0.34 0.44 NA NA SenSim (SD) 0.46 0.73 0.55 0.08 0.76 SenSim (SM) 0.50 0.62 0.48 0.06 0.54 SenSim (TD) 0.45 0.73 0.55 0.08 0.59 SenSim (TM) 0.48 0.62 0.48 0.06 0.78
  • 14. Evaluation on Travel Review Data: Feature Replacement 7/23/2013Department of Computer Science and Engineering, IIT Bombay 14 Metric Used Accuracy (%) PP NP PR NR Baseline 89.10 91.50 87.07 85.18 91.24 LESK (Banerjee et al., 2003) 89.36 91.57 87.46 85.68 91.25 LIN (Lin, 1998) 89.27 91.24 87.61 85.85 90.90 LCH (Leacock et al., 1998) 89.64 90.48 88.86 86.47 89.63 SenSim (SD) 89.95 91.39 88.65 87.11 90.93 SenSim (SM) 90.06 92.01 88.38 86.67 91.58 SenSim (TD) 90.11 91.68 88.69 86.97 91.23