SlideShare a Scribd company logo
1 of 32
SENTIMENT ANALYSIS


 SENTIWORDNET AND POLARITY
 CLASSIFICATION



                             Roseline Antai
Sentiment Analysis
   Introduction
   Objectivity and Subjectivity detection
   Opinion extraction
   Polarity classification
   Approaches to Sentiment Analysis
   Issues in Sentiment Analysis
   SentiWordNet
   My Work
   References
Sentiment Analysis-Introduction


  Sentiment Analysis- “What‟s the Point?”
Introduction
   People these days place so much value on what
    others think, and this shapes their decisions.
   In everyday life, when it comes to making
    decisions, people yearn to know the feelings, and
    the experiences of others, before making up their
    minds.
   People also consult political discussion forums to
    aid in making up their minds on the votes to cast,
    read consumer reports before buying appliances,
    ask the opinions of friends before going to a
    certain restaurant.
   The World Wide Web has now created an avenue
    for the opinions of others to be expressed freely,
    and accessed by others.
Intro...
   Sentiment analysis has been referred to as:
     Subjectivity analysis
     Opinion mining

     Appraisal extraction

     Some connections to affective computing
Intro...
   Sentiment analysis is an area which tries to
    determine the mindset of an author, through a
    body of text.
   Sentiment analysis or statement polarity
    classification has to do with determining the
    relative positivity or negativity of a document, web
    page or selection of text. This task is a difficult one
    because of the variations and complexities that
    exist in language expressions.
   Sentiment analysis has also been defined as the
    task of identifying positive and negative opinions,
    emotions and evaluations.
Intro...
   The important steps in sentiment analysis

     Objectivity/Subjectivity   detection in text
     Opinion  extraction
     Polarity classification
Objectivity and Subjectivity
detection
   While some researchers have used algorithms
    for detecting subjective text, others have used
    syntactic rules.
   Morinaga et al (2002), on the basis of human-
    test samples generated in advance syntactic
    and linguistic rules to determine if any given
    statement is an opinion or not. Statements are
    collected from the web about the target
    products whose reputations they are working
    on, and then passed over these rules, and
    opinions are extracted.
Opinion Extraction
   Sentiment extraction from investor message boards, using five
    different algorithms, (Naïve classifier, vector- distance
    classifier, discriminant-based classifier, adjective-adverb classifier
    and Bayesian classifiers), to classify messages into three
    types, Optimistic, Pessimistic, and neutral, where the neutral
    statements are the objective statements which do not fall in either
    class.

   PMI-IR algorithm used to extract two consecutive words, where one
    was an adjective or adverb, while the second provided context. This
    is due to the fact that an adjective may have a different
    orientation, depending on the review.

       An example is the adjective “unpredictable”, which in an automotive
        review would have a negative orientation, if used in a phrase like
        “unpredictable steering”, while in a movie review, it would have a
        positive orientation if used in a phrase such as “unpredictable plot” .
Polarity classification
   The first step in emotion classification research is the
    question, “Which emotions should be addressed?”
    (Danisman and Alpcokak, 2008).
   In the lexical approach, a dictionary or lexicon of pre-
    tagged words is utilized. Each present word is
    compared against the dictionary. A word‟s polarity
    value is added to the „total polarity score‟ if the word is
    present in the dictionary. If the polarity score of a text
    is positive, the text is classified as positive. Otherwise,
    it is classified as negative.
   For the machine learning approach, a series of feature
    vectors are chosen and a collection of tagged corpora
    are provided for training a „classifier‟, and this can
    then be applied to an untagged corpus of text.
Machine Learning Approach
   The machine learning approach utilizes
    machine learning algorithms such as:
     Naïve Bayes
     Maximum entropy
     SVM/SVM Light
     ADTree (Alternating Decision Tree)


   The Lexical approach makes use of lexicons
    like the GI lexicon, WordNet, ConceptNet and
    SentiWordNet.
Application Domains
   Reviews
   Political blogs
   News Articles/Editorials
   Business message boards
Issues in Sentiment Analysis
   Negations
   Thwarted expectations
   Domain transferability
Negations
   In using the Naïve classifier in their wok, Das and
    Chen (2007) handled negation by matching each
    lexical entry by a corresponding counterpart with a
    negation sign. Each message before it was
    processed, was treated by a parsing algorithm which
    negates words if the sentence context required it. As
    an example, a sentence which read “this stock is not
    good,”, would have the word good, replaced by
    “good__n”, to simplify a negation .

    Following their example , (Pang and Lee, 2002)
    added the tag „NOT_ „ to every word between a
    negation word, like “not”, “isn‟t”, “didn‟t”, etc, and the
    first punctuation mark following the negation word.
Negations
   Some researchers, simply represent the negation with another
    word. They did this, by forming a new word using the negated
    verb. For example, given the sentence “I don‟t enjoy it”, they
    first replaced the shortened form by the full version, - “I do not
    enjoy it”, and then finally, as “I do NOTenjoy it.” Hence, the
    word “enjoy” is used to form a new word “NOTenjoy”, and this
    way, they were able to discriminate the word “enjoy”, which
    has a positive meaning, from the word “NOTenjoy”, which has
    a negative meaning.

   (Denecke, 2009) deals with negation by first scanning a
    text, and identifying negated terms like “Not”, ”no” and
    “nothing” . If one of these negated terms is found within two
    terms of an affective word, it is assumed the word‟s polarity is
    effectively reversed. Hence, any positive word around a
    negative word is ranked as negative, and any negative word
    around a negated term is ranked as positive.
Thwarted Expectations
   The term thwarted expressions has been
    defined as expressions which contain a
    number of words having a polarity which is
    opposite to the polarity of the expression itself
    (Annett and Kondrack, 2008).
   Taking the review:
        “”This film should be brilliant. It sounds like
    a great plot, the actors are first grade, and
    the       supporting cast is good as well, and
    Stallone is      attempting to deliver a good
    performance.
        However, it can‟t hold up””
Domain transferability
   The differences which exist in product features and widely
    varying domains makes the use of automatic sentiment
    classification across a wide range of domains quite difficult to
    achieve (Blitzer and Pereira, 2007).
   Take a scenario where developers annotate corpora for a
    small number of domains, then train these corpora, and
    subsequently apply them to other similar corpora.
   This raises two questions:

       one about the accuracy of the trained classifier, when the test
        data‟s
        distribution is significantly different from the training distribution.
       second, which notion of domain similarity should be used to
        select domains to annotate, which would serve as good proxies
        for other domains.
   Denecke (2009) also reports on classification
    across domains using SentiWordNet, and
    concludes from results obtained that a classifier
    trained on one domain is not transferable to
    another domain without a significant drop in
    accuracy.
   This may be due to the linguistic characteristics of
    different domains.
   Also, average SentiWordNet scores per word
    class vary for different domains.
   A classifier trained on a mixture of texts of
    different domains is better suited.
SentiWordNet
   SentiWordNet provides for each synset of
    WordNet a triple of polarity scores (positivity,
    negativity and objectivity) whose values sum up to
    1. For example the triple 0, 1, 0 (positivity,
    negativity, objectivity) is assigned to the synset of
    the term bad (Denecke,2009).
   It is a lexical resource in which each synset of
    WordNet is associated with three numerical
    scores, „obj‟,‟neg‟ and „pos‟. Each of the scores
    ranges from „0‟ to „1‟, and their sum equals „1‟
    (Saggion and Funk, 2010).
   The score triplet is derived by combining the
    results which are produced by a committee of
    eight ternary classifiers, all characterised by
    similar accuracy levels.
   SentiWordNet has been created automatically by
    means of a combination of linguistic and statistic
    classifiers. Like WordNet 2.0 from which it has
    been derived, SentiWordNet consists of around
    207000 word-sense pairs or 117660 synsets. It
    provides entries for nouns (71%), verbs (12%),
    adjectives (14%) and adverbs (3%).
SentiWordNet Scores
   SentiWordNet scores have been combined in
    different ways to classify text into positive or
    negative polarities. Two of these are:
   Denecke (2009) whose work was on testing
    the suitability of polarity scores for sentiment
    classification of documents in different
    domains, and analyzing accuracies in cross
    domain settings.
   Six different domains were used, four being
    Amazon product reviews
    (books, DVDs, electronics and kitchen
    equipments), one on drugs, and one news
    articles.
   The word is stemmed and looked up in
    SentiWordNet.
   As many entries may exist for a word, the scores
    for positivity, negativity and objectivity of the
    entries are averaged.
   The ambiguity which arises from a word having
    very different values from different senses is not
    addressed in this work. Eg: „bad‟, which in one
    sense has pos=0,neg=1 and obj=0, and in another
    sense, has pos=0.625, neg = 0.125, and obj=0.25.
   Instead, a simpler method of calculating the
    average of the scores of all senses is utilized.
   The polarity score triple is used to determine
    the semantic orientation of the word.
   If the positive value is larger, the word is
    positive, and same goes for the negative,
    where both are equal, the word is ignored.
   An average polarity triple for the full document
    is determined by summing up the polarity
    score triples of all opinionated words.
   If the number of positive words is larger than
    the number of negative words, the document is
    positive. Also, same goes for negative.
   If there are equal numbers of positive, as well
    as negative words, the average polarity score
    is checked if the positive value is larger than
    the negative, then the document is classified
    positive, and vice versa.
   Saggion and Funk(2010) use an English data
    source and an Italian data source.
   Again WSD was not carried out.
   For each entry of the word in SentiWordNet,
    the number of times the word is more positive
    than negative (positive>negative), the number
    of times it is more negative than positive, and
    the total number of entries in SentiWordNet
    are computed.
   In each sentence, the number of words more
    positive than negative is calculated, and same
    goes for the more negative words.
   The sentiment score for the sentence is positive if
    most words in the sentence are positive, and
    negative, if there are more negative words, and
    neutral otherwise.
   The paper also reports using summarization as a
    pre-process before classification, and this does
    lead to a statistically significant increase in
    classification accuracy.
DEMO
http://sentiwordnet.isti.cnr.it/
My Work
   Generate a simple baseline system
   Incorporate WSD in my work
   Will summarization lead to better results?
   What is the document space was reduced ?
    Will this lead to better results?
   How do I make it domain adaptable?
References
   Morinaga, S., Yamanishi, K., Tateishi, K., and Fukushima, T. (2002). Mining product reputations on the
    web. Proceedings of the 8th ACM SIGKDD international Conference on Knowledge Discovery and Data
    Mining.
   Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, bollywood, boom-boxes and blenders: Domain
    adaptation for sentiment classification. Annual Meeting Association For Computational Linguistics (Vol.
    45, pp. 440-447).Association for Computational Linguistics. Retrieved from
    http://acl.ldc.upenn.edu/P/P07/P07-1056.pdf
    Michelle Annett and GrzegorzKondrak. 2008. A comparison of sentiment analysis techniques: polarizing
    movie blogs. In Proceedings of the Canadian Society for computational studies of intelligence, 21st
    conference on Advances in artificial intelligence (Canadian AI'08), Sabine Bergler (Ed.). Springer-
    Verlag, Berlin, Heidelberg, 25-35.
   Das, S. and M. Chen 2007. Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web.
    Management Science 53(9): 1375-1388.
   Mejova,Y.(2009). Sentiment Analysis: An Overview. Computer Science department, University of Iowa.
    www.cs.uiowa.edu/~ymejoya/publications/comps YelenaMejova.pdf
   Thumbs up? Sentiment Classification using Machine Learning Techniques. Bo Pang, Lillian Lee, and
    ShivakumarVaithyanathan. Proceedings of the Conference on Empirical Methods in Natural Language
    Processing (EMNLP), pp. 79--86, 2002.
   Turney, P 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised
    classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational
    Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 417-424.
    DOI=10.3115/1073083.1073153 http://dx.doi.org/10.3115/1073083.1073153
References
   Danisman, T and Alpkocak, A. “Feeler: Emotion classification of text using vector space model,”
    in AISB 2008 Convention, Communication, Interaction and Social Intelligence, vol. vol. 2,
    Aberdeen, UK, April 2008.
   Wilson, T., Wiebe, J., and Hoffmann, P. (2009). Recognizing contextual polarity: an exploration of
    features for phrase-level sentiment analysis. Computational Linguistics, 35(5):399–433.
   Denecke , K. Are SentiWordNet Scores suited fro multi-domain sentiment classification? ICDM,
    2009.
   Saggion, H. and Frank, A. Interpreting SentiWordNet for opinion classification. In proceedings of
    LREC, 2010.
   Esuli, A. , Baccianella, S. and Sebastiani, F. SentiWordNEt 3.0: An Enhanced Lexical Resource
    for Sentiment Analysis and Opinion Mining. In proceedings of the seventh conference on
    International Language resources and Evaluation , 2010.
Thank you!
Questions?

More Related Content

What's hot

Aspects&opinions identification_opinion mining complete ppt
Aspects&opinions identification_opinion mining complete pptAspects&opinions identification_opinion mining complete ppt
Aspects&opinions identification_opinion mining complete ppttanvikadam76
 
A fuzzy logic based on sentiment
A fuzzy logic based on sentimentA fuzzy logic based on sentiment
A fuzzy logic based on sentimentIJDKP
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisAditya Joshi
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlpMahmoud Farag
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisAnuj Gupta
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis Naveen Kumar
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter DataNurendra Choudhary
 
A survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion miningA survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion miningeSAT Publishing House
 
A survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion miningA survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion miningeSAT Journals
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
A FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEME
A FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEMEA FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEME
A FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEMEaciijournal
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerbohanairl
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
MODELLING OF INTELLIGENT AGENTS USING A–PROLOG
MODELLING OF INTELLIGENT AGENTS USING A–PROLOGMODELLING OF INTELLIGENT AGENTS USING A–PROLOG
MODELLING OF INTELLIGENT AGENTS USING A–PROLOGijaia
 

What's hot (20)

Aspects&opinions identification_opinion mining complete ppt
Aspects&opinions identification_opinion mining complete pptAspects&opinions identification_opinion mining complete ppt
Aspects&opinions identification_opinion mining complete ppt
 
A fuzzy logic based on sentiment
A fuzzy logic based on sentimentA fuzzy logic based on sentiment
A fuzzy logic based on sentiment
 
P1803018289
P1803018289P1803018289
P1803018289
 
Web Opinion Mining
Web Opinion MiningWeb Opinion Mining
Web Opinion Mining
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment Analysis
 
O01741103108
O01741103108O01741103108
O01741103108
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlp
 
F0363942
F0363942F0363942
F0363942
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysis
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse AnalysisSentiment Analysis in Twitter with Lightweight Discourse Analysis
Sentiment Analysis in Twitter with Lightweight Discourse Analysis
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
 
A survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion miningA survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion mining
 
A survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion miningA survey on sentiment analysis and opinion mining
A survey on sentiment analysis and opinion mining
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
A FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEME
A FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEMEA FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEME
A FRAMEWORK FOR SUMMARIZATION OF ONLINE OPINION USING WEIGHTING SCHEME
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMiner
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
MODELLING OF INTELLIGENT AGENTS USING A–PROLOG
MODELLING OF INTELLIGENT AGENTS USING A–PROLOGMODELLING OF INTELLIGENT AGENTS USING A–PROLOG
MODELLING OF INTELLIGENT AGENTS USING A–PROLOG
 

Similar to Lac presentation

COMMENT POLARITY MOVIE RATING SYSTEM-1.pptx
COMMENT POLARITY MOVIE RATING SYSTEM-1.pptxCOMMENT POLARITY MOVIE RATING SYSTEM-1.pptx
COMMENT POLARITY MOVIE RATING SYSTEM-1.pptx5088manoj
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...CITE
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESJournal For Research
 
Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.pptvisheshs4
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWJournal For Research
 
Analysis of Opinionated Text for Opinion Mining
Analysis of Opinionated Text for Opinion MiningAnalysis of Opinionated Text for Opinion Mining
Analysis of Opinionated Text for Opinion Miningmlaij
 
Doc format.
Doc format.Doc format.
Doc format.butest
 
A Survey on Sentiment Mining Techniques
A Survey on Sentiment Mining TechniquesA Survey on Sentiment Mining Techniques
A Survey on Sentiment Mining TechniquesKhan Mostafa
 
OPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNET
OPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNETOPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNET
OPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNETijfcstjournal
 
Dictionary Based Approach to Sentiment Analysis - A Review
Dictionary Based Approach to Sentiment Analysis - A ReviewDictionary Based Approach to Sentiment Analysis - A Review
Dictionary Based Approach to Sentiment Analysis - A ReviewINFOGAIN PUBLICATION
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion dannyijwest
 
A Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisA Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisRichard Hogue
 
Aspect-Level Sentiment Analysis On Hotel Reviews
Aspect-Level Sentiment Analysis On Hotel ReviewsAspect-Level Sentiment Analysis On Hotel Reviews
Aspect-Level Sentiment Analysis On Hotel ReviewsKimberly Pulley
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTWarNik Chow
 
A Survey on Sentiment Analysis and Opinion Mining.pdf
A Survey on Sentiment Analysis and Opinion Mining.pdfA Survey on Sentiment Analysis and Opinion Mining.pdf
A Survey on Sentiment Analysis and Opinion Mining.pdfMandy Brown
 
A Survey On Sentiment Analysis And Opinion Mining Techniques
A Survey On Sentiment Analysis And Opinion Mining TechniquesA Survey On Sentiment Analysis And Opinion Mining Techniques
A Survey On Sentiment Analysis And Opinion Mining TechniquesSabrina Green
 
Intro to sentiment analysis
Intro to sentiment analysisIntro to sentiment analysis
Intro to sentiment analysisTimea Turdean
 
A Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsA Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsEditor IJMTER
 

Similar to Lac presentation (20)

COMMENT POLARITY MOVIE RATING SYSTEM-1.pptx
COMMENT POLARITY MOVIE RATING SYSTEM-1.pptxCOMMENT POLARITY MOVIE RATING SYSTEM-1.pptx
COMMENT POLARITY MOVIE RATING SYSTEM-1.pptx
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
 
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUESA SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
A SURVEY OF SENTIMENT CLASSSIFICTION TECHNIQUES
 
Sentiment+Analysis.ppt
Sentiment+Analysis.pptSentiment+Analysis.ppt
Sentiment+Analysis.ppt
 
Fyp ca2
Fyp ca2Fyp ca2
Fyp ca2
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
 
Analysis of Opinionated Text for Opinion Mining
Analysis of Opinionated Text for Opinion MiningAnalysis of Opinionated Text for Opinion Mining
Analysis of Opinionated Text for Opinion Mining
 
Doc format.
Doc format.Doc format.
Doc format.
 
A Survey on Sentiment Mining Techniques
A Survey on Sentiment Mining TechniquesA Survey on Sentiment Mining Techniques
A Survey on Sentiment Mining Techniques
 
OPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNET
OPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNETOPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNET
OPTIMIZATION OF CROSS DOMAIN SENTIMENT ANALYSIS USING SENTIWORDNET
 
Dictionary Based Approach to Sentiment Analysis - A Review
Dictionary Based Approach to Sentiment Analysis - A ReviewDictionary Based Approach to Sentiment Analysis - A Review
Dictionary Based Approach to Sentiment Analysis - A Review
 
J1803015357
J1803015357J1803015357
J1803015357
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion
 
A Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment AnalysisA Context-Based Algorithm For Sentiment Analysis
A Context-Based Algorithm For Sentiment Analysis
 
Aspect-Level Sentiment Analysis On Hotel Reviews
Aspect-Level Sentiment Analysis On Hotel ReviewsAspect-Level Sentiment Analysis On Hotel Reviews
Aspect-Level Sentiment Analysis On Hotel Reviews
 
DETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENTDETECTING OXYMORON IN A SINGLE STATEMENT
DETECTING OXYMORON IN A SINGLE STATEMENT
 
A Survey on Sentiment Analysis and Opinion Mining.pdf
A Survey on Sentiment Analysis and Opinion Mining.pdfA Survey on Sentiment Analysis and Opinion Mining.pdf
A Survey on Sentiment Analysis and Opinion Mining.pdf
 
A Survey On Sentiment Analysis And Opinion Mining Techniques
A Survey On Sentiment Analysis And Opinion Mining TechniquesA Survey On Sentiment Analysis And Opinion Mining Techniques
A Survey On Sentiment Analysis And Opinion Mining Techniques
 
Intro to sentiment analysis
Intro to sentiment analysisIntro to sentiment analysis
Intro to sentiment analysis
 
A Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsA Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie Reviews
 

Lac presentation

  • 1. SENTIMENT ANALYSIS SENTIWORDNET AND POLARITY CLASSIFICATION Roseline Antai
  • 2. Sentiment Analysis  Introduction  Objectivity and Subjectivity detection  Opinion extraction  Polarity classification  Approaches to Sentiment Analysis  Issues in Sentiment Analysis  SentiWordNet  My Work  References
  • 3. Sentiment Analysis-Introduction Sentiment Analysis- “What‟s the Point?”
  • 4. Introduction  People these days place so much value on what others think, and this shapes their decisions.  In everyday life, when it comes to making decisions, people yearn to know the feelings, and the experiences of others, before making up their minds.  People also consult political discussion forums to aid in making up their minds on the votes to cast, read consumer reports before buying appliances, ask the opinions of friends before going to a certain restaurant.  The World Wide Web has now created an avenue for the opinions of others to be expressed freely, and accessed by others.
  • 5. Intro...  Sentiment analysis has been referred to as:  Subjectivity analysis  Opinion mining  Appraisal extraction  Some connections to affective computing
  • 6. Intro...  Sentiment analysis is an area which tries to determine the mindset of an author, through a body of text.  Sentiment analysis or statement polarity classification has to do with determining the relative positivity or negativity of a document, web page or selection of text. This task is a difficult one because of the variations and complexities that exist in language expressions.  Sentiment analysis has also been defined as the task of identifying positive and negative opinions, emotions and evaluations.
  • 7. Intro...  The important steps in sentiment analysis  Objectivity/Subjectivity detection in text  Opinion extraction  Polarity classification
  • 8. Objectivity and Subjectivity detection  While some researchers have used algorithms for detecting subjective text, others have used syntactic rules.  Morinaga et al (2002), on the basis of human- test samples generated in advance syntactic and linguistic rules to determine if any given statement is an opinion or not. Statements are collected from the web about the target products whose reputations they are working on, and then passed over these rules, and opinions are extracted.
  • 9. Opinion Extraction  Sentiment extraction from investor message boards, using five different algorithms, (Naïve classifier, vector- distance classifier, discriminant-based classifier, adjective-adverb classifier and Bayesian classifiers), to classify messages into three types, Optimistic, Pessimistic, and neutral, where the neutral statements are the objective statements which do not fall in either class.  PMI-IR algorithm used to extract two consecutive words, where one was an adjective or adverb, while the second provided context. This is due to the fact that an adjective may have a different orientation, depending on the review.  An example is the adjective “unpredictable”, which in an automotive review would have a negative orientation, if used in a phrase like “unpredictable steering”, while in a movie review, it would have a positive orientation if used in a phrase such as “unpredictable plot” .
  • 10. Polarity classification  The first step in emotion classification research is the question, “Which emotions should be addressed?” (Danisman and Alpcokak, 2008).  In the lexical approach, a dictionary or lexicon of pre- tagged words is utilized. Each present word is compared against the dictionary. A word‟s polarity value is added to the „total polarity score‟ if the word is present in the dictionary. If the polarity score of a text is positive, the text is classified as positive. Otherwise, it is classified as negative.  For the machine learning approach, a series of feature vectors are chosen and a collection of tagged corpora are provided for training a „classifier‟, and this can then be applied to an untagged corpus of text.
  • 11. Machine Learning Approach  The machine learning approach utilizes machine learning algorithms such as:  Naïve Bayes  Maximum entropy  SVM/SVM Light  ADTree (Alternating Decision Tree)  The Lexical approach makes use of lexicons like the GI lexicon, WordNet, ConceptNet and SentiWordNet.
  • 12. Application Domains  Reviews  Political blogs  News Articles/Editorials  Business message boards
  • 13. Issues in Sentiment Analysis  Negations  Thwarted expectations  Domain transferability
  • 14. Negations  In using the Naïve classifier in their wok, Das and Chen (2007) handled negation by matching each lexical entry by a corresponding counterpart with a negation sign. Each message before it was processed, was treated by a parsing algorithm which negates words if the sentence context required it. As an example, a sentence which read “this stock is not good,”, would have the word good, replaced by “good__n”, to simplify a negation .  Following their example , (Pang and Lee, 2002) added the tag „NOT_ „ to every word between a negation word, like “not”, “isn‟t”, “didn‟t”, etc, and the first punctuation mark following the negation word.
  • 15. Negations  Some researchers, simply represent the negation with another word. They did this, by forming a new word using the negated verb. For example, given the sentence “I don‟t enjoy it”, they first replaced the shortened form by the full version, - “I do not enjoy it”, and then finally, as “I do NOTenjoy it.” Hence, the word “enjoy” is used to form a new word “NOTenjoy”, and this way, they were able to discriminate the word “enjoy”, which has a positive meaning, from the word “NOTenjoy”, which has a negative meaning.  (Denecke, 2009) deals with negation by first scanning a text, and identifying negated terms like “Not”, ”no” and “nothing” . If one of these negated terms is found within two terms of an affective word, it is assumed the word‟s polarity is effectively reversed. Hence, any positive word around a negative word is ranked as negative, and any negative word around a negated term is ranked as positive.
  • 16. Thwarted Expectations  The term thwarted expressions has been defined as expressions which contain a number of words having a polarity which is opposite to the polarity of the expression itself (Annett and Kondrack, 2008).  Taking the review: “”This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can‟t hold up””
  • 17. Domain transferability  The differences which exist in product features and widely varying domains makes the use of automatic sentiment classification across a wide range of domains quite difficult to achieve (Blitzer and Pereira, 2007).  Take a scenario where developers annotate corpora for a small number of domains, then train these corpora, and subsequently apply them to other similar corpora.  This raises two questions:  one about the accuracy of the trained classifier, when the test data‟s distribution is significantly different from the training distribution.  second, which notion of domain similarity should be used to select domains to annotate, which would serve as good proxies for other domains.
  • 18. Denecke (2009) also reports on classification across domains using SentiWordNet, and concludes from results obtained that a classifier trained on one domain is not transferable to another domain without a significant drop in accuracy.  This may be due to the linguistic characteristics of different domains.  Also, average SentiWordNet scores per word class vary for different domains.  A classifier trained on a mixture of texts of different domains is better suited.
  • 19. SentiWordNet  SentiWordNet provides for each synset of WordNet a triple of polarity scores (positivity, negativity and objectivity) whose values sum up to 1. For example the triple 0, 1, 0 (positivity, negativity, objectivity) is assigned to the synset of the term bad (Denecke,2009).  It is a lexical resource in which each synset of WordNet is associated with three numerical scores, „obj‟,‟neg‟ and „pos‟. Each of the scores ranges from „0‟ to „1‟, and their sum equals „1‟ (Saggion and Funk, 2010).
  • 20. The score triplet is derived by combining the results which are produced by a committee of eight ternary classifiers, all characterised by similar accuracy levels.  SentiWordNet has been created automatically by means of a combination of linguistic and statistic classifiers. Like WordNet 2.0 from which it has been derived, SentiWordNet consists of around 207000 word-sense pairs or 117660 synsets. It provides entries for nouns (71%), verbs (12%), adjectives (14%) and adverbs (3%).
  • 21. SentiWordNet Scores  SentiWordNet scores have been combined in different ways to classify text into positive or negative polarities. Two of these are:  Denecke (2009) whose work was on testing the suitability of polarity scores for sentiment classification of documents in different domains, and analyzing accuracies in cross domain settings.
  • 22. Six different domains were used, four being Amazon product reviews (books, DVDs, electronics and kitchen equipments), one on drugs, and one news articles.  The word is stemmed and looked up in SentiWordNet.  As many entries may exist for a word, the scores for positivity, negativity and objectivity of the entries are averaged.  The ambiguity which arises from a word having very different values from different senses is not addressed in this work. Eg: „bad‟, which in one sense has pos=0,neg=1 and obj=0, and in another sense, has pos=0.625, neg = 0.125, and obj=0.25.
  • 23. Instead, a simpler method of calculating the average of the scores of all senses is utilized.  The polarity score triple is used to determine the semantic orientation of the word.  If the positive value is larger, the word is positive, and same goes for the negative, where both are equal, the word is ignored.  An average polarity triple for the full document is determined by summing up the polarity score triples of all opinionated words.
  • 24. If the number of positive words is larger than the number of negative words, the document is positive. Also, same goes for negative.  If there are equal numbers of positive, as well as negative words, the average polarity score is checked if the positive value is larger than the negative, then the document is classified positive, and vice versa.
  • 25. Saggion and Funk(2010) use an English data source and an Italian data source.  Again WSD was not carried out.  For each entry of the word in SentiWordNet, the number of times the word is more positive than negative (positive>negative), the number of times it is more negative than positive, and the total number of entries in SentiWordNet are computed.
  • 26. In each sentence, the number of words more positive than negative is calculated, and same goes for the more negative words.  The sentiment score for the sentence is positive if most words in the sentence are positive, and negative, if there are more negative words, and neutral otherwise.  The paper also reports using summarization as a pre-process before classification, and this does lead to a statistically significant increase in classification accuracy.
  • 28. My Work  Generate a simple baseline system  Incorporate WSD in my work  Will summarization lead to better results?  What is the document space was reduced ? Will this lead to better results?  How do I make it domain adaptable?
  • 29. References  Morinaga, S., Yamanishi, K., Tateishi, K., and Fukushima, T. (2002). Mining product reputations on the web. Proceedings of the 8th ACM SIGKDD international Conference on Knowledge Discovery and Data Mining.  Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. Annual Meeting Association For Computational Linguistics (Vol. 45, pp. 440-447).Association for Computational Linguistics. Retrieved from http://acl.ldc.upenn.edu/P/P07/P07-1056.pdf  Michelle Annett and GrzegorzKondrak. 2008. A comparison of sentiment analysis techniques: polarizing movie blogs. In Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence (Canadian AI'08), Sabine Bergler (Ed.). Springer- Verlag, Berlin, Heidelberg, 25-35.  Das, S. and M. Chen 2007. Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web. Management Science 53(9): 1375-1388.  Mejova,Y.(2009). Sentiment Analysis: An Overview. Computer Science department, University of Iowa. www.cs.uiowa.edu/~ymejoya/publications/comps YelenaMejova.pdf  Thumbs up? Sentiment Classification using Machine Learning Techniques. Bo Pang, Lillian Lee, and ShivakumarVaithyanathan. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79--86, 2002.  Turney, P 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 417-424. DOI=10.3115/1073083.1073153 http://dx.doi.org/10.3115/1073083.1073153
  • 30. References  Danisman, T and Alpkocak, A. “Feeler: Emotion classification of text using vector space model,” in AISB 2008 Convention, Communication, Interaction and Social Intelligence, vol. vol. 2, Aberdeen, UK, April 2008.  Wilson, T., Wiebe, J., and Hoffmann, P. (2009). Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(5):399–433.  Denecke , K. Are SentiWordNet Scores suited fro multi-domain sentiment classification? ICDM, 2009.  Saggion, H. and Frank, A. Interpreting SentiWordNet for opinion classification. In proceedings of LREC, 2010.  Esuli, A. , Baccianella, S. and Sebastiani, F. SentiWordNEt 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In proceedings of the seventh conference on International Language resources and Evaluation , 2010.