Anzeige
Anzeige

Más contenido relacionado

Similar a ISCRAM 2013: A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts(20)

Anzeige

Más de ISCRAM Events(20)

Anzeige

ISCRAM 2013: A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts

  1. A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts Axel Schulz, Tung Dang Thanh, Dr. Heiko Paulheim, Dr. Immanuel Schweizer May, 14 2013 The work is partly funded by a grant of the German Federal Ministry for Education and Research Telecooperation Lab Technische Universität Darmstadt
  2. Motivation Problem: Fragmented Situational Picture ? Decision making based on information from: • Onsite rescue squads • Traditional data sources Bystanders report additional information about current situation Decision makers must be aware of all relevant information in their environment Valuable information from user-generated content is not usable for decision makers because of: • Heterogeneous and unstructured nature of the data • Lack of time to analyze flood of data A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 2
  3. Vision  Vision: Increasing the situational picture by making user-generated content usable for decision makers  Sentiment analysis can help to differentiate important information from unimportant one  Current approaches focus on a three-class problem (Negative, Positive, Neutral)  A more fine-grained differentiation could help detecting relevant tweets  7 classes [Ekman]: Anger, Disgust, Fear, Sadness, Surprise, Happiness, Neutral Enhanced Situational Picture ! Decision making based on information from: • Onsite rescue squads • Traditional data sources Bystanders report additional information about current situation A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 3
  4. Approach: Reference Pipeline Feature Extraction •Unigram Extraction •Extraction of Part-of-speech Features •Character Trigram/Fourgram •Syntactic Features •Sentiment Features Preprocessing •Extraction of irrelevant words, links, and user mentions •Handling Negations •Abbreviation Resolution •Category Extraction Tweets Classification •Naïve Bayes Binary Model •Naïve Bayes Multinomial Model •Support Vector Machine A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 4
  5. Testing all combinations of  Machine learning methods  and features Metrics for performance evaluation  Accuracy, Precision, Recall Results calculated using stratified 10-fold cross validation Evaluation: Approach A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 6
  6. 7 Classes: 6 basic emotions + neutral  SET1: 114 English tweets (Seattle)  Labeled by at least eight persons and more than 50% agreement  SET2: 1951 English tweets (Seattle)  Surprise: positive surprise, negative surprise  Each tweet labeled by one person using MTurk 3 Classes: Negative, Positive, Neutral  SET2_GP: grouping SET2  “Disgust”, “Fear”, “Sadness”, “Surprise with negative meaning” into the negative class,  “Happiness”, “Surprise with positive meaning” into the positive class  “None” into the neutral class  872 positive tweets, 598 negative tweets, and 481 neutral tweets Evaluation: Datasets A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 7
  7. Evaluation: Results 7-classes SET 1 SET2 3-classes SET2_GP Accuracy 0.658 0.605 0.657 0.564 0.503 0.535 0.641 0.566 0.626 Avg. Precision 0.615 0.519 0.597 0.482 0.45 0.489 0.645 0.565 0.625 Avg. Recall 0.658 0.605 0.658 0.564 0.504 0.535 0.641 0.566 0.625 F-Measure 0.61 0.525 0.598 0.492 0.394 0.505 0.64 0.564 0.624 Class. Method NBB NBM SVM NBM NBB SVM NBM NBB SVM Unigram x x x x x x x x Syntactic Features x x x Sentiment Features x x x x POS Tagging x x x x x Character tri-gram x A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 8
  8. Evaluation: Crisis Related Results  Tweets collected during “hurricane” Sandy in October 2012  60 situational awareness tweets, 140 random, non-contributing tweets  Results: using “Fear” tweets outperforms “Negative” and random baseline (30% Accuracy)  Found with 7-classes, but not with 3-classes "Day 3. No power. Limited Food. Limited shelter. Must survive. #Sandy" [7-classes: Fear, 3-classes: Neutral] Detected Contributing to SA Accuracy Recall 7-classes Fear 96 38 0.395 0.633 Disgust 41 10 0.243 0.166 Fear & Disgust 137 48 0.35 0.80 3-classes Negative 41 12 0.292 0.20 A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 9
  9. Example Application and Outlook • For detecting tweets contributing to situational awareness • Combination with geolocalization approaches, filtering etc. A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 10
  10. Contribution • Novel sentiment analysis approach for detecting seven sentiment classes • Preliminary evaluation: shows promising results towards detecting crisis related tweets Future Work Larger training set needed Combination of different means necessary for more valuable pipeline Conclusion & Outlook A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 11
  11. THANK YOU! Questions? Can also be addressed to: aschulz@tk.informatik.tu-darmstadt.de 12A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts
  12.  Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. (2011). Sentiment analysis of Twitter data. Proceedings of the Workshop on Languages in Social Media. Portland, Oregon.  Barbosa, L., & Feng, J. (2010). Robust sentiment detection on Twitter from biased and noisy data. Proceedings of the 23rd International Conference on Computational Linguistics. Beijing, China.  Ekman, P. (1992) An argument for basic emotions. Cognition & Emotion, 6, 3-4, 169-200.  Esuli, A., & Sebastiani, F. (2006). SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining. Proceedings of the 5th Conference on Language Resources and Evaluation. Genova, IT.  Jiang, L., Yu, M., Zhou, M., Liu, X., & Zhao, T. (2011). Target-dependent Twitter Sentiment Classification. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Portland, Oregon.  Nagy, A. and Stamberger, J (2012) Proceedings of the 9th International ISCRAM Conference, Vancouver, CA.  Nielsen, A. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microposts. Journal of the International Linguistic Association, 93-98.  Pang, B., & Lee, L. (2006). Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 91-231.  Vieweg, S., Hughes, A. L., Starbird, K., Palen, L. (2010) Micropostging during two natural hazards events. Proceedings of the 28th international conference on Human factors in computing systems, Atlanta, GA.  Witten, I. H. and Frank, E. (2005). Data Mining: Practice Machine Learning Tools and Techniques, 2nd Edition, San Francisco, Morgan Kaufmann Publishers. A Fine-Grained Sentiment Analysis Approach for Detecting Crisis Related Microposts 13 Bibliography
Anzeige