Scaling API-first – The story of a global engineering organization
Language-Independent Twitter Sentiment Analysis
1. Language-Independent Twitter Sentiment Analysis
Sascha Narr, Michael Hülfenhaus, Sahin Albayrak
Sascha Narr
Competence Center Information Retrieval & Machine Learning
KDML 2012, LWA, Dortmund, Germany
2. Overview
►1. Sentiment analysis on social media
►2. Creation of a multilingual evaluation dataset of
tweets
►3. A language-independent sentiment labeling
heuristic for semi-supervised learning
►4. Experiments on the multilingual dataset
18. September 2012 Language-Independent Twitter Sentiment Analysis 2
3. Overview
►1. Sentiment analysis on social media
►2. Creation of a multilingual evaluation dataset of
tweets
►3. A language-independent sentiment labeling
heuristic for semi-supervised learning
►4. Experiments on the multilingual dataset
18. September 2012 Language-Independent Twitter Sentiment Analysis 3
4. 1. Sentiment Analysis on Social Media
► Why Sentiment Analysis?
People’s opinions and sentiments about products and events
in large numbers are invaluable:
Market research, product feedback and more
Sentiment Analysis allows to automatically collect such data
► Why Twitter?
400 Million tweets posted each day[1]
Shorter text lengths encourage people to
“just write” what they think
Tweets are often informal and contain lots of opinions
[1]: http://news.cnet.com/8301-1023 3-57448388-93/twitter-hits-400-million-tweets-per-day-mostly-mobile/
18. September 2012 Language-Independent Twitter Sentiment Analysis 4
5. 1. Methods for Sentiment Classification
► Sentiment classification goals:
Subjectivity: “Does the tweet contain an opinion?”
Polarity: “Is the expressed opinion positive or negative?”
► Classifiers used:
Naive Bayes, Maximum Entropy, Support Vector Machines
► Features used:
n-grams, WordNet semantics, part-of-speech information
► Tweet texts have unique properties:
Informal, contain slang, emoticons, misspellings
18. September 2012 Language-Independent Twitter Sentiment Analysis 5
6. 1. Multilingual Sentiment Analysis
►Less than 40% of tweets are English [1]
►Natural language processing methods are often
designed specifically for one language
► Increase coverage of sentiment analysis by using a
language-independent approach:
No extra effort for additional languages
Is the approach really effective for all languages?
[1] http://semiocast.com/publications/2011_11_24_Arabic_highest_growth_on_Twitter
18. September 2012 Language-Independent Twitter Sentiment Analysis 6
7. Overview
►1. Sentiment analysis on social media
►2. Creation of a multilingual evaluation dataset of
tweets
►3. A language-independent sentiment labeling
heuristic for semi-supervised learning
►4. Experiments on the multilingual dataset
18. September 2012 Language-Independent Twitter Sentiment Analysis 7
8. 2. Creation of a Multilingual Evaluation Dataset
► We created a hand-annotated sentiment evaluation
dataset of over 12000 tweets
4 languages: English, German, French, Portuguese
►Used the Amazon Mechanical Turk platform for
annotation
►Each tweet was annotated by 3 different workers:
Labels: “positive”, “neutral”, “negative”
Added validation tweets to try to ensure the quality of the
annotations
18. September 2012 Language-Independent Twitter Sentiment Analysis 8
9. 2. Our Multilingual Evaluation Dataset
► Observed a low inter-annotator agreement in our dataset
Sentiment classification is a hard task, even for humans
Tweets that humans disagree on are harder to classify as
well
► The dataset is publicly available for research purposes
Table 1: Tweet counts for the complete annotated dataset
18. September 2012 Language-Independent Twitter Sentiment Analysis 9
10. Overview
►1. Sentiment analysis on social media
►2. Creation of a multilingual evaluation dataset of
tweets
►3. A language-independent sentiment labeling
heuristic for semi-supervised learning
►4. Experiments on the multilingual dataset
18. September 2012 Language-Independent Twitter Sentiment Analysis 10
11. 3. A Language-Independent Heuristic
► To train a sentiment classifier, a large amount of labeled
training data is needed
Can be obtained without human effort using a previously
proposed heuristic
► The heuristic uses emoticons in tweets as noisy labels
► Heuristic: If a tweet contains only positive emoticons, label its
whole text as positive (and vice versa for negative).
► Examples of emoticons we used:
Positive: :) :-) =) ;) :] :D ˆ-ˆ ˆ_ˆ
Negative: :( :-( :(( -.- >:-( D: :/
18. September 2012 Language-Independent Twitter Sentiment Analysis 11
12. 3. Heuristic for Semi-Supervised Learning
► Heuristic can be applied to almost any language, since
emoticons are used extensively on Twitter
► Amount of tweets with emoticons differs among languages
Caused by many factors like language-specific ways to
express sentiments or different distributions of “formal”
tweets
Table 2: Number of tweets containing emoticons for each language
18. September 2012 Language-Independent Twitter Sentiment Analysis 12
13. Overview
►1. Sentiment analysis on social media
►2. Creation of a multilingual evaluation dataset of
tweets
►3. A language-independent sentiment labeling
heuristic for semi-supervised learning
►4. Experiments on the multilingual dataset
18. September 2012 Language-Independent Twitter Sentiment Analysis 13
14. 4. Experiments – Sentiment Classification
► Data:
Training: From ~ 800M random tweets of mixed languages:
Filter for languages: English, German, French, Portuguese
Use emoticon heuristic to select and label training data
Evaluation: 12597 hand-annotated tweets (4 languages)
► Setup:
Classification: Sentiment polarity only
Classifier: Naive Bayes
Features: 1-grams and 1, 2-grams
Trained 4 classifiers for en, de, fr, pt
1 classifier for combined en+de+fr+pt
18. September 2012 Language-Independent Twitter Sentiment Analysis 14
15. 4. Experiments: Evaluation Dataset
► 2 variations of our evaluation set for the experiments:
agree-3: Tweets all 3 annotators agreed on for a sentiment
agree-2: Tweets at least 2 annotators agreed on
► Baseline: always guess “positive” (more pos. tweets than neg.)
Table 3: Tweet counts for the evaluation datasets
18. September 2012 Language-Independent Twitter Sentiment Analysis 15
16. 4. Results – English Classifier
► Best results: English classifier using 1-grams, on the 3-agree set
81.3% accuracy (500k trained tweets)
► Performance on 2-agree set constantly lower than 3-agree
en
18. September 2012 Language-Independent Twitter Sentiment Analysis 16
17. 4. Results – All Languages
en de
fr pt
18. September 2012 Language-Independent Twitter Sentiment Analysis 17
18. 4. Evaluation – All Languages Compared
en de
► Strong differences
between languages
► Differences do not
correlate with number
of emoticons in each fr pt
language
► Emoticon heuristic better
fit for some languages,
may depend on the style of
expressing sentiment in it
► “muito engraçado kkkkkkkk”
Table3: Tweet counts containing emoticons for each language
18. September 2012 Language-Independent Twitter Sentiment Analysis 18
19. 4. Evaluation – Multi-language Classifier
► Tested on combined 4 language evaluation set
► Highest Performance: 71.5% accuracy
Slightly less than using 4 individual classifiers (73.9% accuracy)
► Usefulness of combined classifier can outweigh performance
degradation
en+de+fr+pt
18. September 2012 Language-Independent Twitter Sentiment Analysis 19
20. Conclusions
► We presented and evaluated a language-independent
sentiment classification approach on 4 languages
A language-independent classifier can be trained given only
raw tweets, using a noisy label heuristic
Good performances across languages, varies for each
Classifiers need a very large number of tweets for training
Mixed-language classifiers are viable
► Future work:
Currently we only classify sentiment polarity
Classifying subjectivity in tweets is important, but finding a
good heuristic to label “neutral” tweets is a challenge
18. September 2012 Language-Independent Twitter Sentiment Analysis 20
21. Language-Independent Twitter Sentiment Analysis
Thanks for your attention!
Questions?
18. September 2012 Language-Independent Twitter Sentiment Analysis 21
22. Contact
Sascha Narr DAI-Labor
Dipl.-Inform. Technische Universität Berlin
Fakultät IV –
Competence Center Information Retrieval & Elektrontechnik & Informatik
Machine Learning
sascha.narr@dai-labor.de Sekretariat TEL 14
Fon +49 (0) 30 / 314 – 74 138 Ernst Reuter Platz 7
Fax +49 (0) 30 / 314 – 74 003 10587 Berlin
www.dai-labor.de
18. September 2012 Language-Independent Twitter Sentiment Analysis 22