An Ensemble Model for Cross-Domain Polarity Classification on Twitter

An Ensemble Model for Cross-Domain Polarity
Classification on Twitter
Adam Tsakalidis, Symeon Papadopoulos, Yiannis Kompatsiaris
15th Web Information System Engineering (WISE 2014)
Thessaloniki, 14 October 2014
Information Technologies Institute (ITI)
Center for Research & Technology Hellas (CERTH)
#wiseconf2014

Overview
• The Problem
• Existing Approaches
• Proposed Ensemble Model
• Experimental Study
• Summary – Future Work
#wiseconf2014 #2

Polarity Classification
motivation & definition
#wiseconf2014 #3

Polarity Classification - Problem
Finding about whether a piece of text (e.g. tweet)
conveys a positive or negative sentiment about a
topic or entity of interest (person, party, brand, etc.)
#wiseconf2014 #4

Polarity Classification - Applications
• Monitoring sentiment about a brand could help take
appropriate measures in case of negative sentiment
increase (e.g. apology, assistance, compensation).
• Aggregating over many tweets could help form an
overall impression of the public opinion about the
topic/entity of interest.
#wiseconf2014
#5

The Cross-Domain Challenge
• Depending on the domain (and sometimes even the
topic) of interest, the vocabulary and phrasing of
sentiment may vary wildly.
• Most sentiment detection approaches require
manually created ground truth from the same
domain to achieve high performance  expensive!
• We propose a framework that leverages an
ensemble of sentiment detectors to tackle the cross-domain
#wiseconf2014
challenge.
#6

Related Work
#wiseconf2014
#7

Background: Sentiment Analysis
Two main problems: subjectivity & polarity
TTeexxtt RReepprreesseennttaattioionn LLeeaarrnniningg SSeenntitmimeenntt
NNBB SSVVMM
#wiseconf2014 #8
Hybrid-
Hybrid-
Ensemble
Unsupervised SSSSLL Ensemble UUnnigigrarammss BBigigrarammss PPOOSS OOththeerr SSuuppeervrvisiseedd Unsupervised

Hybrid Models
Barbosa and Feng, COLING 2010
• Combines sentiment labels from three independent sources
• Considers the quality of individual sources
• Considers the different bias of individual sources
Gonçalves et al., COSN 2013
• First evaluates 8 different methods with respect to coverage
and agreement
• Proposes a combined method, which combines individual
method outputs based on their performance (first in terms of
coverage and then in terms of accuracy) on an independent
dataset
#wiseconf2014
#9

Ensemble Model for Cross-Domain
Polarity Classification
approach description
#wiseconf2014
#10

Method Overview
#wiseconf2014
#11
TTBBRR
FFBBRR
CCRR
LLBBRR
CCLL__TTBBRR
CCLL__FFBBRR
TTwweeeett
CCLL__CCRR
CCLL__LLBBRR
HHCC
LLCC
CCOOMMBB
representation learning ensembling

Representation
• TBR
– binary n-grams
– n-grams with tf
– n-grams with tf-idf
– n = 1, 2, 3  9 representations
• FBR
• CR
– TBR + POS tagging
• LBR
– SentiWordNet (5D): sum of scores for Noun, Verbs, Adjectives,
Adverbs, sum of keywords (#pos - #neg)
– Opinion Lexicon: Sum of keywords (#pos - #neg)
#wiseconf2014
#12
TBR, FBR and CR with tf < 2
were removed!

Ensemble Learning
• Combine two types of classifiers: HC and LC
• Hybrid classifier (HC) [domain-dependent]
i: tweet, c: class (pos/neg), r: TBR, FBR, LR, wr: weights defined based on
accuracy on ED dataset, pr: individual classifier output
• Lexicon-based classifier (LC) [domain-independent]
• If the outputs of the two classifiers agree, then
assign the agreed class to the tweet.
• If the outputs disagree, use a domain-adapted
classifier (TBR) trained on the agreed tweets.
#wiseconf2014
#13

Practical aspects
• In order for the ensemble model to work well, the
two types of classifiers need to be individually
accurate at sufficient rates, since we trust their
outputs to be used as a new training set!
• In a real-time scenario, we need to allocate some
time for the domain adaptation process to take
place. This should be defined on a per case basis.
adaptation ensemble model operation
#wiseconf2014
#14

Experimental Study
#wiseconf2014
#15

Datasets
• Emoticons Dataset (ED)
– 250K tweets collected over 2-days (mid-march 2014) containing
happy/sad emoticons (removed retweets) [English only!]
– used for feature extraction, preliminary development
• Development set (DEV)
– 10K tweets (equally balanced) collected in the same way as ED
• Stanford Twitter dataset (STS)
– STS-1: 177 pos, 184 neg
– STS-2: 108 pos, 75 neg
• Obama Healthcare Reform (HCR)
– dev-set: 135 pos, 328 neg
– test-set: 116 pos, 383 neg
• Obama-McCain Debate (OMD)
– subset (agreement>50%): 707 pos, 1190 neg
#wiseconf2014
#16

Model building
• Performance of TBR variants, role of POS (on 33% of ED)
• Performance of LR features
• Comparative performance of all classifiers (on DEV)
#wiseconf2014
#17

Results
more realistic estimate of real-world performance / still, far better than other out-of-
domain methods
over-optimistic estimate of real-world performance since 90% of set used for training
#wiseconf2014
#18

Role of classifier agreement
• As expected, accuracy on disagreed tweets is lower
– results produced using a classifier trained on “noisy” data
– more challenging cases
• A similar case when training using automatically
collected data based on emoticons
#wiseconf2014
#19

Conclusions & Future Work
#wiseconf2014
#20

Conclusions & Future Work
Conclusion
•Ensemble model in tandem with domain adaptation is
an effective strategy for tackling the cross-domain
challenge!
Future Work
•Incorporate the “neutral” class and evaluate
•Test with more datasets
•Check with additional classifiers as input
#wiseconf2014
#21

Thank you!
#wiseconf2014
#22
Questions?
https://github.com/socialsensor/sentiment-analysis
http://socialsensor.eu/
adam.tsakalidis@gmail.com / papadop@iti.gr
a.tsakalidis@warwick.ac.uk

Key References (1/3)
• Alina Andreevskaia and Sabine Bergler. When specialists and generalists work
together: Overcoming domain dependence in sentiment tagging. In ACL, pages
290-298, 2008
• Luciano Barbosa and Junlan Feng. Robust sentiment detection on twitter from
biased and noisy data. In Proceedings of the 23rd International Conference on
Computational Linguistics: Posters, pages 36-44. ACL, 2010
• Adam Bermingham and Alan F Smeaton. Classifying sentiment in microblogs: is
brevity an advantage? In Proceedings of the 19th ACM international conference
on Information and knowledge management, pages 1833-1836. ACM, 2010
• Albert Bifet and Eibe Frank. Sentiment knowledge discovery in twitter
streaming data. In Discovery Science, pages 1-15. Springer, 2010
• Johan Bollen, Huina Mao, and Alberto Pepe. Modeling public mood and
emotion: Twitter sentiment and socio-economic phenomena. In ICWSM, 2011
• Samuel Brody and Nicholas Diakopoulos.
Cooooooooooooooollllllllllllll !!!!!!!!!!!!!!: using word lengthening to detect
sentiment in microblogs. In Proceedings of the Conference on EMNLP, pages
562-570. ACL, 2011.
#wiseconf2014
#23

• Andrea Esuli and Fabrizio Sebastiani. Sentiwordnet: A publicly available lexical
resource for opinion mining. In Proceedings of LREC, vol. 6, pages 417-422, 2006
• Alec Go, Richa Bhayani, and Lei Huang. Twitter sentiment classification using
distant supervision. CS224N Project Report, Stanford, pages 1-12, 2009
• Pollyanna Goncalves, Matheus Araujo, Fabricio Benevenuto, and Meeyoung Cha.
Comparing and combining sentiment analysis methods. In Proceedings of the first
ACM COSN, pages 27-38. ACM, 2013
• Minqing Hu and Bing Liu. Mining and summarizing customer reviews. In
Proceedings of the tenth ACM SIGKDD international conference on Knowledge
discovery and data mining, pages 168-177. ACM, 2004
• Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. Target-dependent
twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the
ACL: HLT-Vol. 1, pages 151-160. ACL, 2011
• Jonathon Read. Using emoticons to reduce dependency in machine learning
techniques for sentiment classification. In Proceedings of the ACL Student Research
Workshop, pages 43-48. ACL, 2005
• Hassan Saif, Yulan He, and Harith Alani. Semantic sentiment analysis of twitter. In
The Semantic Web-ISWC 2012, pages 508-524. Springer, 2012
#wiseconf2014
#24

• Emmanouil Schinas, Symeon Papadopoulos, Sotiris Diplaris, Yiannis Kompatsiaris,
Yosi Mass, Jonathan Herzig, and Lazaros Boudakidis. Eventsense: Capturing the
pulse of large-scale events by mining social media streams. In Proceedings of the
17th Panhellenic Conference on Informatics, pages 17-24. ACM, 2013
• Michael Speriosu, Nikita Sudan, Sid Upadhyay, and Jason Baldridge. Twitter
polarity classification with label propagation over lexical links and the follower
graph. In Proceedings of the First workshop on Unsupervised Learning in NLP,
pages 53-63. ACL, 2011
• Kristina Toutanova, Dan Klein, Christopher D Manning, and Yoram Singer. Feature-rich
part-of-speech tagging with a cyclic dependency network. In Proceedings of
the 2003 Conference of the North American Chapter of the ACL on HLT-Vol. 1,
pages 173-180. ACL, 2003
• Andranik Tumasjan, Timm Oliver Sprenger, Philipp Sandner, and Isabell Welpe.
Predicting elections with twitter: What 140 characters reveal about political
sentiment. In ICWSM 2010, 10:178-185, 2010
• Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual
polarity in phrase-level sentiment analysis. In Proceedings of the conference on
HLT and EMNLP, pages 347-354. ACL, 2005
• Ley Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu.
Combining Lexicon-based and learning-based methods for twitter sentiment
analysis. HP Laboratories, Technical Report HPL-2011, 89, 2011.
#wiseconf2014
#25

An Ensemble Model for Cross-Domain Polarity Classification on Twitter

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (6)

Ähnlich wie An Ensemble Model for Cross-Domain Polarity Classification on Twitter

Ähnlich wie An Ensemble Model for Cross-Domain Polarity Classification on Twitter (20)

Mehr von Symeon Papadopoulos

Mehr von Symeon Papadopoulos (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

An Ensemble Model for Cross-Domain Polarity Classification on Twitter