This paper describes the University of Sheffield's submission to the SemEval 2016 Twitter Stance Detection weakly supervised task (SemEval 2016 Task 6, Subtask B). In stance detection, the goal is to classify the stance of a tweet towards a target as "favor", "against", or "none". In Subtask B, the targets in the test data are different from the targets in the training data, thus rendering the task more challenging but also more realistic.
To address the lack of target-specific training data, we use a large set of unlabelled tweets containing all targets and train a bag-of-words autoencoder to learn how to produce feature representations of tweets. These feature representations are then used to train a logistic regression classifier on labelled tweets, with additional features such as an indicator of whether the target is contained in the tweet. Our submitted run on the test data achieved an F1 of 0.3270.
Paper: http://isabelleaugenstein.github.io/papers/SemEval2016-Stance.pdf
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
1. Isabelle Augenstein, Andreas Vlachos, Kalina Bontcheva
i.augenstein@ucl.ac.uk, {a.vlachos | k.bontcheva}@sheffield.ac.uk
USFD at SemEval-2016 Task 6: Any-Target Stance Detection on Twitter with
Autoencoders
Stance Detection Subtask B
Classify attitude of tweet towards target as “favor”, “against”, “none”
Tweet: “No more Hillary Clinton” Target: Donald Trump Stance: FAVOR
Subtask A training targets: Climate Change is a Real Concern, Feminist
Movement, Atheism, Legalization of Abortion, Hillary Clinton
Subtask B testing target: Donald Trump
Challenges
• Labelled data not available for the test target
• Manual labelling of training data not allowed
• Target does not always appear in tweet
Feature Extraction
• Aut-twe: Tweet auto-encoded tweet,100d feature vector
• targetInTweet: is (shortened) target contained in tweet
• Good indicator for non-neutral stance
• Other features tested (not used for final run): WordNet-
Affect gazetteers, emoticon detection
• Baselines: bag of word, word2vec (trained on same data
as autoencoder)
Results
Model Comparison (Hillary Clinton, dev)
Model Comparison (Donald Trump, test)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Macro F1
BoW
BoW+inTwe
Word2Vec
Aut-twe
Aut-twe+inTwe
Conclusions
• It is important to detect if the target is mentioned in the tweet
• Hillary Clinton: 0.4538 F1 (inTwe) vs 0.3243 F1 (not inTwe)
• Donald Trump: 0.3745 F1 (inTwe) vs 0.2377 F1 (not inTwe)
• Autoencoder can help to detect stance towards unseen targets
• Developing method for new targets without labelled training
data is challenging - discrepancies between what works for dev
vs. test set
• Future work: better incorporate the target for stance detection
Acknowledgements
This work was partially supported by the European Union, grant agreement
No. 611233 PHEME (http://www.pheme.eu)
Data
• 5 628 labelled train tweets about Subtask A
targets
• 1 278 about Hillary Clinton, used for dev
• 278 013 unlabelled Donald Trump tweets
• 395 212 collected unlabelled tweets about all
targets
• Keywords: hillary, clinton, trump, climate,
femini, aborti
• 707 Donald Trump test tweets
Preprocessing
• Phrase detection: Train phrase detection model on unlabelled
+labelled tweets, e.g. “donald”, “trump” → “donald trump”
Autoencoder
• Bag-of-word autoencoder, using 50 000 most
frequent words
• trained on unlabelled+labelled tweets
• Input vector: dimensionality 50 000. For each word
in vocabulary, does tweet contain the word or not
• One hidden layer (size 100), output size 100
• Trained encoder is applied to labelled train and
test data to obtain 100d features, decoder not used
Model Macro F1
Majority class (official) 0.2972
SVM n-grams (official) 0.2843
BoW 0.3453
Aut-twe (submi6ed) 0.3307
References
• Code: https://github.com/sheffieldnlp/stance-semeval2016
• Phrases: Mikolov et al. (2013). Distributed Representations
of Words and Phrases and Their Compositionality. NIPS.
Tweets
“No more Hillary Clinton”, “Donald Trump”, “FAVOR”
Preprocessing: [“No”, “more”, “Hillary_Clinton”]
Autoencoder Training
[america: 0, …, Hillary_Clinton: 1] 50 000d input
[0, 0, …, 1] 100d hidden layer
[0, 1, …, 1] 100d output layer
Feature Extraction
Autoencoder inTwe
[0, 1, …, 1] 0
Logistic
Regression
Model
Predictions
“#voteTrump (…)”, “Donald Trump”, “FAVOR”
“youre fired (…)” “Donald Trump”, “AGAINST”