1. Learning to detect Misleading
Content on Twitter
Christina Boididou, Symeon Papadopoulos,
Lazaros Apostolidis, Yiannis Kompatsiaris
Information Technologies Institute, CERTH, Thessaloniki, Greece
ACM International Conference on Multimedia Retrieval
June 6-9, Bucharest, Romania
2. REAL OR FAKE: THE VERIFICATION PROBLEM
FAKE PHOTO
Photoshopped!
3. REAL OR FAKE: THE VERIFICATION PROBLEM
REAL PHOTO
Captured in Dublin’s Olympia Theatre
BUT
Mislabeled on social media as showing
the crowd at the Bataclan theatre just
before gunmen began firing.
4. TYPES OF FAKE
Reposting of real
multimedia
content
Reposting of
synthetic
Digital
tampering
Speculations
fake is any post (tweet) that shares multimedia content that does not faithfully represent the event that it refers to
5. Verification
Corpus
CL11 CL12 CL1n
CL2nCL22CL21
..
..
Tweet
FRAMEWORK OVERVIEW
Visualization
Tweet-based
features
User-based
features
Tweet-based
features
User-based
features
Predictive
model
Predictive
model
Prediction
Prediction
Label
majority vote
majority vote
Training
Testing
Fusion
6. FEATURE EXTRACTION
TWEET-BASED
Features related to tweets
• Text-based
• Language-specific
• Twitter-specific
• Link-based
USER-BASED
Features related to users
• User-specific
• Link-based num of
uppercase
characters: 13
num of
words: 24
num of slang
words: 1
Contains
first order
pronoun
num of
retweets: 3
Num of
favorites: 13
num of
mentions: 2
text
readability: 73
7. FEATURE EXTRACTION
TWEET-BASED
Features related to tweets
• Text-based
• Language-specific
• Twitter-specific
• Link-based
USER-BASED
Features related to users
• User-specific
• Link-based
Verified?
9. VERIFICATION CORPUS
COLLECTION
Set of tweets T collected with a set of keywords K
Tweets contain multimedia content (Image or Video)
GROUND TRUTH
Reputable online resources which debunk
images/videos
Publicly available corpus here:
https://github.com/MKLab-ITI/image-verification-
corpus
193real Images & Videos
6,225real Tweets
220fake Images & Videos
9,596fake Tweets
17events
10. EXPERIMENTAL STUDY
AIM
Evaluate the fake detection accuracy on samples from new events
Accuracy: 𝑎 =
𝑁 𝑐
𝑁
EXPERIMENTS
Kind of event-based cross-validation
For each event Ei -> training: 16 remaining events, testing: Ei
Additional split proposed on MediaEval task [1]
Random Forest of 100 trees
[1] Christina Boididou, Katerina Andreadou, Symeon Papadopoulos, Duc-Tien Dang-Nguyen, Giulia Boato, Michael Riegler, and Yiannis
Kompatsiaris. 2015. Verifying Multimedia Use at MediaEval 2015. In MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany.
11. EXPERIMENTAL STUDY
0
10
20
30
40
50
60
70
80
90
100
Baseline Features Total Features
Effect of bagging across the models and the feature groups
Tweet-based model Tweet-based model (bagging)
User-based model User-based model (bagging)
Baseline Features
Proposed in our previous work
Total Features
Baseline Features +
Newly proposed ones
15. COMPARISON WITH OTHER METHODS
METHOD F1-SCORE
MEDIAEVAL
2015
UoS-ITI 0.830
MCG-ICT 0.942
CERTH-UNITN 0.911
MEDIAEVAL
2016
Linkmedia 0.8246
MMLAB@DISI 0.8283
MCG-ICT 0.6761
VMU 0.9116
Proposed 0.934
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MCG-ICT (2015) method:
• Approach tailored to the given MediaEval dataset
• Preprocessing step that first groups tweets by their multimedia content
• Difficult to apply in realistic setting
16. TWEET VERIFICATION ASSISTANT
ABOUT
Visualize the verification result
Present list of extracted features
and their values
Compare values in comparison to
the ones from the verification
corpus
HOW TO USE
Provide URL or tweet ID
Inspect the features and the
verification result (fake/real)
Find the Tweet Verification Assistant here: http://reveal-mklab.iti.gr/reveal/fake/
18. CHALLENGES AND FUTURE WORK
CHALLENGES
Making the tool usable and easy to understand by non-computer scientists
• Interpretation of Machine Learning outputs is challenging
• Difficult to create an application that journalists could rely on and trust
FUTURE WORK
Test the Verification Assistant usefulness when used by journalists/news editors
Extend the framework to other social media
Leverage method output for other verification problems [1]
[1] Olga Papadopoulou, Markos Zampoglou, Symeon Papadopoulos and Yiannis Kompatsiaris. Web Video
Verification using Contextual Cues
19. Thank you!
Get in touch:
• Christina Boididou: christina.mpoid@gmail.com / @CMpoi
• Symeon Papadopoulos: papadop@iti.gr / @sympap
• Lazaros Apostolidis: laaposto@iti.gr
• Verification Corpus: https://github.com/MKLab-ITI/image-verification-corpus
• Tweet Verification Assistant: http://reveal-mklab.iti.gr/reveal/fake/
With the support of:
Hinweis der Redaktion
Recent years, we have seen a tremendous increase in the use of social media platforms as means of sharing content. The simplicity of sharing has led to large volumes of news content reaching huge numbers of readers in short time. Especially multimedia content can easily become viral as easily consumed and carrying entertainment value.
Given the speed of the news and the competition of journalists to publish first, the verification of the content is neglected or carried out in superficial manner. This leads to the online appearance of misleading multimedia content, or for the sake of brevity fake content. For example, let’s look at this picture: Can you make a guess? Is it real or fake? Even though Sharuman could well attend this meeting, this image was ultimately found to be photoshopped.
Now, let’s have a look at this image. What is your guess now?
Here we deal with an other type of fake photos. It is a real photo but was mislabeled on social media as showing the crowd at the Bataclan theatre just before gunmen started firing.
So, as misleading or fake we consider any twitter post that shares multimedia content that does not faithfully represent the event that it refers to.
This could include Reposting of real multimedia content, Reposting of synthetic/artworks, Digital tampering/photoshop or Speculations.
In order to deal with the verification problem, we present a robust approach for detecting in real time whether a tweet that shares a multimedia item is fake or real.
The proposed framework relies on two independent classification models built on the training data (verification corpus) using different sets of features, tweet-based and user based features. A bagging technique is used when building the models. We use n subsets of tweets including equal number of samples for each class leading to the creation of n classifiers. The final prediction is the majority vote among the n predictions.
At prediction time, an agreement based retraining technique is employed which combines the outputs of the two models. The outome is then visualized to the users, using information of the labelled verification corpus.
The selection of our features was carried out following a thorough study of the way journalists verify content on the web.
We have defined two sets of features, the tweet-based extracted from the tweet itself.
Assess the trust of the website
A key novelty in our approach is the ABR technique (fusion block). We combine the outputs as follows: for each sample, we compare the predictions and depending on the agreement we divide the test set in agreed and disagreed samples. The agreed samples are assigned the agreed label (fake or real) assuming that it is correct with high likelihood and they consistute the predictions for the agreed samples. Then, we use a retraining technique. First we select the most effective of the independent classifiers based on their performance on the training set with cross validation. Then we use the agreed samples together with the initial training samples of the VC to predict labels for the disagreed samples. The goal is to adapt the initial model to the characteristics of the new unseen event.
Our VC is a publicly available dataset with fake and real tweets. It consists to tweets related to 17 events compromising in total 220 cases blah blah. The tweets were collected using a set of keywords and they were debunked using reputable online resources. Only tweets with a multimedia item of these ones were included in the dataset and several manual steps were necessary to come up with those.
The aim of the conducted experiments was to evaluate the fake detection accuracy on samples from new events. We consider this very important aspect of a verification framework as the nature of fake tweets may vary across different events.
The employed scheme can be thought as an event-based cross-validation
We first assess the contribution of the features on the method’s accuracy. We compare the performance using the baseline and the full set of features. The baseline features are just a subset of the features that we used on our previous work. Then, we assess the bagging we applied in our method. We can see that the full set of features and the bagging in both the tweet and user based features model led to considerably improved accuracy.
In this graph, we present the agreement level and the accuracy of the classifiers on the agreed set. We note that the higher the agreed level the higher the achieved accuracy. The last column is the average percentage of the classifiers across the different trials.
This bar chart shows the agreed accuracy, the disagreed accuracy and finally the overall across the trials. On the right chart, we can see the average accuracy levels of them with green orange and grey respectively. The last columns, with the blue color, are the performance of each of the models when tested individually on the test set. One can see a clear improvement (about 5%) compared to the overall accuracy.
We also assessed the model on tweets written in different languages.
Five most used languages in the corpus. No lang -> not detected or not much text
Accuracy is stable independent of the language
We also compare our model with methods sybmitted to Mediaeval 2015 verification task against their best run.
Our proposed method achieves the second best performance reaching almost equals to the best run.
One of the biggest challenges we are facing is making the tool usable and easy to understand by non computer scientists.
Our experience with media experts from Deutsche Welle & AFP (Agence France Presse) shows that the …