Evaluating the Utilization of Twitter Messages as a Source of Security Alerts
1. Evaluating the Utilization of
Twitter Messages as a
Source of Security Alerts
Authors:
Luiz Arthur F. Santos Daniel Macêdo Batista
luizsantos@utfpr.edu.br batista@ime.usp.br
Rodrigo Campiolo Marco Aurélio Gerosa
rcampiolo@utfpr.edu.br gerosa@ime.usp.br
These slides from Luiz Arthur Feitosa Santos, Rodrigo Campiolo, Daniel Macêdo Batista e Marco Aurélio Gerosa
was licensed with a license Creative Commons - Attribution – Non-Commercial 3.0 Not adjusted.
2. Introduction:
●Research Problem:
Delay in propagation of information from new threats (Zero-day
vulnerabilities).
Specialized applications are not fully effective against new
threats.
● Potential Solutions:
The problem can be mitigated by rapid propagation of alerts.
Use of social networks.
2
3. Objective:
Analyze a set of Twitter messages to verify that these messages
can help in the identification and early warning of potential security
problems.
Contributions:
Confirm that there is collaboration in social networks in relation to
computer security.
Characterization of security messages.
3
4. Hypotheses:
H1 - There is information about computer security in Twitter
messages and many of these messages indicate potential threats.
H2 - Twitter reports issues of information security before some
specialized sites.
H3 - Users on Twitter are concerned to warn another users about
security issues.
4
6. Methodology:
1. Get tweets
a. … Problem X …
b. ...PROBLEM Y … http...
c. ... Problem … X … http...
d. Threat Y ... #virus
e. … @user … Problem X …
f. New Malware Z...
g. X Solution.. http
Searches in the range of
1 minute for 132 days:
security AND (virus OR worm
OR attack OR intrusion
OR invasion OR ddos
OR hacker OR cracker
OR exploit OR malware)
6
7. Methodology:
1. Get tweets
a. … Problem X …
b. ...PROBLEM Y … http...Tweet
tweet
c. ... Problem … X … http...
TWEET
d. Threat Y ... #virus TwEet
e. … @user … Problem X …
f. New Malware Z...
g. X Solution.. http
3. Similarity and cluster
1a. … Problem X …
1c. ... Problem … X … http...
1e. … @user … Problem X …
2d. Threat Y ... #virus
2b. ...PROBLEM Y … http...
3f. New Malware Z...
Degree of similarity:
4g. X Solution... http 0,5 – tweets with tweets
7
8. Methodology:
1. Get tweets
a. … Problem X …
b. ...PROBLEM Y … http... 2. Get Feeds
c. ... Problem … X … http...
d. Threat Y ... #virus a. Problem X... new exploit...
e. … @user … Problem X … b. Problem Z...
f. New Malware Z...
g. X Solution.. http
Searches for 2 months
3. Similarity and cluster
using 30 websites of security.
1a. … Problem X … We also used a web crawler.
1c. ... Problem … X … http...
1e. … @user … Problem X …
2d. Threat Y ... #virus
2b. ...PROBLEM Y … http...
3f. New Malware Z...
4g. X Solution... http
8
9. Methodology:
1. Get tweets
a. … Problem X …
b. ...PROBLEM Y … http... 2. Get Feeds
c. ... Problem … X … http...
d. Threat Y ... #virus a. Problem X... new exploit...
e. … @user … Problem X … b. Problem Z...
f. New Malware Z...
g. X Solution.. http
Degree of similarity:
3. Similarity and cluster 0,2 – news with tweets
1a. … Problem X …
1c. ... Problem … X … http...
1e. … @user … Problem X … 4. Important messages
2d. Threat Y ... #virus 1a. … Problem X …
2b. ...PROBLEM Y … http...
3f. New Malware Z... 3f. New Malware Z...
4g. X Solution... http
9
10. Data Collected:
Twitter - from 28/Apr/2012 to 06/Nov/2012
●Number of tweets: 82,355
●Average of tweets per day: ~623
●Number of user: 42,340
●with links to URLs: 87.6 %
●with mention users - @: 37.7 %
●with hashtags - #: 37 %
Feeds - from 01/Apr/2012 to 15/Nov/2012
● Number of feeds: 4,546
10
11. Data Analysis:
Words most used by security tweets
Searched terms Security terms
Qty Words Qty Words
51.197 security 4.671 android
23.030 malware 4.536 flame
22.108 attack 4.214 infosec
10.196 hacker 4.200 news
9.893 virus 4.056 cyber
5.695 exploit 3.270 anti
2.359 ddos 2.788 computer
951 worm 2.637 hacking
816 intrusion 2.419 iran
699 invasion 2.398 apple
246 cracker 2.336 internet
11
12. Data Analysis:
Sample of relevant tweets:
Pos tweets Message excerpts
1 512 Malicious code on Adobe Flash player http...
2 463 How Flame virus has changed everything for online security firms ...
http://t.co...
3 374 New Java Zero-Day Exploit Hits http...
4 373 Kaspersky Anti-Virus Internet Security ... http://t.co/D0Gqh3RR
438 37 Only 9 of 22 virus scanners block Java exploit http://t.co/rw1sa3jf
439 37 ...Microsoft Services Agreement email notifications lead to latest Java
exploit http...
440 36 RT @CompuSec... Hackers, rootkit find place in new novel...
441 36 # Android Map Malware http://t.co/...
1735 10 ...Gevaarlijk wis-virus verwijdert brandende VS-vlag - Er is een nieuwe
variant...
1736 10 Valse Amazon-bestelling bevat Java-exploit ... http://t.co/f1KIGG2s via
@shareth...
1737 10 ...malware via Java-lek Op de website van de Telegraaf hebben
aanvallers kwaadaardige...
1738 10 Mobile Malware On The Rise, Android Most At Risk, Says McAfee
http://t.co/iyhKXaxE
12
13. Data Analysis:
Classification of tweets grouped with the specialized sites.
82%
are related with
Classification % Tweets security!
Relevant 62%
Irrelevant 20%
Spams 10%
Others 8%
13
14. Data Analysis:
Classification of tweets after clustering.
Evaluating a sample of 100 groups of a total 1.738.
Classification % Tweets 91 %
are related with
Security alerts 60% security!
General security 31%
Others 9%
14
15. Evaluation of Hypotheses:
H1 - There is information about computer security in Twitter
messages and many of these messages indicate potential threats.
82.355 tweets in 132 days, averaging of 623,90 tweets per day.
91% tweets reported security issues.
60% tweets report security alerts.
15
16. Evaluation of Hypotheses:
● H2 - Twitter reports issues of information security before some
specialized sites.
43% of tweets have most recent date.
Example:
PHP-CGI query string parameter vulnerability
➢Post on 02/May/2012 at CERT.
➢Posted in Twitter on 04/May/2012.
➢Cataloged in NIST on 11/May/2012.
16
18. Evaluation of Hypotheses:
H3 - Users on Twitter are concerned to warn another users about
●
security issues.
Average time of propagation is 12 days.
10 retweets hit ~10,000 users. The last
two messages respectively hit 22,468
and 52,074 Twitter users.
The message most propagate hit
~512,000 people.
18
19. Final Considerations:
● Difficulty selecting tweets (content and size).
● Social networks propagate security alerts.
● The alerts achieve high and rapid spread.
19
20. Future Work:
● Make new queries using other terms of the security.
● Improve the filter anti-spam/messages out of context.
● Evaluation of security alerts on other social networks.
●Develop an automated early warning of security based on social
networks.
20
21. Questions?
Luiz Arthur F. Santos Daniel Macêdo Batista
luizsantos@utfpr.edu.br batista@ime.usp.br
Rodrigo Campiolo Marco Aurélio Gerosa
rcampiolo@utfpr.edu.br gerosa@ime.usp.br
Thanks / Obrigado!
21