SlideShare ist ein Scribd-Unternehmen logo
1 von 74
Downloaden Sie, um offline zu lesen
Broker Bots: Analyzing
automated activity during
High Impact Events on Twitter
Sudip Mittal
!
M.Tech Thesis Defense
6th June 2014
1
Committee
2
• Dr. Ponnurangam Kumaraguru (IIIT-Delhi)(Chair)
• Dr. Vinayak Naik (IIIT-Delhi)
• Dr. Sundeep Oberoi (TCS)
Introduction
•Twitter 101
•High impact events
•Automated activity/Twitter bot
•Research Motivation
3
Problem Statement
• Analyze and study automated activity (bots)
during High Impact Events on Twitter.
4
Twitter 101 - Users
5
Twitter 101 - Tweet
6
High Impact Events
• Events (some example):!
• Government policy changes,
• Elections,
• Natural calamities: earthquakes, tornados, etc.
• Celebrity gossip.
• Criteria for High Impact Events: !
• Great political and economic impact;
• May also have moderate/high damage to life and/or property.
7
Sample Posts - Boston
Bombings (2013)
• I saw people’s legs blown off. Horrific. Two explosions.
Runners were coming in and saw unspeakable horror. –
Jackie Bruno (@JackieBrunoNECN) April 15, 2013
• At the ER. Not a comforting way to pass the time.
#boston So sad. – GanderHeroDog (@veterantraveler)
April 15, 2013
• An eyewitness during the explosions at the #Boston
#Marathon says “it sounded like a cannon blast.” Video:
on.cnn.com/1399A40 – CNN Video (@CNNVideo) April
15, 2013
8
Automated Activity on Web
• Wikipedia bot
• Twitter bot
• Spam bot
• IRC bot
• Automated tasks on the web
• Bots can perform tasks that are simple and repetitive,
at a higher rate than would be possible for a human.
9
Twitter Bots
• Posts updates to Twitter
automatically.
• Used in:
• Regularly updating
users about information,
• Spam campaigns,
• Directing Twitter users
to outside webpages.
10
Research Motivation
• Bots are becoming popular on Twitter.
• Will impact discussions, information flow, credibility, information
security, etc.
• People rely on Twitter for important updates, crucial information.
• More during high impact events.
!
!
• How do bots impact discussion, information diffusion on
Twitter during high impact events?
11
Related Work
• Work on High Impact Events
• Work on Automated Activity
12
Related Work - Work on High Impact Events
13
Kwak et. al. 2010 Twitter is a social network or a news media?
Palen et. al. 2008 Analyzed Twitter adoption and use during emergencies.
Zhao et. al. 2011 Compared Twitter with traditional media (New York Times).
Mendoza et. al. 2010 Use of Twitter during the 2010 earthquake in Chile.
Castillo et. al. 2011 Information credibility on Twitter.
Gupta et. al. 2012 A mechanism to rank credible information on Twitter.
Longueville et. al. 2009 Location based mining to analyze forest fires in France.
Sakaki et. al. 2010 Used Twitter to locate epicenter & impact of earthquakes.
Agrawal et. al. 2011 Tracked the Mumbai terrorist attack on Twitter.
Verma et. al. 2011 Used NLP, extract “situational awareness tweets”.
Vieweg et. al. 2010 Use of Twitter during two natural hazards events.
Gupta et. al. 2013 Analyzed fake content during the Boston marathon blast.
Related Work - Work on Automated Activity
14
Zhang et. al. 2011 Analyzing Time Trends in bot activity on Twitter.
Chu et. al. 2010
Distinguish between automated & human accounts on
Twitter.
Roure et. al. 2013 Observed bots in the “wild” on Twitter.
Boshmaf et. al. 2011
Showed that online social networks are vulnerable to
infiltration by bots on a large scale.
Messias et. al. 2013
Studied social bots and their influence over the network in
which they were active.
Tavares et. al. 2013
Temporal analysis to compare activity of bots and
humans.
Wald et. al. 2013 Studied the users that were susceptible to bots on Twitter.
Research Findings
• Bots in high impact events spread information obtained from
“trusted and verified” sources.
• In the Boston marathon blasts we show that bots push updates
to at least 9.53% new users.
• We show that bots do not propagate rumors and even if they
do; they do it after some time.
• Bots are moving away from Twitter API based approach to
web automation softwares/systems.
• We create a classifier based on user based features with an
accuracy of 85.10% in classifying bot and non-bot accounts.
15
Methodology
• Event Selection
• Event Data Collection
• Data Annotation
• Data Enrichment
16
Event Selection
Boston Marathon Blast:

2 bombs exploded, 3 people killed.
Bombing was followed by
shootings, manhunt and firefight.

(April 15, 2013)
17
Oklahoma Tornado: 

Tornado stuck Moore, 24 people
were killed. Damages of about
$2 Billion. 

(May 20, 2013)
Navy Yard Shooting:

Lone gunman shot 12 people and
injured 3 at NAVSEA, Washington
Navy Yard. 

(September 16, 2013)
18
Event Selection
Cyclone Phailin:

Cyclonic storm hit Bay of Bengal
causing damages of $696 Million
and 45 deaths. 

(October 4 - 14, 2013)
US Ice-storm 2014:

Fierce storm hit East and
South Coast of USA. 22
people were killed damages of
about $15 Million.

(February 11 - 17, 2014)

19
Event Selection
Event Data Collection
Data collected
using Twitter API
endpoints.
20
Data Annotation
• Top 200 “tweeters” from each event
• 5 Events
• 1,000 users manually annotated
• Each user annotated by 3 annotators
• All annotators were Masters (CS) and PhD (CS)
students
• Criteria: Must be using Twitter for 1 year at least
21
Data Annotation
• Please see the document for annotation
instructions. (Cyborgs discouraged, different from
bots)
• Strict approach: 

All 3 annotators must have labelled an account as
either bot or nonbot. (100% agreement - for quality)
22
Data Enrichment
• “Enrich” for further analysis.
• We collected their Timeline, Friends and Followers
• “Enriched” by collecting profile meta-data.
23
Analysis
• Twitter bot creation methodology
• Some characteristics
• Bot Friends
• Bot Followers
• Tweet Analysis
• Impact on Information Diffusion
• Features and Classification
• Detailed analysis of few bots
• Changes (2011 vs 2013)
24
Exploratory Numbers
• 377 bots, 5 events
• 309 distinct bots (26 active during multiple events)
• 115 distinct nonbots
!
• 5 events from 2013, 7 events from 2011 (later)
25
Bot Creation Methodology
• Twitter API rules:
• Don't post duplicate content.
• Don't post content unrelated to the topic.
• Don't repeatedly follow or unfollow.
• Don't send large number of @replies.
• Don't Retweet aggressively.
• 1,000 tweets per day limit: 41 tweets per hour, 0.68 tweet
per min.
26
Bot Creation Methodology
• 4 major creation methodologies:
• Popular tweet based: “listen” for popular tweets (via API,
Trends, etc.) and repost
• Keyword based: look for certain keywords tweets and
repost.
• Source based: Content posted/reposted from a particular
other Twitter users
• Outside content based: look for updates outside Twitter like
RSS feeds, web feeds, blog updates, dedicated databases,
etc.
27
Characteristics - Friends and
Followers
• Data points clustered around the origin for bots,
more spaced out for nonbots. (similar to Chu et. al.)
28
NonbotBot
29
Characteristics - Profile
Description
Bots
Nonbots
Bot Friends and Data Sources
• One way: frequency of the “@mention” used to cite source, hold
conversations on Twitter. (Retweets, @replies, etc.)
• Retweets in API: “RT @mention tweet text” or “tweet text via
@mention”
30
Bot Friends and Data Sources
31
Bots in high impact events spread information obtained
from “trusted and verified” sources.
Bot Followers - Profile
Description
Total #of Bot Followers collected: 623,198.
32
Bot Followers - Network

(Boston Marathon Blasts)
33
• Total #of Bot Followers collected: 623,198.
• Network created using Gephi (open graph viz platform).
• RAM issues!!! Not possible to create the whole network!
• We created 2 random subsets of 40,000 & 20,000 users.
• Created network for the same.
34
Average graph
degree: 1.01545
Bot Followers - Network

(Boston Marathon Blasts)
Bot Followers - Network

(Boston Marathon Blasts)
35
Bot Followers - Network

(Boston Marathon Blasts)
36
• Most users follow 1 bot account, very few follow
more than 1.
• Following more than one bot accounts will “flood”
their Timeline; coz bots tweet a lot!
Impact on Information Diffusion
during High Impact Events
• Focused on Boston blasts
dataset.
• In the Boston marathon
blasts we show that bots
push updates to at least
9.53% new users.
• These bots can be seen as
“brokering” information to
other users.
37
Role in Rumor Propagation

(Boston Blasts)
• High impact events: spread of misinformation, incredible
information, fake images, hoaxes, malicious content, etc.
• Castillo et. al. & Gupta et. al.: Information credibility on
Twitter.
• Gupta et. al. (separate paper) Analyzed spread of fake
images during Boston bombings, working on the exact
same dataset.
• Look at some rumors in the Boston event and correlate
the Timeline with annotated 97 bots from our dataset.
38
Role in Rumor Propagation

(Boston Blasts)
• Rumor 1:

Suspect became citizen on 9/11!
• “RT @pspoole: Dzhokhar Tsarnaev received US
citizenship on Sept 11, 2012 – 11 years to the day after
9/11 attacks http://t.co/kHLL7mkjnn”
• 1st tweet was by @pspoole on Friday April 19, 2013, at
15:34:56 (+0000).
• Only 2 bots Retweet this rumor on Friday, April 19,
2013, at 15:41:35 and 15:47:58 (+0000).
39
Role in Rumor Propagation

(Boston Blasts)
• Rumor 2:

Sandy hook child killed in bombing!
• “RT @CommonGrandma: She ran for the Sandy Hook
children and was 8 years old. #prayforboston http://
t.co/cLir6nI7tB”
• 1st tweet on Friday, April 19, 2013, at 09:56:45 (+0000).
• This rumor was never picked up by any of the 97
bots in our dataset.
40
Role in Rumor Propagation

(Boston Blasts)
• Rumor 3:

Donating 1$ Tweet.!
• A Tweet by a fake account @_BostonMarathon, “For each RT this gets, $1 will be
donated to the victims of the Boston Marathon Explosions. #DonateTo- Boston”
• Tweet time: Monday, April 15, 2013, at 11:29:23 (+0000).
• Only 1 bot in our dataset picked up this Tweet that too on Wednesday, April
17, 2013 at 00:50:24 (+0000)
• Gupta et. al. claimed 28,350 Retweets. Working on the same dataset.
• Follow bots you will be safe! We show that bots do not propagate rumors and
even if they do; they do it after some time.!
• Bot Source: Verified accounts.
41
Tweet Analysis - URL Analysis
• 44,071 tweets by bots.
• 7,099 tweets by nonbots.
• 36,672 (83%) bot tweets and
4,849 (68%) nonbot tweets
have URLs
• 188 bot and 27 nonbot URLs
were marked “malicious” by
“Google Safe Browsing API”
• Bots dont spread malicious
content.
42
Tweet Analysis - Source
Analysis
Twitter API gives the “source” of a tweet.
43
• Use of automation services.
• IFTTT (“IF This Then That”) and dlvd.it
• Makes creating bots easy.
• Triggered via outside/inside Twitter activity an posted on Twitter.
44
Tweet Analysis - Source
Analysis
Tweet Analysis - Time Analysis
• Zhang et. al.: automated accounts will exhibit timing patterns that are not found in
non-automated users.
• Automated activity is invoked by job schedulers that execute tasks at specific
times or intervals.
45
1,000 tweets per day limit: 41 tweets per
hour, 0.68 tweet per min.
Bot Nonbot
Tweet Analysis - Time Analysis
46
Bot Nonbot
During High impact event
Tweet Analysis - Time Analysis
47
Bot Nonbot
During High impact event - Inter-tweet time mean vs s.d.
Tweet Analysis - Time Analysis
48
During High impact event - Average tweet time
NonbotBot
Temporal features do not help us in differentiating
bots and nonbots during high impact events as proposed by Zhang et. al.
Features
• Chu et. al.’s Twitter features:
• Tweet Count: Bots>Humans
• Long term hibernation: Bots show long periods of no activity.
• Ratio of Friends vs Followers: Bots have more friends; humans balanced.
• Temporal tweeting pattern: Bots are more active during specific days of the
week.
• Account Creation Date: Bots are more “recent”.
• Tweet source: Bots use API, Humans use mobiles, web, etc.
• Presence of URLs: Bots have more.
• Time Trends: Zang et. al.’s argument on Twitter API.
49
Features
• Temporal based features: Time trends, hibernation,
temporal patterns.
• User based features: Tweet count, ratio of friends
vs followers, account creation date, tweet source,
presence of URLs.
50
Features
• We argue that some features are not there during high impact events.
• Long term hibernation: Bots are active, participate.
• Temporal tweeting pattern: Event can occur any day of the week.
• Presence of URLs: 83% bots and 68% nonbot tweets had URLs.
Small difference.
• Time Trends: We compared them in a previous section.
51
Bot Nonbot
Classification
• Decision Trees (J48 using WEKA) to
classify if an account is bot or not.
• 2 sets: 

One with only user based features (F1),
and one with user + temporal features
(F2).
• Results show that user based features
are better in predicting bots.
• Temporal based features are not so
helpful.
• Similar result was obtained by Chu et.al.
• We get these results when we remove
Time based features.
52
Classification
• Best knowledge gain features:
• Tweet source.
• Presence of URLs.
• Ratio of Followers vs. Friends.
• Account creation date.
• Tweet count.
• Similar results were obtained by Chu et. al.
53
Detailed Analysis of few Bots
• How bots behave in normal condition?
• We took 5 bots and monitored them for a period of :
5 March 2014 to 9 April 2014.
• We took daily snapshots of their Timeline, Friends,
Followers, Mentions, Retweets and @replies.
(except Direct Messages - unavailable through API)
54
Detailed Analysis of few Bots
55
Detailed Analysis of few Bots -
@Warframe_BOT
56
Changed itself after
the event.
!
News aggregator to
game updates.
!
All previous tweets
deleted!
Detailed Analysis of few Bots -
@FintechBot & @DTNUSA
57
Spread financial and
general news.
!
Users tend to Retweet
sometimes @reply
!
These bots never
reply back.
They do not hold
conversations.
(Frequently observed
pattern)
Detailed Analysis of few Bots -
@BBCWeatherBot & @tipdoge
58
Engage in interaction with users.
!
Computer programs that require
certain input in particular format.
!
@BBCWeatherBot: requires
place and day of the week.
Used to search weather info in
events.
!
@tipdoge: different commands
like “balance”, “tip”, etc.
Used to donate money for
victims.
59
Detailed Analysis of few Bots -
@BBCWeatherBot & @tipdoge
Detailed Analysis of few Bots -
@BBCWeatherBot & @tipdoge pattern in
tweets
60
Average
Levenshtein
distance for
@BBCWeatherBot:
0.678
!
Highly repetitive
and tweet similarity.
!
Programs that have
a limited output.
Changes 2011 vs 2013
• How did bot participation in high impact events change with time?
• Used the 2011 crisis dataset. (Gupta et. al.)
• Similar annotation scheme (previously discussed)
• Note: Data annotation done in March 2014. Behavior in 2014 does
not mean bot/nonbot activity in 2011.
61
Changes 2011 vs 2013
• Annotation results:
!
• Major difference was observed in tweet source.
Comparing it to 2013 gives us new methods used by
bots to post data to Twitter.
62
Take Aways
63
• Key Points
• Future Work
• Limitations
Take Aways
• We show that bots in high impact events spread
information from “trusted and verified” sources.
• We show that bots aid in information distribution. In
the Boston marathon blasts bots push updates to
at least 9.53% users.
• We show that bots do not spread rumors and even
if they do; they do it after some time.
64
Take Aways
• Bots are moving away from Twitter API based
approach to web automation softwares.
• We analyze the network of bots and bot followers.
• We show temporal based features don’t add value
in differentiating bot and non-bot accounts on
Twitter during high impact events.
65
Take Aways
• We create a classifier based on “user based
features” with an accuracy of 85.10%.
• We analyze 5 bots and give insights on working
and behavior.
• We analyze growth and changes in bot activity in
2011 vs 2013.
66
Future Work
67
• Analyze more events.

• Check and verify these results for other high impact
events

• Analyze bot impact on malicious and fake content on
Twitter.

• Develop tool to differentiate b/w bots and nonbots.
Limitations
• Annotations were done after few months.

• We cant claim that collected data is representative.
68
References
69
.	

 Boshmaf, Y., Muslukhov, I., Beznosov, K., and Ripeanu, M. The socialbot net- work: when bots socialize for
fame and money. In Proceedings of the 27th Annual Computer Security Applications Conference (2011),
ACM, pp. 93–102. 	

	

 .	

 Castillo, C., Mendoza, M., and Poblete, B. Information credibility on twitter. In Proceedings of the 20th
international conference on World wide web (New York, NY, USA, 2011), WWW ’11, ACM, pp. 675–684. 	

	

 .	

 Chu, Z., Gianvecchio, S., Wang, H., and Jajodia, S. Who is tweeting on twitter: Human, bot, or cyborg? In
Proceedings of the 26th Annual Computer Security Applications Conference (New York, NY, USA, 2010),
ACSAC ’10, ACM, pp. 21–30. 	

	

 .	

 De Longueville, B., Smith, R. S., and Luraschi, G. ”omg, from here, i can see the flames!”: a use case of
mining location based social networks to acquire spatio-temporal data on forest fires. In Proceedings of the
2009 International Workshop on Location Based Social Networks (New York, NY, USA, 2009), LBSN ’09,
ACM, pp. 73–80. 	

	

 .	

 De Roure, D., Hooper, C., Meredith-Lobay, M., Page, K., Tarte, S., Cruick- shank, D., and De Roure, C.
Observing social machines part 1: what to observe? In Proceedings of the 22nd international conference on
World Wide Web companion (2013), International World Wide Web Conferences Steering Committee, pp.
901–904. 	

	

 .	

 Gupta, A., and Kumaraguru, P. Credibility ranking of tweets during high impact events. In Proceedings of
the 1st Workshop on Privacy and Security in Online Social Media (New York, NY, USA, 2012), PSOSM
’12, ACM, pp. 2:2–2:8.
70
.	

 Gupta, A., Lamba, H., and Kumaraguru, P. $1.00 per rt #bostonmarathon #pray- forboston: Analyzing fake
content on twitter. In Eigth IEEE APWG eCrime Research Summit (eCRS) (2013), IEEE, p. 12. 	

	

 .	

 Kwak, H., Lee, C., Park, H., and Moon, S. What is twitter, a social network or a news media? In Proceedings
of the 19th international conference on World wide web (New York, NY, USA, 2010), WWW ’10, ACM, pp.
591–600. 	

	

 .	

 Mendoza, M., Poblete, B., and Castillo, C. Twitter under crisis: can we trust what we rt? In Proceedings of
the First Workshop on Social Media Analytics (New York, NY, USA, 2010), SOMA ’10, ACM, pp. 71–79. 	

	

 .	

 Messias, J., Schmidt, L., Oliveira, R., and Benevenuto, F. You followed my bot! transforming robots into
influential users in twitter. First Monday 18, 7 (2013). 	

	

 .	

 Oh, O., Agrawal, M., and Rao, H. R. Information control and terrorism: Tracking the mumbai terrorist attack
through twitter. Information Systems Frontiers 13, 1 (Mar. 2011), 33–43. 	

	

 .	

 Palen, L., and Vieweg, S. The emergence of online widescale interaction in unexpected events: assistance,
alliance & retreat. In Proceedings of the 2008 ACM conference on Computer supported cooperative work
(New York, NY, USA, 2008), CSCW ’08, ACM, pp. 117–126. 	

71
.	

 Sakaki, T., Okazaki, M., and Matsuo, Y. Earthquake shakes twitter users: real-time event detection by social
sensors. In Proceedings of the 19th international conference on World wide web (New York, NY, USA,
2010), WWW ’10, ACM, pp. 851–860. 	

	

 .	

 Tavares, G., and Faisal, A. Scaling-laws of human broadcast communication enable distinction between
human, corporate and robot twitter users. PloS one 8, 7 (2013), e65774. 	

	

 .	

 Verma, S., Vieweg, S., Corvey, W., Palen, L., Martin, J. H., Palmer, M., Schram, A., and Anderson, K. M.
Natural language processing to the rescue? ex- tracting ”situational awareness” tweets during mass
emergency. In ICWSM (2011), L. A. Adamic, R. A. Baeza-Yates, and S. Counts, Eds., The AAAI Press. 	

	

 .	

 Vieweg, S., Hughes, A. L., Starbird, K., and Palen, L. Microblogging during two natural hazards events:
what twitter may contribute to situational awareness. In Proceedings of the 28th international conference on
Human factors in computing systems (New York, NY, USA, 2010), CHI ’10, ACM, pp. 1079–1088. 	

	

 .	

 Wald, R., Khoshgoftaar, T., Napolitano, A., and Sumner, C. Predicting suscep- tibility to social bots on
twitter. In Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on (2013), pp.
6–13. 	

	

 .	

 Zhang, C. M., and Paxson, V. Detecting and analyzing automated activity on twitter. In Proceedings of the
12th International Conference on Passive and Active Measurement (Berlin, Heidelberg, 2011), PAM’11,
Springer-Verlag, pp. 102–111.
72
Acknowledgement
• IIITD
• Dr. PK
• AG
• PJ & PJ
• AA
• NS
• AB
• MG
• SG
• To all my annotators
73
Thank You
sudipmittal@gmail.com | sudipmittal@umbc.edu
!
sudip09068@iiitd.ac.in
74
Bye bye IIITD!!! 5 years were fun!!!
Thank you for everything you have taught me!!
!
After July 2014, you can find me at UMBC ebiquity.

Weitere ähnliche Inhalte

Was ist angesagt?

Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media DataAltmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Toronto Metropolitan University
 
Real-Time Web Search: The Road Ahead
Real-Time Web Search: The Road AheadReal-Time Web Search: The Road Ahead
Real-Time Web Search: The Road Ahead
john park
 
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Matthew Russell
 
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...
Yandex
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용
Kyunghoon Kim
 

Was ist angesagt? (20)

Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
Workshop on Data Collection & Network Analysis with @Netlytic & the iGraph R ...
 
Mining Social Web Data Like a Pro: Four Steps to Success
Mining Social Web Data Like a Pro: Four Steps to SuccessMining Social Web Data Like a Pro: Four Steps to Success
Mining Social Web Data Like a Pro: Four Steps to Success
 
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media DataAltmetrics: Listening & Giving Voice to Ideas with Social Media Data
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
 
Why Twitter Is All the Rage: A Data Miner's Perspective
Why Twitter Is All the Rage: A Data Miner's PerspectiveWhy Twitter Is All the Rage: A Data Miner's Perspective
Why Twitter Is All the Rage: A Data Miner's Perspective
 
Research with Social Media Data: Stewardship & Ethical Considerations
Research with Social Media Data: Stewardship & Ethical ConsiderationsResearch with Social Media Data: Stewardship & Ethical Considerations
Research with Social Media Data: Stewardship & Ethical Considerations
 
Opinion mining for social media
Opinion mining for social mediaOpinion mining for social media
Opinion mining for social media
 
Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter Groundhog Day: Near-Duplicate Detection on Twitter
Groundhog Day: Near-Duplicate Detection on Twitter
 
#ICCSS2015 - Computational Human Security Analytics using "Big Data"
#ICCSS2015 - Computational Human Security Analytics using "Big Data"#ICCSS2015 - Computational Human Security Analytics using "Big Data"
#ICCSS2015 - Computational Human Security Analytics using "Big Data"
 
News construction from microblogging post using open data
News construction from microblogging post using open dataNews construction from microblogging post using open data
News construction from microblogging post using open data
 
Harnessing social signals to enhance a search
Harnessing social signals to enhance a searchHarnessing social signals to enhance a search
Harnessing social signals to enhance a search
 
Real-Time Web Search: The Road Ahead
Real-Time Web Search: The Road AheadReal-Time Web Search: The Road Ahead
Real-Time Web Search: The Road Ahead
 
Identifying Prominent Life Events on Twitter - K-Cap 2015
Identifying Prominent Life Events on Twitter - K-Cap 2015Identifying Prominent Life Events on Twitter - K-Cap 2015
Identifying Prominent Life Events on Twitter - K-Cap 2015
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
 
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
 
Designing and Evaluating Techniques to
 Mitigate Misinformation Spread on 
Mi...
Designing and Evaluating Techniques to
 Mitigate Misinformation Spread on 
Mi...Designing and Evaluating Techniques to
 Mitigate Misinformation Spread on 
Mi...
Designing and Evaluating Techniques to
 Mitigate Misinformation Spread on 
Mi...
 
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...
 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection project
 
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTOPredicting what gets ‘Likes’ on Facebook: case study of BlogTO
Predicting what gets ‘Likes’ on Facebook: case study of BlogTO
 
Liminality and Communitas in Social Media: The Case of Twitter
Liminality and Communitas in Social Media: The Case of TwitterLiminality and Communitas in Social Media: The Case of Twitter
Liminality and Communitas in Social Media: The Case of Twitter
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용
 

Andere mochten auch

Exploration of gaps in Bitly's spam detection and relevant countermeasures
Exploration of gaps in Bitly's spam detection and relevant countermeasuresExploration of gaps in Bitly's spam detection and relevant countermeasures
Exploration of gaps in Bitly's spam detection and relevant countermeasures
Cybersecurity Education and Research Centre
 
Clotho : Saving Programs from Malformed Strings and Incorrect
Clotho : Saving Programs from Malformed Strings and IncorrectClotho : Saving Programs from Malformed Strings and Incorrect
Clotho : Saving Programs from Malformed Strings and Incorrect
Cybersecurity Education and Research Centre
 
Automated Methods for Identity Resolution across Online Social Networks
Automated Methods for Identity Resolution across Online Social NetworksAutomated Methods for Identity Resolution across Online Social Networks
Automated Methods for Identity Resolution across Online Social Networks
Cybersecurity Education and Research Centre
 
Video Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical FlowVideo Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical Flow
Cybersecurity Education and Research Centre
 
Twitter and Polls: What Do 140 Characters Say About India General Elections 2014
Twitter and Polls: What Do 140 Characters Say About India General Elections 2014Twitter and Polls: What Do 140 Characters Say About India General Elections 2014
Twitter and Polls: What Do 140 Characters Say About India General Elections 2014
Cybersecurity Education and Research Centre
 

Andere mochten auch (7)

Exploration of gaps in Bitly's spam detection and relevant countermeasures
Exploration of gaps in Bitly's spam detection and relevant countermeasuresExploration of gaps in Bitly's spam detection and relevant countermeasures
Exploration of gaps in Bitly's spam detection and relevant countermeasures
 
Clotho : Saving Programs from Malformed Strings and Incorrect
Clotho : Saving Programs from Malformed Strings and IncorrectClotho : Saving Programs from Malformed Strings and Incorrect
Clotho : Saving Programs from Malformed Strings and Incorrect
 
Web Application Security 101
Web Application Security 101Web Application Security 101
Web Application Security 101
 
Automated Methods for Identity Resolution across Online Social Networks
Automated Methods for Identity Resolution across Online Social NetworksAutomated Methods for Identity Resolution across Online Social Networks
Automated Methods for Identity Resolution across Online Social Networks
 
Novel Instruction Set Architecture Based Side Channels in popular SSL/TLS Imp...
Novel Instruction Set Architecture Based Side Channels in popular SSL/TLS Imp...Novel Instruction Set Architecture Based Side Channels in popular SSL/TLS Imp...
Novel Instruction Set Architecture Based Side Channels in popular SSL/TLS Imp...
 
Video Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical FlowVideo Inpainting detection using inconsistencies in optical Flow
Video Inpainting detection using inconsistencies in optical Flow
 
Twitter and Polls: What Do 140 Characters Say About India General Elections 2014
Twitter and Polls: What Do 140 Characters Say About India General Elections 2014Twitter and Polls: What Do 140 Characters Say About India General Elections 2014
Twitter and Polls: What Do 140 Characters Say About India General Elections 2014
 

Ähnlich wie Broker Bots: Analyzing automated activity during High Impact Events on Twitter

DH 199 Social Media Analytics
DH 199 Social Media AnalyticsDH 199 Social Media Analytics
DH 199 Social Media Analytics
Stephanie Wong
 
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
[cb22]  From Parroting to Echoing:  The Evolution of China’s Bots-Driven Info...[cb22]  From Parroting to Echoing:  The Evolution of China’s Bots-Driven Info...
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
CODE BLUE
 

Ähnlich wie Broker Bots: Analyzing automated activity during High Impact Events on Twitter (20)

What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?What do you do with 280 million tweets from the 2016 U.S. election?
What do you do with 280 million tweets from the 2016 U.S. election?
 
Beyond the hashtags
Beyond the hashtagsBeyond the hashtags
Beyond the hashtags
 
Evert van Bolhuis, Pieter Prins, Marina Martin and Margot Verleg - Do you tru...
Evert van Bolhuis, Pieter Prins, Marina Martin and Margot Verleg - Do you tru...Evert van Bolhuis, Pieter Prins, Marina Martin and Margot Verleg - Do you tru...
Evert van Bolhuis, Pieter Prins, Marina Martin and Margot Verleg - Do you tru...
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social Media
 
DH 199 Social Media Analytics
DH 199 Social Media AnalyticsDH 199 Social Media Analytics
DH 199 Social Media Analytics
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
User Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social NetworkUser Behaviour Pattern Recognition On Twitter Social Network
User Behaviour Pattern Recognition On Twitter Social Network
 
Outreach Through Social Media | Ocean Sciences 2014
Outreach Through Social Media | Ocean Sciences 2014Outreach Through Social Media | Ocean Sciences 2014
Outreach Through Social Media | Ocean Sciences 2014
 
Twitter r t under crisis
Twitter r t under crisisTwitter r t under crisis
Twitter r t under crisis
 
DP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptxDP1_160430723010_Divya.pptx
DP1_160430723010_Divya.pptx
 
Document(2)
Document(2)Document(2)
Document(2)
 
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
Stefanie Haustein, Timothy D. Bowman, Benoît Macaluso, Cassidy R. Sugimoto &...
 
"Hashtags as Spectacle: #bostonstrong and The Materiality of Metadata" (EGSA ...
"Hashtags as Spectacle: #bostonstrong and The Materiality of Metadata" (EGSA ..."Hashtags as Spectacle: #bostonstrong and The Materiality of Metadata" (EGSA ...
"Hashtags as Spectacle: #bostonstrong and The Materiality of Metadata" (EGSA ...
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitter
 
WAPWG Jan 2020 Sloan cosmos workshop
WAPWG Jan 2020 Sloan cosmos workshopWAPWG Jan 2020 Sloan cosmos workshop
WAPWG Jan 2020 Sloan cosmos workshop
 
Curating and Contextualizing Twitter Stories to Assist with Social Newsgathering
Curating and Contextualizing Twitter Stories to Assist with Social NewsgatheringCurating and Contextualizing Twitter Stories to Assist with Social Newsgathering
Curating and Contextualizing Twitter Stories to Assist with Social Newsgathering
 
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
Stefanie Haustein & Vincent Larivière: Astrophysicists on Twitter and other s...
 
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
[cb22]  From Parroting to Echoing:  The Evolution of China’s Bots-Driven Info...[cb22]  From Parroting to Echoing:  The Evolution of China’s Bots-Driven Info...
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
 

Mehr von Cybersecurity Education and Research Centre

Identification and Analysis of Malicious Content on Facebook: A Survey
Identification and Analysis of Malicious Content on Facebook: A SurveyIdentification and Analysis of Malicious Content on Facebook: A Survey
Identification and Analysis of Malicious Content on Facebook: A Survey
Cybersecurity Education and Research Centre
 
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
Cybersecurity Education and Research Centre
 
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
Analyzing Social and Stylometric Features to Identify Spear phishing EmailsAnalyzing Social and Stylometric Features to Identify Spear phishing Emails
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
Cybersecurity Education and Research Centre
 
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing PageEmerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Cybersecurity Education and Research Centre
 
Securing the Digital Enterprise
Securing the Digital EnterpriseSecuring the Digital Enterprise
Securing the Digital Enterprise
Cybersecurity Education and Research Centre
 

Mehr von Cybersecurity Education and Research Centre (10)

TASVEER : Tomography of India’s Internet Infrastructure
TASVEER : Tomography of India’s Internet InfrastructureTASVEER : Tomography of India’s Internet Infrastructure
TASVEER : Tomography of India’s Internet Infrastructure
 
Data-Driven Assessment of Cyber Risk: Challenges in Assessing and Migrating C...
Data-Driven Assessment of Cyber Risk: Challenges in Assessing and Migrating C...Data-Driven Assessment of Cyber Risk: Challenges in Assessing and Migrating C...
Data-Driven Assessment of Cyber Risk: Challenges in Assessing and Migrating C...
 
A Strategy for Addressing Cyber Security Challenges
A Strategy for Addressing Cyber Security Challenges A Strategy for Addressing Cyber Security Challenges
A Strategy for Addressing Cyber Security Challenges
 
Identification and Analysis of Malicious Content on Facebook: A Survey
Identification and Analysis of Malicious Content on Facebook: A SurveyIdentification and Analysis of Malicious Content on Facebook: A Survey
Identification and Analysis of Malicious Content on Facebook: A Survey
 
National Critical Information Infrastructure Protection Centre (NCIIPC): Role...
National Critical Information Infrastructure Protection Centre (NCIIPC): Role...National Critical Information Infrastructure Protection Centre (NCIIPC): Role...
National Critical Information Infrastructure Protection Centre (NCIIPC): Role...
 
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�Clotho: Saving Programs from Malformed Strings and Incorrect String-handling�
Clotho: Saving Programs from Malformed Strings and Incorrect String-handling
 
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
Analyzing Social and Stylometric Features to Identify Spear phishing EmailsAnalyzing Social and Stylometric Features to Identify Spear phishing Emails
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
 
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing PageEmerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
 
Securing the Digital Enterprise
Securing the Digital EnterpriseSecuring the Digital Enterprise
Securing the Digital Enterprise
 
The future of interaction & its security challenges
The future of interaction & its security challengesThe future of interaction & its security challenges
The future of interaction & its security challenges
 

Kürzlich hochgeladen

Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
eliklein8
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
Health
 
DickinsonSlides teeeeeeeeeeessssssssssst.pptx
DickinsonSlides teeeeeeeeeeessssssssssst.pptxDickinsonSlides teeeeeeeeeeessssssssssst.pptx
DickinsonSlides teeeeeeeeeeessssssssssst.pptx
ednyonat
 

Kürzlich hochgeladen (20)

Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolution
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
 
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
 
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 OnlyVIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
VIP Call Girls Morena 9332606886 Free Home Delivery 5500 Only
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
 
Improve Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing CompanyImprove Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing Company
 
Call Girls In South Ex. Delhi O9654467111 Women Seeking Men
Call Girls In South Ex. Delhi O9654467111 Women Seeking MenCall Girls In South Ex. Delhi O9654467111 Women Seeking Men
Call Girls In South Ex. Delhi O9654467111 Women Seeking Men
 
DickinsonSlides teeeeeeeeeeessssssssssst.pptx
DickinsonSlides teeeeeeeeeeessssssssssst.pptxDickinsonSlides teeeeeeeeeeessssssssssst.pptx
DickinsonSlides teeeeeeeeeeessssssssssst.pptx
 
Your LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence PackageYour LinkedIn Makeover: Sociocosmos Presence Package
Your LinkedIn Makeover: Sociocosmos Presence Package
 
Social media marketing/Seo expert and digital marketing
Social media marketing/Seo expert and digital marketingSocial media marketing/Seo expert and digital marketing
Social media marketing/Seo expert and digital marketing
 
MODERN PODCASTING ,CREATING DREAMS TODAY.
MODERN PODCASTING ,CREATING DREAMS TODAY.MODERN PODCASTING ,CREATING DREAMS TODAY.
MODERN PODCASTING ,CREATING DREAMS TODAY.
 
Stunning ➥8448380779▻ Call Girls In Paharganj Delhi NCR
Stunning ➥8448380779▻ Call Girls In Paharganj Delhi NCRStunning ➥8448380779▻ Call Girls In Paharganj Delhi NCR
Stunning ➥8448380779▻ Call Girls In Paharganj Delhi NCR
 
Ready to get noticed? Partner with Sociocosmos
Ready to get noticed? Partner with SociocosmosReady to get noticed? Partner with Sociocosmos
Ready to get noticed? Partner with Sociocosmos
 
Elite Class ➥8448380779▻ Call Girls In Nehru Place Delhi NCR
Elite Class ➥8448380779▻ Call Girls In Nehru Place Delhi NCRElite Class ➥8448380779▻ Call Girls In Nehru Place Delhi NCR
Elite Class ➥8448380779▻ Call Girls In Nehru Place Delhi NCR
 
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
 
Interpreting the brief for the media IDY
Interpreting the brief for the media IDYInterpreting the brief for the media IDY
Interpreting the brief for the media IDY
 
Film the city investagation powerpoint :)
Film the city investagation powerpoint :)Film the city investagation powerpoint :)
Film the city investagation powerpoint :)
 
Film show production powerpoint for site
Film show production powerpoint for siteFilm show production powerpoint for site
Film show production powerpoint for site
 
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic HappensIgnite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
Ignite Your Online Influence: Sociocosmos - Where Social Media Magic Happens
 

Broker Bots: Analyzing automated activity during High Impact Events on Twitter

  • 1. Broker Bots: Analyzing automated activity during High Impact Events on Twitter Sudip Mittal ! M.Tech Thesis Defense 6th June 2014 1
  • 2. Committee 2 • Dr. Ponnurangam Kumaraguru (IIIT-Delhi)(Chair) • Dr. Vinayak Naik (IIIT-Delhi) • Dr. Sundeep Oberoi (TCS)
  • 3. Introduction •Twitter 101 •High impact events •Automated activity/Twitter bot •Research Motivation 3
  • 4. Problem Statement • Analyze and study automated activity (bots) during High Impact Events on Twitter. 4
  • 5. Twitter 101 - Users 5
  • 6. Twitter 101 - Tweet 6
  • 7. High Impact Events • Events (some example):! • Government policy changes, • Elections, • Natural calamities: earthquakes, tornados, etc. • Celebrity gossip. • Criteria for High Impact Events: ! • Great political and economic impact; • May also have moderate/high damage to life and/or property. 7
  • 8. Sample Posts - Boston Bombings (2013) • I saw people’s legs blown off. Horrific. Two explosions. Runners were coming in and saw unspeakable horror. – Jackie Bruno (@JackieBrunoNECN) April 15, 2013 • At the ER. Not a comforting way to pass the time. #boston So sad. – GanderHeroDog (@veterantraveler) April 15, 2013 • An eyewitness during the explosions at the #Boston #Marathon says “it sounded like a cannon blast.” Video: on.cnn.com/1399A40 – CNN Video (@CNNVideo) April 15, 2013 8
  • 9. Automated Activity on Web • Wikipedia bot • Twitter bot • Spam bot • IRC bot • Automated tasks on the web • Bots can perform tasks that are simple and repetitive, at a higher rate than would be possible for a human. 9
  • 10. Twitter Bots • Posts updates to Twitter automatically. • Used in: • Regularly updating users about information, • Spam campaigns, • Directing Twitter users to outside webpages. 10
  • 11. Research Motivation • Bots are becoming popular on Twitter. • Will impact discussions, information flow, credibility, information security, etc. • People rely on Twitter for important updates, crucial information. • More during high impact events. ! ! • How do bots impact discussion, information diffusion on Twitter during high impact events? 11
  • 12. Related Work • Work on High Impact Events • Work on Automated Activity 12
  • 13. Related Work - Work on High Impact Events 13 Kwak et. al. 2010 Twitter is a social network or a news media? Palen et. al. 2008 Analyzed Twitter adoption and use during emergencies. Zhao et. al. 2011 Compared Twitter with traditional media (New York Times). Mendoza et. al. 2010 Use of Twitter during the 2010 earthquake in Chile. Castillo et. al. 2011 Information credibility on Twitter. Gupta et. al. 2012 A mechanism to rank credible information on Twitter. Longueville et. al. 2009 Location based mining to analyze forest fires in France. Sakaki et. al. 2010 Used Twitter to locate epicenter & impact of earthquakes. Agrawal et. al. 2011 Tracked the Mumbai terrorist attack on Twitter. Verma et. al. 2011 Used NLP, extract “situational awareness tweets”. Vieweg et. al. 2010 Use of Twitter during two natural hazards events. Gupta et. al. 2013 Analyzed fake content during the Boston marathon blast.
  • 14. Related Work - Work on Automated Activity 14 Zhang et. al. 2011 Analyzing Time Trends in bot activity on Twitter. Chu et. al. 2010 Distinguish between automated & human accounts on Twitter. Roure et. al. 2013 Observed bots in the “wild” on Twitter. Boshmaf et. al. 2011 Showed that online social networks are vulnerable to infiltration by bots on a large scale. Messias et. al. 2013 Studied social bots and their influence over the network in which they were active. Tavares et. al. 2013 Temporal analysis to compare activity of bots and humans. Wald et. al. 2013 Studied the users that were susceptible to bots on Twitter.
  • 15. Research Findings • Bots in high impact events spread information obtained from “trusted and verified” sources. • In the Boston marathon blasts we show that bots push updates to at least 9.53% new users. • We show that bots do not propagate rumors and even if they do; they do it after some time. • Bots are moving away from Twitter API based approach to web automation softwares/systems. • We create a classifier based on user based features with an accuracy of 85.10% in classifying bot and non-bot accounts. 15
  • 16. Methodology • Event Selection • Event Data Collection • Data Annotation • Data Enrichment 16
  • 17. Event Selection Boston Marathon Blast:
 2 bombs exploded, 3 people killed. Bombing was followed by shootings, manhunt and firefight.
 (April 15, 2013) 17 Oklahoma Tornado: 
 Tornado stuck Moore, 24 people were killed. Damages of about $2 Billion. 
 (May 20, 2013)
  • 18. Navy Yard Shooting:
 Lone gunman shot 12 people and injured 3 at NAVSEA, Washington Navy Yard. 
 (September 16, 2013) 18 Event Selection Cyclone Phailin:
 Cyclonic storm hit Bay of Bengal causing damages of $696 Million and 45 deaths. 
 (October 4 - 14, 2013)
  • 19. US Ice-storm 2014:
 Fierce storm hit East and South Coast of USA. 22 people were killed damages of about $15 Million.
 (February 11 - 17, 2014)
 19 Event Selection
  • 20. Event Data Collection Data collected using Twitter API endpoints. 20
  • 21. Data Annotation • Top 200 “tweeters” from each event • 5 Events • 1,000 users manually annotated • Each user annotated by 3 annotators • All annotators were Masters (CS) and PhD (CS) students • Criteria: Must be using Twitter for 1 year at least 21
  • 22. Data Annotation • Please see the document for annotation instructions. (Cyborgs discouraged, different from bots) • Strict approach: 
 All 3 annotators must have labelled an account as either bot or nonbot. (100% agreement - for quality) 22
  • 23. Data Enrichment • “Enrich” for further analysis. • We collected their Timeline, Friends and Followers • “Enriched” by collecting profile meta-data. 23
  • 24. Analysis • Twitter bot creation methodology • Some characteristics • Bot Friends • Bot Followers • Tweet Analysis • Impact on Information Diffusion • Features and Classification • Detailed analysis of few bots • Changes (2011 vs 2013) 24
  • 25. Exploratory Numbers • 377 bots, 5 events • 309 distinct bots (26 active during multiple events) • 115 distinct nonbots ! • 5 events from 2013, 7 events from 2011 (later) 25
  • 26. Bot Creation Methodology • Twitter API rules: • Don't post duplicate content. • Don't post content unrelated to the topic. • Don't repeatedly follow or unfollow. • Don't send large number of @replies. • Don't Retweet aggressively. • 1,000 tweets per day limit: 41 tweets per hour, 0.68 tweet per min. 26
  • 27. Bot Creation Methodology • 4 major creation methodologies: • Popular tweet based: “listen” for popular tweets (via API, Trends, etc.) and repost • Keyword based: look for certain keywords tweets and repost. • Source based: Content posted/reposted from a particular other Twitter users • Outside content based: look for updates outside Twitter like RSS feeds, web feeds, blog updates, dedicated databases, etc. 27
  • 28. Characteristics - Friends and Followers • Data points clustered around the origin for bots, more spaced out for nonbots. (similar to Chu et. al.) 28 NonbotBot
  • 30. Bot Friends and Data Sources • One way: frequency of the “@mention” used to cite source, hold conversations on Twitter. (Retweets, @replies, etc.) • Retweets in API: “RT @mention tweet text” or “tweet text via @mention” 30
  • 31. Bot Friends and Data Sources 31 Bots in high impact events spread information obtained from “trusted and verified” sources.
  • 32. Bot Followers - Profile Description Total #of Bot Followers collected: 623,198. 32
  • 33. Bot Followers - Network
 (Boston Marathon Blasts) 33 • Total #of Bot Followers collected: 623,198. • Network created using Gephi (open graph viz platform). • RAM issues!!! Not possible to create the whole network! • We created 2 random subsets of 40,000 & 20,000 users. • Created network for the same.
  • 34. 34 Average graph degree: 1.01545 Bot Followers - Network
 (Boston Marathon Blasts)
  • 35. Bot Followers - Network
 (Boston Marathon Blasts) 35
  • 36. Bot Followers - Network
 (Boston Marathon Blasts) 36 • Most users follow 1 bot account, very few follow more than 1. • Following more than one bot accounts will “flood” their Timeline; coz bots tweet a lot!
  • 37. Impact on Information Diffusion during High Impact Events • Focused on Boston blasts dataset. • In the Boston marathon blasts we show that bots push updates to at least 9.53% new users. • These bots can be seen as “brokering” information to other users. 37
  • 38. Role in Rumor Propagation
 (Boston Blasts) • High impact events: spread of misinformation, incredible information, fake images, hoaxes, malicious content, etc. • Castillo et. al. & Gupta et. al.: Information credibility on Twitter. • Gupta et. al. (separate paper) Analyzed spread of fake images during Boston bombings, working on the exact same dataset. • Look at some rumors in the Boston event and correlate the Timeline with annotated 97 bots from our dataset. 38
  • 39. Role in Rumor Propagation
 (Boston Blasts) • Rumor 1:
 Suspect became citizen on 9/11! • “RT @pspoole: Dzhokhar Tsarnaev received US citizenship on Sept 11, 2012 – 11 years to the day after 9/11 attacks http://t.co/kHLL7mkjnn” • 1st tweet was by @pspoole on Friday April 19, 2013, at 15:34:56 (+0000). • Only 2 bots Retweet this rumor on Friday, April 19, 2013, at 15:41:35 and 15:47:58 (+0000). 39
  • 40. Role in Rumor Propagation
 (Boston Blasts) • Rumor 2:
 Sandy hook child killed in bombing! • “RT @CommonGrandma: She ran for the Sandy Hook children and was 8 years old. #prayforboston http:// t.co/cLir6nI7tB” • 1st tweet on Friday, April 19, 2013, at 09:56:45 (+0000). • This rumor was never picked up by any of the 97 bots in our dataset. 40
  • 41. Role in Rumor Propagation
 (Boston Blasts) • Rumor 3:
 Donating 1$ Tweet.! • A Tweet by a fake account @_BostonMarathon, “For each RT this gets, $1 will be donated to the victims of the Boston Marathon Explosions. #DonateTo- Boston” • Tweet time: Monday, April 15, 2013, at 11:29:23 (+0000). • Only 1 bot in our dataset picked up this Tweet that too on Wednesday, April 17, 2013 at 00:50:24 (+0000) • Gupta et. al. claimed 28,350 Retweets. Working on the same dataset. • Follow bots you will be safe! We show that bots do not propagate rumors and even if they do; they do it after some time.! • Bot Source: Verified accounts. 41
  • 42. Tweet Analysis - URL Analysis • 44,071 tweets by bots. • 7,099 tweets by nonbots. • 36,672 (83%) bot tweets and 4,849 (68%) nonbot tweets have URLs • 188 bot and 27 nonbot URLs were marked “malicious” by “Google Safe Browsing API” • Bots dont spread malicious content. 42
  • 43. Tweet Analysis - Source Analysis Twitter API gives the “source” of a tweet. 43
  • 44. • Use of automation services. • IFTTT (“IF This Then That”) and dlvd.it • Makes creating bots easy. • Triggered via outside/inside Twitter activity an posted on Twitter. 44 Tweet Analysis - Source Analysis
  • 45. Tweet Analysis - Time Analysis • Zhang et. al.: automated accounts will exhibit timing patterns that are not found in non-automated users. • Automated activity is invoked by job schedulers that execute tasks at specific times or intervals. 45 1,000 tweets per day limit: 41 tweets per hour, 0.68 tweet per min. Bot Nonbot
  • 46. Tweet Analysis - Time Analysis 46 Bot Nonbot During High impact event
  • 47. Tweet Analysis - Time Analysis 47 Bot Nonbot During High impact event - Inter-tweet time mean vs s.d.
  • 48. Tweet Analysis - Time Analysis 48 During High impact event - Average tweet time NonbotBot Temporal features do not help us in differentiating bots and nonbots during high impact events as proposed by Zhang et. al.
  • 49. Features • Chu et. al.’s Twitter features: • Tweet Count: Bots>Humans • Long term hibernation: Bots show long periods of no activity. • Ratio of Friends vs Followers: Bots have more friends; humans balanced. • Temporal tweeting pattern: Bots are more active during specific days of the week. • Account Creation Date: Bots are more “recent”. • Tweet source: Bots use API, Humans use mobiles, web, etc. • Presence of URLs: Bots have more. • Time Trends: Zang et. al.’s argument on Twitter API. 49
  • 50. Features • Temporal based features: Time trends, hibernation, temporal patterns. • User based features: Tweet count, ratio of friends vs followers, account creation date, tweet source, presence of URLs. 50
  • 51. Features • We argue that some features are not there during high impact events. • Long term hibernation: Bots are active, participate. • Temporal tweeting pattern: Event can occur any day of the week. • Presence of URLs: 83% bots and 68% nonbot tweets had URLs. Small difference. • Time Trends: We compared them in a previous section. 51 Bot Nonbot
  • 52. Classification • Decision Trees (J48 using WEKA) to classify if an account is bot or not. • 2 sets: 
 One with only user based features (F1), and one with user + temporal features (F2). • Results show that user based features are better in predicting bots. • Temporal based features are not so helpful. • Similar result was obtained by Chu et.al. • We get these results when we remove Time based features. 52
  • 53. Classification • Best knowledge gain features: • Tweet source. • Presence of URLs. • Ratio of Followers vs. Friends. • Account creation date. • Tweet count. • Similar results were obtained by Chu et. al. 53
  • 54. Detailed Analysis of few Bots • How bots behave in normal condition? • We took 5 bots and monitored them for a period of : 5 March 2014 to 9 April 2014. • We took daily snapshots of their Timeline, Friends, Followers, Mentions, Retweets and @replies. (except Direct Messages - unavailable through API) 54
  • 55. Detailed Analysis of few Bots 55
  • 56. Detailed Analysis of few Bots - @Warframe_BOT 56 Changed itself after the event. ! News aggregator to game updates. ! All previous tweets deleted!
  • 57. Detailed Analysis of few Bots - @FintechBot & @DTNUSA 57 Spread financial and general news. ! Users tend to Retweet sometimes @reply ! These bots never reply back. They do not hold conversations. (Frequently observed pattern)
  • 58. Detailed Analysis of few Bots - @BBCWeatherBot & @tipdoge 58 Engage in interaction with users. ! Computer programs that require certain input in particular format. ! @BBCWeatherBot: requires place and day of the week. Used to search weather info in events. ! @tipdoge: different commands like “balance”, “tip”, etc. Used to donate money for victims.
  • 59. 59 Detailed Analysis of few Bots - @BBCWeatherBot & @tipdoge
  • 60. Detailed Analysis of few Bots - @BBCWeatherBot & @tipdoge pattern in tweets 60 Average Levenshtein distance for @BBCWeatherBot: 0.678 ! Highly repetitive and tweet similarity. ! Programs that have a limited output.
  • 61. Changes 2011 vs 2013 • How did bot participation in high impact events change with time? • Used the 2011 crisis dataset. (Gupta et. al.) • Similar annotation scheme (previously discussed) • Note: Data annotation done in March 2014. Behavior in 2014 does not mean bot/nonbot activity in 2011. 61
  • 62. Changes 2011 vs 2013 • Annotation results: ! • Major difference was observed in tweet source. Comparing it to 2013 gives us new methods used by bots to post data to Twitter. 62
  • 63. Take Aways 63 • Key Points • Future Work • Limitations
  • 64. Take Aways • We show that bots in high impact events spread information from “trusted and verified” sources. • We show that bots aid in information distribution. In the Boston marathon blasts bots push updates to at least 9.53% users. • We show that bots do not spread rumors and even if they do; they do it after some time. 64
  • 65. Take Aways • Bots are moving away from Twitter API based approach to web automation softwares. • We analyze the network of bots and bot followers. • We show temporal based features don’t add value in differentiating bot and non-bot accounts on Twitter during high impact events. 65
  • 66. Take Aways • We create a classifier based on “user based features” with an accuracy of 85.10%. • We analyze 5 bots and give insights on working and behavior. • We analyze growth and changes in bot activity in 2011 vs 2013. 66
  • 67. Future Work 67 • Analyze more events.
 • Check and verify these results for other high impact events
 • Analyze bot impact on malicious and fake content on Twitter.
 • Develop tool to differentiate b/w bots and nonbots.
  • 68. Limitations • Annotations were done after few months.
 • We cant claim that collected data is representative. 68
  • 70. . Boshmaf, Y., Muslukhov, I., Beznosov, K., and Ripeanu, M. The socialbot net- work: when bots socialize for fame and money. In Proceedings of the 27th Annual Computer Security Applications Conference (2011), ACM, pp. 93–102. . Castillo, C., Mendoza, M., and Poblete, B. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web (New York, NY, USA, 2011), WWW ’11, ACM, pp. 675–684. . Chu, Z., Gianvecchio, S., Wang, H., and Jajodia, S. Who is tweeting on twitter: Human, bot, or cyborg? In Proceedings of the 26th Annual Computer Security Applications Conference (New York, NY, USA, 2010), ACSAC ’10, ACM, pp. 21–30. . De Longueville, B., Smith, R. S., and Luraschi, G. ”omg, from here, i can see the flames!”: a use case of mining location based social networks to acquire spatio-temporal data on forest fires. In Proceedings of the 2009 International Workshop on Location Based Social Networks (New York, NY, USA, 2009), LBSN ’09, ACM, pp. 73–80. . De Roure, D., Hooper, C., Meredith-Lobay, M., Page, K., Tarte, S., Cruick- shank, D., and De Roure, C. Observing social machines part 1: what to observe? In Proceedings of the 22nd international conference on World Wide Web companion (2013), International World Wide Web Conferences Steering Committee, pp. 901–904. . Gupta, A., and Kumaraguru, P. Credibility ranking of tweets during high impact events. In Proceedings of the 1st Workshop on Privacy and Security in Online Social Media (New York, NY, USA, 2012), PSOSM ’12, ACM, pp. 2:2–2:8. 70
  • 71. . Gupta, A., Lamba, H., and Kumaraguru, P. $1.00 per rt #bostonmarathon #pray- forboston: Analyzing fake content on twitter. In Eigth IEEE APWG eCrime Research Summit (eCRS) (2013), IEEE, p. 12. . Kwak, H., Lee, C., Park, H., and Moon, S. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web (New York, NY, USA, 2010), WWW ’10, ACM, pp. 591–600. . Mendoza, M., Poblete, B., and Castillo, C. Twitter under crisis: can we trust what we rt? In Proceedings of the First Workshop on Social Media Analytics (New York, NY, USA, 2010), SOMA ’10, ACM, pp. 71–79. . Messias, J., Schmidt, L., Oliveira, R., and Benevenuto, F. You followed my bot! transforming robots into influential users in twitter. First Monday 18, 7 (2013). . Oh, O., Agrawal, M., and Rao, H. R. Information control and terrorism: Tracking the mumbai terrorist attack through twitter. Information Systems Frontiers 13, 1 (Mar. 2011), 33–43. . Palen, L., and Vieweg, S. The emergence of online widescale interaction in unexpected events: assistance, alliance & retreat. In Proceedings of the 2008 ACM conference on Computer supported cooperative work (New York, NY, USA, 2008), CSCW ’08, ACM, pp. 117–126. 71
  • 72. . Sakaki, T., Okazaki, M., and Matsuo, Y. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web (New York, NY, USA, 2010), WWW ’10, ACM, pp. 851–860. . Tavares, G., and Faisal, A. Scaling-laws of human broadcast communication enable distinction between human, corporate and robot twitter users. PloS one 8, 7 (2013), e65774. . Verma, S., Vieweg, S., Corvey, W., Palen, L., Martin, J. H., Palmer, M., Schram, A., and Anderson, K. M. Natural language processing to the rescue? ex- tracting ”situational awareness” tweets during mass emergency. In ICWSM (2011), L. A. Adamic, R. A. Baeza-Yates, and S. Counts, Eds., The AAAI Press. . Vieweg, S., Hughes, A. L., Starbird, K., and Palen, L. Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In Proceedings of the 28th international conference on Human factors in computing systems (New York, NY, USA, 2010), CHI ’10, ACM, pp. 1079–1088. . Wald, R., Khoshgoftaar, T., Napolitano, A., and Sumner, C. Predicting suscep- tibility to social bots on twitter. In Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on (2013), pp. 6–13. . Zhang, C. M., and Paxson, V. Detecting and analyzing automated activity on twitter. In Proceedings of the 12th International Conference on Passive and Active Measurement (Berlin, Heidelberg, 2011), PAM’11, Springer-Verlag, pp. 102–111. 72
  • 73. Acknowledgement • IIITD • Dr. PK • AG • PJ & PJ • AA • NS • AB • MG • SG • To all my annotators 73
  • 74. Thank You sudipmittal@gmail.com | sudipmittal@umbc.edu ! sudip09068@iiitd.ac.in 74 Bye bye IIITD!!! 5 years were fun!!! Thank you for everything you have taught me!! ! After July 2014, you can find me at UMBC ebiquity.