SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
The Spammer, the Botmaster, and the Researcher: On the
      Arms Race in Spamming Botnet Mitigation




                                 Gianluca Stringhini

                                   Major Area Exam


                                  December 5, 2011
What is spam?


                Spam is a big problem
                Everyone receives spam
                90-95% of emails are spam

                Organic vs. Junk food
                Spam vs. Ham
                We need a definition a
                computer can understand
                Unsolicited Bulk Email
Early days spam


Spam as a hobby
Businesses ran from home’s basement

CAN-SPAM Act (2003)
Doesn’t forbid to spam, but the spammer
has to be nice.
$16k fine per violating email

The world is big
Not every country prosecutes spammers
Modern spam
Modern spam




                                                                                1



   Affiliate programs [Samosseiko 2009]
   Are banks the weak link? [Levchenko 2011]
   1
     source: Levchenko et al., Click Trajectories: End-to-End Analysis of the
Spam Value Chain
Is Spam Profitable?


Yes, it is
Estimates between $300k and $1M a month for large affiliate
programs [Kanich 2008, Kanich 2011]

Relatively low risk
   Small fishes are the ones who get caught
   The geographic dispersion makes coordinated actions difficult
How is Spam Delivered?
Botnets
Botnets are networks of compromised computers that act under the
control of a single entity (Botmaster)

What are botnets used for?
  Running DoS
  Stealing Information
  Solving Captchas
  Sending spam
Botnets are responsible for 85% of worldwide spam
Why botnets?
Botnets combine the best of two worlds: worms and IRC bots
Researchers and Botmasters are involved in an arms race
Botnet Evolution
Botnet Evolution - Structure




                     SDBot 2002
Botnet Evolution - Structure



IRC botnets
The C&C is an IRC server
Bots join a channel and get orders

Problems
  Researchers can join the channel too
  DNS sinkholing is possible
Botnet Evolution - Structure




                    MyDoom 2004
Botnet Evolution - Structure


Proprietary protocol botnets
The C&C uses a proprietary encrypted protocol
Two architectures:
  Pull architecture
  Push architecture

Problems
  Researchers can reverse engineer the protocol
  DNS sinkholing is still possible
Botnet Evolution - Structure




                     Lethic 2007
Botnet Evolution - Structure


Multiple tier botnets
The bots don’t connect directly to the C&C
The domains used by the proxies use Fast Flux

Fast Flux
Technique similar to Round-robin DNS and CDNs
Give high reliability for the botnet backbone
  Many IP addresses associated to a domain
  Low TTL, the record changes all the time
Botnet Evolution - Structure

Problem
The domains used can still be sinkholed / blacklisted

The solution
Domain Generation Algorithms
Bots contact a domain according to a time-dependent algorithm
Used by Torpig (2008)

Problems
The algorithm can be reverse engineered [StoneGross 2009a]
Botmasters can add non-determinism (e.g., Twitter trends)
Botnet Evolution - Structure




                     Storm 2007
Botnet Evolution - Structure



Peer-to-peer botnets
Bots with private IPs act as workers
Bots with public IPs act as proxies
Workers find proxies based on some overnet protocol

Problem
Proxies are not under the control of the botmaster
Researchers can impersonate a proxy and infiltrate the botnet
Botnet Evolution - Infection model

Worm-like spread
The bot scans the network for vulnerabilities and propagates

Non-spreading bots
Infections are propagated through
  Drive-by-download websites [Provos 2008, StoneGross 2011]
  Email attachments

Pay-per-Install
The new trend is paying third parties for “installing” a certain number
of bots [Caballero 2011]
Botnet and Spam Mitigation
Botnet and Spam Mitigation




Many Possible Vantage Points
Host-based detection
Host-based detection

Traditional anti-virus approach
Look for the presence of virus specific instructions in the binaries
Antiviruses can be fooled by simple obfuscations
[Christodorescu 2003, Christodorescu 2004]



Obfuscations
NOP insertion and code transposition are usually enough
  Metamorphic malware
  Polymorphic malware
Host-based detection
Static analysis
Take program semantics into account [Christodorescu 2003,
Christodorescu 2005]

Dynamic analysis
Model the behavior of a program (e.g., using system calls)
[Kolbitsch 2009]
Monitor access to sensitive information [Yin 2007]
Reverse engineer of the C&C protocol [Caballero 2009]

Problems
Program equivalence is undecidable!
Analysis of samples takes time and resources
Malicious Web Pages Detection
Malicious Web Pages Detection
Infection happening through browser exploits are a big problem
Detecting Drive-by-Download pages
Malicious Javascript can be detected by:
  Emulation [Cova 2010]
  Monitoring system changes [Provos 2008]
  Hooking runtime [Curtsinger 2011, Heiderich 2011]
  Look for common attack patterns (e.g., heap spray)
  [Ratanaworabhan 2009]

Problems
  The analysis could be detected
  These systems might not detect newer attacks
Command and Control based Detection
Command and Control-based Detection


IRC server infiltration [AbuRajab2006]
Protocol Reverse Engineering
Protocol reverse engineering by active probing [Cho 2010a]
This enables botnet infiltration [Stock 2009, Kreibich 2009,
Cho 2010b]

Botnet Takeovers
Reverse engineering of DGAs [StoneGross 2009a]
This enables C&C impersonation [StoneGross 2009a]
Honeypots
Running bots in virtual machines allows to learn important botnet
features [John 2009]

This can be used for
  Blacklisting the domains that host C&C servers
  [StoneGross 2009b]
  Performing botnet takedowns [StoneGross 2011]


Problems
  Bots might detect virtualization [Balzarotti 2010]
  Containment problems arise [Kreibich 2011]
DNS Based Detection
DNS Based Detection

Detecting infected IPs
DNS sinkhole [Dagon 2006]
Look for DNS cached results [AbuRajab 2006]

Detect Fast-Flux Domains
Fast Flux domains present very different characteristics than
legitimate ones [Holz 2008, Passerini 2008, Hu 2009]
  IPs belong to different networks
  TTL is low
  results change very frequently
DNS Based Detection


Detecting Malicious Domains
It is possible to build classifiers to detect malicious domains
  Passive analysis of RDNSs queries [Antonanakis 2010,
  Bilge 2011]
  Limitation: only local view
  Analysis at the authoritative server level or TLDs
  [Antonanakis 2011]
  Limitation: it can be evaded using diverse DNS servers
SMTP based Detection
SMTP based Detection: Content Analysis
Rule-based Spam Detection
  The nature of spam changes over time
  Having a binary decision introduces problems.

Machine Learning
  Bayesian Filtering: uses na¨ Bayes [Sahami 1998,
                             ıve
  Androutsopolous 2000]
  Support Vector Machines [Drucker 1999]

Problems
  Feature selection has to be performed
  “Good word” attacks are possible [Lowd 2005, Karlserger 2007]
SMTP based Detection: Content Analysis
Assign a Reputation to Received Emails
Different features between spam and ham [Hao 2009]

Building Signatures from Spam
[Pitsillidis 2010] ran bots and assigned templates to different botnets

Detect Spam by Looking at URLs
  Study the URL structure [Xie 2008, Ma 2009]
  Learning features from the landing page [Thomas 2011]

Problem
  In general, content analysis is expensive
SMTP based Detection: IP Blacklisting
DNS-based blacklists
Mailservers can query the service to know whether an individual IP is
a known spammer

Problems
  Low coverage [Ramachandran 2006a, Sinha 2008]
  Bot machines have dynamic IPs
  What happens when IPv6 takes over?

Better Approaches
  IP reputation [Ramachandran 2006b, Sinha 2010, Qian 2010]
  Behavioral blacklisting [Ramachandran 2007, Stringhini 2011]
SMTP based Detection: Policies
Greylisting
If a delivery temporary fails, spambots will not try again
Easy to bypass and prone to false positives [Levine 2005]
Multi-level greylisting [Janecek 2008]

Sender Validation
Spam pretends to come from legitimate addresses
SPF,DomainKeys,DKIM [Leiba 2007]

The solution chosen by Google
User voting on spam and ham [Taylor 2006]

Main problem: Spam hits server performances!
Mail prioritization systems [Twining 2004, Venkataraman 2007]
Social Network Detection
Social Network Detection
Online Social Networks are very successful
Users are not as risk aware as they are with email spam

Miscreants create fake profiles to spread spam
Systems to detect fake profiles have been developed
[Benvenuto 2010, Lee 2010, Stringhini 2010, Yang 2011a,
Yang 2011b]

Real accounts that get compromised are more valuable
45% of social network users click on any link by their friends
[Bilge 2009]
89% of profiles sending malicious content on Facebook are
compromised [Gao 2010]
Network Edge Detection
Intrusion Detection
Signature-based intrusion detection
Snort,Bro [Paxson 1998]

Problems
  Constant need of new rules
  Problems with encrypted traffic

Anomaly-based intrusion detection
The system learns the “normal” behavior of a network and flags
anomalies [Portnoy 2001, Kruegel 2002, Wang 2004]

Problems
  What is ”normal“ behavior?
  It is hard to get traffic that is free of infections
Network Edge Detection
Detecting Successful Infections
Botnet infection can as a set of communication flows [Gu 2007]
Problem: what’s the infection model of a botnet?

Detecting Malicious Activity
Correlation between C&C commands and malicious activity
[Gu 2008a]
How to identify C&C traffic?
  Well-known protocols (e.g., IRC, HTTP) [Gu 2008b]
  Look for malicious activity first [Wurzinger 2010]

Leverage Previous Knowledge
Detect hosts that contact the same IPs as infected machines
[Coskun 2010]
Conclusions
How About the Future?
The arms race between researchers and cybercriminals is far from
being over

Is security research like fighting the Hydra?
Future Directions


Botmasters will keep developing more sophisticated techniques


However, a functional botnet has to interact with legitimate services
  DNS servers
  SMTP servers
  Web servers
  Social Networks
This interaction cannot be obfuscated!
My Research

In my research, I focus on analyzing how bots interact with
legitimate, third party services


Bots can be distinguished from real users in the way they use such
services


The main reason is that bots have a different goal than real users:

            Fast interaction vs. Good user experience
My Research
So far, I have been looking at:
Social Networks
  How fake accounts differ from legitimate ones [ACSAC 2010]
  How users behavior change once an account is compromised
  [In submission]

SMTP servers
Distinguishing bots:
  based on the destinations they target [USENIX 2011]
  based on the (wrong) way in which they implement SMTP
  [Work in progress]
My Research



Other interesting areas:
  Login patterns on Social Networks
  Interaction with search engines (e.g., SEO)

What if bots started behaving like legitimate users / programs?
This conflicts with their goal!
Thanks!


email: gianluca@cs.ucsb.edu
twitter: @gianlucaSB

Weitere ähnliche Inhalte

Ähnlich wie The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

New Botnets Trends and Threats (BH Europe 2007)
New Botnets Trends and Threats (BH Europe 2007)New Botnets Trends and Threats (BH Europe 2007)
New Botnets Trends and Threats (BH Europe 2007)André Fucs de Miranda
 
[2010 CodeEngn Conference 04] Max - Fighting against Botnet
[2010 CodeEngn Conference 04] Max - Fighting against Botnet[2010 CodeEngn Conference 04] Max - Fighting against Botnet
[2010 CodeEngn Conference 04] Max - Fighting against BotnetGangSeok Lee
 
lab3cdga.ziplab3code.c#include stdio.h#include std.docx
lab3cdga.ziplab3code.c#include stdio.h#include std.docxlab3cdga.ziplab3code.c#include stdio.h#include std.docx
lab3cdga.ziplab3code.c#include stdio.h#include std.docxsmile790243
 
Detecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBSDetecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBSijsrd.com
 
Mcs2453 aniq mc101053-assignment1
Mcs2453 aniq mc101053-assignment1Mcs2453 aniq mc101053-assignment1
Mcs2453 aniq mc101053-assignment1Aniq Eastrarulkhair
 
Defending Against Botnets
Defending Against BotnetsDefending Against Botnets
Defending Against BotnetsJim Lippard
 
Botnet detection by Imitation method
Botnet detection  by Imitation methodBotnet detection  by Imitation method
Botnet detection by Imitation methodAcad
 
Event - Internet Thailand - Total Security Perimeters
Event - Internet Thailand - Total Security PerimetersEvent - Internet Thailand - Total Security Perimeters
Event - Internet Thailand - Total Security PerimetersSomyos U.
 
Understanding the Botnet Phenomenon
Understanding the Botnet PhenomenonUnderstanding the Botnet Phenomenon
Understanding the Botnet PhenomenonDr. Amarjeet Singh
 
Trends in Web Attacks
Trends in Web AttacksTrends in Web Attacks
Trends in Web AttacksIWMW
 
A Survey of Botnet Detection Techniques
A Survey of Botnet Detection TechniquesA Survey of Botnet Detection Techniques
A Survey of Botnet Detection Techniquesijsrd.com
 
Lab3code.c#include stdio.h#include stdlib.h#include.docx
Lab3code.c#include stdio.h#include stdlib.h#include.docxLab3code.c#include stdio.h#include stdlib.h#include.docx
Lab3code.c#include stdio.h#include stdlib.h#include.docxsmile790243
 
20110524 a survey of spam
20110524 a survey of spam20110524 a survey of spam
20110524 a survey of spamjasonmel
 

Ähnlich wie The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation (20)

New Botnets Trends and Threats (BH Europe 2007)
New Botnets Trends and Threats (BH Europe 2007)New Botnets Trends and Threats (BH Europe 2007)
New Botnets Trends and Threats (BH Europe 2007)
 
[2010 CodeEngn Conference 04] Max - Fighting against Botnet
[2010 CodeEngn Conference 04] Max - Fighting against Botnet[2010 CodeEngn Conference 04] Max - Fighting against Botnet
[2010 CodeEngn Conference 04] Max - Fighting against Botnet
 
lab3cdga.ziplab3code.c#include stdio.h#include std.docx
lab3cdga.ziplab3code.c#include stdio.h#include std.docxlab3cdga.ziplab3code.c#include stdio.h#include std.docx
lab3cdga.ziplab3code.c#include stdio.h#include std.docx
 
Botnets
BotnetsBotnets
Botnets
 
Analysis of rxbot
Analysis of rxbotAnalysis of rxbot
Analysis of rxbot
 
Detecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBSDetecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBS
 
Mcs2453 aniq mc101053-assignment1
Mcs2453 aniq mc101053-assignment1Mcs2453 aniq mc101053-assignment1
Mcs2453 aniq mc101053-assignment1
 
Defending Against Botnets
Defending Against BotnetsDefending Against Botnets
Defending Against Botnets
 
Botnet detection by Imitation method
Botnet detection  by Imitation methodBotnet detection  by Imitation method
Botnet detection by Imitation method
 
Event - Internet Thailand - Total Security Perimeters
Event - Internet Thailand - Total Security PerimetersEvent - Internet Thailand - Total Security Perimeters
Event - Internet Thailand - Total Security Perimeters
 
Botnets
BotnetsBotnets
Botnets
 
Bots and Botnet
Bots and BotnetBots and Botnet
Bots and Botnet
 
Understanding the Botnet Phenomenon
Understanding the Botnet PhenomenonUnderstanding the Botnet Phenomenon
Understanding the Botnet Phenomenon
 
Botnet
BotnetBotnet
Botnet
 
Trends in Web Attacks
Trends in Web AttacksTrends in Web Attacks
Trends in Web Attacks
 
How To Protect Your Website From Bot Attacks
How To Protect Your Website From Bot AttacksHow To Protect Your Website From Bot Attacks
How To Protect Your Website From Bot Attacks
 
A Survey of Botnet Detection Techniques
A Survey of Botnet Detection TechniquesA Survey of Botnet Detection Techniques
A Survey of Botnet Detection Techniques
 
BotNet Attacks
BotNet AttacksBotNet Attacks
BotNet Attacks
 
Lab3code.c#include stdio.h#include stdlib.h#include.docx
Lab3code.c#include stdio.h#include stdlib.h#include.docxLab3code.c#include stdio.h#include stdlib.h#include.docx
Lab3code.c#include stdio.h#include stdlib.h#include.docx
 
20110524 a survey of spam
20110524 a survey of spam20110524 a survey of spam
20110524 a survey of spam
 

Mehr von Gianluca Stringhini

EvilCohort: Detecting Communities of Malicious Accounts on Online Services
EvilCohort: Detecting Communities of Malicious Accounts on Online ServicesEvilCohort: Detecting Communities of Malicious Accounts on Online Services
EvilCohort: Detecting Communities of Malicious Accounts on Online ServicesGianluca Stringhini
 
That Ain't You: Detecting Spearphishing Through Behavioral Modelling
That Ain't You: Detecting Spearphishing Through Behavioral ModellingThat Ain't You: Detecting Spearphishing Through Behavioral Modelling
That Ain't You: Detecting Spearphishing Through Behavioral ModellingGianluca Stringhini
 
Thinking Like They Do: An Inside Look At Cybercriminal Operations
Thinking Like They Do: An Inside Look At Cybercriminal OperationsThinking Like They Do: An Inside Look At Cybercriminal Operations
Thinking Like They Do: An Inside Look At Cybercriminal OperationsGianluca Stringhini
 
The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?Gianluca Stringhini
 
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web PagesShady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web PagesGianluca Stringhini
 
Follow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower MarketsFollow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower MarketsGianluca Stringhini
 
Detecting Spammers on Social Networks
Detecting Spammers on Social NetworksDetecting Spammers on Social Networks
Detecting Spammers on Social NetworksGianluca Stringhini
 
BotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the InternetBotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the InternetGianluca Stringhini
 

Mehr von Gianluca Stringhini (8)

EvilCohort: Detecting Communities of Malicious Accounts on Online Services
EvilCohort: Detecting Communities of Malicious Accounts on Online ServicesEvilCohort: Detecting Communities of Malicious Accounts on Online Services
EvilCohort: Detecting Communities of Malicious Accounts on Online Services
 
That Ain't You: Detecting Spearphishing Through Behavioral Modelling
That Ain't You: Detecting Spearphishing Through Behavioral ModellingThat Ain't You: Detecting Spearphishing Through Behavioral Modelling
That Ain't You: Detecting Spearphishing Through Behavioral Modelling
 
Thinking Like They Do: An Inside Look At Cybercriminal Operations
Thinking Like They Do: An Inside Look At Cybercriminal OperationsThinking Like They Do: An Inside Look At Cybercriminal Operations
Thinking Like They Do: An Inside Look At Cybercriminal Operations
 
The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?The Tricks of the Trade: What Makes Spam Campaigns Successful?
The Tricks of the Trade: What Makes Spam Campaigns Successful?
 
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web PagesShady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages
Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages
 
Follow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower MarketsFollow the Green: Growth and Dynamics on Twitter Follower Markets
Follow the Green: Growth and Dynamics on Twitter Follower Markets
 
Detecting Spammers on Social Networks
Detecting Spammers on Social NetworksDetecting Spammers on Social Networks
Detecting Spammers on Social Networks
 
BotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the InternetBotMagnifier: Locating Spambots on the Internet
BotMagnifier: Locating Spambots on the Internet
 

Kürzlich hochgeladen

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Kürzlich hochgeladen (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

  • 1. The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation Gianluca Stringhini Major Area Exam December 5, 2011
  • 2. What is spam? Spam is a big problem Everyone receives spam 90-95% of emails are spam Organic vs. Junk food Spam vs. Ham We need a definition a computer can understand Unsolicited Bulk Email
  • 3. Early days spam Spam as a hobby Businesses ran from home’s basement CAN-SPAM Act (2003) Doesn’t forbid to spam, but the spammer has to be nice. $16k fine per violating email The world is big Not every country prosecutes spammers
  • 5. Modern spam 1 Affiliate programs [Samosseiko 2009] Are banks the weak link? [Levchenko 2011] 1 source: Levchenko et al., Click Trajectories: End-to-End Analysis of the Spam Value Chain
  • 6. Is Spam Profitable? Yes, it is Estimates between $300k and $1M a month for large affiliate programs [Kanich 2008, Kanich 2011] Relatively low risk Small fishes are the ones who get caught The geographic dispersion makes coordinated actions difficult
  • 7. How is Spam Delivered? Botnets Botnets are networks of compromised computers that act under the control of a single entity (Botmaster) What are botnets used for? Running DoS Stealing Information Solving Captchas Sending spam Botnets are responsible for 85% of worldwide spam Why botnets? Botnets combine the best of two worlds: worms and IRC bots Researchers and Botmasters are involved in an arms race
  • 9. Botnet Evolution - Structure SDBot 2002
  • 10. Botnet Evolution - Structure IRC botnets The C&C is an IRC server Bots join a channel and get orders Problems Researchers can join the channel too DNS sinkholing is possible
  • 11. Botnet Evolution - Structure MyDoom 2004
  • 12. Botnet Evolution - Structure Proprietary protocol botnets The C&C uses a proprietary encrypted protocol Two architectures: Pull architecture Push architecture Problems Researchers can reverse engineer the protocol DNS sinkholing is still possible
  • 13. Botnet Evolution - Structure Lethic 2007
  • 14. Botnet Evolution - Structure Multiple tier botnets The bots don’t connect directly to the C&C The domains used by the proxies use Fast Flux Fast Flux Technique similar to Round-robin DNS and CDNs Give high reliability for the botnet backbone Many IP addresses associated to a domain Low TTL, the record changes all the time
  • 15. Botnet Evolution - Structure Problem The domains used can still be sinkholed / blacklisted The solution Domain Generation Algorithms Bots contact a domain according to a time-dependent algorithm Used by Torpig (2008) Problems The algorithm can be reverse engineered [StoneGross 2009a] Botmasters can add non-determinism (e.g., Twitter trends)
  • 16. Botnet Evolution - Structure Storm 2007
  • 17. Botnet Evolution - Structure Peer-to-peer botnets Bots with private IPs act as workers Bots with public IPs act as proxies Workers find proxies based on some overnet protocol Problem Proxies are not under the control of the botmaster Researchers can impersonate a proxy and infiltrate the botnet
  • 18. Botnet Evolution - Infection model Worm-like spread The bot scans the network for vulnerabilities and propagates Non-spreading bots Infections are propagated through Drive-by-download websites [Provos 2008, StoneGross 2011] Email attachments Pay-per-Install The new trend is paying third parties for “installing” a certain number of bots [Caballero 2011]
  • 19. Botnet and Spam Mitigation
  • 20. Botnet and Spam Mitigation Many Possible Vantage Points
  • 22. Host-based detection Traditional anti-virus approach Look for the presence of virus specific instructions in the binaries Antiviruses can be fooled by simple obfuscations [Christodorescu 2003, Christodorescu 2004] Obfuscations NOP insertion and code transposition are usually enough Metamorphic malware Polymorphic malware
  • 23. Host-based detection Static analysis Take program semantics into account [Christodorescu 2003, Christodorescu 2005] Dynamic analysis Model the behavior of a program (e.g., using system calls) [Kolbitsch 2009] Monitor access to sensitive information [Yin 2007] Reverse engineer of the C&C protocol [Caballero 2009] Problems Program equivalence is undecidable! Analysis of samples takes time and resources
  • 24. Malicious Web Pages Detection
  • 25. Malicious Web Pages Detection Infection happening through browser exploits are a big problem Detecting Drive-by-Download pages Malicious Javascript can be detected by: Emulation [Cova 2010] Monitoring system changes [Provos 2008] Hooking runtime [Curtsinger 2011, Heiderich 2011] Look for common attack patterns (e.g., heap spray) [Ratanaworabhan 2009] Problems The analysis could be detected These systems might not detect newer attacks
  • 26. Command and Control based Detection
  • 27. Command and Control-based Detection IRC server infiltration [AbuRajab2006] Protocol Reverse Engineering Protocol reverse engineering by active probing [Cho 2010a] This enables botnet infiltration [Stock 2009, Kreibich 2009, Cho 2010b] Botnet Takeovers Reverse engineering of DGAs [StoneGross 2009a] This enables C&C impersonation [StoneGross 2009a]
  • 28. Honeypots Running bots in virtual machines allows to learn important botnet features [John 2009] This can be used for Blacklisting the domains that host C&C servers [StoneGross 2009b] Performing botnet takedowns [StoneGross 2011] Problems Bots might detect virtualization [Balzarotti 2010] Containment problems arise [Kreibich 2011]
  • 30. DNS Based Detection Detecting infected IPs DNS sinkhole [Dagon 2006] Look for DNS cached results [AbuRajab 2006] Detect Fast-Flux Domains Fast Flux domains present very different characteristics than legitimate ones [Holz 2008, Passerini 2008, Hu 2009] IPs belong to different networks TTL is low results change very frequently
  • 31. DNS Based Detection Detecting Malicious Domains It is possible to build classifiers to detect malicious domains Passive analysis of RDNSs queries [Antonanakis 2010, Bilge 2011] Limitation: only local view Analysis at the authoritative server level or TLDs [Antonanakis 2011] Limitation: it can be evaded using diverse DNS servers
  • 33. SMTP based Detection: Content Analysis Rule-based Spam Detection The nature of spam changes over time Having a binary decision introduces problems. Machine Learning Bayesian Filtering: uses na¨ Bayes [Sahami 1998, ıve Androutsopolous 2000] Support Vector Machines [Drucker 1999] Problems Feature selection has to be performed “Good word” attacks are possible [Lowd 2005, Karlserger 2007]
  • 34. SMTP based Detection: Content Analysis Assign a Reputation to Received Emails Different features between spam and ham [Hao 2009] Building Signatures from Spam [Pitsillidis 2010] ran bots and assigned templates to different botnets Detect Spam by Looking at URLs Study the URL structure [Xie 2008, Ma 2009] Learning features from the landing page [Thomas 2011] Problem In general, content analysis is expensive
  • 35. SMTP based Detection: IP Blacklisting DNS-based blacklists Mailservers can query the service to know whether an individual IP is a known spammer Problems Low coverage [Ramachandran 2006a, Sinha 2008] Bot machines have dynamic IPs What happens when IPv6 takes over? Better Approaches IP reputation [Ramachandran 2006b, Sinha 2010, Qian 2010] Behavioral blacklisting [Ramachandran 2007, Stringhini 2011]
  • 36. SMTP based Detection: Policies Greylisting If a delivery temporary fails, spambots will not try again Easy to bypass and prone to false positives [Levine 2005] Multi-level greylisting [Janecek 2008] Sender Validation Spam pretends to come from legitimate addresses SPF,DomainKeys,DKIM [Leiba 2007] The solution chosen by Google User voting on spam and ham [Taylor 2006] Main problem: Spam hits server performances! Mail prioritization systems [Twining 2004, Venkataraman 2007]
  • 38. Social Network Detection Online Social Networks are very successful Users are not as risk aware as they are with email spam Miscreants create fake profiles to spread spam Systems to detect fake profiles have been developed [Benvenuto 2010, Lee 2010, Stringhini 2010, Yang 2011a, Yang 2011b] Real accounts that get compromised are more valuable 45% of social network users click on any link by their friends [Bilge 2009] 89% of profiles sending malicious content on Facebook are compromised [Gao 2010]
  • 40. Intrusion Detection Signature-based intrusion detection Snort,Bro [Paxson 1998] Problems Constant need of new rules Problems with encrypted traffic Anomaly-based intrusion detection The system learns the “normal” behavior of a network and flags anomalies [Portnoy 2001, Kruegel 2002, Wang 2004] Problems What is ”normal“ behavior? It is hard to get traffic that is free of infections
  • 41. Network Edge Detection Detecting Successful Infections Botnet infection can as a set of communication flows [Gu 2007] Problem: what’s the infection model of a botnet? Detecting Malicious Activity Correlation between C&C commands and malicious activity [Gu 2008a] How to identify C&C traffic? Well-known protocols (e.g., IRC, HTTP) [Gu 2008b] Look for malicious activity first [Wurzinger 2010] Leverage Previous Knowledge Detect hosts that contact the same IPs as infected machines [Coskun 2010]
  • 43. How About the Future? The arms race between researchers and cybercriminals is far from being over Is security research like fighting the Hydra?
  • 44. Future Directions Botmasters will keep developing more sophisticated techniques However, a functional botnet has to interact with legitimate services DNS servers SMTP servers Web servers Social Networks This interaction cannot be obfuscated!
  • 45. My Research In my research, I focus on analyzing how bots interact with legitimate, third party services Bots can be distinguished from real users in the way they use such services The main reason is that bots have a different goal than real users: Fast interaction vs. Good user experience
  • 46. My Research So far, I have been looking at: Social Networks How fake accounts differ from legitimate ones [ACSAC 2010] How users behavior change once an account is compromised [In submission] SMTP servers Distinguishing bots: based on the destinations they target [USENIX 2011] based on the (wrong) way in which they implement SMTP [Work in progress]
  • 47. My Research Other interesting areas: Login patterns on Social Networks Interaction with search engines (e.g., SEO) What if bots started behaving like legitimate users / programs? This conflicts with their goal!