SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
VxClass for Incident
     Response


         zynamics
    info@zynamics.com
Introduction

• Binary code is often left behind by attackers
  – Running processes
  – Dropped executables
  – Kernel memory snapshots
  – Network traffic
  – Crash dumps
Introduction

• Useful evidence but difficult to analyze
• Current methods:
  – Use AV scanner
  – Run executable to provoke/observe behavior
  – Remove packer/obfuscator code
  – Manual analysis using IDA Pro
Current Methods

•   Error-prone and time-consuming
•   AV signatures are brittle, out of date
•   Behavior can be difficult to provoke
•   Removal of protection code is difficult
•   Manual analysis
    – Does not scale
    – No easy correlation of results
VxClass

•   Structural malware classification tool
•   Categorizes malware samples into families
•   Groups malware that shares code
•   Allows correlation between samples
    – Regardless of how they were obtained
VxClass

•   Upload of samples through a web server
•   Generic unpacking through emulation
•   Extraction of structural information
•   Comparison with known samples
•   Storage of the results in a SQL database
•   Visualization of the results in the browser
Uploading

• Upload samples
  – Through a web interface in your browser
  – Through XML-RPC
  – User-based access control to samples:
     • Public: All users can see and download
     • Limited: All users can see, but not download
     • Private: Only original uploader can see and download
Unpacking

• Generic unpacking is difficult
  – Anti-debugging tricks
  – Attempts to foil emulators
  – Creation of and interaction between multiple
    processes
  – Code obfuscation
Unpacking

• Our approach: Full system emulation
• Emulated Windows XP SP2 in Bochs
• Run the executable until it looks unpacked
• Aquire memory of all processes and dirty
  kernel pages
• Use code in aquired memory for classification
Unpacking

• Solved problems
  – Anti-Debugging tricks
  – Legacy API calls
  – Multiple processes
  – Interprocess communication
  – Kernel memory analysis

• Result: Most packers can be unpacked
  automatically
Comparison

• Problem: Meaningful comparison of binary
  code
• Byte-by-byte comparison is useless
• Our approach: Structural comparison
  – Award-winning (German IT-Security Award 2006)
  – Uses industry-standard BinDiff engine
  – Uses patent-pending MD-Index (more later)
Structural Comparison

• Extract call graph and flow graph information
  from samples
• Compare the structure of these graphs instead
  of byte sequences
• Compares code derived from same source
  – Regardless of compiler settings
  – Regardless of compiler
Structural Comparison
Structural Comparison
MD-Index

• Patent-pending
• Clever hash function for directed graphs
• Assigns 80-bit value to a directed graph
  – Allows keeping a database of flow graphs
  – Allows efficient queries into the database
• Is used within VxClass for several purposes:
  – Very fast approximate comparison
  – Code search
Results

•   Memory dumps and recovered strings
•   IDA files (IDB) of the resulting disassemblies
•   Pairwise similarity scores
•   Visualisation:
    – Family trees
    – Top-10-most-similar list
Results
Results
Results
Architecture
Automated Upload                                     Manual Upload
                                  Malware




              XMLRPC-Server                    Web-Interface



                                SQL Database




Distributed
Workers


    Unpacker                  Disassembly       Classifier
                                                  MD-Index
     Bochs                     IDA
                                                  BinDiff
Case Studies

• Noise reduction
  – Automatically filter uninteresting samples
• Knowledge management
  – Share information between analysts
• Attacker Correlation
  – Is a set of attacks performed with one toolset
• Code searching
  – Find certain functions in known samples
Noise Reduction

• Upload new files to the system
• How similar are they to interesting samples ?
  – Comparison to database of known samples
• Prioritize accordingly
Knowledge management


• Each analyst uploads samples he knows to
  VxClass
• New malware comes in, gets uploaded
• VxClass determines which known samples this
  is similar to
• The expert for similar samples can be found
Attacker Correlation

• A series of incidents is investigated
• On a large number of machines, code is found
• Classify the code using VxClass to find out:
  – Is this one group of attackers ?
  – Is this similar to attacks seen in the past ?
Code Searching

• A particularly strange piece of code (just one
  function) is identified
  – Perhaps a strange encryption function
• Does this particular piece of code appear in
  other samples in the database ?
• Search is not byte-based, but flow graph
  based (MD-Index)
• The answer is one click away
Performance

• One VxClass machine
  – 800-1600 samples per day
• Performance depends on
  – Obfuscation complexity
  – Size of the malware
  – Size of the database
• Can be fully parallelized
  – The only bottleneck is the central database
Behavioral Analysis

• VxClass is not a behavioral-analysis tool
• VxClass is complementary to such tools
• We recommend combining VxClass with
  behavior-monitoring tools such as
  – CWSandbox (http://www.cwsandbox.org)
  – Anubis (Free) (http://anubis.iseclabs.org)
VxClass Options

• VxClass on a single machine
  – Run it inside your organisation
• VxClass distributed
  – Scale it to your needs
• VxClass as service
  – We host a machine for you
• VxClass as shared service
  – We host a machine for you
  – Multiple clients use a shared database
Existing Customers

• The German BSI
  – Agency for security in information systems
• Vodafone Germany
  – Pre-filters Symbian/ARM executables
• Other government entities and private
  companies
• Mostly used for attacker correlation and noise
  filtering
Limitations

• Heavy obfuscation of control flow
• Virtualizing packers
• Unpacking only works on 32-bit Windows
  – No Linux / OSX / Mobile unpacking
  – 64 bit support is in the works


• Upload of IDBs allows heavy manual
  intervention beforehand
FAQ

• What OS does it run on ?
  – It runs on a 64-bit Debian Lenny install
• Does it have any network dependencies ?
  – No
• How can we extend the system ?
  – All generated data is accessible through XML-RPC
  – If needed, direct access to the SQL can be used
  – The SQL schema is available on request
Other questions ?

Weitere ähnliche Inhalte

Was ist angesagt?

CNIT 121: 12 Investigating Windows Systems (Part 3)
CNIT 121: 12 Investigating Windows Systems (Part 3)CNIT 121: 12 Investigating Windows Systems (Part 3)
CNIT 121: 12 Investigating Windows Systems (Part 3)Sam Bowne
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 confluent
 
Distributed Erlang Systems In Operation
Distributed Erlang Systems In OperationDistributed Erlang Systems In Operation
Distributed Erlang Systems In OperationAndy Gross
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentationrcastain
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overviewscrazzl
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
Effective cloud-ready apps with MicroProfile
Effective cloud-ready apps with MicroProfileEffective cloud-ready apps with MicroProfile
Effective cloud-ready apps with MicroProfilePayara
 
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum PainKafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum Painconfluent
 
CNIT 152: 6. Scope & 7. Live Data Collection
CNIT 152: 6. Scope & 7. Live Data CollectionCNIT 152: 6. Scope & 7. Live Data Collection
CNIT 152: 6. Scope & 7. Live Data CollectionSam Bowne
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 
Performance management
Performance managementPerformance management
Performance managementAlan Lok
 
J-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National PoliceJ-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...Bert Jan Schrijver
 
DSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - ElzingaDSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - ElzingaDeltares
 
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark PresentationStephen Borg
 
DSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - VerkadeDSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - VerkadeDeltares
 
Build Libraries That People Love To use
Build Libraries That People Love To useBuild Libraries That People Love To use
Build Libraries That People Love To useDennis Doomen
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structuresconfluent
 

Was ist angesagt? (20)

Cloud patterns
Cloud patternsCloud patterns
Cloud patterns
 
CNIT 121: 12 Investigating Windows Systems (Part 3)
CNIT 121: 12 Investigating Windows Systems (Part 3)CNIT 121: 12 Investigating Windows Systems (Part 3)
CNIT 121: 12 Investigating Windows Systems (Part 3)
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
 
Distributed Erlang Systems In Operation
Distributed Erlang Systems In OperationDistributed Erlang Systems In Operation
Distributed Erlang Systems In Operation
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentation
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overview
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Effective cloud-ready apps with MicroProfile
Effective cloud-ready apps with MicroProfileEffective cloud-ready apps with MicroProfile
Effective cloud-ready apps with MicroProfile
 
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum PainKafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
 
CNIT 152: 6. Scope & 7. Live Data Collection
CNIT 152: 6. Scope & 7. Live Data CollectionCNIT 152: 6. Scope & 7. Live Data Collection
CNIT 152: 6. Scope & 7. Live Data Collection
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Performance management
Performance managementPerformance management
Performance management
 
J-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National PoliceJ-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National Police
 
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
 
DSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - ElzingaDSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
DSD-INT 2021 Towards a Deltares cloud computing service - Elzinga
 
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark Presentation
 
DSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - VerkadeDSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
 
Build Libraries That People Love To use
Build Libraries That People Love To useBuild Libraries That People Love To use
Build Libraries That People Love To use
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
 

Andere mochten auch

Automated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse EngineeringAutomated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse Engineeringzynamics GmbH
 
Automated Mobile Malware Classification
Automated Mobile Malware ClassificationAutomated Mobile Malware Classification
Automated Mobile Malware Classificationzynamics GmbH
 

Andere mochten auch (7)

Eusecwest
EusecwestEusecwest
Eusecwest
 
Automated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse EngineeringAutomated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse Engineering
 
Bh dc09
Bh dc09Bh dc09
Bh dc09
 
Hitb
HitbHitb
Hitb
 
Automated Mobile Malware Classification
Automated Mobile Malware ClassificationAutomated Mobile Malware Classification
Automated Mobile Malware Classification
 
Inbot10 vxclass
Inbot10 vxclassInbot10 vxclass
Inbot10 vxclass
 
0-knowledge fuzzing
0-knowledge fuzzing0-knowledge fuzzing
0-knowledge fuzzing
 

Ähnlich wie VxClass for Incident Response

DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapFelipe Prado
 
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an..."Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...SegInfo
 
OWASP AppSec EU - SecDevOps, a view from the trenches - Abhay Bhargav
OWASP AppSec EU - SecDevOps, a view from the trenches - Abhay BhargavOWASP AppSec EU - SecDevOps, a view from the trenches - Abhay Bhargav
OWASP AppSec EU - SecDevOps, a view from the trenches - Abhay BhargavAbhay Bhargav
 
Do you lose sleep at night?
Do you lose sleep at night?Do you lose sleep at night?
Do you lose sleep at night?Nathan Van Gheem
 
Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...
Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...
Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...Denim Group
 
Application Streaming is dead. A smart way to choose an alternative
Application Streaming is dead. A smart way to choose an alternativeApplication Streaming is dead. A smart way to choose an alternative
Application Streaming is dead. A smart way to choose an alternativeDenis Gundarev
 
Building Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSABuilding Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSADenim Group
 
Open Source Cyber Weaponry
Open Source Cyber WeaponryOpen Source Cyber Weaponry
Open Source Cyber WeaponryJoshua L. Davis
 
Configuration management with puppet
Configuration management with puppetConfiguration management with puppet
Configuration management with puppetJakub Stransky
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the ApplicationAsh Winter
 
Distributed System
Distributed SystemDistributed System
Distributed SystemIqra khalil
 
Open Audit
Open AuditOpen Audit
Open Auditncspa
 
CNIT 123: 8: Desktop and Server OS Vulnerabilites
CNIT 123: 8: Desktop and Server OS VulnerabilitesCNIT 123: 8: Desktop and Server OS Vulnerabilites
CNIT 123: 8: Desktop and Server OS VulnerabilitesSam Bowne
 
State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container EcosystemVinay Rao
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDaveEdwards12
 
Slide Deck CISSP Class Session 5
Slide Deck CISSP Class Session 5Slide Deck CISSP Class Session 5
Slide Deck CISSP Class Session 5FRSecure
 
Ch 8: Desktop and Server OS Vulnerabilites
Ch 8: Desktop and Server OS VulnerabilitesCh 8: Desktop and Server OS Vulnerabilites
Ch 8: Desktop and Server OS VulnerabilitesSam Bowne
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
OWASP Dependency-Track Introduction
OWASP Dependency-Track IntroductionOWASP Dependency-Track Introduction
OWASP Dependency-Track IntroductionSergey Sotnikov
 

Ähnlich wie VxClass for Incident Response (20)

DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
 
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an..."Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
 
OWASP AppSec EU - SecDevOps, a view from the trenches - Abhay Bhargav
OWASP AppSec EU - SecDevOps, a view from the trenches - Abhay BhargavOWASP AppSec EU - SecDevOps, a view from the trenches - Abhay Bhargav
OWASP AppSec EU - SecDevOps, a view from the trenches - Abhay Bhargav
 
Do you lose sleep at night?
Do you lose sleep at night?Do you lose sleep at night?
Do you lose sleep at night?
 
Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...
Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...
Threat Modeling the CI/CD Pipeline to Improve Software Supply Chain Security ...
 
Application Streaming is dead. A smart way to choose an alternative
Application Streaming is dead. A smart way to choose an alternativeApplication Streaming is dead. A smart way to choose an alternative
Application Streaming is dead. A smart way to choose an alternative
 
Building Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSABuilding Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSA
 
Open Source Cyber Weaponry
Open Source Cyber WeaponryOpen Source Cyber Weaponry
Open Source Cyber Weaponry
 
Configuration management with puppet
Configuration management with puppetConfiguration management with puppet
Configuration management with puppet
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
 
Distributed System
Distributed SystemDistributed System
Distributed System
 
Open Audit
Open AuditOpen Audit
Open Audit
 
CNIT 123: 8: Desktop and Server OS Vulnerabilites
CNIT 123: 8: Desktop and Server OS VulnerabilitesCNIT 123: 8: Desktop and Server OS Vulnerabilites
CNIT 123: 8: Desktop and Server OS Vulnerabilites
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container Ecosystem
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
 
Slide Deck CISSP Class Session 5
Slide Deck CISSP Class Session 5Slide Deck CISSP Class Session 5
Slide Deck CISSP Class Session 5
 
Ch 8: Desktop and Server OS Vulnerabilites
Ch 8: Desktop and Server OS VulnerabilitesCh 8: Desktop and Server OS Vulnerabilites
Ch 8: Desktop and Server OS Vulnerabilites
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
OWASP Dependency-Track Introduction
OWASP Dependency-Track IntroductionOWASP Dependency-Track Introduction
OWASP Dependency-Track Introduction
 

VxClass for Incident Response

  • 1. VxClass for Incident Response zynamics info@zynamics.com
  • 2. Introduction • Binary code is often left behind by attackers – Running processes – Dropped executables – Kernel memory snapshots – Network traffic – Crash dumps
  • 3. Introduction • Useful evidence but difficult to analyze • Current methods: – Use AV scanner – Run executable to provoke/observe behavior – Remove packer/obfuscator code – Manual analysis using IDA Pro
  • 4. Current Methods • Error-prone and time-consuming • AV signatures are brittle, out of date • Behavior can be difficult to provoke • Removal of protection code is difficult • Manual analysis – Does not scale – No easy correlation of results
  • 5. VxClass • Structural malware classification tool • Categorizes malware samples into families • Groups malware that shares code • Allows correlation between samples – Regardless of how they were obtained
  • 6. VxClass • Upload of samples through a web server • Generic unpacking through emulation • Extraction of structural information • Comparison with known samples • Storage of the results in a SQL database • Visualization of the results in the browser
  • 7. Uploading • Upload samples – Through a web interface in your browser – Through XML-RPC – User-based access control to samples: • Public: All users can see and download • Limited: All users can see, but not download • Private: Only original uploader can see and download
  • 8.
  • 9. Unpacking • Generic unpacking is difficult – Anti-debugging tricks – Attempts to foil emulators – Creation of and interaction between multiple processes – Code obfuscation
  • 10. Unpacking • Our approach: Full system emulation • Emulated Windows XP SP2 in Bochs • Run the executable until it looks unpacked • Aquire memory of all processes and dirty kernel pages • Use code in aquired memory for classification
  • 11. Unpacking • Solved problems – Anti-Debugging tricks – Legacy API calls – Multiple processes – Interprocess communication – Kernel memory analysis • Result: Most packers can be unpacked automatically
  • 12. Comparison • Problem: Meaningful comparison of binary code • Byte-by-byte comparison is useless • Our approach: Structural comparison – Award-winning (German IT-Security Award 2006) – Uses industry-standard BinDiff engine – Uses patent-pending MD-Index (more later)
  • 13. Structural Comparison • Extract call graph and flow graph information from samples • Compare the structure of these graphs instead of byte sequences • Compares code derived from same source – Regardless of compiler settings – Regardless of compiler
  • 14.
  • 17. MD-Index • Patent-pending • Clever hash function for directed graphs • Assigns 80-bit value to a directed graph – Allows keeping a database of flow graphs – Allows efficient queries into the database • Is used within VxClass for several purposes: – Very fast approximate comparison – Code search
  • 18. Results • Memory dumps and recovered strings • IDA files (IDB) of the resulting disassemblies • Pairwise similarity scores • Visualisation: – Family trees – Top-10-most-similar list
  • 22. Architecture Automated Upload Manual Upload Malware XMLRPC-Server Web-Interface SQL Database Distributed Workers Unpacker Disassembly Classifier MD-Index Bochs IDA BinDiff
  • 23. Case Studies • Noise reduction – Automatically filter uninteresting samples • Knowledge management – Share information between analysts • Attacker Correlation – Is a set of attacks performed with one toolset • Code searching – Find certain functions in known samples
  • 24. Noise Reduction • Upload new files to the system • How similar are they to interesting samples ? – Comparison to database of known samples • Prioritize accordingly
  • 25. Knowledge management • Each analyst uploads samples he knows to VxClass • New malware comes in, gets uploaded • VxClass determines which known samples this is similar to • The expert for similar samples can be found
  • 26. Attacker Correlation • A series of incidents is investigated • On a large number of machines, code is found • Classify the code using VxClass to find out: – Is this one group of attackers ? – Is this similar to attacks seen in the past ?
  • 27. Code Searching • A particularly strange piece of code (just one function) is identified – Perhaps a strange encryption function • Does this particular piece of code appear in other samples in the database ? • Search is not byte-based, but flow graph based (MD-Index) • The answer is one click away
  • 28. Performance • One VxClass machine – 800-1600 samples per day • Performance depends on – Obfuscation complexity – Size of the malware – Size of the database • Can be fully parallelized – The only bottleneck is the central database
  • 29. Behavioral Analysis • VxClass is not a behavioral-analysis tool • VxClass is complementary to such tools • We recommend combining VxClass with behavior-monitoring tools such as – CWSandbox (http://www.cwsandbox.org) – Anubis (Free) (http://anubis.iseclabs.org)
  • 30. VxClass Options • VxClass on a single machine – Run it inside your organisation • VxClass distributed – Scale it to your needs • VxClass as service – We host a machine for you • VxClass as shared service – We host a machine for you – Multiple clients use a shared database
  • 31. Existing Customers • The German BSI – Agency for security in information systems • Vodafone Germany – Pre-filters Symbian/ARM executables • Other government entities and private companies • Mostly used for attacker correlation and noise filtering
  • 32. Limitations • Heavy obfuscation of control flow • Virtualizing packers • Unpacking only works on 32-bit Windows – No Linux / OSX / Mobile unpacking – 64 bit support is in the works • Upload of IDBs allows heavy manual intervention beforehand
  • 33. FAQ • What OS does it run on ? – It runs on a 64-bit Debian Lenny install • Does it have any network dependencies ? – No • How can we extend the system ? – All generated data is accessible through XML-RPC – If needed, direct access to the SQL can be used – The SQL schema is available on request