SlideShare ist ein Scribd-Unternehmen logo
1 von 71
The Future of
    Automated Malware Generation
                  Stephan Chenette
    Director of Security Research & Development



1
Who Am I?
    • Stephan Chenette @StephanChenette (twitter)
    • Currently Director of Security R&D @ IOActive
      •Building / Breaking / Hacking / Researching


    • R&D @ eEye Digital Security 4+ years
    • Head Security Researcher @ Websense 6+ years
    • (Graduate Student @ UCSD - Network Security)


2
What I hope you learn…
    • An understanding of the current malware landscape
    • Various malware/exploit defense techniques
    • Where I think detection/defense technologies are
      headed
    • How malware authors will most likely react
       drive the future of automated malware generation




3
Statement
    This particular topic/area is a personal research interest
    of mine –

    I’m hoping to basically motivate you to think offensively
    when building or using defensive technologies…

    For Example: I’m currently helping on an open source
    automated detection technology for the cuckoo
    sandbox – and am trying to evade/bypass it at the same
    time

4
Agenda
    • Current State of Automated Malware Generation
    • Current State of Malware Defense (Tech.)
    • Malware Trends
    • The Future of Malware Defense
    • The Future of Automated Malware Generation




5
Malware Distribution Networks
              (MDNs)




6
Malware Distribution Networks
    Malware has evolved into a profitable business for
    cyber criminals

    •Complex/Organized/Distributed Network
    •Malware Distribution Network (MDNs)
      •Pay-per-install (PPI) clients (RogueAV, SpamBot, keylogger)
      •PPI Services
      •PPI Affiliates (landing pages, redirection services, etc.)



7
Malware Distribution Networks (MDNs)


                                 2                 3                4
                 1




    Source: Microsoft Security Intelligence Threat Report (http://www.microsoft.com/sir   )

8
Malware Distribution Networks (MDNs)

      Single Sample Repository
        A repository that does not update the malicious
        executable for the lifetime of the repository.


      Multiple Sample Repository
        A repository that performs updates to the malicious
        executable over time, but is not generating the
        samples for each request

      Polymorphic/Metamorphic Repository
        A repository that produces a unique malicious
        executable for every download request
9
Example: Blackhole Exploit Kit
     Blackhole contains an integrated AV scanner and will auto-repackage if
     malware is detected




     Figure: Blackhole exploit kit download chain

     Source: Manufacturing Compromise: The Emergence of Exploit-as-a-Service
     (http://cseweb.ucsd.edu/~voelker/pubs/eaas-ccs12.pdf)



10
Exploit Kits and Malware
        Blackhole | Ingognito || ZeroAccess | TDSS




     Source: Manufacturing Compromise: The Emergence of Exploit-as-a-Service
     (http://cseweb.ucsd.edu/~voelker/pubs/eaas-ccs12.pdf)


11
Agenda
     • Current State of Automated Malware Generation
     • Current State of Malware Defense (Tech.)
     • Malware Trends
     • The Future of Malware Defense
     • The Future of Automated Malware Generation




12
Current State of Malware
         Defense (Tech.)




13
Current Techniques
     • Hash
     • Signatures
     • Heuristics
     • Semantics-aware detection




14
Current Techniques
              Attacker             Defender
          Easier to bypass   Easier to implement




          Harder to change   Harder to implement

15
Hash-based detection
     • Full file hashing (cryptographic checksum)
       •MD5, SHA1, SHA256


     • Portable Executable (PE)
       •Sectional hashing
       •Custom hashing
       •Fuzzy hashing (ssdeep)

     • Error on the side of caution

16
Defeating Hash-based detection
     • Create Unique malware sample per user request
       •Randomizing single byte in irrelevant file offset
       •Re-packaging binary (FSG, ASPack, Themida)
       •Re-building malware dynamically




17
Signature-based detection
     • Regular Expression based signatures (PCRE, RE2)
     • Byte-signatures
      rule ASPack
      {
              strings:
              $ = { 60 E8 ?? ?? ?? ?? 5D 81 ED ?? ?? (43 | 44) ?? B8 ?? ?? (43 | 44) ?? 03 C5 }
              $ = { 60 EB ?? 5D EB ?? FF ?? ?? ?? ?? ?? E9 }
              $ = { 60 EB 03 5D FF E5 E8 F8 FF FF FF 81 ED 1B 6A 44 00 BB 10 6A 44 00 03 DD 2B 9D 2A }
              $ = { 60 E8 00 00 00 00 5D ?? ?? ?? ?? ?? ?? BB ?? ?? ?? ?? 03 DD }
              $ = { 60 E8 41 06 00 00 EB 41 }
              $ = { 60 E8 7? 05 00 00 EB (33 | 4C) }
          
              condition:
          
                  for any of them : ($ at entrypoint)
      }


     • Deeper contextual content scanning with proprietary
       language
18
Defeating Signature-based detection
     • Syntax mutation easily defeats this technique
              •    Garbage Code Insertion e.g. NOP, “MOV ax, ax”, “SUB ax 0”
              •    Register Renaming
              •    Subroutine Permutation
              •    Code Reordering through Jumps
              •    Equivalent instruction substitution
     Instruction          Equivalent instruction
     MOV EAX, EBX         PUSH EBX, POP EAX

     Call                 Emulated Call                            Misused Call
     CALL <target>        PUSH <PC + sizeof(PUSH) + sizeof(JMP)>   CALL <target>
                          JMP <target>
                                                                   .target
                                                                   POP <register-name>

              • Same behavior but different syntax
19
Heuristics are introduced…
     AV engines were forced to evolve and use heuristics by
     way of emulation/behavioral analysis due to:
       •Polymorphic engines
         • Encrypt body with randomly generated encryption
           algorithm
         • Private key normally in decoding engine
       •Metamorphic engines
         • Employs obfuscation/substitution techniques instead of encryption
           •   Junk insertion, equivalent instruction substitution, etc.




20
Heuristics-based detection
     General term for the different techniques used to
     detect malware by their behavior
        Emulation, API hooking, sand-boxing, file anomalies and other analysis techniques



                                                                                  Rule A
                                                      Rule B
                                  Rule C

                                      IF Rule A then Rule B then Rule C then Poison Ivy




     Source: (http://http://hooked-on-mnemonics.blogspot.com)

21
Defeating Heuristics-based detection
     • Detect emulation and execute different code path
     • Break emulation engine
     • Avoid the heuristics

     • Overall solid method
     • Possible false positives




22
Semantics-aware Detection
      • Captured execution trace is transformed into a higher-level
        representation capturing its semantic meaning, i.e., the trace
        is first abstracted before being compared to a malicious
        behavior
       • Make the time to build the code flow or extraction of a
         model infeasible for real-time AV using time lock puzzles


       • Intermediate representation (IR)
         •   Abstract Syntax Trees, Register Transfer Language



23
Semantics-aware detection




      Good idea in theory, but unknown (to me) how widely
      implemented this is in security products


24
Defeating Semantics-aware detection

      Implementation is difficult
      Limited support for equivalent code sequences

               a = b * 2
               a = b << 1
      A left arithmetic shift by n is equivalent to multiplying by 2n
      (provided the value does not overflow)

      Focus on same techniques used to defeat signatures
      and heuristics + likelihood of limited support less
      popular instructions
25
Recap




26
Agenda
     • Current State of Automated Malware Generation
     • Current State of Malware Defense (Tech.)
     • Malware Trends
     • The Future of Malware Defense
     • The Future of Automated Malware Generation




27
Malware Trends




28
Malware Detection Reality Check
     • How well are current detection techniques working?




                       33%!
29
Malware Samples
     Observation: # of Malware Samples are increasing




     Source: Mcafee Global Q12012 Threat Report
     (http://mcafee.com/us/resources/reports/rp-quarterly-threat-q1-2012.pdf)


30
Mobile Malware Samples
     Observation: # of Android Malware Samples are
     increasing




     Source: Kaspersky Q12012 Threat Report
     (http://www.securelist.com/en/analysis/204792231/IT_Threat_Evolution_Q1_2012)

31
Use of Behavior Sandboxes
     Client binary is malware but isn’t detected.
     Suspicious files are sent back to “home base/cloud”
     lab for analysis
     1.Sent to sandbox system
     2.Meta data report is created for easier export of
     new rules
      a. Hash and blacklist entries are added
      b. Signatures are added
      c. Heuristic detection is added

32
The Overworked Malware Analyst




33
Solving the problem with people
      Malware Analysts      Malware Samples
                            Samples


                              A D!!
                        L   O
                  O VER


34
Agenda
     • Current State of Automated Malware Generation
     • Current State of Malware Defense (Tech.)
     • Malware Trends
     • The Future of Malware Defense
     • The Future of Automated Malware Generation




35
The Future of Malware Defense




     Skynet? …probably not
     But some of the concepts aren’t too far fetched…



36
The Future of Malware Defense


       Perhaps malware detection should have more
                  science applied to it.




37
The Malware Infinity Problem
     Malware detection
     As malware samples approaches ∞ we can’t manually
     add detection for every file. We must model WHAT
     actions malware take, HOW it makes those actions
     and WHERE it makes connected.

     Malware Attribution
     As Attack Surface approaches ∞ we can’t defend
     everything from everyone. We must model WHO is
     after WHICH assets and HOW they attack.

38
The Future of Malware Defense
     IF we are going to start modeling we must make
     some assumptions:

     1.Attackers are going to change their code and
     techniques only enough to avoid detection
     2.The majority of malware/exploits code and
     techniques will continue to represent future
     malware/exploits code and techniques


39
The Who is important…
     “Researchers at Symantec traced the group’s work after
     finding a number of similarities between the Google attack
     code and methods and those used against other
     companies and organizations over the last few years.

     The researchers, who describe their findings in a report
     published Friday, say the gang — which they have dubbed
     the “Elderwood gang” based on the name of a
     parameter used in the attack codes — appears to
     have breached more than 1,000 computers in
     companies spread throughout several sectors –
     including defense, shipping, oil and gas, financial,
     technology and ISPs. The group has also targeted non-
     governmental organizations, particularly ones connected
     to human rights activities related to Tibet and China”

     Source: http://www.wired.com/threatlevel/2012/09/google-
     hacker-gang-returns/




40
Statistics
     A discipline that makes you understand data and
     makes you make decisions based on data
                            S
                            T
                            A
                            T
                            I
         Data               S           Decisions
                            T
                            I
                            C
                            S
41
Train the Machines
            •Classify
            •Cluster




42
Automatic Classification
                                                                                                         Steps:
                                                                                                         1.Extract features
                                                                                                         2.Train models using ML
                                                                                                         algorithms
                                                                                                         3.Feature Selection
                                                                                                         4.Use models as classifiers
                                                                                                         5.Use models to classify
                                                                                                         unknown files as 0 or 1




      Source: http://eval.symantec.com/mktginfo/enterprise/white_papers/b-dlp_machine_learning.WP_en-us.pdf
43
Machine learning
     Where we train computers to make statistical
     decisions on real-time data based on inputted data

     While machine learning as a concept has been
     around for decades and has been used in everything
     from anti-spam engines to Google™ algorithms for
     translating text, it is only now being applied to web
     filtering, DLP and malware content analysis.


44
Historical Observation
     Historically certain malware has
     •No icon
     •No description or company in resource section
     •Is packed
     •Lives in windows directory or user profile


     These are the type of “features” that expert humans
     would feed to machine learning classifiers to train on

45
Expert Humans train Machines
     “You can’t effectively and consistently manage what you can’t
     measure, and you can’t measure what you haven’t defined…”
     SOURCE: http://fairwiki.riskmanagementinsight.com/?page_id=3




     •The job of the human
        •List features

     •The job of the machine
        •Model which features are important, in what grouping and in what order
     •Classify
     •Cluster


46
Machine Learning (ML) Algorithms

     • Naive Baysian Classifier (each feature is independent of the
       other features)
     • Support Vector Machine (SVM) when high dimensionality (high
       dimensionality.. more than a thousand of variables are in the
       model)
     • Random Forest when you want an interpretable model (<
       2000 features)
     • Marchov Chains (Natural Language Processing) for when you
       want to assess the sequence probability


47
The Future of Malware Defense

                      Network
                     File System
                   Physical Memory




                                     Inspection Point

       Every Layer provides various degrees of
                 “features” to inspect

48
The Future of Malware Defense




49
Existing Academic work…
     • D. Plonka and P. Barford. Context-Aware Clustering of DNS Query
       Traffic. In Proceedings of the 8th ACM SIGCOMM conference on
       Internet Measurement, October 2008.

     • R. Perdisci, W. Lee, and N. Feamster. Behavioral Clustering of HTTP-
       Based Malware and Signature Generation Using Malicious Network
       Traces. In Proceedings of the 7th USENIX conference on Networked
       Systems Design and Implementation, April 2010.

     • K. Rieck, P. Trinius, C. Willems, T. Holz. Automatic Analysis of
       Malware Behavior using Machine Learning. e Journal of Computer
       Security, 2011


50
Projects using machine learning
      •Razorbacktm -
       http://sourceforge.net/projects/razorbacktm/files/
      •Malheur - http://www.mlsec.org/malheur/
      •Malvic - http://www.malvic.org
      •Adobe Open Source Malware Classification Tool
       http://sourceforge.net/projects/malclassifier.adobe/
        • 98.21% accuracy
        • 6.7% false positive rate
        • 7 features = DebugSize, ImageVersion, IatRVA, ExportSize,
          ResourceSize, VirtualSize2, NumberOfSections


51
Statistics Based Detection Tools




52
The Future of Malware Defense
      •Using Machine learning for malware detection is only as
       useful as the features you create and the good and bad
       sample sets it’s trained on.
        • Features
        • Good Sample Set
        • Bad Sample Set

        • If you have 1000’s of samples but on the same malware or
          sample exploit…not good!!!




53
PDF Example Features
     • Compressed JavaScript
     • PDF header location e.g %PDF - within first 1024 bytes
     • Does it contain an embedded file (e.g. flash, sound file)
     • Signed by a trusted certificate
     • Encoded/Encrypted Streams e.g. FlatDecode
     • Names hex escaped
     • Bogus xref table

     Reference: http://blog.fireeye.com/files/27c3_julia_wolf_omg-wtf-pdf.pdf



54
Detecting shellcode
                                   • Marchov chains
                                     To determine probability of
                                     instruction sequences       0.3

                                   • Technique clustering        0.7
                                                                                 0.4


                                                                                    0.6


          XOR   ECX, ECX               ;   ECX = 0
          MOV   ESI, [FS:ECX + 0x30]   ;   ESI = &(PEB) ([FS:0x30])
          MOV   ESI, [ESI + 0x0C]      ;   ESI = PEB->Ldr
          MOV   ESI, [ESI + 0x1C]      ;   ESI = PEB->Ldr.InInitOrder next_module:
          MOV   EBP, [ESI + 0x08]      ;   EBP = InInitOrder[X].base_address
          MOV   EDI, [ESI + 0x20]      ;   EBP = InInitOrder[X].module_name (unicode)
          MOV   ESI, [ESI]             ;   ESI = InInitOrder[X].flink (next module)
          CMP   [EDI + 12*2], CL       ;   modulename[12] == 0 ?
          JNE   next_module            ;   No: try next module.


55
Shellcode detection


     Decoder routine clustering
     Detect entropy of bytes to indicated encoded
     payload

     ...features =]


56
Malware features in action …
     • Features:
       •Static:
          • Packed
          • File size
          • Origin
       •Dynamic (Network)
          • Makes a connection
          • Number of DNS request
          • Encrypted Communication
          • Burst/length of communication
       •Dynamic (File)
          • Register keys
          • File level modifications
57
The Future of Malware Defense
     • Choose features that are harder for the attacker to
       change.
       •E.g. bot network communication protocol
        (if not encrypted)




58
Agenda
     • Current State of Automated Malware Generation
     • Current State of Malware Defense (Tech.)
     • Malware Trends
     • The Future of Malware Defense
     • The Future of Automated Malware Generation




59
The Future of Automated
       Malware Generation




60
The Future of Malware Offense
     The Attacker has a few things in their favor:
     1.Prone to False Positives
          Machine learning can be prone to false positives and false negatives
          if feature and sample sets aren’t extensive enough
     1.Avoid Feature Indicators
           Detection via machine learning can be defeated if an attacker can
      find out where the features are and avoid them
     1.New Features Come Out…
          You can't protect yourself from a new weapon if you don't know it
          exist


61
Prone to false positives
     If the defense side creates models based on a small sample
     set or a sample set that doesn’t represent a diverse enough
     sample set than the model will be too restrictive – false
     negatives


     If the defense creates models based only on malicious files
     and not enough good files there will be tons of false positives


     An Attacker can always try poison the sample sets if they have
     enough manipulation power and resources (VirusTotal)
62
Avoid feature indicators
     • Attackers can always do the same research and model generic
       malware and avoid features that are being used by most
       malware
     • …to instead use features that that are more popular in benign
       software
     • This will also avoid being placed in known clusters




63
New features come out…
     • If format changes, or gets updated:

       •A new file/protocol parser must be created/updated to
        understand and extract features

       •The model must be retrained and shipped out




64
…OR Just keep is simple
     Encrypt binaries with a user-specific key so that AV
     can’t decrypt it

     •Targeted binary like Gauss
       •Encrypted DLL with user key


     •Zeus
       •Encrypted the downloaded binary with user key




65
Conclusion
     • Complex/Organized Network
     • Malware distribution network (MDNs)
       •Pay-per-install (PPI) clients
       •Malware crypt services will include
         • Feature verification
            •   anti-clustering technology  the Future?
            •   anti-classification technology  The Future?

     Will this be the future of automated
     malware generation? Or will it just be more
                              of the same?
66
Conclusion

     Today, what I hope that you learned is that if
     you want to truly understand your defensive
     technology you have to understand it’s
     limitations and look at things from an
     attacker/offensive viewpoint.




67
Conclusion

     Proper security is all about a defense-in-depth
     strategy. Create multiple layers of defense.
     Every layer presenting a different set of
     challenges, requiring different skill sets and
     technology.
     So every layer will increase the time and effort
     to compromise your environment and
     exfiltration data.
68
Conclusion

     External reconnaissance
     Penetration
     Internal reconnaissance + stage persistent state
     Exfiltration

     If security strategy is successful:
     via your layered defenses the attack is stopped
     before exfiltration of data can happen.
69
Questions?

     questions.py:
     while len(questions) > 0:
       if time <= 0:
           break
       print answers[questions.pop()]

70
Thanks Pacsec!
            Stephan Chenette | @StephanChenette
            Director of Research and Development


               IOActive, Inc. http://ioactive.com




71

Weitere ähnliche Inhalte

Was ist angesagt?

Software Security (Vulnerabilities) And Physical Security
Software Security (Vulnerabilities) And Physical SecuritySoftware Security (Vulnerabilities) And Physical Security
Software Security (Vulnerabilities) And Physical Security
Nicholas Davis
 
Watchtowers of the Internet - Source Boston 2012
Watchtowers of the Internet - Source Boston 2012Watchtowers of the Internet - Source Boston 2012
Watchtowers of the Internet - Source Boston 2012
Stephan Chenette
 
Fighting advanced malware using machine learning (English)
Fighting advanced malware using machine learning (English)Fighting advanced malware using machine learning (English)
Fighting advanced malware using machine learning (English)
FFRI, Inc.
 
Android Malware Analysis
Android Malware AnalysisAndroid Malware Analysis
Android Malware Analysis
JongWon Kim
 

Was ist angesagt? (20)

Exploitation techniques and fuzzing
Exploitation techniques and fuzzingExploitation techniques and fuzzing
Exploitation techniques and fuzzing
 
Software Security (Vulnerabilities) And Physical Security
Software Security (Vulnerabilities) And Physical SecuritySoftware Security (Vulnerabilities) And Physical Security
Software Security (Vulnerabilities) And Physical Security
 
Watchtowers of the Internet - Source Boston 2012
Watchtowers of the Internet - Source Boston 2012Watchtowers of the Internet - Source Boston 2012
Watchtowers of the Internet - Source Boston 2012
 
Automatic tool for static analysis
Automatic tool for static analysisAutomatic tool for static analysis
Automatic tool for static analysis
 
IDA Vulnerabilities and Bug Bounty  by Masaaki Chida
IDA Vulnerabilities and Bug Bounty  by Masaaki ChidaIDA Vulnerabilities and Bug Bounty  by Masaaki Chida
IDA Vulnerabilities and Bug Bounty  by Masaaki Chida
 
Malware Classification Using Structured Control Flow
Malware Classification Using Structured Control FlowMalware Classification Using Structured Control Flow
Malware Classification Using Structured Control Flow
 
Fighting advanced malware using machine learning (English)
Fighting advanced malware using machine learning (English)Fighting advanced malware using machine learning (English)
Fighting advanced malware using machine learning (English)
 
Introduction to penetration testing
Introduction to penetration testingIntroduction to penetration testing
Introduction to penetration testing
 
Embedded device hacking Session i
Embedded device hacking Session iEmbedded device hacking Session i
Embedded device hacking Session i
 
SmartphoneHacking_Android_Exploitation
SmartphoneHacking_Android_ExploitationSmartphoneHacking_Android_Exploitation
SmartphoneHacking_Android_Exploitation
 
Android Malware Analysis
Android Malware AnalysisAndroid Malware Analysis
Android Malware Analysis
 
Malware Analysis
Malware AnalysisMalware Analysis
Malware Analysis
 
Inside the Matrix,How to Build Transparent Sandbox for Malware Analysis
Inside the Matrix,How to Build Transparent Sandbox for Malware AnalysisInside the Matrix,How to Build Transparent Sandbox for Malware Analysis
Inside the Matrix,How to Build Transparent Sandbox for Malware Analysis
 
Ceh v8 labs module 10 denial of service
Ceh v8 labs module 10 denial of serviceCeh v8 labs module 10 denial of service
Ceh v8 labs module 10 denial of service
 
Ceh v8 labs module 05 system hacking
Ceh v8 labs module 05 system hackingCeh v8 labs module 05 system hacking
Ceh v8 labs module 05 system hacking
 
Ceh v8 labs module 07 viruses and worms
Ceh v8 labs module 07 viruses and wormsCeh v8 labs module 07 viruses and worms
Ceh v8 labs module 07 viruses and worms
 
Purple team is awesome
Purple team is awesomePurple team is awesome
Purple team is awesome
 
Ceh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networksCeh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networks
 
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
An Introduction of SQL Injection, Buffer Overflow & Wireless AttackAn Introduction of SQL Injection, Buffer Overflow & Wireless Attack
An Introduction of SQL Injection, Buffer Overflow & Wireless Attack
 
Web application Testing
Web application TestingWeb application Testing
Web application Testing
 

Ähnlich wie The Future of Automated Malware Generation

Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...
Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...
Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...
Maksim Shudrak
 
Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?
HackIT Ukraine
 
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Shih-Kun Huang
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
Rahul Mohandas
 
EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22
MichaelM85042
 
Building Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSABuilding Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSA
Denim Group
 
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Lastline, Inc.
 
Rahul - Analysis Of Adversarial Code - ClubHack2007
Rahul - Analysis Of Adversarial Code - ClubHack2007Rahul - Analysis Of Adversarial Code - ClubHack2007
Rahul - Analysis Of Adversarial Code - ClubHack2007
ClubHack
 

Ähnlich wie The Future of Automated Malware Generation (20)

B-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive DefenseB-Sides Seattle 2012 Offensive Defense
B-Sides Seattle 2012 Offensive Defense
 
Reverse Engineering Malware - A Practical Guide
Reverse Engineering Malware - A Practical GuideReverse Engineering Malware - A Practical Guide
Reverse Engineering Malware - A Practical Guide
 
Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...
Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...
Tricky sample? Hack it easy! Applying dynamic binary inastrumentation to ligh...
 
Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?Алексей Старов - Как проводить киберраследования?
Алексей Старов - Как проводить киберраследования?
 
Malware analysis _ Threat Intelligence Morocco
Malware analysis _ Threat Intelligence MoroccoMalware analysis _ Threat Intelligence Morocco
Malware analysis _ Threat Intelligence Morocco
 
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
Baab (Bug as a Backdoor) through automatic exploit generation (CRAX)
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
[HUN][hackersuli] Malware avengers
[HUN][hackersuli] Malware avengers[HUN][hackersuli] Malware avengers
[HUN][hackersuli] Malware avengers
 
The Hacking Games - Operation System Vulnerabilities Meetup 29112022
The Hacking Games - Operation System Vulnerabilities Meetup 29112022The Hacking Games - Operation System Vulnerabilities Meetup 29112022
The Hacking Games - Operation System Vulnerabilities Meetup 29112022
 
EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22EMBA Firmware analysis - TROOPERS22
EMBA Firmware analysis - TROOPERS22
 
Building Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSABuilding Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSA
 
[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection[RAT資安小聚] Study on Automatically Evading Malware Detection
[RAT資安小聚] Study on Automatically Evading Malware Detection
 
TENTACLE: Environment-Sensitive Malware Palpation(PacSec 2014)
TENTACLE: Environment-Sensitive Malware Palpation(PacSec 2014)TENTACLE: Environment-Sensitive Malware Palpation(PacSec 2014)
TENTACLE: Environment-Sensitive Malware Palpation(PacSec 2014)
 
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an..."Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
"Automated Malware Analysis" de Gabriel Negreira Barbosa, Malware Research an...
 
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
Full-System Emulation Achieving Successful Automated Dynamic Analysis of Evas...
 
PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynam...
PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynam...PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynam...
PANDEMONIUM: Automated Identification of Cryptographic Algorithms using Dynam...
 
Rahul - Analysis Of Adversarial Code - ClubHack2007
Rahul - Analysis Of Adversarial Code - ClubHack2007Rahul - Analysis Of Adversarial Code - ClubHack2007
Rahul - Analysis Of Adversarial Code - ClubHack2007
 
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detection
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detectionAnti-virus Mechanisms and Various Ways to Bypass Antivirus detection
Anti-virus Mechanisms and Various Ways to Bypass Antivirus detection
 
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
Automated In-memory Malware/Rootkit  Detection via Binary Analysis and Machin...Automated In-memory Malware/Rootkit  Detection via Binary Analysis and Machin...
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels Camp
 

Mehr von Stephan Chenette

Building Custom Android Malware BruCON 2013
Building Custom Android Malware BruCON 2013Building Custom Android Malware BruCON 2013
Building Custom Android Malware BruCON 2013
Stephan Chenette
 
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Stephan Chenette
 
Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007
Stephan Chenette
 
Web Wreck-utation - CanSecWest 2008
Web Wreck-utation - CanSecWest 2008Web Wreck-utation - CanSecWest 2008
Web Wreck-utation - CanSecWest 2008
Stephan Chenette
 
The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008
Stephan Chenette
 

Mehr von Stephan Chenette (8)

Landing on Jupyter
Landing on JupyterLanding on Jupyter
Landing on Jupyter
 
Building Custom Android Malware BruCON 2013
Building Custom Android Malware BruCON 2013Building Custom Android Malware BruCON 2013
Building Custom Android Malware BruCON 2013
 
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
 
Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007
 
Web Wreck-utation - CanSecWest 2008
Web Wreck-utation - CanSecWest 2008Web Wreck-utation - CanSecWest 2008
Web Wreck-utation - CanSecWest 2008
 
The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008The Ultimate Deobfuscator - ToorCON San Diego 2008
The Ultimate Deobfuscator - ToorCON San Diego 2008
 
Script Fragmentation - Stephan Chenette - OWASP/RSA 2008
Script Fragmentation - Stephan Chenette - OWASP/RSA 2008Script Fragmentation - Stephan Chenette - OWASP/RSA 2008
Script Fragmentation - Stephan Chenette - OWASP/RSA 2008
 
Fireshark - Brucon 2010
Fireshark - Brucon 2010Fireshark - Brucon 2010
Fireshark - Brucon 2010
 

Kürzlich hochgeladen

Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
baharayali
 
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Amil Baba Naveed Bangali
 
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
baharayali
 
Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...
Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...
Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...
baharayali
 
Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...
Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...
Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...
Amil Baba Naveed Bangali
 
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
baharayali
 
Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...
Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...
Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...
baharayali
 
Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...
Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...
Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...
baharayali
 
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
baharayali
 

Kürzlich hochgeladen (20)

Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
 
Emails, Facebook, WhatsApp and the Dhamma (English and Chinese).pdf
Emails, Facebook, WhatsApp and the Dhamma  (English and Chinese).pdfEmails, Facebook, WhatsApp and the Dhamma  (English and Chinese).pdf
Emails, Facebook, WhatsApp and the Dhamma (English and Chinese).pdf
 
Genesis 1:5 - Meditate the Scripture Daily bit by bit
Genesis 1:5 - Meditate the Scripture Daily bit by bitGenesis 1:5 - Meditate the Scripture Daily bit by bit
Genesis 1:5 - Meditate the Scripture Daily bit by bit
 
From The Heart v8.pdf xxxxxxxxxxxxxxxxxxx
From The Heart v8.pdf xxxxxxxxxxxxxxxxxxxFrom The Heart v8.pdf xxxxxxxxxxxxxxxxxxx
From The Heart v8.pdf xxxxxxxxxxxxxxxxxxx
 
Peaceful Meditation | Peaceful Way by Kabastro
Peaceful Meditation | Peaceful Way by KabastroPeaceful Meditation | Peaceful Way by Kabastro
Peaceful Meditation | Peaceful Way by Kabastro
 
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
 
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
 
Hire Best Next Js Developer For Your Project
Hire Best Next Js Developer For Your ProjectHire Best Next Js Developer For Your Project
Hire Best Next Js Developer For Your Project
 
St. Louise de Marillac and Care of the Sick Poor
St. Louise de Marillac and Care of the Sick PoorSt. Louise de Marillac and Care of the Sick Poor
St. Louise de Marillac and Care of the Sick Poor
 
Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...
Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...
Top Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist in S...
 
Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...
Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...
Best Astrologer Vashikaran Specialist in Germany and France Black Magic Exper...
 
Louise de Marillac and Care for the Elderly
Louise de Marillac and Care for the ElderlyLouise de Marillac and Care for the Elderly
Louise de Marillac and Care for the Elderly
 
Meaning of 22 numbers in Matrix Destiny Chart | 22 Energy Calculator
Meaning of 22 numbers in Matrix Destiny Chart | 22 Energy CalculatorMeaning of 22 numbers in Matrix Destiny Chart | 22 Energy Calculator
Meaning of 22 numbers in Matrix Destiny Chart | 22 Energy Calculator
 
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...
Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...
Popular Kala Jadu, Black magic expert in Karachi and Kala jadu expert in Laho...
 
Exploring the Meaning of Jesus’ Ascension
Exploring the Meaning of Jesus’ AscensionExploring the Meaning of Jesus’ Ascension
Exploring the Meaning of Jesus’ Ascension
 
Zulu - The Epistle of Ignatius to Polycarp.pdf
Zulu - The Epistle of Ignatius to Polycarp.pdfZulu - The Epistle of Ignatius to Polycarp.pdf
Zulu - The Epistle of Ignatius to Polycarp.pdf
 
Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...
Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...
Famous Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in ka...
 
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
Famous Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi ...
 

The Future of Automated Malware Generation

  • 1. The Future of Automated Malware Generation Stephan Chenette Director of Security Research & Development 1
  • 2. Who Am I? • Stephan Chenette @StephanChenette (twitter) • Currently Director of Security R&D @ IOActive •Building / Breaking / Hacking / Researching • R&D @ eEye Digital Security 4+ years • Head Security Researcher @ Websense 6+ years • (Graduate Student @ UCSD - Network Security) 2
  • 3. What I hope you learn… • An understanding of the current malware landscape • Various malware/exploit defense techniques • Where I think detection/defense technologies are headed • How malware authors will most likely react  drive the future of automated malware generation 3
  • 4. Statement This particular topic/area is a personal research interest of mine – I’m hoping to basically motivate you to think offensively when building or using defensive technologies… For Example: I’m currently helping on an open source automated detection technology for the cuckoo sandbox – and am trying to evade/bypass it at the same time 4
  • 5. Agenda • Current State of Automated Malware Generation • Current State of Malware Defense (Tech.) • Malware Trends • The Future of Malware Defense • The Future of Automated Malware Generation 5
  • 7. Malware Distribution Networks Malware has evolved into a profitable business for cyber criminals •Complex/Organized/Distributed Network •Malware Distribution Network (MDNs) •Pay-per-install (PPI) clients (RogueAV, SpamBot, keylogger) •PPI Services •PPI Affiliates (landing pages, redirection services, etc.) 7
  • 8. Malware Distribution Networks (MDNs) 2 3 4 1 Source: Microsoft Security Intelligence Threat Report (http://www.microsoft.com/sir ) 8
  • 9. Malware Distribution Networks (MDNs) Single Sample Repository A repository that does not update the malicious executable for the lifetime of the repository. Multiple Sample Repository A repository that performs updates to the malicious executable over time, but is not generating the samples for each request Polymorphic/Metamorphic Repository A repository that produces a unique malicious executable for every download request 9
  • 10. Example: Blackhole Exploit Kit Blackhole contains an integrated AV scanner and will auto-repackage if malware is detected Figure: Blackhole exploit kit download chain Source: Manufacturing Compromise: The Emergence of Exploit-as-a-Service (http://cseweb.ucsd.edu/~voelker/pubs/eaas-ccs12.pdf) 10
  • 11. Exploit Kits and Malware Blackhole | Ingognito || ZeroAccess | TDSS Source: Manufacturing Compromise: The Emergence of Exploit-as-a-Service (http://cseweb.ucsd.edu/~voelker/pubs/eaas-ccs12.pdf) 11
  • 12. Agenda • Current State of Automated Malware Generation • Current State of Malware Defense (Tech.) • Malware Trends • The Future of Malware Defense • The Future of Automated Malware Generation 12
  • 13. Current State of Malware Defense (Tech.) 13
  • 14. Current Techniques • Hash • Signatures • Heuristics • Semantics-aware detection 14
  • 15. Current Techniques Attacker Defender Easier to bypass Easier to implement Harder to change Harder to implement 15
  • 16. Hash-based detection • Full file hashing (cryptographic checksum) •MD5, SHA1, SHA256 • Portable Executable (PE) •Sectional hashing •Custom hashing •Fuzzy hashing (ssdeep) • Error on the side of caution 16
  • 17. Defeating Hash-based detection • Create Unique malware sample per user request •Randomizing single byte in irrelevant file offset •Re-packaging binary (FSG, ASPack, Themida) •Re-building malware dynamically 17
  • 18. Signature-based detection • Regular Expression based signatures (PCRE, RE2) • Byte-signatures rule ASPack {         strings:         $ = { 60 E8 ?? ?? ?? ?? 5D 81 ED ?? ?? (43 | 44) ?? B8 ?? ?? (43 | 44) ?? 03 C5 }         $ = { 60 EB ?? 5D EB ?? FF ?? ?? ?? ?? ?? E9 }         $ = { 60 EB 03 5D FF E5 E8 F8 FF FF FF 81 ED 1B 6A 44 00 BB 10 6A 44 00 03 DD 2B 9D 2A }         $ = { 60 E8 00 00 00 00 5D ?? ?? ?? ?? ?? ?? BB ?? ?? ?? ?? 03 DD }         $ = { 60 E8 41 06 00 00 EB 41 }         $ = { 60 E8 7? 05 00 00 EB (33 | 4C) }              condition:                  for any of them : ($ at entrypoint) } • Deeper contextual content scanning with proprietary language 18
  • 19. Defeating Signature-based detection • Syntax mutation easily defeats this technique • Garbage Code Insertion e.g. NOP, “MOV ax, ax”, “SUB ax 0” • Register Renaming • Subroutine Permutation • Code Reordering through Jumps • Equivalent instruction substitution Instruction Equivalent instruction MOV EAX, EBX PUSH EBX, POP EAX Call Emulated Call Misused Call CALL <target> PUSH <PC + sizeof(PUSH) + sizeof(JMP)> CALL <target> JMP <target> .target POP <register-name> • Same behavior but different syntax 19
  • 20. Heuristics are introduced… AV engines were forced to evolve and use heuristics by way of emulation/behavioral analysis due to: •Polymorphic engines • Encrypt body with randomly generated encryption algorithm • Private key normally in decoding engine •Metamorphic engines • Employs obfuscation/substitution techniques instead of encryption • Junk insertion, equivalent instruction substitution, etc. 20
  • 21. Heuristics-based detection General term for the different techniques used to detect malware by their behavior Emulation, API hooking, sand-boxing, file anomalies and other analysis techniques Rule A Rule B Rule C IF Rule A then Rule B then Rule C then Poison Ivy Source: (http://http://hooked-on-mnemonics.blogspot.com) 21
  • 22. Defeating Heuristics-based detection • Detect emulation and execute different code path • Break emulation engine • Avoid the heuristics • Overall solid method • Possible false positives 22
  • 23. Semantics-aware Detection • Captured execution trace is transformed into a higher-level representation capturing its semantic meaning, i.e., the trace is first abstracted before being compared to a malicious behavior • Make the time to build the code flow or extraction of a model infeasible for real-time AV using time lock puzzles • Intermediate representation (IR) • Abstract Syntax Trees, Register Transfer Language 23
  • 24. Semantics-aware detection Good idea in theory, but unknown (to me) how widely implemented this is in security products 24
  • 25. Defeating Semantics-aware detection Implementation is difficult Limited support for equivalent code sequences a = b * 2 a = b << 1 A left arithmetic shift by n is equivalent to multiplying by 2n (provided the value does not overflow) Focus on same techniques used to defeat signatures and heuristics + likelihood of limited support less popular instructions 25
  • 27. Agenda • Current State of Automated Malware Generation • Current State of Malware Defense (Tech.) • Malware Trends • The Future of Malware Defense • The Future of Automated Malware Generation 27
  • 29. Malware Detection Reality Check • How well are current detection techniques working? 33%! 29
  • 30. Malware Samples Observation: # of Malware Samples are increasing Source: Mcafee Global Q12012 Threat Report (http://mcafee.com/us/resources/reports/rp-quarterly-threat-q1-2012.pdf) 30
  • 31. Mobile Malware Samples Observation: # of Android Malware Samples are increasing Source: Kaspersky Q12012 Threat Report (http://www.securelist.com/en/analysis/204792231/IT_Threat_Evolution_Q1_2012) 31
  • 32. Use of Behavior Sandboxes Client binary is malware but isn’t detected. Suspicious files are sent back to “home base/cloud” lab for analysis 1.Sent to sandbox system 2.Meta data report is created for easier export of new rules a. Hash and blacklist entries are added b. Signatures are added c. Heuristic detection is added 32
  • 34. Solving the problem with people Malware Analysts Malware Samples Samples A D!! L O O VER 34
  • 35. Agenda • Current State of Automated Malware Generation • Current State of Malware Defense (Tech.) • Malware Trends • The Future of Malware Defense • The Future of Automated Malware Generation 35
  • 36. The Future of Malware Defense Skynet? …probably not But some of the concepts aren’t too far fetched… 36
  • 37. The Future of Malware Defense Perhaps malware detection should have more science applied to it. 37
  • 38. The Malware Infinity Problem Malware detection As malware samples approaches ∞ we can’t manually add detection for every file. We must model WHAT actions malware take, HOW it makes those actions and WHERE it makes connected. Malware Attribution As Attack Surface approaches ∞ we can’t defend everything from everyone. We must model WHO is after WHICH assets and HOW they attack. 38
  • 39. The Future of Malware Defense IF we are going to start modeling we must make some assumptions: 1.Attackers are going to change their code and techniques only enough to avoid detection 2.The majority of malware/exploits code and techniques will continue to represent future malware/exploits code and techniques 39
  • 40. The Who is important… “Researchers at Symantec traced the group’s work after finding a number of similarities between the Google attack code and methods and those used against other companies and organizations over the last few years. The researchers, who describe their findings in a report published Friday, say the gang — which they have dubbed the “Elderwood gang” based on the name of a parameter used in the attack codes — appears to have breached more than 1,000 computers in companies spread throughout several sectors – including defense, shipping, oil and gas, financial, technology and ISPs. The group has also targeted non- governmental organizations, particularly ones connected to human rights activities related to Tibet and China” Source: http://www.wired.com/threatlevel/2012/09/google- hacker-gang-returns/ 40
  • 41. Statistics A discipline that makes you understand data and makes you make decisions based on data S T A T I Data S Decisions T I C S 41
  • 42. Train the Machines •Classify •Cluster 42
  • 43. Automatic Classification Steps: 1.Extract features 2.Train models using ML algorithms 3.Feature Selection 4.Use models as classifiers 5.Use models to classify unknown files as 0 or 1 Source: http://eval.symantec.com/mktginfo/enterprise/white_papers/b-dlp_machine_learning.WP_en-us.pdf 43
  • 44. Machine learning Where we train computers to make statistical decisions on real-time data based on inputted data While machine learning as a concept has been around for decades and has been used in everything from anti-spam engines to Google™ algorithms for translating text, it is only now being applied to web filtering, DLP and malware content analysis. 44
  • 45. Historical Observation Historically certain malware has •No icon •No description or company in resource section •Is packed •Lives in windows directory or user profile These are the type of “features” that expert humans would feed to machine learning classifiers to train on 45
  • 46. Expert Humans train Machines “You can’t effectively and consistently manage what you can’t measure, and you can’t measure what you haven’t defined…” SOURCE: http://fairwiki.riskmanagementinsight.com/?page_id=3 •The job of the human •List features •The job of the machine •Model which features are important, in what grouping and in what order •Classify •Cluster 46
  • 47. Machine Learning (ML) Algorithms • Naive Baysian Classifier (each feature is independent of the other features) • Support Vector Machine (SVM) when high dimensionality (high dimensionality.. more than a thousand of variables are in the model) • Random Forest when you want an interpretable model (< 2000 features) • Marchov Chains (Natural Language Processing) for when you want to assess the sequence probability 47
  • 48. The Future of Malware Defense Network File System Physical Memory Inspection Point Every Layer provides various degrees of “features” to inspect 48
  • 49. The Future of Malware Defense 49
  • 50. Existing Academic work… • D. Plonka and P. Barford. Context-Aware Clustering of DNS Query Traffic. In Proceedings of the 8th ACM SIGCOMM conference on Internet Measurement, October 2008. • R. Perdisci, W. Lee, and N. Feamster. Behavioral Clustering of HTTP- Based Malware and Signature Generation Using Malicious Network Traces. In Proceedings of the 7th USENIX conference on Networked Systems Design and Implementation, April 2010. • K. Rieck, P. Trinius, C. Willems, T. Holz. Automatic Analysis of Malware Behavior using Machine Learning. e Journal of Computer Security, 2011 50
  • 51. Projects using machine learning •Razorbacktm - http://sourceforge.net/projects/razorbacktm/files/ •Malheur - http://www.mlsec.org/malheur/ •Malvic - http://www.malvic.org •Adobe Open Source Malware Classification Tool http://sourceforge.net/projects/malclassifier.adobe/ • 98.21% accuracy • 6.7% false positive rate • 7 features = DebugSize, ImageVersion, IatRVA, ExportSize, ResourceSize, VirtualSize2, NumberOfSections 51
  • 53. The Future of Malware Defense •Using Machine learning for malware detection is only as useful as the features you create and the good and bad sample sets it’s trained on. • Features • Good Sample Set • Bad Sample Set • If you have 1000’s of samples but on the same malware or sample exploit…not good!!! 53
  • 54. PDF Example Features • Compressed JavaScript • PDF header location e.g %PDF - within first 1024 bytes • Does it contain an embedded file (e.g. flash, sound file) • Signed by a trusted certificate • Encoded/Encrypted Streams e.g. FlatDecode • Names hex escaped • Bogus xref table Reference: http://blog.fireeye.com/files/27c3_julia_wolf_omg-wtf-pdf.pdf 54
  • 55. Detecting shellcode • Marchov chains To determine probability of instruction sequences 0.3 • Technique clustering 0.7 0.4 0.6 XOR ECX, ECX ; ECX = 0 MOV ESI, [FS:ECX + 0x30] ; ESI = &(PEB) ([FS:0x30]) MOV ESI, [ESI + 0x0C] ; ESI = PEB->Ldr MOV ESI, [ESI + 0x1C] ; ESI = PEB->Ldr.InInitOrder next_module: MOV EBP, [ESI + 0x08] ; EBP = InInitOrder[X].base_address MOV EDI, [ESI + 0x20] ; EBP = InInitOrder[X].module_name (unicode) MOV ESI, [ESI] ; ESI = InInitOrder[X].flink (next module) CMP [EDI + 12*2], CL ; modulename[12] == 0 ? JNE next_module ; No: try next module. 55
  • 56. Shellcode detection Decoder routine clustering Detect entropy of bytes to indicated encoded payload ...features =] 56
  • 57. Malware features in action … • Features: •Static: • Packed • File size • Origin •Dynamic (Network) • Makes a connection • Number of DNS request • Encrypted Communication • Burst/length of communication •Dynamic (File) • Register keys • File level modifications 57
  • 58. The Future of Malware Defense • Choose features that are harder for the attacker to change. •E.g. bot network communication protocol (if not encrypted) 58
  • 59. Agenda • Current State of Automated Malware Generation • Current State of Malware Defense (Tech.) • Malware Trends • The Future of Malware Defense • The Future of Automated Malware Generation 59
  • 60. The Future of Automated Malware Generation 60
  • 61. The Future of Malware Offense The Attacker has a few things in their favor: 1.Prone to False Positives Machine learning can be prone to false positives and false negatives if feature and sample sets aren’t extensive enough 1.Avoid Feature Indicators Detection via machine learning can be defeated if an attacker can find out where the features are and avoid them 1.New Features Come Out… You can't protect yourself from a new weapon if you don't know it exist 61
  • 62. Prone to false positives If the defense side creates models based on a small sample set or a sample set that doesn’t represent a diverse enough sample set than the model will be too restrictive – false negatives If the defense creates models based only on malicious files and not enough good files there will be tons of false positives An Attacker can always try poison the sample sets if they have enough manipulation power and resources (VirusTotal) 62
  • 63. Avoid feature indicators • Attackers can always do the same research and model generic malware and avoid features that are being used by most malware • …to instead use features that that are more popular in benign software • This will also avoid being placed in known clusters 63
  • 64. New features come out… • If format changes, or gets updated: •A new file/protocol parser must be created/updated to understand and extract features •The model must be retrained and shipped out 64
  • 65. …OR Just keep is simple Encrypt binaries with a user-specific key so that AV can’t decrypt it •Targeted binary like Gauss •Encrypted DLL with user key •Zeus •Encrypted the downloaded binary with user key 65
  • 66. Conclusion • Complex/Organized Network • Malware distribution network (MDNs) •Pay-per-install (PPI) clients •Malware crypt services will include • Feature verification • anti-clustering technology  the Future? • anti-classification technology  The Future? Will this be the future of automated malware generation? Or will it just be more of the same? 66
  • 67. Conclusion Today, what I hope that you learned is that if you want to truly understand your defensive technology you have to understand it’s limitations and look at things from an attacker/offensive viewpoint. 67
  • 68. Conclusion Proper security is all about a defense-in-depth strategy. Create multiple layers of defense. Every layer presenting a different set of challenges, requiring different skill sets and technology. So every layer will increase the time and effort to compromise your environment and exfiltration data. 68
  • 69. Conclusion External reconnaissance Penetration Internal reconnaissance + stage persistent state Exfiltration If security strategy is successful: via your layered defenses the attack is stopped before exfiltration of data can happen. 69
  • 70. Questions? questions.py: while len(questions) > 0: if time <= 0: break print answers[questions.pop()] 70
  • 71. Thanks Pacsec! Stephan Chenette | @StephanChenette Director of Research and Development IOActive, Inc. http://ioactive.com 71