2012 B-Sides and ToorCon Talk Offensive Defense
Blog Post - http://blog.ioactive.com/2013/01/offensive-defense.html
Cyber-criminals have had back-end infrastructures equivalent to Virus Total to test if malware and exploits are effective against AV scanners for many years, thus showing that attackers are proactively avoiding detection when building malware. In this day of age malicious binaries are generated on demand by server-side kits when a victim visits a malicious web page, making reliance solely on hash based solutions inadequate. In the last 15 years detection techniques have evolved in an attempt to keep up with attack trends. In the last few years security companies have looked for supplemental solutions such as the use of machine learning to detect and mitigate attacks against cyber criminals. Let's not pretend attackers can't bypass each and every detection technique currently deployed. Join me as I present and review current detection methods found in most host and network security solutions found today. We will re-review the defense in depth strategy while keeping in mind that a solid security strategy consists of forcing an attacker to spend as much time and effort while needing to know a variety of skills and technologies in order to successfully pull off the attack. In the end I hope to convince you that thinking defensively requires thinking offensively.
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
B-Sides Seattle 2012 Offensive Defense
1. Offensive Defense
Stephan Chenette,
Director of Security Research & Development
1
2. Who Am I?
• Stephan Chenette
• Director of Security R&D @ IOActive
•Building / Breaking / Hacking / Researching
• R&D @ eEye Digital Security 4+ years
• Head Security Researcher @ Websense ~6 years
• (Graduate Student @ UCSD - Network Security)
2
3. What I will NOT talk about
• Offensive Defense
•Active Defense
• Retaliating during an attack
• Striking back against adversaries
• Technical/Legal
• HoneyTraps
•CrowdStrike, Cylance, …Facebook, etc.
Recent Discussions:
• http://www.forbes.com/sites/jodywestby/2012/11/29/caution-active-response-to-cyber-attacks-has-high-risk/
• http://
blogs.csoonline.com/security-leadership/2469/caution-not-executing-offensive-actions-against-our-adversaries-high-risk
• http://www.honeynet.org/node/1004
3
4. What I WILL talk about
• Offensive Defense
•“Smart Defense”
•Understanding of malware/exploit defense techniques
• Ability to question and call BS on marketing/sales unrealistic claims
•Current Malware Distribution Networks (MDNs)
•Explanation of defense techniques
•Attacking defense techniques
•(Note: This is similar to a talk I did at EkoParty Argentina 2012)
4
5. Statement
Research
in evading defensive technology
is a personal research interest of mine
5
7. Malware Distribution Networks
Malware has evolved into a profitable business for
cyber criminals
•Complex/Organized/Distributed Network
•Malware Distribution Network (MDNs)
•Pay-per-install (PPI) clients (RogueAV, SpamBot, keylogger)
•PPI Services
•PPI Affiliates (landing pages, redirection services, etc.)
7
8. Malware Distribution Networks (MDNs)
2 3 4
1
Source: Microsoft Security Intelligence Threat Report (http://www.microsoft.com/sir )
8
9. Malware Distribution Networks (MDNs)
Single Sample Repository
A repository that does not update the malicious
executable for the lifetime of the repository.
Multiple Sample Repository
A repository that performs updates to the malicious
executable over time, but is not generating the
samples for each request
Polymorphic/Metamorphic Repository
A repository that produces a unique malicious
executable for every download request
9
15. Attacking Defense Techniques
1. Defense technologies need to keep latency low
…so they sacrifice analysis to that end
(if a connection/analysis is taking too long they will in
some cases fail open)
2. Correct Implementation is difficult
15
16. Current Techniques
Attacker Defender
Easier to bypass Easier to implement
Harder to change Harder to implement
16
17. Hash detection
• Full file hashing
•MD5, SHA1, SHA256
• Portable Executable (PE)
•Sectional hashing
•Custom hashing
•Fuzzy hashing (ssdeep)
• Error on the side of caution
17
19. Signatures
• Syntax mutation easily defeats this technique
• Garbage Code Insertion e.g. NOP, “MOV ax, ax”, “SUB ax 0”
• Register Renaming
• Subroutine Permutation
• Code Reordering through Jumps
• Equivalent instruction substitution
Instruction Equivalent instruction
MOV EAX, EBX PUSH EBX, POP EAX
Call Emulated Call Misused Call
CALL <target> PUSH <PC + sizeof(PUSH) + sizeof(JMP)> CALL <target>
JMP <target>
.target
POP <register-name>
• Same behavior but different syntax
19
20. Signatures
AV engines were forced to evolve and use heuristics by
way of emulation/behavioral analysis due to:
•Polymorphic engines
• Encrypt body with randomly generated encryption
algorithm
• Private key normally in decoding engine
•Metamorphic engines
• Employs obfuscation/substitution techniques instead of encryption
• Junk insertion, equivalent instruction substitution, etc.
20
21. Heuristics
General term for the different techniques used to
detect malware by their behavior
Emulation, API hooking, sand-boxing, file anomalies and other analysis techniques
Rule A
Rule B
Rule C
IF Rule A then Rule B then Rule C then Poison Ivy
Source: (http://http://hooked-on-mnemonics.blogspot.com)
21
22. Heuristics
• Defeating heuristics
• Detect emulation and execute different code path
• Break emulation engine
• Avoid the heuristics if you can
• Overall solid method
• Possible false positives
22
23. Semantics-aware Detection
• Captured execution trace is transformed into a higher-level
representation capturing its semantic meaning, i.e., the trace
is first abstracted before being compared to a malicious
behavior
• Make the time to build the code flow or extraction of a
model infeasible for real-time AV using time lock puzzles
• A common anti-emulation trick is to introduce loops that take a relatively long time to compute. The
• loop may in fact take so long to emulate that the antivirus scanner gives up.
• a packed binary can be quickly created by an attacker which is guaranteed to require a predefined and
easily adjustable number of computationally expensive operations to rebuild a cryptographic key. This key
is then used in a strong cryptographic cipher to decrypt the next stage.
• Intermediate representation (IR)
• Abstract Syntax Trees, Register Transfer Language
23
24. Semantics-aware Detection
Good idea in theory, but unknown (to me) how widely
implemented this is in security products
24
25. Semantics-aware Detection
And how correct is it implemented ?
(e.g. took veracode 10+ years to get right)
Limited support for equivalent code sequences
a = b * 2
a = b << 1
A left arithmetic shift by n is equivalent
to multiplying by 2n
(provided the value does not overflow)
25
27. Recap
Technology Attack Technique
Hash-detection Sufficiently altering binary/exploit
Signature-detection Garbage Code Insertion
Register Renaming
Subroutine Permutation
Code Reordering through Jumps
Equivalent instruction substitution
Content Fragmentation
Heuristic-detection Avoid matching heuristic-detection
decision tree, add enough benign
functionality that heuristics detection fails
open due to false positive mitigation
Semantic-detection Avoid matching semantic-detection
decision tree or find semantic which
semantic-detection engine has not
translated properly, (see heuristic-
27 detection for more attacks)
28. Looking Beyond…
Too often the assumption is that when analyzing
malware or a file exploit – all the malicious content
to be found is within the file boundaries and available
all at one moment in time.
This is not always the case…
Web pages – Script Fragmentation [link]
Mobile Apps (Java/JavaScript bridges [link][link])
28
30. Malware Detection Reality Check
Imperva Blog:
“Assessing the Effectiveness of Antivirus Solutions”
Excerpt:
'....Imperva collected and analyzed more than 80 previously
non-cataloged viruses against more than 40 antivirus solutions.
They found that less than 5% of anti-virus solutions in the study
were able to initially detect previously non-cataloged viruses and
that many solutions took up to a month or longer following the
initial scan to update their signatures......'
30
32. Trend of Malware Creation
Observation: # of Malware Samples are increasing
Source: Mcafee Global Q12012 Threat Report
(http://mcafee.com/us/resources/reports/rp-quarterly-threat-q1-2012.pdf)
32
33. Trend of Malware Creation
Observation: # of Android Malware Samples are
increasing
Source: Kaspersky Q12012 Threat Report
(http://www.securelist.com/en/analysis/204792231/IT_Threat_Evolution_Q1_2012)
33
34. How I interpret those results…
current techniques
aren’t really
succeeding
34
35. Who qualifies AV defense products
AV-Test
Security Essentials failed to
recognize enough zero-day
threats with detection rates
of only 69% , where the
average is 89% [link]
AV-Comparatives
ICSA Labs
NSS Labs
EICAR
Etc.
35
36. Malware Defense is really hard…
• Benign software can look like malware
ClamAV detected as malware
why: signature database isn’t encrypted so the strings match signatures
36
37. Malware Defense is really hard…
• Malware can look/act like benign software
• ~70,000 new pieces of malware a day
• Gauss – Encrypted DLL
• Zeus – Downloading encrypted binaries
• Java or .NET malware – e.g. recent "Japanese remote
control malware virus“ used to make death threats on web
forums
• Starting with Vista and Windows Server 2008 and continuing into Windows
7, .NET is now a native part of the OS installation.
•Analysis of the byte code of an interpreted language
37
38. Typical Scenario
Client binary is malware but isn’t detected.
If considered suspicious, files are sent back to “home
base/cloud” lab for analysis (feedback mechanism)
1.Sent to sandbox system
2.Meta data report is created for easier export of
new rules
a. Hash and blacklist entries are added
b. Signatures are added
c. Heuristic detection is added
38
40. Solving the problem with people
Malware Analysts Malware Samples
Samples
A D!!
L O
O VER
40
41. The Future of Malware Defense
Perhaps there should be more science and statistical
modeling applied to malware defense – as an
additional layer.
41
42. Modeling attacks and attackers
Malware detection
As malware approaches ∞ we can’t manually add
detection for every file. We must model WHAT actions
malware take, WHERE it makes connected to and
HOW it performs it’s actions.
Attribution
As Attack Surface approaches ∞ we can’t defend
everything from everyone. We must model WHO is
after WHICH assets and HOW they attack.
42
43. The Future of Malware Defense
IF we are going to start modeling we must make
some assumptions:
•Attackers are lazy, they are going to change their
code and techniques only enough to avoid detection
•The majority of malware/exploits code and
techniques will continue to represent future
malware/exploits
43
44. Machine learning
Machine learning – is where we train computers to
make statistical decisions on real-time data based on
inputted data
While machine learning as a concept has been
around for decades and has been used in everything
from anti-spam engines to Google™ algorithms for
translating text, it is only now being applied to web
filtering, DLP and malware content analysis.
44
45. Statistics
Manual observation:
Historically certain malware has
•No icon
•No description or company in resource section
•Is packed
•Lives in windows directory or user profile
These are the type of “features” that expert humans
would feed to machine learning classifiers to learn
45
46. The Future of Malware Defense
Network
File System
Physical Memory
Inspection Point
Every Layer provides various degrees of
“features” to inspect
46
47. Malware features in action …
• Features:
•Static:
• Packed
• File size
• Origin
•Dynamic (Network)
• Makes a connection
• Number of DNS request
• Encrypted Communication
• Burst/length of communication
•Dynamic (File)
• Register keys
• File level modifications
47
49. PDF Example Features
• Compressed JavaScript
• PDF header location e.g %PDF - within first 1024 bytes
• Does it contain an embedded file (e.g. flash, sound file)
• Signed by a trusted certificate
• Encoded/Encrypted Streams e.g. FlatDecode
• Names hex escaped
• Bogus xref table
Reference: http://blog.fireeye.com/files/27c3_julia_wolf_omg-wtf-pdf.pdf
49
51. Machine Learning
Just another layer in the defenses
Take all technologies and each are used as features in
themselves
• Signatures
• Heuristics
• Semantics
51
53. Offensive Thinking - AI
Technology Attack Technique
Machine Learning / 1. Machine learning can be prone to false
positives and false negatives if feature
Natural Language and sample sets aren’t extensive
enough
Processing 2. Detection via machine learning can be
defeated if an attacker can find out
what the features are and avoid them
3. Classifier Training can be poisoned if
an attacker can influence the training
set
4. Functionality typically used for benign
actions can be used to conduct
malicious actions
5. Machine Learning can't detect a new
weapon if it doesn’t know it exist or
doesn’t know how to interpret/parse it
(HTML5 objects for heap spraying or
improvements to file format)
53
55. Test defenses on your network
• Get Samples…
•Private logs, Setting up Honeypots, Infected
machines, Private Security Mailing lists
•Open Malware - http://offensivecomputing.net/
• Myself and another researcher are working on an API…
•Research websites e.g. malr, zeustracker
• DMZ’d / replicated network
• Establish a role for Security Architect
• Hire a Consulting Firm to architect a security
55 framework for your organization
56. Conclusion
• Independent testing should be done on attempting
to bypass file/network layers of defense
• External reconnaissance
• Penetration
• Internal reconnaissance + stage persistent state
• Exfiltration
• An understanding of the limitations of each
defensive layer should be part of deciding how to
build your network
• OS – ASLR, DEP, HIPS, FIREWALL, etc.
• NETWORK - FILTERING, IPS, IDS, FIREWALL, etc.
56
• LOGGING and CORRELATION
57. Conclusion
Proper security is all about a defense-in-depth
strategy. Create multiple layers of defense.
Every layer presenting a different set of
challenges, requiring different skill sets and
technology.
So every layer will increase the time and effort
to compromise your environment and
exfiltration data.
57
58. Conclusion
If security strategy is successful:
via your layered defenses the attack is stopped
before exfiltration of data can happen.
58
59. Questions?
questions.py:
while len(questions) > 0:
if time <= 0:
break
print answers[questions.pop()]
59
60. Thanks!
Stephan Chenette | @StephanChenette
Director of Research and Development
IOActive, Inc. http://ioactive.com
60