2. INTRODUCTION
MOTIVATION
DETECTION TECHNIQUES
Signature Based
Anomaly Based
Specification Based
MALWARE OBFUSCATION
3. Malware
Malware, short for "malicious software," refers to a type of computer program designed
to infect a legitimate user's computer and inflict harm on it in multiple ways
Antimalware
Antimalware software protects against infections caused by many types of malware,
including viruses, worms, Trojan horses, rootkits, spyware, key
loggers, ransomware and adware.
Obfuscation
The obfuscation is a technique that makes programs harder to understand
4. Why do we need to study malwares ?
So are only the computers that can be affected ?
Wait
Does it look fake ?
Virus 666
US patent 6506148 B2
Why do we need antimalware's?
6. Techniques used for detecting malware can be categorized broadly in to two
categories:
anomaly-based detection
and signature-based detection
An anomaly-based detection technique uses its knowledge of what constitutes
normal behavior to decide the maliciousness of a program under inspection
Specification-based techniques leverage some specification or rule set of what is valid
behavior in order to decide the maliciousness of a program under inspection
Signature-based detection uses its characterization of what is known to be
malicious to decide the maliciousness of a program under inspection
7. Static
Static analysis uses syntax or structural properties
A static approach attempts to detect malware before the program under inspection executes
Example strings utility (naïve way)
Dynamic
dynamic approach will leverage runtime information
a dynamic approach attempts to detect malicious behavior during program execution or after
program execution
Example Sysinternals suit (naïve way)
Hybrid
In this case, static and dynamic information is used to detect malware
8. What is a signature?
The signatures are typically hashes or byte-streams that are used to determine whether
a file or buffer contains a malicious payload
Hashes are generated using algorithms like CRC or MD5 which are typically fast and
can be calculated many times per second
This is most typical and preferred method employed by antimalware/antivirus
There is a tradeoff between being fast and being accurate
9. Byte-Streams
Simplest form of signatures
Signature is a byte-stream that is specific to a malware file and that does not normally
appear on non-malicious files
Example:- to detect the European Institute for Computer Anti-Virus Research (EICAR)
antivirus testing file, an antivirus engine may simply search for this entire string:
X5O!P%@AP[4PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
Easiest and fast approach for detection
Many robust and efficient algorithms are present for string matching
Example : Aho-Corasick, Knuth-Morris-Pratt, Boyer-Moore, etc.
This approach is error prone
11. Checksums
The most typical signature-matching algorithm is used by almost all existing AV engines
and is based on calculating CRCs.
An antivirus engine may detect this testing file by calculating the CRC32 checksum of
the entire buffer against chunks of data or by analyzing the specific parts of a file format
that can be divided
Fast but a lot of false positives due to collisions
Use of modified CRC for detection. But still it gives false positioves
Example :
“petfood” and “eisenhower” have the same CRC32 hash 0xD0132158
Use of custom checksums
12. Cryptographic hashes
Follows the 3 main properties of cryptographic hash functions
Generates a “signature” that univocally identifies one buffer and just one buffer
Reduces false positives
More expensive than calculating a CRC32 hash
A single bit change may need to compute a new signature
They are used for recently discovered malwares that are considered critical. Meanwhile
stronger signature are being developed
13. The aim is to identify a whole family of malwares and reduce false positives
Fuzzy Hashing
Minimal or no diffusion at all
No confusion at all
A good collision rate (depends on application)
Some available hashes
Ssdeep, DeepToad, SpamSum etc.
False positives are possible but less compared to earlier discussed techniques
The are not used independently but used with some sophisticated techniques like
bloom filters
14. Bypassing such filters is not easy
Attacker needs to change many parts because changing just one bit will not work
The number of changes required to bypass the fuzzy signature depends on the
block size and how the block size is chosen
If block size depends on the size of given buffer and is not fixed then it is easier to
bypass
Fixed block size based fuzzy signatures are difficult to bypass
15. Graph-Based Hashes for Executables
Software program can be divided into two different kinds of graphs
Call graph – Directed graph showing the relationship between all the functions in the program
Flow graph- Directed graph showing the relationship between basic blocks
Antimalware's with code analysis engines may use signatures in the form of graphs
using information extracted from call graphs or the flow graphs
This approach is expensive but effective
For better performance limit to some instructions, basic blocks, time-outs
These techniques are powerful for the detection of the polymorphic viruses, while the
instructions will be different between different evolutions but the call graphs usually
remain stable.
16. False positive cases are still possible
Evasion techniques
Change the layout of the call graph
Implement anti-disassembly tricks
Mix anti-disassembly techniques with opaque predicates
Use time-out tricks (make the flow graph as complex as possible)
Example of control flow graph tool
http://github.com/joxeankoret/pyew
17. Dynamic signature-based detection is characterized by using solely information
gathered during the execution to decide its maliciousness
looks for patterns of behavior that would reveal the true malicious intent of a
program.
Signature-based method for worm detection that is based on known malicious
behaviors
A state transition based technique for detection
18. Uses static and dynamic properties to determine the maliciousness
First executes the program and then apply static signature detection
Example
Worm vs. Worm
Malicious Code Filter
19. Anomaly based detection usually occurs in two phases:
Training (learning) phase and
Detection (monitoring) phase
During the training phase the detector attempts to learn the normal behavior .
The detector could be learning the behavior of system, program or both
The key advantage of anomaly based detection is to detect zero-day attacks
Two fundamental problems associated with this approach are
High false alarm rate
Complexity of choosing the features to be learned in training phase
20. In dynamic anomaly-based detection, information gathered from the program’s
execution is used to detect malicious code
The detection phase monitors the program under inspection during its execution,
checking for inconsistencies with what was learned during the training phase
Examples
IDS, using computer forensic methods for Privacy-Invasive Software, monitoring system
call sequences, process call sequences
Setting a threshold is a challenging problem to reduce false positive cases
21. In static anomaly-based detection, characteristics about the file structure of the
program under inspection are used to detect malicious code
A key advantage of static anomaly based detection is that its use may make it
possible to detect malware without having to allow the malware carrying program
execute on the host system
Data-mining and machine learning approaches are used to detect the malwares
Hybrid anomaly based detection
22. Specification-based detection is a type of anomaly-based detection that tries to
address the typical high false alarm rate associated with most anomaly-based
detection techniques
Specification-based detection attempts to approximate the requirements for an
application or system
Training phase is the attainment of some rule set
The main limitation of specification-based detection is that it is often difficult to
specify completely and accurately the entire set of valid behaviors a system should
exhibit
23. Approaches classified as dynamic specification-based use behavior observed at
runtime to determine the maliciousness of an executable
Example
Monitoring Security-Critical Programs (using monitored system call events)
Using Dynamic Information Flow to Protect Applications
Process Behavior Monitoring
Using Instruction Block Signatures
24. Structural properties of programs are use for detection
Example
Static Detection of Malicious Code in Executables (API- graph)
Compiler Approach to Malcode Detection (certifying compiler)
Detecting Malcode in Firmware
Hybrid specification based detection
Example
26. Encryption
The first approach to evade the signature based antivirus scanners is to use encryption
Exclusive OR
Perform XOR operation with some byte
Base64 Encoding
Base64 is commonly used in malware to disguise text strings
ROT13
Rotate13 a simple letter substitution to jumble text
27. Code Packing
A packer is piece of software that takes the original malware file and compresses it
Dead-Code Insertion
Dead-code insertion is a simple technique that adds some ineffective instructions to a
program to change its appearance, but keep its behavior
Register Reassignment
Switches registers generation to generation while keeping program behavior same
Subroutine Reordering
Obfuscate an original code by changing the order of its subroutines in a random way.
Example Win32/Ghost
28. Instruction Substitution
Evolves an original code by replacing some instruction with other equivalent ones
Code Transposition
Code transposition reorders the sequence of the instructions of an original code without having any
impact on its behavior.
Code Integration
Introduced by the Win32/Zmist malware
Malware knits itself to the code of its target program
Decompile the target program into manageable objects , add itself between them and
reassembles the integrated code into a new generation.
29. Antivirus hackers handbook, Joxean Koret Elias Bachaalany, Willy Publication.
Practical Malware Analysis, Andrew Honig, No Starch Press.
Nwokedi Idika, Aditya P. Mathur, A Survey of Malware Detection Techniques,
Ilsun You , Kangbin Yim, Malware Obfuscation Techniques: A Brief Survey , 2010 International Conference on
Broadband, Wireless Computing, Communication and Applications.
defcon-17-sean_taylor-binary_obfuscation.pdf, Defcon 02017.