This document discusses recovering malware assembly codes from self-modifying programs. It presents the problem of reconstructing the control flow graph (CFG) of an encrypted program based on a memory snapshot and execution trace. Key challenges include indirect jumps, junk code insertion, and misaligned overlapping code. The proposed method constructs a partial CFG from the trace, then searches for missing codes and splits blocks to resolve misalignments. Dynamic typing of memory with reading, writing, and execution levels is used to model self-modifying behavior. The overall goal is to synchronize conflicting partial CFGs using traces and statistical recognition to fully recover encrypted assembly codes.
1. How to recover malware assembly codes
Jean-Yves Marion
LORIA
!
2. Jean-Yves Marion - Laboratoire de Haute Sécurité (LHS)
Duqu : The precursor to the next Stuxnet
Duqu is Targeted attacks
Start in June 2010 ?
Discover in Sept. 2011 by Crysys (Budapest)
See white paper of Symantec
Code injection
Duqu is similar to Stuxnet
➡ Same installation mechanisms and Similar functionalities
➡ But Anti-Virus companies detect it in Sept 2011 !
➡ None of 43 anti-virus of VirusTotal was able to detect Duqu knowing Stuxnet.
0-day exploit
Driver File
(.sys)
Installer
(.dll)
Decryption
DUQU
Main DLL Service.exe
3. The decryption routine!
of the payload installer
Unpack a UPX-file
The main DLL code is !
now decrypted !
and depacked !
in memory only
Wave 1
Wave 2
Decryption
Duqu is a self-modifying program
4. A common protection scheme for malware
Wave 1
payload
Decrypt
..........
Decrypt
P33C7 18+&01223
Decrypt
Wave 2
Self-modifying program schema
5. Self-modifying codes
A bare semantics
µ0[c] : binary c loaded into memory µ0
µn : memory n : registers
n(ip) returns the address of the next instruction to run
Traces
➡ Traces are obtained by code instrumentation : we use Pin (intel)
We collect an execution trace of P :
For each run instruction, we gather
– its memory address
– its machine instruction
(µ0[c], 0) ! (µ1, 1) ! . . . ! (µn, n) ! . . .
6. Self-modifying codes
Dynamic typing of memories
(m) = (kr, kw, kx) where m is a memory adress
kw is the writing level
kr is the reading level
kx is the execution level
0(m) = (0, 0, 0)
(µ0[c], 0, 0) ! (µ1, 1, 1) ! . . . ! (µn n, n) ! . . .
The execution level is the level of n(ip) given by n( n(ip))
7. Self-modifying codes
Dynamic typing of memories
(m) = (kr, kw, kx) where m is a memory adress
kw is the writing level
kx is the execution level
0(m) = (0, 0, 0)
(µ0[c], 0, 0) ! (µ1, 1, 1) ! . . . ! (µn n, n) ! . . .
An instruction written at level k has an execution level of k+1
@a: mov esi,$index
@b: xor [@offset+esi],$key
@c: sub esi,4
@d: jnz @b
@offset: [encrypted data]
Wave 1
@a,…,@d
Decrypt
Wave 2
@offset
kw is 1
The execution level of @offset is 2 because it is written by instructions in wave 1
So kx is 2
8. Self-modifying codes
kw is the writing level
kx is the execution level
A self-modifying program c is a program such that its execution level is > 1 for an input
i+1(m) =
8
><
>:
(kr, k + 1, kx) if m is written
and i(m) = (kr, kw, kx)
i(m) otherwise
i+1(m) =
(
(kr, k, k + 1) m = (ip) & i( i(ip)) = (kr, k, kx)
i(m) otherwise
Similar to Phase semantics of Preda, Giacobazzi, and Debray
The execution level is k + 1
i(ip) points to an instruction like mov [m],eax
10. Fig. 7.2: Résultats de l’analyse
Nom du binaire k k Blind Decrypt Check Scrambled
hostname.exe (original) 1 1
!EPack_1..exe 2 2 X X
acprotect-hostname.exe 18 882 X X X X
aspack-hostname.exe 2 3 X X X X
enigma_protector_1.16.exe 5 24 X X X X
exefog_1.1.exe 3 5 X X X
expressor-hostname.exe 2 3 X X
fsg.exe 2 2 X X X
mew11.exe 2 2 X X X
molebox-hostname.exe 3 5 X X X X
morphine_1.9.exe 3 3 X X X
nakedpack.exe 2 2 X
npack-hostname.exe 2 2 X
nspack.exe 3 4 X X X
packman_1..exe 2 2 X X X
pec2-hostname.exe 3 4 X X X X
pelock-hostname.exe 9 16 X X X X
pepack.exe 1 1 X
pespin-hostname.exe 4 38 X X X X
petite.exe 2 2 X X
rlpack_1.17_full_version.exe 2 2 X X X
rlpack-hostname.exe 2 2 X
telock_.98.exe 2 2 X
themida_1.8.5.2.exe 11 164 X X X X
upx-hostname.exe 2 2 X X
vmprotect-hostname.exe 1 1 X
winupack-hostname.exe 3 4 X X X X
Yodas_Crypter_v1.3.exe 4 4 X X X X
yp-1.02-hostname.exe 4 6 X X X X
Légende :
11. Where are we ?
(µ0[c], 0, 0) ! (µ1, 1, 1) ! . . . ! (µn, n, n) ! . . .
Dynamic typed memory trace
which defines a sequence of waves
Wave 1
Decrypt
..........
DecryptDecrypt
Wave 2 Wave
K
Can we recover the assembly code of the wave K ?
Can we reconstruct the full CFG ?
The inputs:
An execution trace inside wave K
The snapshot of the memory at wave K
12. The problem and its inputs
Can we recover the assembly code of this wave ?
The inputs:
An execution trace inside wave K
The snapshot of the memory at wave K
Snapshot of the memory at the beginning of wave 5
13. Dynamic vs Static analysis
A trace obtained by dynamic analysis
Dynamic typed memory trace
Undiscovered code in white boxes
15. Indirect jumps
100: jmp eax
– Fuzzing !
– We need to have a robust approximation of x86 semantics!
– Abstract interpretation!
– SMT Solver
What is the set of possible values of eax ?
16. Junk code insertion
Junk code insertion at the expected return adress
!
!
100 : call @a
junk code
@a : …
How to determine the return address of a call ?
125 : pop esi
Modify the return
address (125)
See Debray’s paper
17. Yet another difficulty
mis-alignment
01006 e7a f e 04 0b inc byte [ ebx+ecx ]
01006 e7d eb f f jmp +1
01006 e7e f f c9 dec ecx
01006 e80 7 f e6 jg 01006 e68
01006 e82 8b c1 mov eax , ecx
Figure 1. Overlapping assembly in tELock.
010059 f0 89 f9 mov ecx , edi
,=< 010059 f2 79 07 jns +9
| 010059 f4 0 f b7 07 movzx eax , word [ edi ]
| 010059 f7 47 inc edi
| 010059 f8 50 push eax
| 010059 f9 47 inc edi
| 010059 fa b9 57 48 f2 ae mov ecx , aef24857
‘ > 010059 fb 57 push edi
010059 f c 48 dec eax
010059 fd f2 ae repne scasb
010059 f f 55 push ebp
Figure 2. Overlapping assembly in UPX.
2.2.1 tELock0.99
tELock0.99 uses an overlapping technique to simply obfuscate the code as follows. Figure 1 sh
disassembly taken from the address 01006e7a. There is a jmp +1 instruction at address 01006e7
teLock
Figure 2. Overlapping assembly in UPX.
2.2.1 tELock0.99
tELock0.99 uses an overlapping technique to simply obfuscate the code as follows. Figure 1 shows a recursive
disassembly taken from the address 01006e7a. There is a jmp +1 instruction at address 01006e7d and coded on
the two bytes eb ff, that jumps to the address 01006e7d + 1, which is a dec ecx instruction (ff c9) which shares the
byte ff at address 01006e7d + 1 with the jmp instruction.
2.2.2 UPX
UPX uses overlapping to optimize the size of the final packed binary (figure 2). The unpacker part uses a conditional
jump to separate the control flow into two overlapping blocks which both realign after a few instructions.
(TODO: expliquer les deux branches, rapidement en quoi elles sont utiles)
2.2.3 Overlapping in state-of-the-art disassemblers
Existing disassemblers, even when doing recursive traversal, assume that code cannot overlap and fail at displaying
the resulting disassembly.
With IDA Pro (v6.3), the tELock example looks as follows:
01006E7A inc byte ptr [ ebx+ecx ]
01006E7D jmp short near ptr loc_1006E7D+1
01006E7D ;
01006E7F db 0C9h ;
01006E80 db 7Fh ;
01006E81 db 0E6h ;
01006E82 db 8Bh ;
01006E83 db 0C1h ;
With Radare (TODO: recursive?), the tELock example is disassembled as follows:
01006 e7a fe040b inc byte [ ebx+ecx ]
01006 e7d e b f f jmp 6 e7e
01006 e7f c9 leave
01006 e80 7 fe6 jg 6e68
01006 e82 8bc1 mov eax , ecx
Both are not able to follow the jmp: the target of the jmp is already disassembled in another assembly instruction
and is thus deemed invalid.
2
IDA fails
because of jmp +1
BB [0x4 -> 0x5] (0x2)
0x4 dec ecx
BB [0x3 -> 0x4] (0x2)
0x3 jmp 0x4
BB [0x6 -> 0x7] (0x2)
0x6 jg 0x ee
BB [0x0 -> 0x2] (0x3)
0x0 inc byte [ebx+ecx]
BB [0x8 -> 0x9] (0x2)
0x8 mov eax, ecx
Figure 4. Control flow graph for the tELock sample
18. Another example of mis-alignment
01006 e7d eb f f jmp +1
01006 e7e f f c9 dec ecx
01006 e80 7 f e6 jg 01006 e68
01006 e82 8b c1 mov eax , ecx
Figure 1. Overlapping assembly in tELock.
010059 f0 89 f9 mov ecx , edi
,=< 010059 f2 79 07 jns +9
| 010059 f4 0 f b7 07 movzx eax , word [ edi ]
| 010059 f7 47 inc edi
| 010059 f8 50 push eax
| 010059 f9 47 inc edi
| 010059 fa b9 57 48 f2 ae mov ecx , aef24857
‘ > 010059 fb 57 push edi
010059 f c 48 dec eax
010059 fd f2 ae repne scasb
010059 f f 55 push ebp
Figure 2. Overlapping assembly in UPX.
2.2.1 tELock0.99
tELock0.99 uses an overlapping technique to simply obfuscate the code as follows. Figure 1 shows a recurs
disassembly taken from the address 01006e7a. There is a jmp +1 instruction at address 01006e7d and coded
the two bytes eb ff, that jumps to the address 01006e7d + 1, which is a dec ecx instruction (ff c9) which shares
byte ff at address 01006e7d + 1 with the jmp instruction.
2.2.2 UPX
UPX uses overlapping to optimize the size of the final packed binary (figure 2). The unpacker part uses a conditio
jump to separate the control flow into two overlapping blocks which both realign after a few instructions.
(TODO: expliquer les deux branches, rapidement en quoi elles sont utiles)
UPX
Re-synchronized
bytes in common !!
mov ecx,edi
jnz +9
movzx eax, [edi]
inc edi
push eax
inc edi
mov ecx, aef24857
push ebp
push edu
dec eax
repine scasb
Share 4 bytes
19. Let’s recap the problem
First
instruc*on
Last
instruc*on
TRACER
W2 W4W1 W3 W5
Snapshot of the memory at the beginning of wave 5
Goal : Reconstruct the full CFG
21. Junk codes insertion after a call
100 : call @a
junk code
125 : pop esi
@a : pop ebp
Modify the return
address
@b: ret
100:call @a, @a:pop esi,…, @b:ret;125:pop esi;…
A trace will provide automatically the address 125
It is junk codes only if it is not reachable
Trace:
See the paper of Krugel and al, Usenix 2004 for another approach
22. Method for mis alignment
… 89 F9 79 07 0F B7 07 47 50 47 B9 57 48 F2 AE 55 …
mov ecx,edi
jnz +9
push edi
dec eax
repne scasb
push ebp
movzx eax, [edi]
inc edi
push eax
inc edi
mov ecx, aef24857
push ebp
An obfuscation similar to UPX
The CFG construction follow the trace
Then, we search for missing codes
3/ We split blocks
23. Method for mis alignment
… 89 F9 79 07 0F B7 07 47 50 47 B9 57 48 F2 AE 55 …
mov ecx,edi
jnz +9
push edi
dec eax
repne scasb
movzx eax, [edi]
inc edi
push eax
inc edi
mov ecx, aef24857
push ebp
An obfuscation similar to UPX
1/ The CFF construction follows the trace
2/ Then, we search for missing codes
3/ We split blocks
Misalignment can come from indirect jump … traces are then useful !
24. The overall method (work in progress)
• A partial CFG is an un-complete CFG
• Two partial CFG are in conflict if there are two mis-aligned instructions.
• Traces define a set of partial CFG which are in conflict.
mov ecx,edi
jnz +9
movzx eax, [edi]
inc edi
push eax
inc edi
mov ecx, aef24857
push edi
dec eax
repne scasb
push ebp
push ebp
push ebp
Share the !
same adresses
• Edges between partial CFG
indicate mis-alignement
• Then we can synchronize
partial CFG
• There are orphan partial CFG
• There are ok if there is an
edge to a valid address
• Statistic recognition is useful
at this stage
25. Conclusion and Questions
• We develop a disassembler for self-modified codes :
BinVizz : Visualization of each code wave from a trace