Subsequently to starting our analysis found a non-obfuscated sample – we weren’t motivated to fully deobfuscate the code
Since 2017 been focusing on reverse engineering challenging malware samples
Computer Science student at the Comenius University in the first years of master’s degree
BlackHat Arsenal, AVAR
Work like standard machines – machine code is bytecode here
PC is offset to the bytecode here – not necessary
Transfer of control from one virtual instruction to the next during interpretation needs to be performed by every VM. This process is generally known as dispatching
Centralized interpreter – dispatcher – called direct call threading; can be also simple switch case
Can be also inside of handlers – direct threading
Stored in sep.arate section
Need context switch
Bytecode is generated from a function
Original function starts interpreter
Interpreter is embedded in the executable
Context switch needs to be performed
Stored in a separate section
Need context switch
Bytecode is generated from a function
Original function starts interpreter
Interpreter is embedded in the executable
Context switch needs to be performed
The strength of this obfuscation technique resides in the fact that the ISA of the VM is unknown to any prospective reverse engineer – a thorough analysis of the VM, which can be very time-consuming;
Additionally, obfuscating VMs usually virtualize only certain “interesting” functions
SE in IR
IRDst 1. stands for 2. is IR instruction pointer, e.g. cmov multiple blocks
X bit value dereference
Self explainable dereference
SE engine in Miasm automatically performs common optimization techniques
The VM used in the Wslink is a variant of CV (themida) that supports multiple VMs and polymorphic layers of obfuscation techniques. We found this out subsequently to publishing our research
The right side contains a part of the function
Why 1071? They are duplicated!
Vm_init – Sync – shared virtual context in the memory
Virtual program counter (VPC) register
Base address register
Instruction table register
Zdorazni nested VM – zacneme VM2
Rolling decryption – present in a blog about VM Protect’s structure
Make values relative to the bytecode ptr concrete
We’ve just deobfuscated one of the VM1 instructions
Let’s start with CFG
Reminder – atomic operations in blocks
Laser point – These IR blocks are effects of VM2 virtual instructions, they represent one VM1 virtual instruction together
Zeroes out a register, stores the RSP, initializes rolling decryption registers
A part of the instruction. When we inspected the code, we realized what the problem is
Memory range of the VMs is known – separate section
Let’s take a closer look at one of the branches
<CLICK>
The branches check a known value
We can apply the value and simplify it
This is a sort of opaque predicates
We will show the results on the first executed bytecode, but before that show statistics
Before we look at the devirtualized bytecode, let’s see what the execution time is approximately like
Let’s look at the deobfuscated bytecode block of VM1 <CLICK>
*We discovered the sample after starting our analysis
No ARM R0! – pseudocode close to assembly
Advanced – use junk code, duplicate and merge handlers, encrypt operands
Right values – operands and registers for encryption