Function Argument detection
proposed on : 09.04.2016
Ahmed mohamed abd el Mawgood <IRC/telegram/Github @oddcoder>
email: <ahmedsoliman0x666[at]gmail[dot]com>
<ahmedsoliman[at]oddcoder[dot]com>
Blog: <oddcoder.com>
Abstract
Function is considered to be the basic code construction block. However, Radare2 is not
good at detecting anything beyond basic functions. Things like structures, OS specific
data types, function names are mostly unrecognized by Radare2 (with exceptions).
Adding support for this very feature will make Radare typical for static analysis for
malware , and reverse engineering. In general it will attract more users and build
concrete alternative to other commercial Software reverse engineering tools like
HexRays’s IDA pro, Hopper.
Motivation
The benefits claimed by Google fit perfectly into all tasks offered by every other
organization. But what makes this task special for me is that it is the only task I wanted to
be done one day. And I got the chance to do it myself. Typically, I will be doing something
that probably many of the Radare2 community and myself will use on daily basis. What I
will really be looking forward to do it to prepare myself to be Mentor at GSoC at radare2.
If this task is taken I don’t think that their is another task that fits me.But suggestions will
be put in consideration.
Specifications
● All the work will be available online on my fork for Radare2 and
Radare2-regressions.
● I am from Alex- Egypt the time zone is UTC+02:00gm
● Work will be pushed to the the main repo as soon as a functional piece of code is
written and debugged.
● I will start working on the task early (typically I will start coding on 22 April 2016
once google announce the results to save time as my final exams will start on 28
May and end on 16 June).
● Progress will be tracked on my personal blog oddcoder.com every friday and real
time on IRC/telegram channel.
● work will be 7-9 hours daily except for friday(weekend freeday).
Major Goals
1. Writing tests for the t command family and the corresponding pf commands and
enhancing them (issue #287, #2189, #3115).
2. Enhancing support for variations of the fastcall calling convention (issue #4204).
3. Supporting Naming local variables on the stack (issue#3735).
4. Type propagation (MAIN INITIAL TASK) (issue #4291)
a. When a local var is passed as an argument, the type must be inferred.
b. This must be done in a separate analysis command, because it needs
functions to be already analyzed to run. `aft` -> analyze function type or
`afp` analyze function propagations.
i. Input for this command is offset of a function. must walk its local var
usages and function argument passing
ii. output must be a list of r2 commands to register those arg types for
the functions called.
c. To analyze all type propagations just `aft @@ fcn*` (for example)
Optional Goals
1. Creating windows types,functions pre-compiled headers (issue #1883, #3654,
#3925).
2. Creating POSIX types,functions pre-compiled headers.
Timeline
Before 23 April 2016
I will study the parts I will be working on, this will be defined by:
● Understanding how pf command works and its implementation(only subset
related to tcommand family).
● Understanding how t commands works and understand its implementation
in the code.
● Understanding the anal (analysis subsystem) found in libr/anal/*.c ,
(libr/anal/types.c and libr/anal/fcn.c), , libr/anal/var.c , libr/core/cmd_anal.c)
● Understanding Radare2 lists libr/include/r_list.h
● Research about fastcall calling conventions and refresh my memory about
it.
● OPTIONAL:Understanding SDB. The docs
23 April-30 April:
● Writing Tests/fixing all of the following t, to, t-, tf, td, tb, te, tl, tk
● Implement td
● Fixing t*
● Writing tests for both tdand t*
● implementing(or fixing the existing implementation of) ts
1May-10 May
● Creating afA that extract function parameters from registers in case of
fastcall only with similar behavior to that of afa.
● Creating test cases to test for the correct implementation using existing
binaries in radare2-regressions and possibly but not mandatory specifically
crafted binaries.
● OPTIONAL: merging both afa and afA so user will not need to think which
to use.
11 May-23 May:
● Studying the behaviour of -fomit-frame-pointerin gcc.
● Adding support for naming local Variable compiled with
-fomit-frame-pointerunder gcc.
● Writing Tests for functions compiled with -fomit-frame-pointer.
● OPTIONAL:adding support for similar under different compiler typically
MSVC.
23 May-16 June:
● I will have to stop and prepare for my final exams
17 June- 5 August:
● Researching on the best way to store the list of local variables (pros/cons).
● Creating a list for the usage of ebp based local variables/formal
arguments.
● Creating a list for the usage of esp based local variables/formal arguments.
● Creating a list for the usage of fastcall style local variable/formal argument.
● Implementing type inference algorithm for the variable (decision making)
● Re-implementing Type inference,variables enumerating but with recursively
● Implementing intel x86 architecture specific type inference algorithm
● Writing tests for everything
5 August-15 August:
● Cleaning the code.
● Fixing Coverity Scanerrors.
● Adding missing tests.
● Responding to issuesrelated to my code.
15 August-20 August:
● OPTIONAL:Creating windows types and functions pre-compiled (into SDB)
headers.
● OPTIONAL:Creating POSIX compatible OS types and functions precompiled
header.
Micro-Tasks solved:
I have made bunch of commits to the codebase Radare2 and Radare2-regressions. My
goal was to demonstrate both coding speed / ability to adopt & work with unknown code
base within short time.
All my commits are referenced hereand heremost notable are :
● Adding initial support for pic18c disassembler
● pic18c analysis plugin (coloring)
● enhancing the jump instruction family
● adding tests for pic
GSoC experience:
This is my first time to apply at Google summer of code. Last year I wanted to apply but I
didn’t because I thought I was not prepared enough. This year I applied only for this one
task. The reason is that their is no any other project that is Reverse engineering and code
analysis based.