Fan of & Fun with Assembly language
Researcher
Scientist
Teach Reverse Engineering since 2001
Candidate of technical science
Lecturer at Samara State Technical University
and Samara State Aerospace University
Intro
Simple
Trace & Coverage
Graph
Program Slicing
All Together
Iterative process
Understand small piece of code – make
abstraction in mind
Understand all pieces of code in procedure –
unite all abstractions – make abstraction
about function
And etc
Good visualization important
Many routine tasks
Code localization
Data flow dependencies
Code flow dependencies
Local variables checking
Input output procedures parameters checking
Variables range checking
Labels naming
Function naming
Function prototyping
Biggest science school - Professor Thomas W.
Reps - University of Wisconsin-Madison -
http://pages.cs.wisc.edu/~reps/
In Russia – Institute for System Programming
Russian Academy of Science -
http://www.ispras.ru
Also called Execution Trace
Trace of program execution
Simpe case - just a list of addresses that
instruction pointer takes on single run
Firstly used as a measure to describe the
degree to which the source code of a
program is tested by a particular test suite.
List of instructions that executed during
single run
List of unique addresses from program trace
Difference between code coverage can help to
locate code that do some functionality
Common code coverage – common
functionality
More runs – more diff between code coverage
– precise code localization
The collection of all memory accesses
performed by an application in single run
Include both writes and reads
Include Code Trace
Include all registers values and memory
values at every execution point
May be absolute – save all values
Relative – just save values that changed at
this execution point
Directed graph that shows control
dependencies between blocks of commands
Each node represents basic block
Basic block – piece of code ends with jump,
starts with jump target without any jump or
jump target inside block
Two special blocks – entry block and exit
block
Directed graph that represents calling
relationships between subroutines in a computer
program
Each node represents procedure
Each edge (a, b) indicates that procedure a calls
procedure b
Cycle in the graph indicates recursive procedure
calls
Static call graph represents every possible run of
the program
Dynamic call graph is a record of an execution of
the program
Directed graph that represents data
dependencies between a number of
operations
Each node represents operation
Each edge represents variable
Ottenstein & Ottenstein – PDG, 1984
Actually – Procedure dependence graph because
introduced for programs with one procedure
Each node represents a statement
Two types of edges
Control Dependence – between a predicate and
the statements it controls
Data Dependence – between statements
modifying a variable and those that may
reference it
Special “Entry” node is connected to all nodes
that are not control dependant
Horowitz, Reps & Binkly – SDG, 1990
PDG included for procedures
New nodes: Call Site, Procedure Entry, Actual-in-
argument, Actual-out-argument, Formal-in-
parameter, Formal-out-parameter
3 new edge types
Call Edge – connect “call site” and “procedure
entry”
Parameter-In Edge – connect “Actual-in” with
“Formal-in”
Parameter-Out-Edge – connect “Actual-out” with
“Formal-out”
Large programs must be decomposed for
understanding and manipulation.
However, it should be into procedures and
abstract data types.
Program Slicing is decomposition based on
data flow and control flow analysis.
A study showed, experienced programmers
mentally slicing while debugging.
“The mental abstraction people make when
they are debugging a program” [Weiser]
All the statements of a program that may
affect the values of some variables in a set V
at some point of interest i.
A slicing criterion of a program P is a tuple (i,
V), where i is a statement in P and V is a
subset of variables in P.
Slicing Criterion:
C = (i , V)
Direction of slicing
◦ Backward
◦ Forward
Slicing techniques
◦ Static
◦ Dynamic
◦ Conditioned
Levels of slices
◦ Intraprocedural slicing
◦ Interprocedural slicing
Original Slicing Method
Backward slice of a program with respect to a
program point i and set of program variables V
consists of all statements and predicates in the
program that may affect the value of variables in
V at I
Answer the question “what program components
might effect a selected computation?”
Preserve the meaning of the variable (s) in the
slicing criterion for all possible inputs to the
program
Slice criterion <12,i>
◦ 1 main( )
◦ 2 {
◦ 3 int i, sum;
◦ 4 sum = 0;
◦ 5 i = 1;
◦ 6 while(i <= 10)
◦ 7 {
◦ 8 Sum = sum + 1;
◦ 9 ++ i;
◦ 10 }
◦ 11 Cout<< sum;
◦ 12 Cout<< i;
◦ 13 }
• Forward slice of a program with respect to a program
point i and set of program variables V consists of all
statements and predicates in the program that may
be affected by the value of variables in V at I
• Answers the question “what program components
might be effected by a selected computation?”
• Can show the code affected by a modification to a
single statement
Slice criterion <3,sum>
◦ 1 main( )
◦ 2 {
◦ 3 int i, sum;
◦ 4 sum = 0;
◦ 5 i = 1;
◦ 6 while(i <= 10)
◦ 7 {
◦ 8 sum = sum + 1;
◦ 9 ++ i;
◦ 10 }
◦ 11 Cout<< sum;
◦ 12 Cout<< i;
◦ 13}
Static Slicing does not make any assumptions
regarding the input.
Slices derived from the source code for all
possible input values
May lead to relatively big slices
Contains all statements that may affect a
variable for every possible execution
Current static methods can only compute
approximations
Slice criterion (12,i)
◦ 1 main( )
◦ 2 {
◦ 3 int i, sum;
◦ 4 sum = 0;
◦ 5 i = 1;
◦ 6 while(i <= 10)
◦ 7 {
◦ 8 sum = sum + 1;
◦ 9 ++ i;
◦ 10 }
◦ 11 Cout<< sum;
◦ 12 Cout<< i;
◦ 13 }
First introduced by Korel and Laski
Dynamic Slicing assumes a fixed input for a
program.
Only the dependences that occur in a specific
execution of the program are taken into account
Computed on a given input
Dynamic slicing criterion is a triple (input,
occurrence of a statement, variable) – it specifies
the input, and distinguishes between different
occurrences of a statement in the execution
history
1. read (n)
2. for I := 1 to n do
3. a := 2
4. if c1==1 then
5. if c2==1 then
6. a := 4
7. else
8. a := 6
9. z := a
10. write (z)
• Assumptions
– Input n is 1
– C1, c2 both true
– Execution history is
11, 21, 31, 41, 51, 61, 91, 22,
101
– Slice criterion<1, 101,
z>
Assumptions - Input ‘a’ is positive number
1. read(a)
2. if (a < 0)
3. a = -a
4. x = 1/a
Computes slice within one procedure
Consists basically of two steps:
A single slice of the procedure containing the
slicing criterion is made.
Procedure calls from within this procedure
are sliced using new criteria.
Compute slice over an entire program
Two ways for crossing procedure boundary
Up – going from sliced procedure into
calling procedure
Down – going from sliced procedure into
called procedure
Must Be Context Sensitive
CodeSurfer
◦ Commercial product by GammaTech Inc.
◦ GUI Based
◦ Scripting language-Tk
Unravel
◦ Static program slicer developed at NIST
◦ Slices ANSI C programs
◦ Limitations are in the treatment of Unions, Forks
and pointers to functions
Slicing of Register on Code Coverage
Graph based view of file reading and moves
between memory blocks