Control-flow analysis discovers the flow of control within code by building basic blocks and a control-flow graph (CFG). It identifies loops by computing dominators. Basic blocks are sequences of straight-line code with a single entry and exit. A CFG represents control flow with nodes for basic blocks and edges for control transfers. Loops are detected by finding natural loops defined by back edges where the target dominates the source.
1. Control-Flow Analysis
Last time
â Undergraduate compilers in a day
Today
â Control-flow analysis
â Building basic blocks
â Building control-flow graphs
â Loops
CIS570 Lecture 3 Control-Flow Analysis 2
Context
Data-flow
â Flow of data values from defs to uses
Control-flow
â Sequencing of operations
e.g., Evaluation of then-code and else-code depends on if-test
CIS570 Lecture 3 Control-Flow Analysis 3
1
2. Representing Control-Flow
High-level representation
â Control flow is implicit in an AST
Low-level representation:
â Use a Control-flow graph (CFG)
â Nodes represent statements (low-level linear IR)
â Edges represent explicit flow of control
Other options
â Control dependences in program dependence graph (PDG) [Ferrante87]
â Dependences on explicit state in value dependence graph (VDG) [Weise 94]
CIS570 Lecture 3 Control-Flow Analysis 4
What Is Control-Flow Analysis?
Control-flow analysis discovers the flow of control within a procedure
(e.g., builds a CFG, identifies loops)
1 a := 0
Example b := a * b
1 a := 0
3 c := b/d
2 b := a * b c < x?
3 L1: c := b/d
4 if c < x goto L2 5 e := b / c
5 e := b / c f := e + 1
6 f := e + 1
7 L2: g := f 7 g := f
8 h := t - g h := t - g
9 if e > 0 goto L3 e > 0?
10 goto L1 No Yes
11 L3: return 10 goto 11 return
CIS570 Lecture 3 Control-Flow Analysis 5
2
3. Basic Blocks
Definition
â A basic block is a sequence of straight line code that can be entered only
at the beginning and exited only at the end
g := f
h := t - g
e > 0?
No Yes
Building basic blocks
â Identify leaders
â The first instruction in a procedure, or
â The target of any branch, or
â An instruction immediately following a branch (implicit target)
â Gobble all subsequent instructions until the next leader
CIS570 Lecture 3 Control-Flow Analysis 6
Algorithm for Building Basic Blocks
Input: List of n instructions (instr[i] = ith instruction)
Output: Set of leaders & list of basic blocks
(block[x] is block with leader x)
leaders = {1} // First instruction is a leader
for i = 1 to n // Find all leaders
if instr[i] is a branch
leaders = leaders âȘ set of potential targets of instr[i]
foreach x â leaders
block[x] = { x }
i = x+1 // Fill out xâs basic block
while i †n and i â leaders
block[x] = block[x] âȘ { i }
i=i+1
CIS570 Lecture 3 Control-Flow Analysis 7
3
4. Building Basic Blocks Example
Example Leaders?
1 a := 0 â {1, 3, 5, 7, 10, 11}
2 b := a * b
3 L1: c := b/d Blocks?
4 if c < x goto L2 â {1, 2}
5 e := b / c â {3, 4}
6 f := e + 1 â {5, 6}
7 L2: g := f â {7, 8, 9}
8 h := t - g â {10}
9 if e > 0 goto L3 â {11}
10 goto L1
11 L3: return
CIS570 Lecture 3 Control-Flow Analysis 8
Extended Basic Blocks
Extended basic blocks
â A maximal sequence of instructions that
has no merge points in it (except
perhaps in the leader)
â Single entry, multiple exits
How are these useful?
â Increases the scope of any local analysis
or transformation that âflows forwardsâ
(e.g., copy propagation, register
renaming, instruction scheduling)
Reverse extended basic blocks
â Useful for âbackward flowâ problems
CIS570 Lecture 3 Control-Flow Analysis 9
4
5. Building a CFG from Basic Blocks
Construction
â Each CFG node represents a basic block
â There is an edge from node i to j if
â Last statement of block i branches to the first statement of j, or
â Block i does not end with an unconditional branch and is
immediately followed in program order by block j (fall through)
Input: A list of m basic blocks (block)
Output: A CFG where each node is a basic block1 1
goto L1:
for i = 1 to m
x = last instruction of block[i] 2
if instr x is a branch L1:
5
for each target (to block j) of instr x
create an edge from block i to block j
if instr x is not an unconditional branch
create an edge from block i to block i+1
CIS570 Lecture 3 Control-Flow Analysis 10
Details
Multiple edges between two nodes
...
if (a<b) goto L2
L2: ...
â Combine these edges into one edge
Unreachable code
... â Perform DFS from entry node
goto L1 â Mark each basic block as it is visited
L0: a = 10 â Unmarked blocks are unreachable
L1: ...
CIS570 Lecture 3 Control-Flow Analysis 11
5
6. Challenges
When is CFG construction more challenging?
Languages where jump targets may be unknown
â e.g., Executable code
ld $8, 104($7)
jmp $8
Languages with user-defined control structures
â e.g., Cecil
if ( &{x = 3}, &{a := a + 1}, &{a := a - 1} );
CIS570 Lecture 3 Control-Flow Analysis 12
Loop Concepts
Loop: Strongly connected component of CFG
Loop entry edge: Source not in loop & target in loop
Loop exit edge: Source in loop & target not in loop
Loop header node: Target of loop entry edge
Natural loop: Loop with only a single loop header
Back edge: Target is loop header & source is in the loop
Loop tail node: Source of back edge
Loop preheader node: Single node thatâs source of the loop entry edge
Nested loop: Loop whose header is inside another loop
Reducible flow graph: CFG whose loops are all natural loops
CIS570 Lecture 3 Control-Flow Analysis 13
6
7. Picturing Loop Terminology
preheader
back edge
entry edge
head
loop
(natural)
tail
exit edge
CIS570 Lecture 3 Control-Flow Analysis 14
The Value of Preheader Nodes
Not all loops have preheaders
â Sometimes it is useful to create them h
Without preheader node t
â There can be multiple entry edges
p pre-header
With single preheader node
â There is only one entry edge h
Useful when moving code outside the loop
t
â Donât have to replicate code for multiple entry
edges
CIS570 Lecture 3 Control-Flow Analysis 15
7
8. Identifying Loops
Why?
â Most execution time spent in loops, so optimizing loops will often give
most benefit
Many approaches
â Interval analysis
â Exploit the natural hierarchical structure of programs
â Decompose the program into nested regions called intervals
â Structural analysis: a generalization of interval analysis
â Identify dominators to discover loops
Weâll look at the dominator-based approach
CIS570 Lecture 3 Control-Flow Analysis 16
Dominator Terminology
entry
Dominators
d dom i if all paths from entry to node i include d
d d dom i
Strict dominators
d sdom i if d dom i and d â i entry
i
Immediate dominators
a idom b if a sdom b and there does not exist a node c a idom b a
such that c â a, c â b, a dom c, and c dom b
not â c, a sdom c and c sdom b
b
Post dominators
i
p pdom i if every possible path from i to exit includes
p (p dom i in the flow graph whose arcs are reversed
and entry and exit are interchanged)
p p pdom i
exit
CIS570 Lecture 3 Control-Flow Analysis 17
8
11. Handling Irreducible CFGâs
Node splitting
â Can turn irreducible CFGs into reducible CFGs
a a
b b
c d c d
e dâČ e
General idea
â Reduce graph (iteratively rem. self edges, merge nodes with single pred)
â More than one node => irreducible
â Split any multi-parent node and start over
CIS570 Lecture 3 Control-Flow Analysis 22
Why Go To All This Trouble?
Modern languages provide structured control flow
â Shouldnât the compiler remember this information rather than throw it
away and then re-compute it?
Answers?
â We may want to work on the binary code in which case such information
is unavailable
â Most modern languages still provide a goto statement
â Languages typically provide multiple types of loops. This analysis lets us
treat them all uniformly
â We may want a compiler with multiple front ends for multiple languages;
rather than translate each language to a CFG, translate each language to a
canonical LIR, and translate that representation once to a CFG
CIS570 Lecture 3 Control-Flow Analysis 23
11
12. Concepts
Control-flow analysis
Basic blocks
â Computing basic blocks
â Extended basic blocks
Control-flow graph (CFG)
Loop terminology
Identifying loops
Dominators
Reducibility
CIS570 Lecture 3 Control-Flow Analysis 24
Next Time
Discussion
â âBinary Translationâ by Sites et al.
â Discussion questions online
â Come prepared
After that: lecture
â Introduction to data-flow analysis
CIS570 Lecture 3 Control-Flow Analysis 25
12