The document discusses intermediate code generation in compilers. It describes how compilers generate intermediate code representations after parsing source code. Intermediate representations allow separating the front-end and back-end of compilers, facilitating code optimization and retargeting compilers to different architectures. Common intermediate representations discussed include abstract syntax trees, postfix notation, static single assignment form, and three-address instructions. The document also provides examples of generating three-address code using syntax-directed translation.
2. CS 540 GMU Spring 2007 2
Compiler Architecture
Scanner
(lexical
analysis)
Parser
(syntax
analysis)
Code
Optimizer
Semantic
Analysis
(IC generator)
Code
Generator
Symbol
Table
Source
language
tokens Syntactic
structure
Intermediate
Code
Target
language
Intermediate
Code
Intermediate
Code
3. Joey Paquet, 2000, 2002 3
Introduction to Code Generation
• Front end:
– Lexical Analysis
– Syntactic Analysis
– Intermediate Code Generation
• Back end:
– Intermediate Code Optimization
– Object Code Generation
• The front end is machine-independent, i.e. it can be reused to build
compilers for different architectures
• The back end is machine-dependent, i.e. these steps are related to the
nature of the assembly or machine language of the target architecture
4. 08/31/13 4
Introduction to Code Generation
Target-1 Code Generator Target-2 Code Generator
Intermediate-code Optimizer
Language-1 Front End
Source program
in Language-1
Language-2 Front End
Source program
in Language-2
Non-optimized Intermediate Code
Optimized Intermediate Code
Target-1 machine code Target-2 machine code
5. Joey Paquet, 2000, 2002 5
Introduction to Code Generation
• After syntactic analysis, we have a number of options to choose from:
– generate object code directly from the parse
– generate intermediate code, and then generate object code from it
– generate an intermediate abstract representation, and then generate code
directly from it
– generate an intermediate abstract representation, generate intermediate code,
and then the object code
• All these options have one thing in common: they are all based on
syntactic information gathered in the semantic analysis
6. Joey Paquet, 2000, 2002 6
Introduction to Code Generation
Syntactic
Analyzer
Object
Code
Syntactic
Analyzer
Intermediate
Representation
Object
Code
Lexical
Analyzer
Lexical
Analyzer
Lexical
Analyzer
Syntactic
Analyzer
Intermediate
Representation
Intermediate
Code
Object
Code
Syntactic
Analyzer
Intermediate
Code
Object
Code
Lexical
Analyzer
Front End Back End
7. 08/31/13 7
Intermediate Representation (IR)
A kind of abstract machine language that can express the
target machine operations without committing to too
much machine details.
•Why IR ?
11. 08/31/13 11
Advantages of Using an Intermediate
Language
1. Retargeting - Build a compiler for a new machine by
attaching a new code generator to an existing front-end.
2. Optimization - reuse intermediate code optimizers in
compilers for different languages and different
machines.
Note: the terms “intermediate code”, “intermediate
language”, and “intermediate representation” are all
used interchangeably.
12. 08/31/13 12
Issues in Designing an IR
Whether to use an existing IR
if target machine architecture is similar
if the new language is similar
Whether the IR is appropriate for the kind of
optimizations to be performed
e.g. speculation and predication
some transformations may take much longer
than they would on a different IR
13. 08/31/13 13
Issues in Designing an IR
Designing a new IR needs to consider
Level (how machine dependent it is)
Structure
Expressiveness
Appropriateness for general and special
optimizations
Appropriateness for code generation
Whether multiple IRs should be used
14. what are the IR in actual
compilers?
• gcc is a widely used compiler on many platforms
it uses two IRs: AST (Abstract Syntax Tree) and
RTL (Register Transfer Language), and some
development paths are using Tree-SSA
[SSA: Static Single Assignment: each name is
assigned once. We will talk about this later!]
• VM can be seen as a new type of IR
Java Bytecode
.Net IL
some programming languages have well defined intermediate languages.
java – java virtual machine
prolog – warren abstract machine
In fact, there are byte-code emulators to execute instructions in these
intermediate languages.
15. Intermediate Code Generation
• Direct Translation
– Using SDT scheme
– Parse tree to Three-Address Instructions
– Can be done while parsing in a single pass
– Needs to be able to deal with Syntactic Errors and Recovery
• Indirect Translation
– First validate parsing constructing of AST
– Uses SDT scheme to build AST
– Traverse the AST and generate Three Address Instructions
Intermediate
Code Generation
O(n)
IR IR
Three-Address
Instructions
∞ regs
Parse tree
AST
16. Syntax-directed definition to produce AST for
assignment statements
productionproduction semantic rulessemantic rules
SS →→ id :=id :=EE SS..nptrnptr :=:= mknodemknode((‘‘assignassign’’,, mkleafmkleaf (id, id.(id, id.entryentry),),
EE..nptrnptr))
EE →→ EE11 ++EE22
EE..nptrnptr :=:= mknodemknode(( ‘‘++’’,, EE11..nptrnptr,, EE22..nptrnptr))
EE →→ EE11 ∗∗EE22 EE..nptrnptr :=:= mknodemknode(( ‘‘∗∗’’,, EE11..nptrnptr,, EE22..nptrnptr))
EE →→ −−EE11
EE..nptrnptr :=:= mkunodemkunode(( ‘‘uminusuminus’’,, EE11..nptrnptr))
EE →→ ((EE11)) EE..nptrnptr :=:= EE11..nptrnptr
EE →→ idid EE..nptrnptr :=:= mkleafmkleaf (id, id.(id, id.entryentry))
1. Syntax Tree vs DAG
17. assign
a +
+ ∗
∗
b
c d
c duminus
syntax tree for
a := (−b + c∗d ) + c∗d
Syntax Tree vs DAG
18. • if mknode returns a pointer to an existing node
whenever possible, a DAG can be produced
assign
a +
+ ∗
∗
b
c d
c duminus
assign
a +
+
∗
b
c d
uminus
(a)syntax tree (b)DAG
a := (−b + c∗d ) + c∗d
Syntax Tree vs DAG
19. 08/31/13 19
Form Rules:
1. If E is a variable/constant, the PN of E is E itself
2. If E is an expression of the form E1 op E2, the PN of E
is E1
’
E2
’
op (E1
’
and E2
’
are the PN of E1 and E2,
respectively.)
3. If E is a parenthesized expression of form (E1), the PN
of E is the same as the PN of E1.
The PN of expression 9* (5+2) is 952+*
How about (a+b)/(c-d) ? ab+cd-/
A mathematical notation wherein every operator
follows all of its operands.
2. Postfix Notation
20. Intermediate-Code Generation 20
3. Static Single-Assignment Form
• Static single assignment form (SSA) is an
intermediate representation that facilitates certain
code optimization.
• Two distinct aspects distinguish SSA from three –
address code.
– All assignments in SSA are to variables with distinct names; hence
the term static single-assignment.
21. Intermediate-Code Generation 21
3. Static Single-Assignment Form
if (flag) x = -1; else x = 1;
y = x * ;a
if (flag) x1 = -1; else x2 = 1;
X3 = φ(x1, x2)
22. 4. Three Address Instructions IR
• Construct mapped to Three-Address Instructions
– Register-based IR for expression evaluation
– Infinite number of virtual registers
– Still independent of target architecture
• Generic Statement Format:
Label: x = y op z or if exp goto L
– Statements can have symbolic labels
– Compiler inserts temporary variables
– Type and conversions dealt in other phases of the code generation
23. Types of Three-address Statements
• Assignment
– Binary: x := y op z
– Unary: x := op y
– “op” can be any reasonable arithmetic or logic
operator.
• Copy
– Simple: x := y
– Indexed: x := y[i] or x[i] := y
– Address and pointer manipulation:
• x := &y
• x := * y
• *x := y
24. Types of Three-address Statements
• Jump
– Unconditional: goto L
– Conditional: if x relop y goto L1 [else goto L2], where relop is
<,=, >, , or ≠.≧ ≦
• Procedure call
– Call procedure P(X1,X2, . . . ,Xn)
PARAM X1
PARAM X2
...
PARAM Xn
CALL P, n
26. Quadruples
• A quadruple is a record structure with four fields:
op, arg1, arg2, and result
– The op field contains an internal code for an operator
– Statements with unary operators do not use arg2
– Operators like param use neither arg2 nor result
– The target label for conditional and unconditional jumps
are in result
• The contents of fields arg1, arg2, and result
are typically pointers to symbol table entries
– If so, temporaries must be entered into the symbol table
as they are created
– Obviously, constants need to be handled differently
27. Quadruples Example
op arg1 arg2 result
(0) uminus c t1
(1) * b t1 t2
(2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
a := b * -c + b * -c
28. Triples
• Triples refer to a temporary value by the position of
the statement that computes it
– Statements can be represented by a record with only three
fields: op, arg1, and arg2
– Avoids the need to enter temporary names into the
symbol table
• Contents of arg1 and arg2:
– Pointer into symbol table (for programmer defined
names)
– Pointer into triple structure (for temporaries)
– Of course, still need to handle constants differently
29. Triples Example
op arg1 arg2
(0) uminus c
(1) * b (0)
(2) uminus c
(3) * b (2)
(4) + t2 (3)
(5) assign a (4)
Result is implicit in triples
a := b * -c + b * -c
30. opop arg1arg1 arg2arg2
(0)(0) []=[]= xx ii
(1)(1) :=:= (0)(0) yy
an indexed assignment requires two triples:an indexed assignment requires two triples:
x[i] := yx[i] := y
31. Indirect triples
• indirect triples add a list of pointers to triples,
so that triples can be shared and moved easily
op arg1 arg2
(14) uminus c
(15) * b (14)
(16) uminus c
(17) * b (16)
(18) + (15) (17)
(19) assign a (18)
op
(0) (14)
(1) (15)
(2) (16)
(3) (17)
(4) (18)
(5) (19)
a := b * -c + b * -c
33. syntax-directed translation into
three-address code
productionproduction semantic rulessemantic rules
SS →→ while Ewhile E
do Sdo S11
S.begin := newlabel;S.begin := newlabel;
S.after := newlabel;S.after := newlabel;
SS..codecode :=:=
gen(S.begingen(S.begin ‘‘::’’)) ‖‖
E.codeE.code ‖‖
gen(‘if’Egen(‘if’E.place.place ‘‘==’’ ‘‘00’’ ‘‘gotogoto’’ S.after)S.after) ‖‖
SS11.code.code ‖‖
gen(gen(‘‘gotogoto’’ S.begin)S.begin) ‖‖
gen(S.aftergen(S.after ‘‘::’’))
34. Declarations
• enter symbols in a symbol table
• allocate space and record it in the symbol table
• emit appropriate code
35. Declarations in a procedure
• computing types and relative address of names
P → {offset := 0} D
D → D ; D
D → id : T {enter ( id.name, T.type, offset);
offset := offset + T.width }
T → integer {T.type := integer; T.width := 4 }
T→ real {T.type := real; T.width := 8 }
T→ array [ num ] of T1
{T.type := array (num.val, T1
.type);
T.width := num.val × T1
.width}
T→ ↑T1
{T.type := pointer (T1
.type); T.width := 4 }
36. Synta x -Directed Translation to Three
Address Code
• Attributes for the Non-Terminals, say E and S
– Location of the value of an expression: E.place
– The Code that Evaluates the Expressions or Statement: E.code
– Markers for beginning and end of sections of the code S.begin, S.end
• Semantic Actions in Productions of the Grammar
– Functions to create temporaries newtemp, and labels newlabel
– Use Auxiliary functions to enter symbols and consult types corresponding to
declarations in aside data structure that can be built as the code is being
parsed - a symbol table.
– To generate the code we use the emit function gen which creates a list of
instructions to be emitted later and can generate symbolic labels
corresponding to next instruction of a list.
– Use of append function on lists of instructions.
– Synthesized and Inherited Attributes
37. Assignment Statements: Grammar and
Actions
S → id = E { p = lookup(id.name);
if (p != NULL)
S.code = gen(p ‘=‘ E.place);
else error;
S.code = nulllist;
}
E → E1 + E2 {E.place = newtemp();
E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place
‘+’ E2.place); }
E → E1 * E2 { E.place = newtemp();
E.code = append(E1.code,E2.code,gen(E.place ‘=‘
E1.place ‘*’ E2.place); }
38. Assignment Statements: Grammar and
Actions
E → - E1 {E.place = newtemp();
E.code = append(E1.code,gen(E.place ‘=‘ ‘-’ E1.place)); }
E → (E1) {E.place = E1.place; E.code = E1.code; }
E → id {p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = nulllist;
}
39. Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
40. Assignment: Example
x = a * b + c * d - e * f;
id
E → id { p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = null list;
}
Production:
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e f
E
E
+
E
-
place = loc(e)
code = null
41. Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
place = loc(f)
code = null
E → id { p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = null list;
}
Production:
place = loc(e)
code = null
42. Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
place = loc(f)
code = null
E → E1 * E2 {E.place = newtemp();
E.code = gen(E.place ‘=‘ E1.place ‘*’ E2.place);}
Production:
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e + f;}
43. Assignment: E x ample
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
Production:
E → E1 + E2 {E.place = newtemp();
E.code = gen(E.place ‘=‘ E1.place ‘+’ E2.place);}
place = loc(f)
code = null
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e + f;}
place = loc(d)
code = null
place = loc(c)
code = null
place = loc(t2)
code = {t2 = c + d;}
44. Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
Production:
place = loc(f)
code = null
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e * f;}
place = loc(d)
code = null
place = loc(c)
code = null
place = loc(t2)
code = {t2 = c * d;}
S → id = E { p = lookup(id.name);
if (p != NULL)
E.code = append(E.code,
gen(p ‘=‘ E.place));
else
error;
}
place = loc(t3)
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; }
place = loc(b)
code = null
place = loc(a)
code = null
place = loc(t4)
code = {t4 = a * b;}
place = loc(t5)
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4
+ t3}
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3;
x = t5;}
place = loc(x)
code = null
45. Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
x
E
E
E
*
id
c
id
d
E
E E*
id
e
id
f
E
E + E
-
t1 = e * f;
t2 = c * d;
t3 = t2 - t1;
t4 = a * b;
t5 = t4 + t3;
x = t5;
46. Reusing Temporary Variables
• Temporary Variables
– Short lived
– Used for Evaluation of Expressions
– Clutter the Symbol Table
• Change the newtemp Function
– Keep track of when the value created in a temporary is used
– Use a counter to keep track of the number of active temps
– When a temporary is used in an expression decrement counter
– When a temporary is generated by newtemp increment counter
– Initialize counter to zero
47. Assignment: Example
• Only 2 Registers
Needed
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
x
E
E
E
*
id
c
id
d
E
E E*
id
e
id
f
E
E + E
-
// c = 0
t1 = e * f; // c = 1
t2 = c * d; // c = 2
t1 = t2 - t1; // c = 1
t2 = a * b; // c = 2
t1 = t2 + t1; // c = 1
x = t1; // c = 0
48. Boolean & Relational Values
How should the compiler represent them?
• Answer depends on the target machine
Two classic approaches
• Numerical representation
• Positional (implicit) representation
Correct choice depends on both context and ISA
49. • Issue: Control Flow Introduces Complications
– In Both Representations
– Need to Know Address to Jump To in Some Cases
• Solution: Two Additional Attributes
– nextstat (Inherited) Indicates the next location to be generated
– laststat (Synthesized) Indicates the last location filled
– As code is generated the attributes are filled with the correct value
SDT Scheme for Boolean Expressions
55. Boolean Expressions: Example
a < b or c < d and e < f
00: if a < b goto 03
01: t1 = 0
02: goto 04
03: t1 = 1
04: if c < d goto 07
05: t2 = 0
06: goto 08
07: t2 = 1
08: if e < f goto 11
09: t3 = 0
10: goto 12
11: t3 = 1
12: t4 = t2 and t3
13: t5 = t1 or t4
id relop id
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
56. Control Flow Statements: Code Layout
E.code
S1.code
S → if E then S1
S → if E then S1else S2
E.code
S1.code
S2.code
goto S.next
to E.true
to E.false
E.true:
E.false:
to E.true
to E.false
E.true:
E.false:
S.next:
• Attributes:
– E.true: the label to which control flows if E is true
– E.false: the label to which control flows if E is false
– S.next: an inherited attribute with the symbolic label of the code following S
57. Control Flow Statements: Code Layout
E.code
S1.code
S → while E do S1
goto S.begin
to E.true
to E.false
E.true:
E.false:
S.begin:
• Difficulty: Need to know where to jump to
– Introduce a symbolic labels using the newlabel function
– Use inherited attributes
– Backpatch it later with the actual value (later…)
58. Control Flow Statements: Grammar
and Actions
S → if E then S1
{
E.true = newlabel
E.false = S.next
S1.next = S.next
S.code = append(E.code,gen(E.true:),S1.code)
}
59. Control Flow Statements: Grammar
and Actions
S → if E then S1 else S2
{ E.true = newlabel
E.false = newlabel
S1.next = S.next
S2.next = S.next
S.code = append(E.code,gen(E.true:),S1.code,
gen(goto S.next),gen(E.false :),S2.code)
}
60. Control Flow Statements: Grammar
and Actions
S → while E do S1
{ S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:), E.code, gen(E.true:),
S1.code, gen(goto S.begin)
}
61. Control Flow Translation of Boolean
Expressions
• Short-Circuit Evaluation
– No Need to Evaluate portions of the expression if the outcome is
already determined
– Examples:
• E1 or E2 need not evaluate E2 if E1 is known to be true.
• E1 and E2 need not evaluate E2 if E1 is known to be false.
• Use Control Flow
– Jump over code that evaluates boolean terms of the expression
– Use Inherited E.false and E.true attributes and link evaluation of E
62. Control Flow Translation of Boolean
Expressions
E → E1 or E2
{ E1.true = E.true
E1.false = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,gen(E1.false:),E2.code)
}
E → E1 and E2
{E1.false = E.false
E1.true = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,gen(E1.true:),E2.code)
}
63. Control Flow Translation of Boolean
Expressions
E → id1 relop id2
{E.code = append(gen(if id1.place relop id2.place goto E.true),
gen(goto E.false)) }
E → true {E.code = gen(goto E.true) }
E → false {E.code = gen(goto E.false) }
E → not E1 {E1.true = E.false
E1.false = E.true
E.code = E1.code }
E → ( E1 ) { E1.true = E.true
E1.false = E.false
E.code = E1.code }
64. Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
andid relop id
65. Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = Ltrue
E.false = Lfalse
E1.true = Ltrue
E1.false = L1
id relop id
E → id1 relop id2 ‖
E.code = append(
gen(if id1.place relop id2.place goto E.true),
gen(goto E.false))
E → E1 or E2 ‖
E1.true = E.true
E1.false = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,
gen(E1.false:),E2.code)
E2.true = Ltrue
E2.false = Lfalse
if a < b goto Ltrue
goto L1
L1:
66. Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
id relop id
E → id1 relop id2 ‖
E.code = append(
gen(if id1.place relop id2.place goto E.true),
gen(goto E.false))
E → E1 and E2 ‖
E1.false = E.false
E1.true = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,
gen(E1.true:),E2.code)
if a < b goto Ltrue
goto L1
E2.true = Ltrue
E2.false = Lfalse
L1: if c < d goto L2
goto Lfalse
L2:
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = Ltrue
E.false = Lfalse
E1.true = Ltrue
E1.false = L1
E2.true = Ltrue
E2.false = Lfalse
E1.true = L2
E1.false = Lfalse
67. Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = Ltrue
E.false = Lfalse
E1.true = Ltrue
E1.false = L1
id relop id
E → id1 relop id2 ‖
E.code = append(
gen(if id1.place relop id2.place goto E.true),
gen(goto E.false))
E → E1 and E2 ‖
E1.false = E.false
E1.true = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,
gen(E1.true:),E2.code)
E2.true = Ltrue
E2.false = Lfalse
if a < b goto Ltrue
goto L1
E2.true = Ltrue
E2.false = Lfalse
E1.true = L2
E1.false = Lfalse
L1: if c < d goto L2
goto Lfalse
L2: if e < f goto Ltrue
goto Lfalse
68. Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
andid relop id
if a < b goto Ltrue
goto L1
L1: if c < d goto L2
goto Lfalse
L2: if e < f goto Ltrue
goto Lfalse
69. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin)
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
70. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
71. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2:
72. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3:E.true = L3
E.false = L4
S1.next = L1
S2.next = L1
73. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4:
E.true = L3
E.false = L4
S1.next = L1
S2.next = L1
74. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4: t2 = x - z
x = t2
goto L1
Lnext:
E.true = L3
E.false = L4
S1.next = L1
S2.next = L1
75. Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4: t2 = x - z
x = t2
goto L1
Lnext:
76. Loop Constructs
Loops
• Evaluate condition before loop (if needed)
• Evaluate condition after loop
• Branch back to the top (if needed)
Why this structure?
• Merges test with last block of loop body
• Pre-test block to hold loop-invariant code
• Post-test for increment instructions and test
while, for, do, & until all fit this basic model
Pre-test
Loop head
Post-test
Next block
B1 B2
77. Break & Skip Statements
Many modern programming languages include a break
• Exits from the innermost control-flow statement
– Out of the innermost loop
– Out of a case statement
Translates into a jump
• Targets statement outside control- flow
construct
• Creates multiple-exit construct
• skip in loop goes to next iteration
Only make sense if loop has > 1 block
Pre-test
Loop head
Post-test
Next block
B1 B2Break in
B1
Skip in
B2
78. Break and Skip Statements
• Need to Keep track of enclosing control-flow
constructs
• Harder to have clean SDT scheme…
– Keep a Stack of control-flow constructs
– Using S.next as in the stack as the target for the break statement
– For skip statements need to keep track of the label of the code of
the post-test block to advance to the next iteration. This is harder
since the code has not been generated yet.
• Backpatching helps
– Use a breaklist and a skiplist to be patched later.
79. Backpatching
• Single Pass Solution to Code Generation?
– No more symbolic labels - symbolic addresses instead
– Emit code directly into an array of instructions
– Actions associated with Productions
– Executed when Bottom-Up Parser “Reduces” a production
• Problem
– Need to know the labels for target branches before actually generating the
code for them.
• Solution
– Leave Branches undefined and patch them later
– Requires: carrying around a list of the places that need to be patched until
the value to be patched with is known.
80. Boolean Expressions Revisited
• Use Additional ε-Production
– Just a Marker M
– Label Value M.addr
• Attributes:
– E.truelist: code places that need to be
filled-in corresponding to the
evaluation of E as “true”.
– E.falselist: same for “false”
(1) E → E1 or M E2
(2) | E1 and M E2
(3) | not E1
(4) | ( E1 )
(5) | id1 relop id2
(6) | true
(7) | false
(8) M → ε
82. Action
(8) M → ε { M.Addr := nextAddr; }
(1) E → E1 or M E2
{ backpatch(E1.falselist,M.Addr);
E.truelist := merge(E1.truelist,E2.truelist);
E.falselist := E2.falselist; }
(2) E → E1 and M E2 { backpatch(E1.truelist,M.Addr);
E.truelist := E2.truelist;
E.falselist := merge(E1.falselist, E2.falselist); }
83. (3) E → not E1
{E.truelist := E1
.falselist;
E.falselist := E1
.truelist;}
(4) E → ( E1
)
{E.truelist := E1
.truelist;
E.falselist := E1
.falselist;}
(6) E → true
{E.truelist := makelist(nextquad); emit(‘goto _’);}
(7) E → false
{E.falselist := makelist(nextquad); emit(‘goto _’);}
More Actions
84. Backpatching Example
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
M.addr =
M.addr =or
and
a < b
c < d
e < f
e
e
E
E
E
E
E
M
M
E
E.truelist
E.falselist
M.addr
M
Generated CodeExecuting Action
{ E.truelist := makelist(nextquad());
E.falselist := makelist(nextquad());
emit(“if id1.place relop.op id2.place goto _”);
emit(“goto _”); }
100: if a < b goto _
101: goto_
102: if c < d goto _
103: goto_
{ M.quad = nextquad(); }
104: if e < f goto _
105: goto_
{ backpatch(E1.falselist,M.quad);
E.truelist := merge(E1.truelist,E2.truelist);
E.falselist := E2.falselist; }
{ backpatch(E1.truelist,M.quad);
E.truelist := E2.truelist;
E.falselist := merge(E1.falselist,E2.falselist; } 102: if c < d goto 104
103: goto_
100: if a < b goto _
101: goto 102
{100}
{101}
{102}
{103}
102
104
{104}
{105}
{104}
{103, 105}
{103, 105}
{100, 104}
85. Control Flow Code Structures
.
.
.
E.code
S1.codeE.true:
E.false:
if E then S1
.
.
.
E.code
S1.codeE.true:
E.false:
if E then S1 else S2
S.next:
S2.code
goto S.next
.
.
.
E.code
S1.codeE.true:
E.false:
while E do S1
goto S.begin
S.begin: