SlideShare ist ein Scribd-Unternehmen logo
1 von 85
IntermedIate Code
GeneratIon
Sarfaraz MaSood
Asstt Prof, Department of Computer
Engineering
Jamia Millia University
New Delhi
CS 540 GMU Spring 2007 2
Compiler Architecture
Scanner
(lexical
analysis)
Parser
(syntax
analysis)
Code
Optimizer
Semantic
Analysis
(IC generator)
Code
Generator
Symbol
Table
Source
language
tokens Syntactic
structure
Intermediate
Code
Target
language
Intermediate
Code
Intermediate
Code
Joey Paquet, 2000, 2002 3
Introduction to Code Generation
• Front end:
– Lexical Analysis
– Syntactic Analysis
– Intermediate Code Generation
• Back end:
– Intermediate Code Optimization
– Object Code Generation
• The front end is machine-independent, i.e. it can be reused to build
compilers for different architectures
• The back end is machine-dependent, i.e. these steps are related to the
nature of the assembly or machine language of the target architecture
08/31/13 4
Introduction to Code Generation
Target-1 Code Generator Target-2 Code Generator
Intermediate-code Optimizer
Language-1 Front End
Source program
in Language-1
Language-2 Front End
Source program
in Language-2
Non-optimized Intermediate Code
Optimized Intermediate Code
Target-1 machine code Target-2 machine code
Joey Paquet, 2000, 2002 5
Introduction to Code Generation
• After syntactic analysis, we have a number of options to choose from:
– generate object code directly from the parse
– generate intermediate code, and then generate object code from it
– generate an intermediate abstract representation, and then generate code
directly from it
– generate an intermediate abstract representation, generate intermediate code,
and then the object code
• All these options have one thing in common: they are all based on
syntactic information gathered in the semantic analysis
Joey Paquet, 2000, 2002 6
Introduction to Code Generation
Syntactic
Analyzer
Object
Code
Syntactic
Analyzer
Intermediate
Representation
Object
Code
Lexical
Analyzer
Lexical
Analyzer
Lexical
Analyzer
Syntactic
Analyzer
Intermediate
Representation
Intermediate
Code
Object
Code
Syntactic
Analyzer
Intermediate
Code
Object
Code
Lexical
Analyzer
Front End Back End
08/31/13 7
Intermediate Representation (IR)
A kind of abstract machine language that can express the
target machine operations without committing to too
much machine details.
•Why IR ?
08/31/13 8
Without IR
C
Pascal
FORTRAN
C++
SPARC
HP PA
x86
IBM PPC
08/31/13 9
With IR
C
Pascal
FORTRAN
C++
SPARC
HP PA
x86
IBM PPC
IR
08/31/13 10
With IR
C
Pascal
FORTRAN
C++
IR Common Backend
?
08/31/13 11
Advantages of Using an Intermediate
Language
1. Retargeting - Build a compiler for a new machine by
attaching a new code generator to an existing front-end.
2. Optimization - reuse intermediate code optimizers in
compilers for different languages and different
machines.
Note: the terms “intermediate code”, “intermediate
language”, and “intermediate representation” are all
used interchangeably.
08/31/13 12
Issues in Designing an IR
 Whether to use an existing IR
 if target machine architecture is similar
 if the new language is similar
 Whether the IR is appropriate for the kind of
optimizations to be performed
 e.g. speculation and predication
 some transformations may take much longer
than they would on a different IR
08/31/13 13
Issues in Designing an IR
 Designing a new IR needs to consider
 Level (how machine dependent it is)
 Structure
 Expressiveness
 Appropriateness for general and special
optimizations
 Appropriateness for code generation
 Whether multiple IRs should be used
what are the IR in actual
compilers?
• gcc is a widely used compiler on many platforms
it uses two IRs: AST (Abstract Syntax Tree) and
RTL (Register Transfer Language), and some
development paths are using Tree-SSA
[SSA: Static Single Assignment: each name is
assigned once. We will talk about this later!]
• VM can be seen as a new type of IR
Java Bytecode
.Net IL
some programming languages have well defined intermediate languages.
 java – java virtual machine
 prolog – warren abstract machine
 In fact, there are byte-code emulators to execute instructions in these
intermediate languages.
Intermediate Code Generation
• Direct Translation
– Using SDT scheme
– Parse tree to Three-Address Instructions
– Can be done while parsing in a single pass
– Needs to be able to deal with Syntactic Errors and Recovery
• Indirect Translation
– First validate parsing constructing of AST
– Uses SDT scheme to build AST
– Traverse the AST and generate Three Address Instructions
Intermediate
Code Generation
O(n)
IR IR
Three-Address
Instructions
∞ regs
Parse tree
AST
Syntax-directed definition to produce AST for
assignment statements
productionproduction semantic rulessemantic rules
SS →→ id :=id :=EE SS..nptrnptr :=:= mknodemknode((‘‘assignassign’’,, mkleafmkleaf (id, id.(id, id.entryentry),),
EE..nptrnptr))
EE →→ EE11 ++EE22
EE..nptrnptr :=:= mknodemknode(( ‘‘++’’,, EE11..nptrnptr,, EE22..nptrnptr))
EE →→ EE11 ∗∗EE22 EE..nptrnptr :=:= mknodemknode(( ‘‘∗∗’’,, EE11..nptrnptr,, EE22..nptrnptr))
EE →→ −−EE11
EE..nptrnptr :=:= mkunodemkunode(( ‘‘uminusuminus’’,, EE11..nptrnptr))
EE →→ ((EE11)) EE..nptrnptr :=:= EE11..nptrnptr
EE →→ idid EE..nptrnptr :=:= mkleafmkleaf (id, id.(id, id.entryentry))
1. Syntax Tree vs DAG
assign
a +
+ ∗
∗
b
c d
c duminus
syntax tree for
a := (−b + c∗d ) + c∗d
Syntax Tree vs DAG
• if mknode returns a pointer to an existing node
whenever possible, a DAG can be produced
assign
a +
+ ∗
∗
b
c d
c duminus
assign
a +
+
∗
b
c d
uminus
(a)syntax tree (b)DAG
a := (−b + c∗d ) + c∗d
Syntax Tree vs DAG
08/31/13 19
Form Rules:
1. If E is a variable/constant, the PN of E is E itself
2. If E is an expression of the form E1 op E2, the PN of E
is E1
’
E2
’
op (E1
’
and E2
’
are the PN of E1 and E2,
respectively.)
3. If E is a parenthesized expression of form (E1), the PN
of E is the same as the PN of E1.
The PN of expression 9* (5+2) is 952+*
How about (a+b)/(c-d) ? ab+cd-/
A mathematical notation wherein every operator
follows all of its operands.
2. Postfix Notation
Intermediate-Code Generation 20
3. Static Single-Assignment Form
• Static single assignment form (SSA) is an
intermediate representation that facilitates certain
code optimization.
• Two distinct aspects distinguish SSA from three –
address code.
– All assignments in SSA are to variables with distinct names; hence
the term static single-assignment.
Intermediate-Code Generation 21
3. Static Single-Assignment Form
if (flag) x = -1; else x = 1;
y = x * ;a
if (flag) x1 = -1; else x2 = 1;
X3 = φ(x1, x2)
4. Three Address Instructions IR
• Construct mapped to Three-Address Instructions
– Register-based IR for expression evaluation
– Infinite number of virtual registers
– Still independent of target architecture
• Generic Statement Format:
Label: x = y op z or if exp goto L
– Statements can have symbolic labels
– Compiler inserts temporary variables
– Type and conversions dealt in other phases of the code generation
Types of Three-address Statements
• Assignment
– Binary: x := y op z
– Unary: x := op y
– “op” can be any reasonable arithmetic or logic
operator.
• Copy
– Simple: x := y
– Indexed: x := y[i] or x[i] := y
– Address and pointer manipulation:
• x := &y
• x := * y
• *x := y
Types of Three-address Statements
• Jump
– Unconditional: goto L
– Conditional: if x relop y goto L1 [else goto L2], where relop is
<,=, >, , or ≠.≧ ≦
• Procedure call
– Call procedure P(X1,X2, . . . ,Xn)
PARAM X1
PARAM X2
...
PARAM Xn
CALL P, n
implementations of three-address
statements
• common implementations:
– Quadruples
– Triples
– indirect triples
Consider the code:
a := b * -c + b * -c
Quadruples
• A quadruple is a record structure with four fields:
op, arg1, arg2, and result
– The op field contains an internal code for an operator
– Statements with unary operators do not use arg2
– Operators like param use neither arg2 nor result
– The target label for conditional and unconditional jumps
are in result
• The contents of fields arg1, arg2, and result
are typically pointers to symbol table entries
– If so, temporaries must be entered into the symbol table
as they are created
– Obviously, constants need to be handled differently
Quadruples Example
op arg1 arg2 result
(0) uminus c t1
(1) * b t1 t2
(2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
a := b * -c + b * -c
Triples
• Triples refer to a temporary value by the position of
the statement that computes it
– Statements can be represented by a record with only three
fields: op, arg1, and arg2
– Avoids the need to enter temporary names into the
symbol table
• Contents of arg1 and arg2:
– Pointer into symbol table (for programmer defined
names)
– Pointer into triple structure (for temporaries)
– Of course, still need to handle constants differently
Triples Example
op arg1 arg2
(0) uminus c
(1) * b (0)
(2) uminus c
(3) * b (2)
(4) + t2 (3)
(5) assign a (4)
Result is implicit in triples
a := b * -c + b * -c
opop arg1arg1 arg2arg2
(0)(0) []=[]= xx ii
(1)(1) :=:= (0)(0) yy
 an indexed assignment requires two triples:an indexed assignment requires two triples:
x[i] := yx[i] := y
Indirect triples
• indirect triples add a list of pointers to triples,
so that triples can be shared and moved easily
op arg1 arg2
(14) uminus c
(15) * b (14)
(16) uminus c
(17) * b (16)
(18) + (15) (17)
(19) assign a (18)
op
(0) (14)
(1) (15)
(2) (16)
(3) (17)
(4) (18)
(5) (19)
a := b * -c + b * -c
syntax-directed translation into
three-address code
productionproduction semantic rulessemantic rules
SS →→ id :=id :=EE SS..codecode := E.code:= E.code ‖gen(‖gen(id.placeid.place ‘‘:=:=’’ E.place)E.place)
EE →→ EE11 ++EE22
EE.place := newtemp;.place := newtemp;
E.code :=E.code := EE11.code.code ‖‖EE22.code.code ‖‖
gen(E.place ‘:=’gen(E.place ‘:=’EE11.place.place ‘‘++’’EE22. place). place)
EE →→ EE11 ∗∗EE22
............
EE →→ −−EE11
EE.place := newtemp;.place := newtemp;
E.code :=E.code := EE11.code.code ‖gen(E.place ‘:=’‖gen(E.place ‘:=’‘‘uminusuminus’’EE11. place). place)
EE →→ ((EE11)) EE..placeplace :=:= EE11.place;.place; EE.code :=.code := EE11.code.code
FF →→ idid EE..placeplace := id.place; E.code :=:= id.place; E.code := ‘’‘’
syntax-directed translation into
three-address code
productionproduction semantic rulessemantic rules
SS →→ while Ewhile E
do Sdo S11
S.begin := newlabel;S.begin := newlabel;
S.after := newlabel;S.after := newlabel;
SS..codecode :=:=
gen(S.begingen(S.begin ‘‘::’’)) ‖‖
E.codeE.code ‖‖
gen(‘if’Egen(‘if’E.place.place ‘‘==’’ ‘‘00’’ ‘‘gotogoto’’ S.after)S.after) ‖‖
SS11.code.code ‖‖
gen(gen(‘‘gotogoto’’ S.begin)S.begin) ‖‖
gen(S.aftergen(S.after ‘‘::’’))
Declarations
• enter symbols in a symbol table
• allocate space and record it in the symbol table
• emit appropriate code
Declarations in a procedure
• computing types and relative address of names
P → {offset := 0} D
D → D ; D
D → id : T {enter ( id.name, T.type, offset);
offset := offset + T.width }
T → integer {T.type := integer; T.width := 4 }
T→ real {T.type := real; T.width := 8 }
T→ array [ num ] of T1
{T.type := array (num.val, T1
.type);
T.width := num.val × T1
.width}
T→ ↑T1
{T.type := pointer (T1
.type); T.width := 4 }
Synta x -Directed Translation to Three
Address Code
• Attributes for the Non-Terminals, say E and S
– Location of the value of an expression: E.place
– The Code that Evaluates the Expressions or Statement: E.code
– Markers for beginning and end of sections of the code S.begin, S.end
• Semantic Actions in Productions of the Grammar
– Functions to create temporaries newtemp, and labels newlabel
– Use Auxiliary functions to enter symbols and consult types corresponding to
declarations in aside data structure that can be built as the code is being
parsed - a symbol table.
– To generate the code we use the emit function gen which creates a list of
instructions to be emitted later and can generate symbolic labels
corresponding to next instruction of a list.
– Use of append function on lists of instructions.
– Synthesized and Inherited Attributes
Assignment Statements: Grammar and
Actions
S → id = E { p = lookup(id.name);
if (p != NULL)
S.code = gen(p ‘=‘ E.place);
else error;
S.code = nulllist;
}
E → E1 + E2 {E.place = newtemp();
E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place
‘+’ E2.place); }
E → E1 * E2 { E.place = newtemp();
E.code = append(E1.code,E2.code,gen(E.place ‘=‘
E1.place ‘*’ E2.place); }
Assignment Statements: Grammar and
Actions
E → - E1 {E.place = newtemp();
E.code = append(E1.code,gen(E.place ‘=‘ ‘-’ E1.place)); }
E → (E1) {E.place = E1.place; E.code = E1.code; }
E → id {p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = nulllist;
}
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
Assignment: Example
x = a * b + c * d - e * f;
id
E → id { p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = null list;
}
Production:
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e f
E
E
+
E
-
place = loc(e)
code = null
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
place = loc(f)
code = null
E → id { p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = null list;
}
Production:
place = loc(e)
code = null
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
place = loc(f)
code = null
E → E1 * E2 {E.place = newtemp();
E.code = gen(E.place ‘=‘ E1.place ‘*’ E2.place);}
Production:
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e + f;}
Assignment: E x ample
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
Production:
E → E1 + E2 {E.place = newtemp();
E.code = gen(E.place ‘=‘ E1.place ‘+’ E2.place);}
place = loc(f)
code = null
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e + f;}
place = loc(d)
code = null
place = loc(c)
code = null
place = loc(t2)
code = {t2 = c + d;}
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E
*
id
a
id
b
x
E
E E
*
id
c
id
d
E
E E
*
id
e
id
f
E
E
+
E
-
Production:
place = loc(f)
code = null
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e * f;}
place = loc(d)
code = null
place = loc(c)
code = null
place = loc(t2)
code = {t2 = c * d;}
S → id = E { p = lookup(id.name);
if (p != NULL)
E.code = append(E.code,
gen(p ‘=‘ E.place));
else
error;
}
place = loc(t3)
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; }
place = loc(b)
code = null
place = loc(a)
code = null
place = loc(t4)
code = {t4 = a * b;}
place = loc(t5)
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4
+ t3}
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3;
x = t5;}
place = loc(x)
code = null
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
x
E
E
E
*
id
c
id
d
E
E E*
id
e
id
f
E
E + E
-
t1 = e * f;
t2 = c * d;
t3 = t2 - t1;
t4 = a * b;
t5 = t4 + t3;
x = t5;
Reusing Temporary Variables
• Temporary Variables
– Short lived
– Used for Evaluation of Expressions
– Clutter the Symbol Table
• Change the newtemp Function
– Keep track of when the value created in a temporary is used
– Use a counter to keep track of the number of active temps
– When a temporary is used in an expression decrement counter
– When a temporary is generated by newtemp increment counter
– Initialize counter to zero
Assignment: Example
• Only 2 Registers
Needed
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
x
E
E
E
*
id
c
id
d
E
E E*
id
e
id
f
E
E + E
-
// c = 0
t1 = e * f; // c = 1
t2 = c * d; // c = 2
t1 = t2 - t1; // c = 1
t2 = a * b; // c = 2
t1 = t2 + t1; // c = 1
x = t1; // c = 0
Boolean & Relational Values
How should the compiler represent them?
• Answer depends on the target machine
Two classic approaches
• Numerical representation
• Positional (implicit) representation
Correct choice depends on both context and ISA
• Issue: Control Flow Introduces Complications
– In Both Representations
– Need to Know Address to Jump To in Some Cases
• Solution: Two Additional Attributes
– nextstat (Inherited) Indicates the next location to be generated
– laststat (Synthesized) Indicates the last location filled
– As code is generated the attributes are filled with the correct value
SDT Scheme for Boolean Expressions
Boolean Expression: Grammar and
Actions
E → false {E.place = newtemp()
E.code = {gen(E.place = 0)}
E.laststat = E.nextstat + 1
}
E → true {E.place = newtemp()
E.code = {gen(E.place = 1)}
E.laststat = E.nextstat + 1
}
Boolean Expression: Grammar and
Actions
E → (E1) {E.place = E1.place;
E.code = E1.code;
E1.nextstat = E.nextstat
E.laststat = E1.laststat
}
E → not E1 {E.place = newtemp()
E.code = append(E1.code,gen(E.place = not E1.place))
E1.nextstat = E.nextstat
E.laststat = E1.laststat + 1
}
Boolean Expression: Grammar and
Actions
E → E1 or E2
{E.place = newtemp()
E.code = append(E1.code,E2.code,gen(E.place = E1.place or E2.place)
E1.nextstat = E.nexstat
E2.nextstat = E1.laststat
E.laststat = E2.laststat + 1
}
Boolean Expression: Grammar and
Actions
E → E1 and E2
{E.place = newtemp()
E.code = append(E1.code,E2.code,gen(E.place = E1.place and E2.place)
E1.nextstat = E.nexstat
E2.nextstat = E1.laststat
E.laststat = E2.laststat + 1
}
Boolean Expression: Grammar and
Actions
E → id1 relop id2
{
E.place = newtemp()
E.code = gen(if id1.place relop id2.place goto E.nextstat+3)
E.code = append(E.code,gen(E.place = 0))
E.code = append(E.code,gen(goto E.nextstat+2))
E.code = append(E.code,gen(E.place = 1))
E.laststat = E.nextstat + 4
}
Boolean Expressions: Example
a < b or c < d and e < f
00: if a < b goto 03
01: t1 = 0
02: goto 04
03: t1 = 1
04: if c < d goto 07
05: t2 = 0
06: goto 08
07: t2 = 1
08: if e < f goto 11
09: t3 = 0
10: goto 12
11: t3 = 1
12: t4 = t2 and t3
13: t5 = t1 or t4
id relop id
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
Control Flow Statements: Code Layout
E.code
S1.code
S → if E then S1
S → if E then S1else S2
E.code
S1.code
S2.code
goto S.next
to E.true
to E.false
E.true:
E.false:
to E.true
to E.false
E.true:
E.false:
S.next:
• Attributes:
– E.true: the label to which control flows if E is true
– E.false: the label to which control flows if E is false
– S.next: an inherited attribute with the symbolic label of the code following S
Control Flow Statements: Code Layout
E.code
S1.code
S → while E do S1
goto S.begin
to E.true
to E.false
E.true:
E.false:
S.begin:
• Difficulty: Need to know where to jump to
– Introduce a symbolic labels using the newlabel function
– Use inherited attributes
– Backpatch it later with the actual value (later…)
Control Flow Statements: Grammar
and Actions
S → if E then S1
{
E.true = newlabel
E.false = S.next
S1.next = S.next
S.code = append(E.code,gen(E.true:),S1.code)
}
Control Flow Statements: Grammar
and Actions
S → if E then S1 else S2
{ E.true = newlabel
E.false = newlabel
S1.next = S.next
S2.next = S.next
S.code = append(E.code,gen(E.true:),S1.code,
gen(goto S.next),gen(E.false :),S2.code)
}
Control Flow Statements: Grammar
and Actions
S → while E do S1
{ S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:), E.code, gen(E.true:),
S1.code, gen(goto S.begin)
}
Control Flow Translation of Boolean
Expressions
• Short-Circuit Evaluation
– No Need to Evaluate portions of the expression if the outcome is
already determined
– Examples:
• E1 or E2 need not evaluate E2 if E1 is known to be true.
• E1 and E2 need not evaluate E2 if E1 is known to be false.
• Use Control Flow
– Jump over code that evaluates boolean terms of the expression
– Use Inherited E.false and E.true attributes and link evaluation of E
Control Flow Translation of Boolean
Expressions
E → E1 or E2
{ E1.true = E.true
E1.false = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,gen(E1.false:),E2.code)
}
E → E1 and E2
{E1.false = E.false
E1.true = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,gen(E1.true:),E2.code)
}
Control Flow Translation of Boolean
Expressions
E → id1 relop id2
{E.code = append(gen(if id1.place relop id2.place goto E.true),
gen(goto E.false)) }
E → true {E.code = gen(goto E.true) }
E → false {E.code = gen(goto E.false) }
E → not E1 {E1.true = E.false
E1.false = E.true
E.code = E1.code }
E → ( E1 ) { E1.true = E.true
E1.false = E.false
E.code = E1.code }
Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
andid relop id
Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = Ltrue
E.false = Lfalse
E1.true = Ltrue
E1.false = L1
id relop id
E → id1 relop id2 ‖
E.code = append(
gen(if id1.place relop id2.place goto E.true),
gen(goto E.false))
E → E1 or E2 ‖
E1.true = E.true
E1.false = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,
gen(E1.false:),E2.code)
E2.true = Ltrue
E2.false = Lfalse
if a < b goto Ltrue
goto L1
L1:
Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
id relop id
E → id1 relop id2 ‖
E.code = append(
gen(if id1.place relop id2.place goto E.true),
gen(goto E.false))
E → E1 and E2 ‖
E1.false = E.false
E1.true = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,
gen(E1.true:),E2.code)
if a < b goto Ltrue
goto L1
E2.true = Ltrue
E2.false = Lfalse
L1: if c < d goto L2
goto Lfalse
L2:
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = Ltrue
E.false = Lfalse
E1.true = Ltrue
E1.false = L1
E2.true = Ltrue
E2.false = Lfalse
E1.true = L2
E1.false = Lfalse
Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = Ltrue
E.false = Lfalse
E1.true = Ltrue
E1.false = L1
id relop id
E → id1 relop id2 ‖
E.code = append(
gen(if id1.place relop id2.place goto E.true),
gen(goto E.false))
E → E1 and E2 ‖
E1.false = E.false
E1.true = newlabel
E2.true = E.true
E2.false = E.false
E.code = append(E1.code,
gen(E1.true:),E2.code)
E2.true = Ltrue
E2.false = Lfalse
if a < b goto Ltrue
goto L1
E2.true = Ltrue
E2.false = Lfalse
E1.true = L2
E1.false = Lfalse
L1: if c < d goto L2
goto Lfalse
L2: if e < f goto Ltrue
goto Lfalse
Boolean Expression: Short Circuit
Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
andid relop id
if a < b goto Ltrue
goto L1
L1: if c < d goto L2
goto Lfalse
L2: if e < f goto Ltrue
goto Lfalse
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin)
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2:
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3:E.true = L3
E.false = L4
S1.next = L1
S2.next = L1
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4:
E.true = L3
E.false = L4
S1.next = L1
S2.next = L1
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4: t2 = x - z
x = t2
goto L1
Lnext:
E.true = L3
E.false = L4
S1.next = L1
S2.next = L1
Combining Boolean and Control Flow
Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S → while E do S1 ‖
S.begin = newlabel
E.true = newlabel
E.false = S.next
S1.next = S.begin
S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code,
gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
if
id relop id
while
a
then Sthen
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4: t2 = x - z
x = t2
goto L1
Lnext:
Loop Constructs
Loops
• Evaluate condition before loop (if needed)
• Evaluate condition after loop
• Branch back to the top (if needed)
Why this structure?
• Merges test with last block of loop body
• Pre-test block to hold loop-invariant code
• Post-test for increment instructions and test
while, for, do, & until all fit this basic model
Pre-test
Loop head
Post-test
Next block
B1 B2
Break & Skip Statements
Many modern programming languages include a break
• Exits from the innermost control-flow statement
– Out of the innermost loop
– Out of a case statement
Translates into a jump
• Targets statement outside control- flow
construct
• Creates multiple-exit construct
• skip in loop goes to next iteration
Only make sense if loop has > 1 block
Pre-test
Loop head
Post-test
Next block
B1 B2Break in
B1
Skip in
B2
Break and Skip Statements
• Need to Keep track of enclosing control-flow
constructs
• Harder to have clean SDT scheme…
– Keep a Stack of control-flow constructs
– Using S.next as in the stack as the target for the break statement
– For skip statements need to keep track of the label of the code of
the post-test block to advance to the next iteration. This is harder
since the code has not been generated yet.
• Backpatching helps
– Use a breaklist and a skiplist to be patched later.
Backpatching
• Single Pass Solution to Code Generation?
– No more symbolic labels - symbolic addresses instead
– Emit code directly into an array of instructions
– Actions associated with Productions
– Executed when Bottom-Up Parser “Reduces” a production
• Problem
– Need to know the labels for target branches before actually generating the
code for them.
• Solution
– Leave Branches undefined and patch them later
– Requires: carrying around a list of the places that need to be patched until
the value to be patched with is known.
Boolean Expressions Revisited
• Use Additional ε-Production
– Just a Marker M
– Label Value M.addr
• Attributes:
– E.truelist: code places that need to be
filled-in corresponding to the
evaluation of E as “true”.
– E.falselist: same for “false”
(1) E → E1 or M E2
(2) | E1 and M E2
(3) | not E1
(4) | ( E1 )
(5) | id1 relop id2
(6) | true
(7) | false
(8) M → ε
Boolean Expressions: Code Outline
E1.code
E2.code
E1 and E2
false
?
true
false
?true
E1.code
E2.code
E1 or E2
true
false
?
false true
Action
(8) M → ε { M.Addr := nextAddr; }
(1) E → E1 or M E2
{ backpatch(E1.falselist,M.Addr);
E.truelist := merge(E1.truelist,E2.truelist);
E.falselist := E2.falselist; }
(2) E → E1 and M E2 { backpatch(E1.truelist,M.Addr);
E.truelist := E2.truelist;
E.falselist := merge(E1.falselist, E2.falselist); }
(3) E → not E1
{E.truelist := E1
.falselist;
E.falselist := E1
.truelist;}
(4) E → ( E1
)
{E.truelist := E1
.truelist;
E.falselist := E1
.falselist;}
(6) E → true
{E.truelist := makelist(nextquad); emit(‘goto _’);}
(7) E → false
{E.falselist := makelist(nextquad); emit(‘goto _’);}
More Actions
Backpatching Example
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
M.addr =
M.addr =or
and
a < b
c < d
e < f
e
e
E
E
E
E
E
M
M
E
E.truelist
E.falselist
M.addr
M
Generated CodeExecuting Action
{ E.truelist := makelist(nextquad());
E.falselist := makelist(nextquad());
emit(“if id1.place relop.op id2.place goto _”);
emit(“goto _”); }
100: if a < b goto _
101: goto_
102: if c < d goto _
103: goto_
{ M.quad = nextquad(); }
104: if e < f goto _
105: goto_
{ backpatch(E1.falselist,M.quad);
E.truelist := merge(E1.truelist,E2.truelist);
E.falselist := E2.falselist; }
{ backpatch(E1.truelist,M.quad);
E.truelist := E2.truelist;
E.falselist := merge(E1.falselist,E2.falselist; } 102: if c < d goto 104
103: goto_
100: if a < b goto _
101: goto 102
{100}
{101}
{102}
{103}
102
104
{104}
{105}
{104}
{103, 105}
{103, 105}
{100, 104}
Control Flow Code Structures
.
.
.
E.code
S1.codeE.true:
E.false:
if E then S1
.
.
.
E.code
S1.codeE.true:
E.false:
if E then S1 else S2
S.next:
S2.code
goto S.next
.
.
.
E.code
S1.codeE.true:
E.false:
while E do S1
goto S.begin
S.begin:

Weitere ähnliche Inhalte

Was ist angesagt?

Chapter Eight(2)
Chapter Eight(2)Chapter Eight(2)
Chapter Eight(2)
bolovv
 
Chapter Eight(1)
Chapter Eight(1)Chapter Eight(1)
Chapter Eight(1)
bolovv
 
Chapter Eight(3)
Chapter Eight(3)Chapter Eight(3)
Chapter Eight(3)
bolovv
 
Code generator
Code generatorCode generator
Code generator
Tech_MX
 
Intermediate code- generation
Intermediate code- generationIntermediate code- generation
Intermediate code- generation
rawan_z
 

Was ist angesagt? (20)

Ch8a
Ch8aCh8a
Ch8a
 
Intermediate code
Intermediate codeIntermediate code
Intermediate code
 
COMPILER DESIGN AND CONSTRUCTION
COMPILER DESIGN AND CONSTRUCTIONCOMPILER DESIGN AND CONSTRUCTION
COMPILER DESIGN AND CONSTRUCTION
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 
Chapter 6 Intermediate Code Generation
Chapter 6   Intermediate Code GenerationChapter 6   Intermediate Code Generation
Chapter 6 Intermediate Code Generation
 
Compiler unit 2&3
Compiler unit 2&3Compiler unit 2&3
Compiler unit 2&3
 
Syntax-Directed Translation into Three Address Code
Syntax-Directed Translation into Three Address CodeSyntax-Directed Translation into Three Address Code
Syntax-Directed Translation into Three Address Code
 
Ch8b
Ch8bCh8b
Ch8b
 
Chapter Eight(2)
Chapter Eight(2)Chapter Eight(2)
Chapter Eight(2)
 
Compiler unit 5
Compiler  unit 5Compiler  unit 5
Compiler unit 5
 
Chapter Eight(1)
Chapter Eight(1)Chapter Eight(1)
Chapter Eight(1)
 
Chapter Eight(3)
Chapter Eight(3)Chapter Eight(3)
Chapter Eight(3)
 
Three address code generation
Three address code generationThree address code generation
Three address code generation
 
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
 
Ch9b
Ch9bCh9b
Ch9b
 
Compiler Design QA
Compiler Design QACompiler Design QA
Compiler Design QA
 
Code generator
Code generatorCode generator
Code generator
 
Intermediate code- generation
Intermediate code- generationIntermediate code- generation
Intermediate code- generation
 
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler DesignIntermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
 
Lecture 11 semantic analysis 2
Lecture 11 semantic analysis 2Lecture 11 semantic analysis 2
Lecture 11 semantic analysis 2
 

Ähnlich wie Interm codegen

14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
venkatapranaykumarGa
 

Ähnlich wie Interm codegen (20)

CC Week 11.ppt
CC Week 11.pptCC Week 11.ppt
CC Week 11.ppt
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compiler
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 
Theory1&amp;2
Theory1&amp;2Theory1&amp;2
Theory1&amp;2
 
DConf 2016: Keynote by Walter Bright
DConf 2016: Keynote by Walter Bright DConf 2016: Keynote by Walter Bright
DConf 2016: Keynote by Walter Bright
 
02 functions, variables, basic input and output of c++
02   functions, variables, basic input and output of c++02   functions, variables, basic input and output of c++
02 functions, variables, basic input and output of c++
 
Compiler chapter six .ppt course material
Compiler chapter six .ppt course materialCompiler chapter six .ppt course material
Compiler chapter six .ppt course material
 
Introduction to c
Introduction to cIntroduction to c
Introduction to c
 
Unit 1 cd
Unit 1 cdUnit 1 cd
Unit 1 cd
 
Cpcs302 1
Cpcs302  1Cpcs302  1
Cpcs302 1
 
07 140430-ipp-languages used in llvm during compilation
07 140430-ipp-languages used in llvm during compilation07 140430-ipp-languages used in llvm during compilation
07 140430-ipp-languages used in llvm during compilation
 
C program compiler presentation
C program compiler presentationC program compiler presentation
C program compiler presentation
 
Syntaxdirected
SyntaxdirectedSyntaxdirected
Syntaxdirected
 
Syntaxdirected
SyntaxdirectedSyntaxdirected
Syntaxdirected
 
Syntaxdirected (1)
Syntaxdirected (1)Syntaxdirected (1)
Syntaxdirected (1)
 
C programming language tutorial
C programming language tutorial C programming language tutorial
C programming language tutorial
 
gayathri.p.pptx
gayathri.p.pptxgayathri.p.pptx
gayathri.p.pptx
 
Compilers Design
Compilers DesignCompilers Design
Compilers Design
 

Mehr von Anshul Sharma (12)

Understanding concurrency
Understanding concurrencyUnderstanding concurrency
Understanding concurrency
 
Programming using Open Mp
Programming using Open MpProgramming using Open Mp
Programming using Open Mp
 
Open MPI 2
Open MPI 2Open MPI 2
Open MPI 2
 
Open MPI
Open MPIOpen MPI
Open MPI
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2
 
Parallel programming
Parallel programmingParallel programming
Parallel programming
 
Cuda 3
Cuda 3Cuda 3
Cuda 3
 
Cuda 2
Cuda 2Cuda 2
Cuda 2
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
Des
DesDes
Des
 
Intoduction to Linux
Intoduction to LinuxIntoduction to Linux
Intoduction to Linux
 
GCC
GCCGCC
GCC
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Interm codegen

  • 1. IntermedIate Code GeneratIon Sarfaraz MaSood Asstt Prof, Department of Computer Engineering Jamia Millia University New Delhi
  • 2. CS 540 GMU Spring 2007 2 Compiler Architecture Scanner (lexical analysis) Parser (syntax analysis) Code Optimizer Semantic Analysis (IC generator) Code Generator Symbol Table Source language tokens Syntactic structure Intermediate Code Target language Intermediate Code Intermediate Code
  • 3. Joey Paquet, 2000, 2002 3 Introduction to Code Generation • Front end: – Lexical Analysis – Syntactic Analysis – Intermediate Code Generation • Back end: – Intermediate Code Optimization – Object Code Generation • The front end is machine-independent, i.e. it can be reused to build compilers for different architectures • The back end is machine-dependent, i.e. these steps are related to the nature of the assembly or machine language of the target architecture
  • 4. 08/31/13 4 Introduction to Code Generation Target-1 Code Generator Target-2 Code Generator Intermediate-code Optimizer Language-1 Front End Source program in Language-1 Language-2 Front End Source program in Language-2 Non-optimized Intermediate Code Optimized Intermediate Code Target-1 machine code Target-2 machine code
  • 5. Joey Paquet, 2000, 2002 5 Introduction to Code Generation • After syntactic analysis, we have a number of options to choose from: – generate object code directly from the parse – generate intermediate code, and then generate object code from it – generate an intermediate abstract representation, and then generate code directly from it – generate an intermediate abstract representation, generate intermediate code, and then the object code • All these options have one thing in common: they are all based on syntactic information gathered in the semantic analysis
  • 6. Joey Paquet, 2000, 2002 6 Introduction to Code Generation Syntactic Analyzer Object Code Syntactic Analyzer Intermediate Representation Object Code Lexical Analyzer Lexical Analyzer Lexical Analyzer Syntactic Analyzer Intermediate Representation Intermediate Code Object Code Syntactic Analyzer Intermediate Code Object Code Lexical Analyzer Front End Back End
  • 7. 08/31/13 7 Intermediate Representation (IR) A kind of abstract machine language that can express the target machine operations without committing to too much machine details. •Why IR ?
  • 11. 08/31/13 11 Advantages of Using an Intermediate Language 1. Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end. 2. Optimization - reuse intermediate code optimizers in compilers for different languages and different machines. Note: the terms “intermediate code”, “intermediate language”, and “intermediate representation” are all used interchangeably.
  • 12. 08/31/13 12 Issues in Designing an IR  Whether to use an existing IR  if target machine architecture is similar  if the new language is similar  Whether the IR is appropriate for the kind of optimizations to be performed  e.g. speculation and predication  some transformations may take much longer than they would on a different IR
  • 13. 08/31/13 13 Issues in Designing an IR  Designing a new IR needs to consider  Level (how machine dependent it is)  Structure  Expressiveness  Appropriateness for general and special optimizations  Appropriateness for code generation  Whether multiple IRs should be used
  • 14. what are the IR in actual compilers? • gcc is a widely used compiler on many platforms it uses two IRs: AST (Abstract Syntax Tree) and RTL (Register Transfer Language), and some development paths are using Tree-SSA [SSA: Static Single Assignment: each name is assigned once. We will talk about this later!] • VM can be seen as a new type of IR Java Bytecode .Net IL some programming languages have well defined intermediate languages.  java – java virtual machine  prolog – warren abstract machine  In fact, there are byte-code emulators to execute instructions in these intermediate languages.
  • 15. Intermediate Code Generation • Direct Translation – Using SDT scheme – Parse tree to Three-Address Instructions – Can be done while parsing in a single pass – Needs to be able to deal with Syntactic Errors and Recovery • Indirect Translation – First validate parsing constructing of AST – Uses SDT scheme to build AST – Traverse the AST and generate Three Address Instructions Intermediate Code Generation O(n) IR IR Three-Address Instructions ∞ regs Parse tree AST
  • 16. Syntax-directed definition to produce AST for assignment statements productionproduction semantic rulessemantic rules SS →→ id :=id :=EE SS..nptrnptr :=:= mknodemknode((‘‘assignassign’’,, mkleafmkleaf (id, id.(id, id.entryentry),), EE..nptrnptr)) EE →→ EE11 ++EE22 EE..nptrnptr :=:= mknodemknode(( ‘‘++’’,, EE11..nptrnptr,, EE22..nptrnptr)) EE →→ EE11 ∗∗EE22 EE..nptrnptr :=:= mknodemknode(( ‘‘∗∗’’,, EE11..nptrnptr,, EE22..nptrnptr)) EE →→ −−EE11 EE..nptrnptr :=:= mkunodemkunode(( ‘‘uminusuminus’’,, EE11..nptrnptr)) EE →→ ((EE11)) EE..nptrnptr :=:= EE11..nptrnptr EE →→ idid EE..nptrnptr :=:= mkleafmkleaf (id, id.(id, id.entryentry)) 1. Syntax Tree vs DAG
  • 17. assign a + + ∗ ∗ b c d c duminus syntax tree for a := (−b + c∗d ) + c∗d Syntax Tree vs DAG
  • 18. • if mknode returns a pointer to an existing node whenever possible, a DAG can be produced assign a + + ∗ ∗ b c d c duminus assign a + + ∗ b c d uminus (a)syntax tree (b)DAG a := (−b + c∗d ) + c∗d Syntax Tree vs DAG
  • 19. 08/31/13 19 Form Rules: 1. If E is a variable/constant, the PN of E is E itself 2. If E is an expression of the form E1 op E2, the PN of E is E1 ’ E2 ’ op (E1 ’ and E2 ’ are the PN of E1 and E2, respectively.) 3. If E is a parenthesized expression of form (E1), the PN of E is the same as the PN of E1. The PN of expression 9* (5+2) is 952+* How about (a+b)/(c-d) ? ab+cd-/ A mathematical notation wherein every operator follows all of its operands. 2. Postfix Notation
  • 20. Intermediate-Code Generation 20 3. Static Single-Assignment Form • Static single assignment form (SSA) is an intermediate representation that facilitates certain code optimization. • Two distinct aspects distinguish SSA from three – address code. – All assignments in SSA are to variables with distinct names; hence the term static single-assignment.
  • 21. Intermediate-Code Generation 21 3. Static Single-Assignment Form if (flag) x = -1; else x = 1; y = x * ;a if (flag) x1 = -1; else x2 = 1; X3 = φ(x1, x2)
  • 22. 4. Three Address Instructions IR • Construct mapped to Three-Address Instructions – Register-based IR for expression evaluation – Infinite number of virtual registers – Still independent of target architecture • Generic Statement Format: Label: x = y op z or if exp goto L – Statements can have symbolic labels – Compiler inserts temporary variables – Type and conversions dealt in other phases of the code generation
  • 23. Types of Three-address Statements • Assignment – Binary: x := y op z – Unary: x := op y – “op” can be any reasonable arithmetic or logic operator. • Copy – Simple: x := y – Indexed: x := y[i] or x[i] := y – Address and pointer manipulation: • x := &y • x := * y • *x := y
  • 24. Types of Three-address Statements • Jump – Unconditional: goto L – Conditional: if x relop y goto L1 [else goto L2], where relop is <,=, >, , or ≠.≧ ≦ • Procedure call – Call procedure P(X1,X2, . . . ,Xn) PARAM X1 PARAM X2 ... PARAM Xn CALL P, n
  • 25. implementations of three-address statements • common implementations: – Quadruples – Triples – indirect triples Consider the code: a := b * -c + b * -c
  • 26. Quadruples • A quadruple is a record structure with four fields: op, arg1, arg2, and result – The op field contains an internal code for an operator – Statements with unary operators do not use arg2 – Operators like param use neither arg2 nor result – The target label for conditional and unconditional jumps are in result • The contents of fields arg1, arg2, and result are typically pointers to symbol table entries – If so, temporaries must be entered into the symbol table as they are created – Obviously, constants need to be handled differently
  • 27. Quadruples Example op arg1 arg2 result (0) uminus c t1 (1) * b t1 t2 (2) uminus c t3 (3) * b t3 t4 (4) + t2 t4 t5 (5) := t5 a a := b * -c + b * -c
  • 28. Triples • Triples refer to a temporary value by the position of the statement that computes it – Statements can be represented by a record with only three fields: op, arg1, and arg2 – Avoids the need to enter temporary names into the symbol table • Contents of arg1 and arg2: – Pointer into symbol table (for programmer defined names) – Pointer into triple structure (for temporaries) – Of course, still need to handle constants differently
  • 29. Triples Example op arg1 arg2 (0) uminus c (1) * b (0) (2) uminus c (3) * b (2) (4) + t2 (3) (5) assign a (4) Result is implicit in triples a := b * -c + b * -c
  • 30. opop arg1arg1 arg2arg2 (0)(0) []=[]= xx ii (1)(1) :=:= (0)(0) yy  an indexed assignment requires two triples:an indexed assignment requires two triples: x[i] := yx[i] := y
  • 31. Indirect triples • indirect triples add a list of pointers to triples, so that triples can be shared and moved easily op arg1 arg2 (14) uminus c (15) * b (14) (16) uminus c (17) * b (16) (18) + (15) (17) (19) assign a (18) op (0) (14) (1) (15) (2) (16) (3) (17) (4) (18) (5) (19) a := b * -c + b * -c
  • 32. syntax-directed translation into three-address code productionproduction semantic rulessemantic rules SS →→ id :=id :=EE SS..codecode := E.code:= E.code ‖gen(‖gen(id.placeid.place ‘‘:=:=’’ E.place)E.place) EE →→ EE11 ++EE22 EE.place := newtemp;.place := newtemp; E.code :=E.code := EE11.code.code ‖‖EE22.code.code ‖‖ gen(E.place ‘:=’gen(E.place ‘:=’EE11.place.place ‘‘++’’EE22. place). place) EE →→ EE11 ∗∗EE22 ............ EE →→ −−EE11 EE.place := newtemp;.place := newtemp; E.code :=E.code := EE11.code.code ‖gen(E.place ‘:=’‖gen(E.place ‘:=’‘‘uminusuminus’’EE11. place). place) EE →→ ((EE11)) EE..placeplace :=:= EE11.place;.place; EE.code :=.code := EE11.code.code FF →→ idid EE..placeplace := id.place; E.code :=:= id.place; E.code := ‘’‘’
  • 33. syntax-directed translation into three-address code productionproduction semantic rulessemantic rules SS →→ while Ewhile E do Sdo S11 S.begin := newlabel;S.begin := newlabel; S.after := newlabel;S.after := newlabel; SS..codecode :=:= gen(S.begingen(S.begin ‘‘::’’)) ‖‖ E.codeE.code ‖‖ gen(‘if’Egen(‘if’E.place.place ‘‘==’’ ‘‘00’’ ‘‘gotogoto’’ S.after)S.after) ‖‖ SS11.code.code ‖‖ gen(gen(‘‘gotogoto’’ S.begin)S.begin) ‖‖ gen(S.aftergen(S.after ‘‘::’’))
  • 34. Declarations • enter symbols in a symbol table • allocate space and record it in the symbol table • emit appropriate code
  • 35. Declarations in a procedure • computing types and relative address of names P → {offset := 0} D D → D ; D D → id : T {enter ( id.name, T.type, offset); offset := offset + T.width } T → integer {T.type := integer; T.width := 4 } T→ real {T.type := real; T.width := 8 } T→ array [ num ] of T1 {T.type := array (num.val, T1 .type); T.width := num.val × T1 .width} T→ ↑T1 {T.type := pointer (T1 .type); T.width := 4 }
  • 36. Synta x -Directed Translation to Three Address Code • Attributes for the Non-Terminals, say E and S – Location of the value of an expression: E.place – The Code that Evaluates the Expressions or Statement: E.code – Markers for beginning and end of sections of the code S.begin, S.end • Semantic Actions in Productions of the Grammar – Functions to create temporaries newtemp, and labels newlabel – Use Auxiliary functions to enter symbols and consult types corresponding to declarations in aside data structure that can be built as the code is being parsed - a symbol table. – To generate the code we use the emit function gen which creates a list of instructions to be emitted later and can generate symbolic labels corresponding to next instruction of a list. – Use of append function on lists of instructions. – Synthesized and Inherited Attributes
  • 37. Assignment Statements: Grammar and Actions S → id = E { p = lookup(id.name); if (p != NULL) S.code = gen(p ‘=‘ E.place); else error; S.code = nulllist; } E → E1 + E2 {E.place = newtemp(); E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘+’ E2.place); } E → E1 * E2 { E.place = newtemp(); E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘*’ E2.place); }
  • 38. Assignment Statements: Grammar and Actions E → - E1 {E.place = newtemp(); E.code = append(E1.code,gen(E.place ‘=‘ ‘-’ E1.place)); } E → (E1) {E.place = E1.place; E.code = E1.code; } E → id {p = lookup(id.name); if (p != NULL) E.place = p; else error; E.code = nulllist; }
  • 39. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E -
  • 40. Assignment: Example x = a * b + c * d - e * f; id E → id { p = lookup(id.name); if (p != NULL) E.place = p; else error; E.code = null list; } Production: S id = E E * id a id b x E E E * id c id d E E E * id e f E E + E - place = loc(e) code = null
  • 41. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - place = loc(f) code = null E → id { p = lookup(id.name); if (p != NULL) E.place = p; else error; E.code = null list; } Production: place = loc(e) code = null
  • 42. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - place = loc(f) code = null E → E1 * E2 {E.place = newtemp(); E.code = gen(E.place ‘=‘ E1.place ‘*’ E2.place);} Production: place = loc(e) code = null place = loc(t1) code = {t1 = e + f;}
  • 43. Assignment: E x ample x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - Production: E → E1 + E2 {E.place = newtemp(); E.code = gen(E.place ‘=‘ E1.place ‘+’ E2.place);} place = loc(f) code = null place = loc(e) code = null place = loc(t1) code = {t1 = e + f;} place = loc(d) code = null place = loc(c) code = null place = loc(t2) code = {t2 = c + d;}
  • 44. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - Production: place = loc(f) code = null place = loc(e) code = null place = loc(t1) code = {t1 = e * f;} place = loc(d) code = null place = loc(c) code = null place = loc(t2) code = {t2 = c * d;} S → id = E { p = lookup(id.name); if (p != NULL) E.code = append(E.code, gen(p ‘=‘ E.place)); else error; } place = loc(t3) code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; } place = loc(b) code = null place = loc(a) code = null place = loc(t4) code = {t4 = a * b;} place = loc(t5) code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3} code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3; x = t5;} place = loc(x) code = null
  • 45. Assignment: Example x = a * b + c * d - e * f; S id = E E* id a id b x E E E * id c id d E E E* id e id f E E + E - t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3; x = t5;
  • 46. Reusing Temporary Variables • Temporary Variables – Short lived – Used for Evaluation of Expressions – Clutter the Symbol Table • Change the newtemp Function – Keep track of when the value created in a temporary is used – Use a counter to keep track of the number of active temps – When a temporary is used in an expression decrement counter – When a temporary is generated by newtemp increment counter – Initialize counter to zero
  • 47. Assignment: Example • Only 2 Registers Needed x = a * b + c * d - e * f; S id = E E* id a id b x E E E * id c id d E E E* id e id f E E + E - // c = 0 t1 = e * f; // c = 1 t2 = c * d; // c = 2 t1 = t2 - t1; // c = 1 t2 = a * b; // c = 2 t1 = t2 + t1; // c = 1 x = t1; // c = 0
  • 48. Boolean & Relational Values How should the compiler represent them? • Answer depends on the target machine Two classic approaches • Numerical representation • Positional (implicit) representation Correct choice depends on both context and ISA
  • 49. • Issue: Control Flow Introduces Complications – In Both Representations – Need to Know Address to Jump To in Some Cases • Solution: Two Additional Attributes – nextstat (Inherited) Indicates the next location to be generated – laststat (Synthesized) Indicates the last location filled – As code is generated the attributes are filled with the correct value SDT Scheme for Boolean Expressions
  • 50. Boolean Expression: Grammar and Actions E → false {E.place = newtemp() E.code = {gen(E.place = 0)} E.laststat = E.nextstat + 1 } E → true {E.place = newtemp() E.code = {gen(E.place = 1)} E.laststat = E.nextstat + 1 }
  • 51. Boolean Expression: Grammar and Actions E → (E1) {E.place = E1.place; E.code = E1.code; E1.nextstat = E.nextstat E.laststat = E1.laststat } E → not E1 {E.place = newtemp() E.code = append(E1.code,gen(E.place = not E1.place)) E1.nextstat = E.nextstat E.laststat = E1.laststat + 1 }
  • 52. Boolean Expression: Grammar and Actions E → E1 or E2 {E.place = newtemp() E.code = append(E1.code,E2.code,gen(E.place = E1.place or E2.place) E1.nextstat = E.nexstat E2.nextstat = E1.laststat E.laststat = E2.laststat + 1 }
  • 53. Boolean Expression: Grammar and Actions E → E1 and E2 {E.place = newtemp() E.code = append(E1.code,E2.code,gen(E.place = E1.place and E2.place) E1.nextstat = E.nexstat E2.nextstat = E1.laststat E.laststat = E2.laststat + 1 }
  • 54. Boolean Expression: Grammar and Actions E → id1 relop id2 { E.place = newtemp() E.code = gen(if id1.place relop id2.place goto E.nextstat+3) E.code = append(E.code,gen(E.place = 0)) E.code = append(E.code,gen(goto E.nextstat+2)) E.code = append(E.code,gen(E.place = 1)) E.laststat = E.nextstat + 4 }
  • 55. Boolean Expressions: Example a < b or c < d and e < f 00: if a < b goto 03 01: t1 = 0 02: goto 04 03: t1 = 1 04: if c < d goto 07 05: t2 = 0 06: goto 08 07: t2 = 1 08: if e < f goto 11 09: t3 = 0 10: goto 12 11: t3 = 1 12: t4 = t2 and t3 13: t5 = t1 or t4 id relop id E E E id relop id E id relop id E a b c d e f<< < or and
  • 56. Control Flow Statements: Code Layout E.code S1.code S → if E then S1 S → if E then S1else S2 E.code S1.code S2.code goto S.next to E.true to E.false E.true: E.false: to E.true to E.false E.true: E.false: S.next: • Attributes: – E.true: the label to which control flows if E is true – E.false: the label to which control flows if E is false – S.next: an inherited attribute with the symbolic label of the code following S
  • 57. Control Flow Statements: Code Layout E.code S1.code S → while E do S1 goto S.begin to E.true to E.false E.true: E.false: S.begin: • Difficulty: Need to know where to jump to – Introduce a symbolic labels using the newlabel function – Use inherited attributes – Backpatch it later with the actual value (later…)
  • 58. Control Flow Statements: Grammar and Actions S → if E then S1 { E.true = newlabel E.false = S.next S1.next = S.next S.code = append(E.code,gen(E.true:),S1.code) }
  • 59. Control Flow Statements: Grammar and Actions S → if E then S1 else S2 { E.true = newlabel E.false = newlabel S1.next = S.next S2.next = S.next S.code = append(E.code,gen(E.true:),S1.code, gen(goto S.next),gen(E.false :),S2.code) }
  • 60. Control Flow Statements: Grammar and Actions S → while E do S1 { S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:), E.code, gen(E.true:), S1.code, gen(goto S.begin) }
  • 61. Control Flow Translation of Boolean Expressions • Short-Circuit Evaluation – No Need to Evaluate portions of the expression if the outcome is already determined – Examples: • E1 or E2 need not evaluate E2 if E1 is known to be true. • E1 and E2 need not evaluate E2 if E1 is known to be false. • Use Control Flow – Jump over code that evaluates boolean terms of the expression – Use Inherited E.false and E.true attributes and link evaluation of E
  • 62. Control Flow Translation of Boolean Expressions E → E1 or E2 { E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.false:),E2.code) } E → E1 and E2 {E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.true:),E2.code) }
  • 63. Control Flow Translation of Boolean Expressions E → id1 relop id2 {E.code = append(gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) } E → true {E.code = gen(goto E.true) } E → false {E.code = gen(goto E.false) } E → not E1 {E1.true = E.false E1.false = E.true E.code = E1.code } E → ( E1 ) { E1.true = E.true E1.false = E.false E.code = E1.code }
  • 64. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or andid relop id
  • 65. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or and E.true = Ltrue E.false = Lfalse E1.true = Ltrue E1.false = L1 id relop id E → id1 relop id2 ‖ E.code = append( gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) E → E1 or E2 ‖ E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code, gen(E1.false:),E2.code) E2.true = Ltrue E2.false = Lfalse if a < b goto Ltrue goto L1 L1:
  • 66. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f id relop id E → id1 relop id2 ‖ E.code = append( gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) E → E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code, gen(E1.true:),E2.code) if a < b goto Ltrue goto L1 E2.true = Ltrue E2.false = Lfalse L1: if c < d goto L2 goto Lfalse L2: E E E id relop id E id relop id E a b c d e f<< < or and E.true = Ltrue E.false = Lfalse E1.true = Ltrue E1.false = L1 E2.true = Ltrue E2.false = Lfalse E1.true = L2 E1.false = Lfalse
  • 67. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or and E.true = Ltrue E.false = Lfalse E1.true = Ltrue E1.false = L1 id relop id E → id1 relop id2 ‖ E.code = append( gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) E → E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code, gen(E1.true:),E2.code) E2.true = Ltrue E2.false = Lfalse if a < b goto Ltrue goto L1 E2.true = Ltrue E2.false = Lfalse E1.true = L2 E1.false = Lfalse L1: if c < d goto L2 goto Lfalse L2: if e < f goto Ltrue goto Lfalse
  • 68. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or andid relop id if a < b goto Ltrue goto L1 L1: if c < d goto L2 goto Lfalse L2: if e < f goto Ltrue goto Lfalse
  • 69. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin) S S E id relop id S E b c d< < do if id relop id while a then Sthen
  • 70. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1
  • 71. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2:
  • 72. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3:E.true = L3 E.false = L4 S1.next = L1 S2.next = L1
  • 73. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t1 = x + z x = t1 goto L1 L4: E.true = L3 E.false = L4 S1.next = L1 S2.next = L1
  • 74. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t1 = x + z x = t1 goto L1 L4: t2 = x - z x = t2 goto L1 Lnext: E.true = L3 E.false = L4 S1.next = L1 S2.next = L1
  • 75. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t1 = x + z x = t1 goto L1 L4: t2 = x - z x = t2 goto L1 Lnext:
  • 76. Loop Constructs Loops • Evaluate condition before loop (if needed) • Evaluate condition after loop • Branch back to the top (if needed) Why this structure? • Merges test with last block of loop body • Pre-test block to hold loop-invariant code • Post-test for increment instructions and test while, for, do, & until all fit this basic model Pre-test Loop head Post-test Next block B1 B2
  • 77. Break & Skip Statements Many modern programming languages include a break • Exits from the innermost control-flow statement – Out of the innermost loop – Out of a case statement Translates into a jump • Targets statement outside control- flow construct • Creates multiple-exit construct • skip in loop goes to next iteration Only make sense if loop has > 1 block Pre-test Loop head Post-test Next block B1 B2Break in B1 Skip in B2
  • 78. Break and Skip Statements • Need to Keep track of enclosing control-flow constructs • Harder to have clean SDT scheme… – Keep a Stack of control-flow constructs – Using S.next as in the stack as the target for the break statement – For skip statements need to keep track of the label of the code of the post-test block to advance to the next iteration. This is harder since the code has not been generated yet. • Backpatching helps – Use a breaklist and a skiplist to be patched later.
  • 79. Backpatching • Single Pass Solution to Code Generation? – No more symbolic labels - symbolic addresses instead – Emit code directly into an array of instructions – Actions associated with Productions – Executed when Bottom-Up Parser “Reduces” a production • Problem – Need to know the labels for target branches before actually generating the code for them. • Solution – Leave Branches undefined and patch them later – Requires: carrying around a list of the places that need to be patched until the value to be patched with is known.
  • 80. Boolean Expressions Revisited • Use Additional ε-Production – Just a Marker M – Label Value M.addr • Attributes: – E.truelist: code places that need to be filled-in corresponding to the evaluation of E as “true”. – E.falselist: same for “false” (1) E → E1 or M E2 (2) | E1 and M E2 (3) | not E1 (4) | ( E1 ) (5) | id1 relop id2 (6) | true (7) | false (8) M → ε
  • 81. Boolean Expressions: Code Outline E1.code E2.code E1 and E2 false ? true false ?true E1.code E2.code E1 or E2 true false ? false true
  • 82. Action (8) M → ε { M.Addr := nextAddr; } (1) E → E1 or M E2 { backpatch(E1.falselist,M.Addr); E.truelist := merge(E1.truelist,E2.truelist); E.falselist := E2.falselist; } (2) E → E1 and M E2 { backpatch(E1.truelist,M.Addr); E.truelist := E2.truelist; E.falselist := merge(E1.falselist, E2.falselist); }
  • 83. (3) E → not E1 {E.truelist := E1 .falselist; E.falselist := E1 .truelist;} (4) E → ( E1 ) {E.truelist := E1 .truelist; E.falselist := E1 .falselist;} (6) E → true {E.truelist := makelist(nextquad); emit(‘goto _’);} (7) E → false {E.falselist := makelist(nextquad); emit(‘goto _’);} More Actions
  • 84. Backpatching Example E.truelist = E.falselist = E.truelist = E.falselist = E.truelist = E.falselist = E.truelist = E.falselist = E.truelist = E.falselist = M.addr = M.addr =or and a < b c < d e < f e e E E E E E M M E E.truelist E.falselist M.addr M Generated CodeExecuting Action { E.truelist := makelist(nextquad()); E.falselist := makelist(nextquad()); emit(“if id1.place relop.op id2.place goto _”); emit(“goto _”); } 100: if a < b goto _ 101: goto_ 102: if c < d goto _ 103: goto_ { M.quad = nextquad(); } 104: if e < f goto _ 105: goto_ { backpatch(E1.falselist,M.quad); E.truelist := merge(E1.truelist,E2.truelist); E.falselist := E2.falselist; } { backpatch(E1.truelist,M.quad); E.truelist := E2.truelist; E.falselist := merge(E1.falselist,E2.falselist; } 102: if c < d goto 104 103: goto_ 100: if a < b goto _ 101: goto 102 {100} {101} {102} {103} 102 104 {104} {105} {104} {103, 105} {103, 105} {100, 104}
  • 85. Control Flow Code Structures . . . E.code S1.codeE.true: E.false: if E then S1 . . . E.code S1.codeE.true: E.false: if E then S1 else S2 S.next: S2.code goto S.next . . . E.code S1.codeE.true: E.false: while E do S1 goto S.begin S.begin: