1. Course Overview Chapter 3 Compilation
PART I: overview material So far we have treated language processors (including
1 Introduction compilers) as “black boxes”
2 Language processors (tombstone diagrams, bootstrapping)
3 Architecture of a compiler GOAL this lecture:
PART II: inside a compiler – A first look "inside the box": how to build compilers.
4 Syntax analysis – Different “phases” and their relationships.
5 Contextual analysis
6 Runtime organization
7 Code generation
PART III: conclusion
8 Interpretation
9 Review
Compilation (Chapter 3) 1 Compilation (Chapter 3) 2
The Major “Phases” of a Compiler Different Phases of a Compiler
The different phases can be seen as different
Source Program transformation steps to transform source code into
object code.
Syntax Analysis Error Reports The different phases correspond roughly to the different
Abstract Syntax Tree parts of the language specification:
Contextual Analysis Error Reports • Syntax analysis <-> Syntax
• Contextual analysis <-> Contextual constraints
Decorated Abstract Syntax Tree
• Code generation <-> Semantics
Code Generation
Object Code
Compilation (Chapter 3) 3 Compilation (Chapter 3) 4
Example Program 1) Syntax Analysis
We now look at each of the three different phases in a little
more detail. We look at each of the steps in transforming Source Program
an example Triangle program into TAM code.
Syntax Analysis Error Reports
! This program is useless except for
! This program is useless except for
! illustration
! illustration
let var n: integer;
let var n: integer; Abstract Syntax Tree Note: Not all compilers construct an
Note: Not all compilers construct an
var c: char
var c: char explicit representation of an AST. (e.g. on
explicit representation of an AST. (e.g. on
in begin
in begin a “single pass compiler” generally no need
a “single pass compiler” generally no need
c := ‘&’;
c := ‘&’; to construct an AST)
to construct an AST)
n := n+1
n := n+1
end
end
Compilation (Chapter 3) 5 Compilation (Chapter 3) 6
1
2. 1) Syntax Analysis --> AST 2) Contextual Analysis --> Decorated AST
Program Abstract Syntax Tree
LetCommand
Contextual Analysis Error Reports
SequentialCommand
Decorated Abstract Syntax Tree
SequentialDeclaration AssignCommand
Contextual analysis:
AssignCommand BinaryExpr
• Scope checking: verify that all applied occurrences of
identifiers are declared
VarDecl VarDecl Char.Expr VNameExp Int.Expr • Type checking: verify that all operations in the program are
used according to their type rules.
SimpleT SimpleT SimpleV SimpleV
Annotate AST:
Ident Ident Ident Ident Ident Char.Lit Ident Ident Op Int.Lit • Applied identifier occurrences => declaration
• Expressions => Type
n Integer c Char c ‘&’ n n + 1
Compilation (Chapter 3) 7 Compilation (Chapter 3) 8
2) Contextual Analysis --> Decorated AST Contextual Analysis
Program Finds scope and type errors.
LetCommand Example 1:
AssignCommand ***TYPE ERROR
SequentialCommand (incompatible types in AssignCommand)
:char :int
SequentialDeclaration AssignCommand
AssignCommand SimpleV BinaryExpr :int Example 2:
:int
VarDecl VarDecl Char.Expr VNameExp Int.Expr foo not found
:char :int :int
SimpleT SimpleT SimpleV SimpleV SimpleV ***SCOPE ERROR
:char :int
Ident Ident Ident Ident Ident Char.Lit Ident Ident Op Int.Lit Ident (undeclared variable foo)
n Integer c Char c ‘&’ n n + 1 foo
Compilation (Chapter 3) 9 Compilation (Chapter 3) 10
3) Code Generation 3) Code Generation
Decorated Abstract Syntax Tree let var n: integer; PUSH 2
var c: char LOADL 38
Code Generation in begin STORE 1[SB]
c := ‘&’; LOAD 0[SB]
Object Code n := n+1 LOADL 1
end CALL add
STORE 0[SB]
• Assumes that program has been thoroughly POP 2
VarDecl address = 0[SB] HALT
checked and is well formed (scope & type rules)
SimpleT
• Takes into account semantics of the source
Ident Ident
language as well as the target language.
• Transforms source program into target code. n Integer
Compilation (Chapter 3) 11 Compilation (Chapter 3) 12
2
3. Compiler Passes Single Pass Compiler
• A “pass” is a complete traversal of the source program, A single pass compiler makes a single pass over the source text,
or a complete traversal of some internal representation parsing, analyzing, and generating code all at once.
of the source program (such as an AST).
• A pass can correspond to a “phase” but it does not have
Dependency diagram of a typical Single Pass Compiler:
to!
Compiler Driver
• Sometimes a single pass corresponds to several phases
that are interleaved in time. calls
• What and how many passes a compiler does over the
Syntactic Analyzer
source program is an important design decision.
calls calls
Contextual Analyzer Code Generator
Compilation (Chapter 3) 13 Compilation (Chapter 3) 14
Multi Pass Compiler Example: Single Pass Compilation of ...
A multi pass compiler makes several passes over the program. The let var n: integer;
output of a preceding phase is stored in a data structure and used by var c: char
subsequent phases. in begin
c := ‘&’;
Dependency diagram of a typical Multi Pass Compiler: PUSH 2
n := n+1
LOADL 38
Compiler Driver end
STORE 1[SB]
calls calls LOAD 0[SB]
calls LOADL 1
Syntactic Analyzer Contextual Analyzer Code Generator CALL add
Ident Type Address STORE 0[SB]
input output input output input output n int 0[SB] POP 2
c char 1[SB] HALT
Source Text AST Decorated AST Object Code
Compilation (Chapter 3) 15 Compilation (Chapter 3) 16
Compiler Design Issues Language Issues
Single Pass Multi Pass Example Pascal:
Pascal was explicitly designed to be easy to implement
Speed better worse with a single pass compiler:
– Every identifier must be declared before its first use.
Memory better for (potentially) better
large programs for small programs ?
Modularity worse better var n:integer; procedure inc;
begin
Flexibility worse better procedure inc; n:=n+1
begin
end; Undeclared Variable!
“Global” optimization impossible possible n:=n+1
end var n:integer;
Source Language single pass compilers are not possible
for many programming languages
Compilation (Chapter 3) 17 Compilation (Chapter 3) 18
3
4. Language Issues Language Issues
Example Pascal: Example Pascal:
– Every identifier must be declared before it is used. – Every identifier must be declared before it is used.
– How to handle mutual recursion then? – How to handle mutual recursion then?
procedure ping(x:integer) forward procedure pong(x:integer)
begin
procedure ping(x:integer)
... pong(x-1); ...
begin
end;
... pong(x-1); ...
procedure pong(x:integer) end;
OK!
begin
procedure pong(x:integer)
... ping(x–1); ...
begin
end;
... ping(x–1); ...
end;
Compilation (Chapter 3) 19 Compilation (Chapter 3) 20
Example: The Triangle Compiler Driver
public class Compiler {
public class Compiler {
public static void compileProgram(...) {
public static void compileProgram(...) {
Parser parser = new Parser(...);
Parser parser = new Parser(...);
Checker checker = new Checker(...);
Checker checker = new Checker(...);
Encoder generator = new Encoder(...);
Encoder generator = new Encoder(...);
Program theAST = parser.parse( );
Program theAST = parser.parse( ); // first pass
// first pass
checker.check(theAST );
checker.check(theAST ); // second pass
// second pass
generator.encode(theAST );
generator.encode(theAST ); // third pass
// third pass
}
}
public static void main(String[ ]] args )) {
public static void main(String[ args {
... compileProgram(...); ...
... compileProgram(...); ...
}
}
}
}
Compilation (Chapter 3) 21
4