2. UNIT I Compiling TAMILSELVI T 2
COMPILERS
Compilers are basically a Translators
A compiler is a program takes a program written in a source
language and translates it into an equivalent program in a target
language.
3. UNIT I Compiling TAMILSELVI T 3
Analysis of Source Program
Analysis consists of three phases:
Linear Analysis (Lexical Analysis or Scanning)
Group the source string into Tokens
Hierarchical Analysis (Syntax Analysis or Parsing)
Tokens are grouped into nested collections
Semantic Analysis
Ensure the correctness of the source program
5. UNIT I Compiling TAMILSELVI T 5
Lexical Analysis
Linear Analysis (Lexical Analysis or Scanning)
Scan the complete source code
Source program is broken up into group of strings called tokens
Example: position := initial+rate*60
The identifier position
The Assignment Symbol
The identifier initial
The plus sign
The identifier rate
The multiplication sign
The constant 60
6. UNIT I Compiling TAMILSELVI T 6
Syntax Analysis
Hierarchical Analysis (Parsing)
Grouping the tokens into grammatical Phrases
Grammatical Phrases are represented by Parse Tree or Syntax
Tree
7. UNIT I Compiling TAMILSELVI T 7
Semantic Analysis
Type checking
checks the source program for semantic errors
collects the type information
8. UNIT I Compiling TAMILSELVI T 8
Intermediate Code Generation
Three Address Code
Sequence of instructions each of which has atmost three
operands
Properties:
Easy to produce
Easy to translate into target program
Example
t1 := inttoreal(60)
t2 := id3*t1
t3 := id2+t2
id1 := t3
9. UNIT I Compiling TAMILSELVI T 9
Code Optimization
Improve the intermediate code (Improve the running time of the
target program)
Example
t1 := id3*60.0
id1 := id2+t1
10. UNIT I Compiling TAMILSELVI T 10
Code Generation
Intermediate code instruction are translated into sequence of
machine instruction
Example
MOVF id3,R2
MULF #60.0,R2
MOVF id2,R1
ADDF R2,R1
MOVF R1,id1
11. UNIT I Compiling TAMILSELVI T 11
Symbol Table Management
A data structure containing a record for each identifier
Stores the information about the attributes of all identifiers
Attributes : name, type, size etc.
Error Detection and Handling
It detects and handle the errors (Syntax error, Semantic errors)
Errors are reported in the form of messages
12. UNIT I Compiling TAMILSELVI T 12
Cousins of Compiler
Preprocessors
Macro processing
File inclusion
Assemblers
Loaders and Link-Editors
13. UNIT I Compiling TAMILSELVI T 13
Grouping of Phases
Front end
Primarily depends on the source language and independent
on the target language
It includes lexical analysis, Syntax analysis and semantic
analysis
Back end
Primarily depends on the target language and independent
on the source language
It includes code generation and code optimization
14. UNIT I Compiling TAMILSELVI T 14
Compiler Construction Tools
Parser Generator
Scanner Generator
Syntax directed translation Engine
Automatic code generator
Data flow engine
15. UNIT I Compiling TAMILSELVI T 15
Lexical Analysis
Input is scanned completely to identify the tokens
Tokens (Logical unit)
Identifier, Keywords, operators etc.
16. UNIT I Compiling TAMILSELVI T 16
Specification of Tokens
Strings and Languages
Finite sequence of Symbols is called Strings
Set of strings over some alphabet is called Language
Operation on Languages
Concatenation:
L1L2 = { s1s2| s1∈ L1 and s2∈ L2}
Union
L1 ∪ L2 = { s| s∈ L1 or s∈ L2}
Kleene Closure
L* =
Positive Closure
L+ =
Regular Expressions
∞
=1i
i
L
∞
=0i
i
L
17. UNIT I Compiling TAMILSELVI T 17
Regular Expression
Notation for representing Tokens
Ex: Identifiers in Pascal
letter → A | B | ... | Z | a | b | ... | z
digit → 0 | 1 | ... | 9
id → letter (letter | digit ) *
18. UNIT I Compiling TAMILSELVI T 18
Recognition of Tokens
Finite Automata
DFA –
NFA –
10 2
a b
start
a
b
10 2
ba
b
b a
19. UNIT I Compiling TAMILSELVI T 19
Questions
Part -A
What are the issues of Lexical analyzer?
Define compiler.
What is preprocessor?
What is finite automata?
Draw the transition diagram to represent relational operators.
Define Lexeme, pattern.
Part -B
Explain the various phases of compiler.
Explain the compiler construction tools.
Explain input buffering in detail
Write and explain about specification of tokens
Construct NFA from the regular expression.