SlideShare ist ein Scribd-Unternehmen logo
1 von 91
Language processing:
introduction to compiler
construction
Andy D. Pimentel
Computer Systems Architecture group
andy@science.uva.nl
http://www.science.uva.nl/~andy/taalverwerking.html
About this course
• This part will address compilers for programming
languages
• Depth-first approach
– Instead of covering all compiler aspects very briefly,
we focus on particular compiler stages
– Focus: optimization and compiler back issues

• This course is complementary to the compiler
course at the VU
• Grading: (heavy) practical assignment and one
or two take-home assignments
About this course (cont’d)
• Book
– Recommended, not compulsory: Seti, Aho and
Ullman,”Compilers Principles, Techniques and Tools”
(the Dragon book)
– Old book, but still more than sufficient
– Copies of relevant chapters can be found in the library

• Sheets are available at the website
• Idem for practical/take-home assignments,
deadlines, etc.
Topics
• Compiler introduction
– General organization

• Scanning & parsing
– From a practical viewpoint: LEX and YACC

• Intermediate formats
• Optimization: techniques and algorithms
–
–
–
–
–

Local/peephole optimizations
Global and loop optimizations
Recognizing loops
Dataflow analysis
Alias analysis
Topics (cont’d)
• Code generation
– Instruction selection
– Register allocation
– Instruction scheduling: improving ILP

• Source-level optimizations
– Optimizations for cache behavior
Compilers:
general organization
Compilers: organization
Source

Frontend IR

Optimizer IR

Machine
Backend
code

• Frontend
–
–
–
–

Dependent on source language
Lexical analysis
Parsing
Semantic analysis (e.g., type checking)
Compilers: organization (cont’d)
Source

Frontend IR

Optimizer IR

Machine
Backend
code

• Optimizer
–
–
–
–

Independent part of compiler
Different optimizations possible
IR to IR translation
Can be very computational intensive part
Compilers: organization (cont’d)
Source

Frontend IR

Optimizer IR

• Backend
–
–
–
–
–

Dependent on target processor
Code selection
Code scheduling
Register allocation
Peephole optimization

Machine
Backend
code
Frontend
Introduction to parsing
using LEX and YACC
Overview
• Writing a compiler is difficult requiring lots of time
and effort
• Construction of the scanner and parser is routine
enough that the process may be automated
Lexical Rules
Grammar
Semantics

Compiler
Compiler

Scanner
--------Parser
--------Code
generator
YACC
• What is YACC ?
– Tool which will produce a parser for a given
grammar.
– YACC (Yet Another Compiler Compiler) is a program
designed to compile a LALR(1) grammar and to
produce the source code of the syntactic analyzer of
the language produced by this grammar
– Input is a grammar (rules) and actions to take upon
recognizing a rule
– Output is a C program and optionally a header file of
tokens
LEX
• Lex is a scanner generator
– Input is description of patterns and actions
– Output is a C program which contains a function yylex()
which, when called, matches patterns and performs
actions per input
– Typically, the generated scanner performs lexical
analysis and produces tokens for the (YACC-generated)
parser
LEX and YACC: a team

How to
work ?
LEX and YACC: a team
call yylex()

[0-9]+

next token is NUM

NUM ‘+’ NUM
Availability
•
•
•
•
•

lex, yacc on most UNIX systems
bison: a yacc replacement from GNU
flex: fast lexical analyzer
BSD yacc
Windows/MS-DOS versions exist
YACC

Basic Operational Sequence
gram.y
yacc

y.tab.c
cc
or gcc

a.out

File containing desired
grammar in YACC format
YACC program
C source program created by YACC

C compiler
Executable program that will parse
grammar given in gram.y
YACC File Format
Definitions
%%
Rules
%%
Supplementary Code

The identical LEX format was
actually taken from this...
Rules Section
• Is a grammar
• Example
expr : expr '+' term | term;
term : term '*' factor | factor;
factor : '(' expr ')' | ID | NUM;
Rules Section
• Normally written like this
• Example:
expr

:
|
;
term
:
|
;
factor :
|
|
;

expr '+' term
term
term '*' factor
factor
'(' expr ')'
ID
NUM
Definitions Section
Example

%{
#include <stdio.h>
#include <stdlib.h>
%}
This is called a
terminal
%token ID NUM
%start expr
The start
symbol
(non-terminal)
Sidebar
• LEX produces a function called yylex()
• YACC produces a function called yyparse()
• yyparse() expects to be able to call yylex()
• How to get yylex()?
• Write your own!
• If you don't want to write your own: Use LEX!!!
Sidebar
int yylex()
{
if(it's a num)
return NUM;
else if(it's an id)
return ID;
else if(parsing is done)
return 0;
else if(it's an error)
return -1;
}
Semantic actions
expr :
|
;
term :
|
;
factor

expr '+' term
term

{ $$ = $1 + $3; }
{ $$ = $1; }

term '*' factor
factor

{ $$ = $1 * $3; }
{ $$ = $1; }

: '(' expr ')'
| ID
| NUM
;

{ $$ = $2; }
Semantic actions (cont’d)
$1
expr :
|
;
term :
|
;
factor

expr '+' term
term

{ $$ = $1 + $3; }
{ $$ = $1; }

term '*' factor
factor

{ $$ = $1 * $3; }
{ $$ = $1; }

: '(' expr ')'
| ID
| NUM
;

{ $$ = $2; }
Semantic actions (cont’d)
expr :
|
;
term :
|
;
factor

expr '+' term
term

{ $$ = $1 + $3; }
{ $$ = $1; }

term '*' factor
factor

{ $$ = $1 * $3; }
{ $$ = $1; }

: '(' expr ')'
| ID
| NUM
;

{ $$ = $2; }

$2
Semantic actions (cont’d)
expr :
|
;
term :
|
;
factor

expr '+' term
term

{ $$ = $1 + $3; }
{ $$ = $1; }

term '*' factor
factor

{ $$ = $1 * $3; }
{ $$ = $1; }

: '(' expr ')'
| ID
| NUM
;

{ $$ = $2; }

$3
Default: $$ = $1;
Bored, lonely? Try this!
yacc -d gram.y
• Will produce:
y.tab.h

Look at this and you'll
Look at this and you'll
never be unhappy again!
never be unhappy again!

yacc -v gram.y
• Will produce:
y.output

Shows "State
Shows "State
Machine"®
Machine"®
Example: LEX
%{
#include <stdio.h>
#include "y.tab.h"
%}
id
[_a-zA-Z][_a-zA-Z0-9]*
wspc
[ tn]+
semi
[;]
comma
[,]
%%
int
{ return INT; }
char
{ return CHAR; }
float
{ return FLOAT; }
{comma}
{ return COMMA; }
{semi}
{ return SEMI; }
{id}
{ return ID;}
{wspc}
{;}

/* Necessary? */

scanner.l
Example: Definitions
%{
#include <stdio.h>
#include <stdlib.h>
%}
%start line
%token CHAR, COMMA, FLOAT, ID, INT, SEMI
%%

decl.y
Example: Rules
/*
*
*
*

decl.y

This production is not part of the "official"
grammar. It's primary purpose is to recover from
parser errors, so it's probably best if you leave
it here. */

line : /* lambda */
| line decl
| line error {
printf("Failure :-(n");
yyerrok;
yyclearin;
}
;
Example: Rules
decl :

type ID list { printf("Success!n"); } ;

list : COMMA ID list
| SEMI
;
type :
INT | CHAR | FLOAT
;
%%

decl.y
decl.y

Example: Supplementary Code
extern FILE *yyin;
main()
{
do {
yyparse();
} while(!feof(yyin));
}
yyerror(char *s)
{
/* Don't have to do anything! */
}
Bored, lonely? Try this!
yacc -d decl.y
• Produced
y.tab.h
#
#
#
#
#
#

define
define
define
define
define
define

CHAR 257
COMMA 258
FLOAT 259
ID 260
INT 261
SEMI 262
Symbol attributes
• Back to attribute grammars...
• Every symbol can have a value
– Might be a numeric quantity in case of a number (42)
– Might be a pointer to a string ("Hello, World!")
– Might be a pointer to a symbol table entry in case of a
variable

• When using LEX we put the value into yylval
– In complex situations yylval is a union

• Typical LEX code:
[0-9]+

{yylval = atoi(yytext); return NUM}
Symbol attributes (cont’d)
• YACC allows symbols to have multiple types of
value symbols
%union {
double dval;
int
vblno;
char* strval;
}
Symbol attributes (cont’d)
%union {
double dval;
int
vblno;
char* strval;
}

yacc -d

y.tab.h
…
extern YYSTYPE yylval;

[0-9]+

{ yylval.vblno = atoi(yytext);
return NUM;}
[A-z]+ { yylval.strval =
strdup(yytext);
return STRING;}

LEX file
include “y.tab.h”
Precedence / Association
(1) 1 – 2 - 3
(2) 1 – 2 * 3

1. 1-2-3 = (1-2)-3? or 1-(2-3)?
Define ‘-’ operator is left-association.
2. 1-2*3 = 1-(2*3)
Define “*” operator is precedent to “-” operator
Precedence /
Association
%left '+' '-'
%left '*' '/'
%noassoc UMINUS

expr :
|
|
|

expr
expr
expr
expr

‘+’
‘-’
‘*’
‘/’

expr
{ $$ = $1 + $3; }
expr
{ $$ = $1 - $3; }
expr
{ $$ = $1 * $3; }
expr { if($3==0)
yyerror(“divide 0”);
else
$$ = $1 / $3;
}
| ‘-’ expr %prec UMINUS {$$ = -$2; }
Precedence / Association
%right
%left
%left
%left

‘=‘
'<' '>' NE LE GE
'+' '-‘
'*' '/'
highest precedence
Big trick
Getting YACC & LEX to work together!
LEX & YACC

lex.yy.c

y.tab.c

cc/
gcc

a.out
Building Example
• Suppose you have a lex file called scanner.l
and a yacc file called decl.y and want parser
• Steps to build...
lex scanner.l
yacc -d decl.y
gcc -c lex.yy.c y.tab.c
gcc -o parser lex.yy.o y.tab.o -ll
Note: scanner should include in the definitions
section: #include "y.tab.h"
YACC
• Rules may be recursive
• Rules may be ambiguous
• Uses bottom-up Shift/Reduce parsing
– Get a token
– Push onto stack
– Can it be reduced (How do we know?)
• If yes: Reduce using a rule
• If no: Get another token

• YACC cannot look ahead more than one token
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

stack:
<empty>

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
a = 7; b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

SHIFT!
stack:
NAME

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
= 7; b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

SHIFT!
stack:
NAME ‘=‘

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
7; b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

SHIFT!
stack:
NAME ‘=‘ 7

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
; b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

REDUCE!
stack:
NAME ‘=‘ exp

| exp ‘-’ exp
| NAME
| NUMBER

input:
; b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

REDUCE!
stack:
stmt

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
; b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

SHIFT!
stack:
stmt ‘;’

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
b = 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

SHIFT!
stack:
stmt ‘;’ NAME

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
= 3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

SHIFT!
stack:
stmt ‘;’ NAME ‘=‘

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
3 + a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

SHIFT!
stack:
stmt ‘;’ NAME ‘=‘
NUMBER

| exp ‘-’ exp
| NAME
| NUMBER

input:
+ a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

REDUCE!
stack:
stmt ‘;’ NAME ‘=‘
exp

| exp ‘-’ exp
| NAME
| NUMBER

input:
+ a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

SHIFT!
stack:
stmt ‘;’ NAME ‘=‘
exp ‘+’

| exp ‘-’ exp
| NAME
| NUMBER

input:
a + 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

SHIFT!
stack:
stmt ‘;’ NAME ‘=‘
exp ‘+’ NAME

| exp ‘-’ exp
| NAME
| NUMBER

input:
+ 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

REDUCE!
stack:
stmt ‘;’ NAME ‘=‘
exp ‘+’ exp

| exp ‘-’ exp
| NAME
| NUMBER

input:
+ 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

REDUCE!
stack:
stmt ‘;’ NAME ‘=‘
exp

| exp ‘-’ exp
| NAME
| NUMBER

input:
+ 2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

SHIFT!
stack:
stmt ‘;’ NAME ‘=‘
exp ‘+’

| exp ‘-’ exp
| NAME
| NUMBER

input:
2
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp

SHIFT!
stack:
stmt ‘;’ NAME ‘=‘
exp ‘+’ NUMBER

| exp ‘-’ exp
| NAME
| NUMBER

input:
<empty>
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

REDUCE!
stack:
stmt ‘;’ NAME ‘=‘
exp ‘+’ exp
input:
<empty>
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

REDUCE!
stack:
stmt ‘;’ NAME ‘=‘
exp
input:
<empty>
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

REDUCE!
stack:
stmt ‘;’ stmt

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
<empty>
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

REDUCE!
stack:
stmt

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
<empty>
Shift and reducing
stmt: stmt ‘;’ stmt
| NAME ‘=‘ exp

DONE!
stack:
stmt

exp: exp ‘+’ exp
| exp ‘-’ exp
| NAME
| NUMBER

input:
<empty>
IF-ELSE Ambiguity
• Consider following rule:

Following state : IF expr IF expr stmt . ELSE stmt
• Two possible derivations:
IF expr IF expr stmt . ELSE stmt
IF expr IF expr stmt ELSE . stmt
IF expr IF expr stmt ELSE stmt .
IF expr stmt

IF expr IF expr stmt . ELSE stmt
IF expr stmt . ELSE stmt
IF expr stmt ELSE . stmt
IF expr stmt ELSE stmt .
IF-ELSE Ambiguity
• It is a shift/reduce conflict
• YACC will always do shift first
• Solution 1 : re-write grammar
stmt

: matched
| unmatched
;
matched: other_stmt
| IF expr THEN matched ELSE matched
;
unmatched: IF expr THEN stmt
| IF expr THEN matched ELSE unmatched
;
IF-ELSE Ambiguity
• Solution 2:
the rule has the
same precedence as
token IFX
Shift/Reduce Conflicts
• shift/reduce conflict
– occurs when a grammar is written in such a way that
a decision between shifting and reducing can not be
made.
– e.g.: IF-ELSE ambiguity

• To resolve this conflict, YACC will choose to shift
Reduce/Reduce Conflicts
• Reduce/Reduce Conflicts:
start : expr | stmt
;
expr : CONSTANT;
stmt : CONSTANT;

• YACC (Bison) resolves the conflict by
reducing using the rule that occurs earlier in
the grammar. NOT GOOD!!
• So, modify grammar to eliminate them
Error Messages
• Bad error message:
– Syntax error
– Compiler needs to give programmer a good advice

• It is better to track the line number in LEX:
void yyerror(char *s)
{
fprintf(stderr, "line %d: %sn:", yylineno, s);
}
Recursive Grammar
• Left recursion

• Right recursion

list:
item
| list ',' item
;

list:
item
| item ',' list
;

• LR parser prefers left recursion
• LL parser prefers right recursion
YACC Example
• Taken from LEX & YACC
• Simple calculator
a = 4 + 6
a
a=10
b = 7
c = a + b
c
c = 17
pressure = (78 + 34) * 16.4
$
Grammar
expression ::= expression '+' term |
expression '-' term |
term
term

::= term '*' factor |
term '/' factor |
factor

factor

::= '(' expression ')' |
'-' factor |
NUMBER |
NAME
parser.h
2

name

value

3

name

value

name

value

5

name

value

6

name

value

name

value

name

value

name

value

name

value

11

name

value

12

parser.h

value

10

struct symtab *symlook();

name

9

struct symtab {
char *name;
double value;
} symtab[NSYMS];

1

8

/* maximum number
of symbols */

value

7

#define NSYMS 20

name

4

/*
* Header for calculator program
*/

0

name

value

13

name

value

14

name

value
parser.y
%{
#include "parser.h"
#include <string.h>
%}
%union {
double dval;
struct symtab *symp;
}
%token <symp> NAME
%token <dval> NUMBER
%type <dval> expression
%type <dval> term
%type <dval> factor
%%

parser.y
statement_list:
|
;
statement:
|
;

statement 'n'
statement_list statement 'n‘

NAME '=' expression
{ $1->value = $3; }
expression
{ printf("= %gn", $1); }

expression: expression '+' term { $$ = $1 + $3; }
| expression '-' term { $$ = $1 - $3; }
term
;

parser.y
term:
|

term '*' factor { $$ = $1 * $3; }
term '/' factor { if($3 == 0.0)
yyerror("divide by zero");
else
$$ = $1 / $3;
}
| factor
;

factor:
|
|
|
;
%%

'(' expression ')' { $$ = $2; }
'-' factor
{ $$ = -$2; }
NUMBER
NAME
{ $$ = $1->value; }

parser.y
/* look up a symbol table entry, add if not present */
struct symtab *symlook(char *s) {
char *p;
struct symtab *sp;
for(sp = symtab; sp < &symtab[NSYMS]; sp++) {
/* is it already here? */
if(sp->name && !strcmp(sp->name, s))
return sp;
if(!sp->name) { /* is it free */
sp->name = strdup(s);
return sp;
}
/* otherwise continue to next */
}
yyerror("Too many symbols");
exit(1); /* cannot continue */
} /* symlook */
parser.y
yyerror(char *s)
{
printf( "yyerror: %sn", s);
}

parser.y
typedef union
{
double dval;
struct symtab *symp;
} YYSTYPE;
extern YYSTYPE yylval;
# define NAME 257
# define NUMBER 258
y.tab.h
calclexer.l
%{
#include "y.tab.h"
#include "parser.h"
#include <math.h>
%}
%%

calclexer.l
%%
([0-9]+|([0-9]*.[0-9]+)([eE][-+]?[0-9]+)?) {
yylval.dval = atof(yytext);
return NUMBER;
}
[ t] ;

/* ignore white space */

[A-Za-z][A-Za-z0-9]*

{ /* return symbol pointer */
yylval.symp = symlook(yytext);
return NAME;
}

"$"

{ return 0; /* end of input */ }

n |.
%%

return yytext[0];

calclexer.l
Makefile
LEX = lex
YACC = yacc
CC = gcc

Makefile

calcu:
y.tab.o lex.yy.o
$(CC) -o calcu y.tab.o lex.yy.o -ly -ll
y.tab.c y.tab.h: parser.y
$(YACC)
-d parser.y
y.tab.o: y.tab.c parser.h
$(CC) -c y.tab.c
lex.yy.o: y.tab.h lex.yy.c
$(CC) -c lex.yy.c
lex.yy.c: calclexer.l parser.h
$(LEX) calclexer.l

clean:

rm *.o
rm *.c
rm calcu
YACC Declaration
Summary
`%start'

Specify the grammar's start symbol

`%union‘ Declare the collection of data types that
semantic values may have
`%token‘ Declare a terminal symbol (token type
name) with no precedence or associativity specified
`%type‘ Declare the type of semantic values for a
nonterminal symbol
YACC Declaration Summary
`%right‘ Declare a terminal symbol (token type name)
that is right-associative
`%left‘ Declare a terminal symbol (token type name)
that is left-associative
`%nonassoc‘ Declare a terminal symbol (token type
name) that is nonassociative (using it in a way that
would be associative is a syntax error, e.g.:
x op. y op. z is syntax error)

Weitere ähnliche Inhalte

Was ist angesagt?

Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)omercomail
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical AnalysisMunni28
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compilerMahesh Kumar Chelimilla
 
Cs6660 compiler design
Cs6660 compiler designCs6660 compiler design
Cs6660 compiler designhari2010
 
Different phases of a compiler
Different phases of a compilerDifferent phases of a compiler
Different phases of a compilerSumit Sinha
 
Structure of the compiler
Structure of the compilerStructure of the compiler
Structure of the compilerSudhaa Ravi
 
Lecture 15 run timeenvironment_2
Lecture 15 run timeenvironment_2Lecture 15 run timeenvironment_2
Lecture 15 run timeenvironment_2Iffat Anjum
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compilerIffat Anjum
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler ConstructionAhmed Raza
 
Compiler Engineering Lab#1
Compiler Engineering Lab#1Compiler Engineering Lab#1
Compiler Engineering Lab#1MashaelQ
 
Chapter One
Chapter OneChapter One
Chapter Onebolovv
 

Was ist angesagt? (20)

Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)
 
Lex & yacc
Lex & yaccLex & yacc
Lex & yacc
 
Lex
LexLex
Lex
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Lecture2 general structure of a compiler
Lecture2 general structure of a compilerLecture2 general structure of a compiler
Lecture2 general structure of a compiler
 
Compiler unit 1
Compiler unit 1Compiler unit 1
Compiler unit 1
 
Cpcs302 1
Cpcs302  1Cpcs302  1
Cpcs302 1
 
Cs6660 compiler design
Cs6660 compiler designCs6660 compiler design
Cs6660 compiler design
 
Different phases of a compiler
Different phases of a compilerDifferent phases of a compiler
Different phases of a compiler
 
Lecture1 introduction compilers
Lecture1 introduction compilersLecture1 introduction compilers
Lecture1 introduction compilers
 
Structure of the compiler
Structure of the compilerStructure of the compiler
Structure of the compiler
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
Lecture 15 run timeenvironment_2
Lecture 15 run timeenvironment_2Lecture 15 run timeenvironment_2
Lecture 15 run timeenvironment_2
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compiler
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
 
Compiler Engineering Lab#1
Compiler Engineering Lab#1Compiler Engineering Lab#1
Compiler Engineering Lab#1
 
Compiler1
Compiler1Compiler1
Compiler1
 
Chapter One
Chapter OneChapter One
Chapter One
 

Andere mochten auch

Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture NotesFellowBuddy.com
 
Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)guest251d9a
 
Compiler Design
Compiler DesignCompiler Design
Compiler DesignMir Majid
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design BasicsAkhil Kaushik
 
compiler and their types
compiler and their typescompiler and their types
compiler and their typespatchamounika7
 
Compiler Design - Introduction to Compiler
Compiler Design - Introduction to CompilerCompiler Design - Introduction to Compiler
Compiler Design - Introduction to CompilerIffat Anjum
 
Introduction to compiler
Introduction to compilerIntroduction to compiler
Introduction to compilerAbha Damani
 
Code Optimization
Code OptimizationCode Optimization
Code Optimizationguest9f8315
 
Compiler vs Interpreter-Compiler design ppt.
Compiler vs Interpreter-Compiler design ppt.Compiler vs Interpreter-Compiler design ppt.
Compiler vs Interpreter-Compiler design ppt.Md Hossen
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compileradilmehmood93
 
Phases of the Compiler - Systems Programming
Phases of the Compiler - Systems ProgrammingPhases of the Compiler - Systems Programming
Phases of the Compiler - Systems ProgrammingMukesh Tekwani
 
JOSA TechTalks - Compilers, Transpilers, and Why You Should Care
JOSA TechTalks - Compilers, Transpilers, and Why You Should CareJOSA TechTalks - Compilers, Transpilers, and Why You Should Care
JOSA TechTalks - Compilers, Transpilers, and Why You Should CareJordan Open Source Association
 
Computer system
Computer systemComputer system
Computer systemzamzulaiha
 
Types and components of computer system
Types and components of computer systemTypes and components of computer system
Types and components of computer systemhome
 

Andere mochten auch (20)

Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture Notes
 
Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)Compiler Design(NANTHU NOTES)
Compiler Design(NANTHU NOTES)
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Compilers
CompilersCompilers
Compilers
 
Compiler Design Material
Compiler Design MaterialCompiler Design Material
Compiler Design Material
 
Compiler Design Basics
Compiler Design BasicsCompiler Design Basics
Compiler Design Basics
 
compiler and their types
compiler and their typescompiler and their types
compiler and their types
 
Compiler Design - Introduction to Compiler
Compiler Design - Introduction to CompilerCompiler Design - Introduction to Compiler
Compiler Design - Introduction to Compiler
 
Introduction to compiler
Introduction to compilerIntroduction to compiler
Introduction to compiler
 
Interpreter
InterpreterInterpreter
Interpreter
 
Translators(Compiler, Assembler) and interpreter
Translators(Compiler, Assembler) and interpreterTranslators(Compiler, Assembler) and interpreter
Translators(Compiler, Assembler) and interpreter
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
Compiler vs Interpreter-Compiler design ppt.
Compiler vs Interpreter-Compiler design ppt.Compiler vs Interpreter-Compiler design ppt.
Compiler vs Interpreter-Compiler design ppt.
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
 
Phases of Compiler
Phases of CompilerPhases of Compiler
Phases of Compiler
 
Phases of the Compiler - Systems Programming
Phases of the Compiler - Systems ProgrammingPhases of the Compiler - Systems Programming
Phases of the Compiler - Systems Programming
 
What is Compiler?
What is Compiler?What is Compiler?
What is Compiler?
 
JOSA TechTalks - Compilers, Transpilers, and Why You Should Care
JOSA TechTalks - Compilers, Transpilers, and Why You Should CareJOSA TechTalks - Compilers, Transpilers, and Why You Should Care
JOSA TechTalks - Compilers, Transpilers, and Why You Should Care
 
Computer system
Computer systemComputer system
Computer system
 
Types and components of computer system
Types and components of computer systemTypes and components of computer system
Types and components of computer system
 

Ähnlich wie Compiler Design Tutorial

Ähnlich wie Compiler Design Tutorial (20)

Lexyacc
LexyaccLexyacc
Lexyacc
 
Lex (lexical analyzer)
Lex (lexical analyzer)Lex (lexical analyzer)
Lex (lexical analyzer)
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
LEX & YACC TOOL
 
Writing Parsers and Compilers with PLY
Writing Parsers and Compilers with PLYWriting Parsers and Compilers with PLY
Writing Parsers and Compilers with PLY
 
Should i Go there
Should i Go thereShould i Go there
Should i Go there
 
Scala Refactoring for Fun and Profit
Scala Refactoring for Fun and ProfitScala Refactoring for Fun and Profit
Scala Refactoring for Fun and Profit
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , Overview
 
Speaking Scala: Refactoring for Fun and Profit (Workshop)
Speaking Scala: Refactoring for Fun and Profit (Workshop)Speaking Scala: Refactoring for Fun and Profit (Workshop)
Speaking Scala: Refactoring for Fun and Profit (Workshop)
 
Unix
UnixUnix
Unix
 
sonam Kumari python.ppt
sonam Kumari python.pptsonam Kumari python.ppt
sonam Kumari python.ppt
 
x86
x86x86
x86
 
Learn c++ Programming Language
Learn c++ Programming LanguageLearn c++ Programming Language
Learn c++ Programming Language
 
Java gets a closure
Java gets a closureJava gets a closure
Java gets a closure
 
17641.ppt
17641.ppt17641.ppt
17641.ppt
 
Slides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MDSlides on introduction to R by ArinBasu MD
Slides on introduction to R by ArinBasu MD
 
17641.ppt
17641.ppt17641.ppt
17641.ppt
 
The Fuss about || Haskell | Scala | F# ||
The Fuss about || Haskell | Scala | F# ||The Fuss about || Haskell | Scala | F# ||
The Fuss about || Haskell | Scala | F# ||
 
How to obtain and install R.ppt
How to obtain and install R.pptHow to obtain and install R.ppt
How to obtain and install R.ppt
 
A Replay Approach to Software Validation
A Replay Approach to Software ValidationA Replay Approach to Software Validation
A Replay Approach to Software Validation
 
lex.ppt
lex.pptlex.ppt
lex.ppt
 

Kürzlich hochgeladen

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

Compiler Design Tutorial

  • 1. Language processing: introduction to compiler construction Andy D. Pimentel Computer Systems Architecture group andy@science.uva.nl http://www.science.uva.nl/~andy/taalverwerking.html
  • 2. About this course • This part will address compilers for programming languages • Depth-first approach – Instead of covering all compiler aspects very briefly, we focus on particular compiler stages – Focus: optimization and compiler back issues • This course is complementary to the compiler course at the VU • Grading: (heavy) practical assignment and one or two take-home assignments
  • 3. About this course (cont’d) • Book – Recommended, not compulsory: Seti, Aho and Ullman,”Compilers Principles, Techniques and Tools” (the Dragon book) – Old book, but still more than sufficient – Copies of relevant chapters can be found in the library • Sheets are available at the website • Idem for practical/take-home assignments, deadlines, etc.
  • 4. Topics • Compiler introduction – General organization • Scanning & parsing – From a practical viewpoint: LEX and YACC • Intermediate formats • Optimization: techniques and algorithms – – – – – Local/peephole optimizations Global and loop optimizations Recognizing loops Dataflow analysis Alias analysis
  • 5. Topics (cont’d) • Code generation – Instruction selection – Register allocation – Instruction scheduling: improving ILP • Source-level optimizations – Optimizations for cache behavior
  • 7. Compilers: organization Source Frontend IR Optimizer IR Machine Backend code • Frontend – – – – Dependent on source language Lexical analysis Parsing Semantic analysis (e.g., type checking)
  • 8. Compilers: organization (cont’d) Source Frontend IR Optimizer IR Machine Backend code • Optimizer – – – – Independent part of compiler Different optimizations possible IR to IR translation Can be very computational intensive part
  • 9. Compilers: organization (cont’d) Source Frontend IR Optimizer IR • Backend – – – – – Dependent on target processor Code selection Code scheduling Register allocation Peephole optimization Machine Backend code
  • 11. Overview • Writing a compiler is difficult requiring lots of time and effort • Construction of the scanner and parser is routine enough that the process may be automated Lexical Rules Grammar Semantics Compiler Compiler Scanner --------Parser --------Code generator
  • 12. YACC • What is YACC ? – Tool which will produce a parser for a given grammar. – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar and to produce the source code of the syntactic analyzer of the language produced by this grammar – Input is a grammar (rules) and actions to take upon recognizing a rule – Output is a C program and optionally a header file of tokens
  • 13. LEX • Lex is a scanner generator – Input is description of patterns and actions – Output is a C program which contains a function yylex() which, when called, matches patterns and performs actions per input – Typically, the generated scanner performs lexical analysis and produces tokens for the (YACC-generated) parser
  • 14. LEX and YACC: a team How to work ?
  • 15. LEX and YACC: a team call yylex() [0-9]+ next token is NUM NUM ‘+’ NUM
  • 16. Availability • • • • • lex, yacc on most UNIX systems bison: a yacc replacement from GNU flex: fast lexical analyzer BSD yacc Windows/MS-DOS versions exist
  • 17. YACC Basic Operational Sequence gram.y yacc y.tab.c cc or gcc a.out File containing desired grammar in YACC format YACC program C source program created by YACC C compiler Executable program that will parse grammar given in gram.y
  • 18. YACC File Format Definitions %% Rules %% Supplementary Code The identical LEX format was actually taken from this...
  • 19. Rules Section • Is a grammar • Example expr : expr '+' term | term; term : term '*' factor | factor; factor : '(' expr ')' | ID | NUM;
  • 20. Rules Section • Normally written like this • Example: expr : | ; term : | ; factor : | | ; expr '+' term term term '*' factor factor '(' expr ')' ID NUM
  • 21. Definitions Section Example %{ #include <stdio.h> #include <stdlib.h> %} This is called a terminal %token ID NUM %start expr The start symbol (non-terminal)
  • 22. Sidebar • LEX produces a function called yylex() • YACC produces a function called yyparse() • yyparse() expects to be able to call yylex() • How to get yylex()? • Write your own! • If you don't want to write your own: Use LEX!!!
  • 23. Sidebar int yylex() { if(it's a num) return NUM; else if(it's an id) return ID; else if(parsing is done) return 0; else if(it's an error) return -1; }
  • 24. Semantic actions expr : | ; term : | ; factor expr '+' term term { $$ = $1 + $3; } { $$ = $1; } term '*' factor factor { $$ = $1 * $3; } { $$ = $1; } : '(' expr ')' | ID | NUM ; { $$ = $2; }
  • 25. Semantic actions (cont’d) $1 expr : | ; term : | ; factor expr '+' term term { $$ = $1 + $3; } { $$ = $1; } term '*' factor factor { $$ = $1 * $3; } { $$ = $1; } : '(' expr ')' | ID | NUM ; { $$ = $2; }
  • 26. Semantic actions (cont’d) expr : | ; term : | ; factor expr '+' term term { $$ = $1 + $3; } { $$ = $1; } term '*' factor factor { $$ = $1 * $3; } { $$ = $1; } : '(' expr ')' | ID | NUM ; { $$ = $2; } $2
  • 27. Semantic actions (cont’d) expr : | ; term : | ; factor expr '+' term term { $$ = $1 + $3; } { $$ = $1; } term '*' factor factor { $$ = $1 * $3; } { $$ = $1; } : '(' expr ')' | ID | NUM ; { $$ = $2; } $3 Default: $$ = $1;
  • 28. Bored, lonely? Try this! yacc -d gram.y • Will produce: y.tab.h Look at this and you'll Look at this and you'll never be unhappy again! never be unhappy again! yacc -v gram.y • Will produce: y.output Shows "State Shows "State Machine"® Machine"®
  • 29. Example: LEX %{ #include <stdio.h> #include "y.tab.h" %} id [_a-zA-Z][_a-zA-Z0-9]* wspc [ tn]+ semi [;] comma [,] %% int { return INT; } char { return CHAR; } float { return FLOAT; } {comma} { return COMMA; } {semi} { return SEMI; } {id} { return ID;} {wspc} {;} /* Necessary? */ scanner.l
  • 30. Example: Definitions %{ #include <stdio.h> #include <stdlib.h> %} %start line %token CHAR, COMMA, FLOAT, ID, INT, SEMI %% decl.y
  • 31. Example: Rules /* * * * decl.y This production is not part of the "official" grammar. It's primary purpose is to recover from parser errors, so it's probably best if you leave it here. */ line : /* lambda */ | line decl | line error { printf("Failure :-(n"); yyerrok; yyclearin; } ;
  • 32. Example: Rules decl : type ID list { printf("Success!n"); } ; list : COMMA ID list | SEMI ; type : INT | CHAR | FLOAT ; %% decl.y
  • 33. decl.y Example: Supplementary Code extern FILE *yyin; main() { do { yyparse(); } while(!feof(yyin)); } yyerror(char *s) { /* Don't have to do anything! */ }
  • 34. Bored, lonely? Try this! yacc -d decl.y • Produced y.tab.h # # # # # # define define define define define define CHAR 257 COMMA 258 FLOAT 259 ID 260 INT 261 SEMI 262
  • 35. Symbol attributes • Back to attribute grammars... • Every symbol can have a value – Might be a numeric quantity in case of a number (42) – Might be a pointer to a string ("Hello, World!") – Might be a pointer to a symbol table entry in case of a variable • When using LEX we put the value into yylval – In complex situations yylval is a union • Typical LEX code: [0-9]+ {yylval = atoi(yytext); return NUM}
  • 36. Symbol attributes (cont’d) • YACC allows symbols to have multiple types of value symbols %union { double dval; int vblno; char* strval; }
  • 37. Symbol attributes (cont’d) %union { double dval; int vblno; char* strval; } yacc -d y.tab.h … extern YYSTYPE yylval; [0-9]+ { yylval.vblno = atoi(yytext); return NUM;} [A-z]+ { yylval.strval = strdup(yytext); return STRING;} LEX file include “y.tab.h”
  • 38. Precedence / Association (1) 1 – 2 - 3 (2) 1 – 2 * 3 1. 1-2-3 = (1-2)-3? or 1-(2-3)? Define ‘-’ operator is left-association. 2. 1-2*3 = 1-(2*3) Define “*” operator is precedent to “-” operator
  • 39. Precedence / Association %left '+' '-' %left '*' '/' %noassoc UMINUS expr : | | | expr expr expr expr ‘+’ ‘-’ ‘*’ ‘/’ expr { $$ = $1 + $3; } expr { $$ = $1 - $3; } expr { $$ = $1 * $3; } expr { if($3==0) yyerror(“divide 0”); else $$ = $1 / $3; } | ‘-’ expr %prec UMINUS {$$ = -$2; }
  • 40. Precedence / Association %right %left %left %left ‘=‘ '<' '>' NE LE GE '+' '-‘ '*' '/' highest precedence
  • 41. Big trick Getting YACC & LEX to work together!
  • 43. Building Example • Suppose you have a lex file called scanner.l and a yacc file called decl.y and want parser • Steps to build... lex scanner.l yacc -d decl.y gcc -c lex.yy.c y.tab.c gcc -o parser lex.yy.o y.tab.o -ll Note: scanner should include in the definitions section: #include "y.tab.h"
  • 44. YACC • Rules may be recursive • Rules may be ambiguous • Uses bottom-up Shift/Reduce parsing – Get a token – Push onto stack – Can it be reduced (How do we know?) • If yes: Reduce using a rule • If no: Get another token • YACC cannot look ahead more than one token
  • 45. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp stack: <empty> exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: a = 7; b = 3 + a + 2
  • 46. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp SHIFT! stack: NAME exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: = 7; b = 3 + a + 2
  • 47. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp SHIFT! stack: NAME ‘=‘ exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: 7; b = 3 + a + 2
  • 48. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp SHIFT! stack: NAME ‘=‘ 7 exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: ; b = 3 + a + 2
  • 49. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp REDUCE! stack: NAME ‘=‘ exp | exp ‘-’ exp | NAME | NUMBER input: ; b = 3 + a + 2
  • 50. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp REDUCE! stack: stmt exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: ; b = 3 + a + 2
  • 51. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp SHIFT! stack: stmt ‘;’ exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: b = 3 + a + 2
  • 52. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp SHIFT! stack: stmt ‘;’ NAME exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: = 3 + a + 2
  • 53. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp SHIFT! stack: stmt ‘;’ NAME ‘=‘ exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: 3 + a + 2
  • 54. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp SHIFT! stack: stmt ‘;’ NAME ‘=‘ NUMBER | exp ‘-’ exp | NAME | NUMBER input: + a + 2
  • 55. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp REDUCE! stack: stmt ‘;’ NAME ‘=‘ exp | exp ‘-’ exp | NAME | NUMBER input: + a + 2
  • 56. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp SHIFT! stack: stmt ‘;’ NAME ‘=‘ exp ‘+’ | exp ‘-’ exp | NAME | NUMBER input: a + 2
  • 57. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp SHIFT! stack: stmt ‘;’ NAME ‘=‘ exp ‘+’ NAME | exp ‘-’ exp | NAME | NUMBER input: + 2
  • 58. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp REDUCE! stack: stmt ‘;’ NAME ‘=‘ exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: + 2
  • 59. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp REDUCE! stack: stmt ‘;’ NAME ‘=‘ exp | exp ‘-’ exp | NAME | NUMBER input: + 2
  • 60. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp SHIFT! stack: stmt ‘;’ NAME ‘=‘ exp ‘+’ | exp ‘-’ exp | NAME | NUMBER input: 2
  • 61. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp SHIFT! stack: stmt ‘;’ NAME ‘=‘ exp ‘+’ NUMBER | exp ‘-’ exp | NAME | NUMBER input: <empty>
  • 62. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER REDUCE! stack: stmt ‘;’ NAME ‘=‘ exp ‘+’ exp input: <empty>
  • 63. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER REDUCE! stack: stmt ‘;’ NAME ‘=‘ exp input: <empty>
  • 64. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp REDUCE! stack: stmt ‘;’ stmt exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: <empty>
  • 65. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp REDUCE! stack: stmt exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: <empty>
  • 66. Shift and reducing stmt: stmt ‘;’ stmt | NAME ‘=‘ exp DONE! stack: stmt exp: exp ‘+’ exp | exp ‘-’ exp | NAME | NUMBER input: <empty>
  • 67. IF-ELSE Ambiguity • Consider following rule: Following state : IF expr IF expr stmt . ELSE stmt • Two possible derivations: IF expr IF expr stmt . ELSE stmt IF expr IF expr stmt ELSE . stmt IF expr IF expr stmt ELSE stmt . IF expr stmt IF expr IF expr stmt . ELSE stmt IF expr stmt . ELSE stmt IF expr stmt ELSE . stmt IF expr stmt ELSE stmt .
  • 68. IF-ELSE Ambiguity • It is a shift/reduce conflict • YACC will always do shift first • Solution 1 : re-write grammar stmt : matched | unmatched ; matched: other_stmt | IF expr THEN matched ELSE matched ; unmatched: IF expr THEN stmt | IF expr THEN matched ELSE unmatched ;
  • 69. IF-ELSE Ambiguity • Solution 2: the rule has the same precedence as token IFX
  • 70. Shift/Reduce Conflicts • shift/reduce conflict – occurs when a grammar is written in such a way that a decision between shifting and reducing can not be made. – e.g.: IF-ELSE ambiguity • To resolve this conflict, YACC will choose to shift
  • 71. Reduce/Reduce Conflicts • Reduce/Reduce Conflicts: start : expr | stmt ; expr : CONSTANT; stmt : CONSTANT; • YACC (Bison) resolves the conflict by reducing using the rule that occurs earlier in the grammar. NOT GOOD!! • So, modify grammar to eliminate them
  • 72. Error Messages • Bad error message: – Syntax error – Compiler needs to give programmer a good advice • It is better to track the line number in LEX: void yyerror(char *s) { fprintf(stderr, "line %d: %sn:", yylineno, s); }
  • 73. Recursive Grammar • Left recursion • Right recursion list: item | list ',' item ; list: item | item ',' list ; • LR parser prefers left recursion • LL parser prefers right recursion
  • 74. YACC Example • Taken from LEX & YACC • Simple calculator a = 4 + 6 a a=10 b = 7 c = a + b c c = 17 pressure = (78 + 34) * 16.4 $
  • 75. Grammar expression ::= expression '+' term | expression '-' term | term term ::= term '*' factor | term '/' factor | factor factor ::= '(' expression ')' | '-' factor | NUMBER | NAME
  • 77. 2 name value 3 name value name value 5 name value 6 name value name value name value name value name value 11 name value 12 parser.h value 10 struct symtab *symlook(); name 9 struct symtab { char *name; double value; } symtab[NSYMS]; 1 8 /* maximum number of symbols */ value 7 #define NSYMS 20 name 4 /* * Header for calculator program */ 0 name value 13 name value 14 name value
  • 79. %{ #include "parser.h" #include <string.h> %} %union { double dval; struct symtab *symp; } %token <symp> NAME %token <dval> NUMBER %type <dval> expression %type <dval> term %type <dval> factor %% parser.y
  • 80. statement_list: | ; statement: | ; statement 'n' statement_list statement 'n‘ NAME '=' expression { $1->value = $3; } expression { printf("= %gn", $1); } expression: expression '+' term { $$ = $1 + $3; } | expression '-' term { $$ = $1 - $3; } term ; parser.y
  • 81. term: | term '*' factor { $$ = $1 * $3; } term '/' factor { if($3 == 0.0) yyerror("divide by zero"); else $$ = $1 / $3; } | factor ; factor: | | | ; %% '(' expression ')' { $$ = $2; } '-' factor { $$ = -$2; } NUMBER NAME { $$ = $1->value; } parser.y
  • 82. /* look up a symbol table entry, add if not present */ struct symtab *symlook(char *s) { char *p; struct symtab *sp; for(sp = symtab; sp < &symtab[NSYMS]; sp++) { /* is it already here? */ if(sp->name && !strcmp(sp->name, s)) return sp; if(!sp->name) { /* is it free */ sp->name = strdup(s); return sp; } /* otherwise continue to next */ } yyerror("Too many symbols"); exit(1); /* cannot continue */ } /* symlook */ parser.y
  • 83. yyerror(char *s) { printf( "yyerror: %sn", s); } parser.y
  • 84. typedef union { double dval; struct symtab *symp; } YYSTYPE; extern YYSTYPE yylval; # define NAME 257 # define NUMBER 258 y.tab.h
  • 87. %% ([0-9]+|([0-9]*.[0-9]+)([eE][-+]?[0-9]+)?) { yylval.dval = atof(yytext); return NUMBER; } [ t] ; /* ignore white space */ [A-Za-z][A-Za-z0-9]* { /* return symbol pointer */ yylval.symp = symlook(yytext); return NAME; } "$" { return 0; /* end of input */ } n |. %% return yytext[0]; calclexer.l
  • 89. LEX = lex YACC = yacc CC = gcc Makefile calcu: y.tab.o lex.yy.o $(CC) -o calcu y.tab.o lex.yy.o -ly -ll y.tab.c y.tab.h: parser.y $(YACC) -d parser.y y.tab.o: y.tab.c parser.h $(CC) -c y.tab.c lex.yy.o: y.tab.h lex.yy.c $(CC) -c lex.yy.c lex.yy.c: calclexer.l parser.h $(LEX) calclexer.l clean: rm *.o rm *.c rm calcu
  • 90. YACC Declaration Summary `%start' Specify the grammar's start symbol `%union‘ Declare the collection of data types that semantic values may have `%token‘ Declare a terminal symbol (token type name) with no precedence or associativity specified `%type‘ Declare the type of semantic values for a nonterminal symbol
  • 91. YACC Declaration Summary `%right‘ Declare a terminal symbol (token type name) that is right-associative `%left‘ Declare a terminal symbol (token type name) that is left-associative `%nonassoc‘ Declare a terminal symbol (token type name) that is nonassociative (using it in a way that would be associative is a syntax error, e.g.: x op. y op. z is syntax error)