SlideShare a Scribd company logo
1 of 7
Regular Expression
R.Rajkumar
Asst.Professor
CSE
Lexical analyzer
• Lexical analysis, also called scanning, is the phase of the compilation
process which deals with the actual program being compiled, character by
character. The higher level parts of the compiler will call the lexical
analyzer with the command "get the next word from the input", and it is
the scanner's job to sort through the input characters and find this word.
• The types of "words" commonly found in a program are:
• programming language keywords, such as if, while, struct, int etc.
• operator symbols like =, +, -, &&, !, <= etc.
• other special symbols like: ( ), { }, [ ], ;, & etc.
• constants like 1, 2, 3, 'a', 'b', 'c', "any quoted string" etc.
• variable and function names (called identifers) such as x, i, t1 etc.
• Some languages (such as C) are case sensitive, in that they differentiate
between eg. if and IF; thus the former would be a keyword, the latter a
variable name.
Tokens
• Also, most languages would insist that identifers cannot be any of the keywords, or
contain operator symbols (versions of Fortran don't, making lexical analysis quite
difficult).
• In addition to the basic grouping process, lexical analysis usually performs the
following tasks:
• Since there are only a finite number of types of words, instead of passing the actual
word to the next phase we can save space by passing a suitable representation. This
representation is known as a token.
• If the language isn't case sensitive, we can eliminate differences between case at this
point by using just one token per keyword, irrespective of case; eg. #define IF-
TOKEN 1 #define WHILE-TOKEN 2 ..... ..... if we meet "IF", "If", "iF", "if" then return
IF_TOKEN if we meet "WHILE, "While", "WHile", ... then return WHILE-TOKEN
• We can pick out mistakes in the lexical syntax of the program such as using a
character which is not valid in the language. (Note that we do not worry about the
combination of patterns; eg. the pattern of characters"+*" would be returned
as PLUS-TOKEN, MULT-TOKEN, and it would be up to the next phase to see that
these should not follow in sequence.)
• We can eliminate pieces of the program that are no longer relevant, such as spaces,
tabs, carriage-returns (in most languages), and comments.
• In order to specify the lexical analysis process, what we need is some method of
describing which patterns of characters correspond to which words.
Regular Expressions
• Regular expressions are used to define patterns of characters; they are used in UNIX tools
such as awk, grep, vi and, of course, lex.
• A regular expression is just a form of notation, used for describing sets of words. For any
given set of characters , a regular expression over is defined by:
• The empty string, , which denotes a string of length zero, and means ``take nothing from
the input''. It is most commonly used in conjunction with other regular expressions eg. to
denote optionality.
• Any character in may be used in a regular expression. For instance, if we write a as a
regular expression, this means ``take the letter a from the input''; ie. it denotes the
(singleton) set of words {``a''}
• The union operator, ``|'', which denotes the union of two sets of words. Thus the regular
expression a|b denotes the set {``a'', ``b''}, and means ``take either the letter a or the
letter b from the input''
• Writing two regular expressions side-by-side is known as concatenation; thus the regular
expression ab denotes the set {``ab''} and means ``take the character a followed by the
character b from the input''.
• The Kleene closure of a regular expression, denoted by ``*'', indicates zero or more
occurrences of that expression. Thus a* is the (infinite) set {, ``a'', ``aa'', ``aaa'', ...} and
means ``take zero or more as from the input''.
• Brackets may be used in a regular expression to enforce precedence or increase clarity.
Thompson Algorithm
for converting RE to NFA
Lexical1
Lexical1

More Related Content

What's hot

4 lexical and syntax analysis
4 lexical and syntax analysis4 lexical and syntax analysis
4 lexical and syntax analysisjigeno
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysisSudhaa Ravi
 
4 lexical and syntax
4 lexical and syntax4 lexical and syntax
4 lexical and syntaxMunawar Ahmed
 
Type checking compiler construction Chapter #6
Type checking compiler construction Chapter #6Type checking compiler construction Chapter #6
Type checking compiler construction Chapter #6Daniyal Mughal
 
Lecture 04 syntax analysis
Lecture 04 syntax analysisLecture 04 syntax analysis
Lecture 04 syntax analysisIffat Anjum
 
Type checking in compiler design
Type checking in compiler designType checking in compiler design
Type checking in compiler designSudip Singh
 
The role of the parser and Error recovery strategies ppt in compiler design
The role of the parser and Error recovery strategies ppt in compiler designThe role of the parser and Error recovery strategies ppt in compiler design
The role of the parser and Error recovery strategies ppt in compiler designSadia Akter
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical AnalysisMunni28
 
Syntax analyzer
Syntax analyzerSyntax analyzer
Syntax analyzerahmed51236
 
Compiler design and lexical analyser
Compiler design and lexical analyserCompiler design and lexical analyser
Compiler design and lexical analyserabhishek gupta
 
System Programming Unit IV
System Programming Unit IVSystem Programming Unit IV
System Programming Unit IVManoj Patil
 
Chap 1-language processor
Chap 1-language processorChap 1-language processor
Chap 1-language processorshindept123
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical AnalyzerArchana Gopinath
 
Symbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code GenerationSymbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code GenerationAkhil Kaushik
 

What's hot (20)

4 lexical and syntax analysis
4 lexical and syntax analysis4 lexical and syntax analysis
4 lexical and syntax analysis
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysis
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
4 lexical and syntax
4 lexical and syntax4 lexical and syntax
4 lexical and syntax
 
Type checking compiler construction Chapter #6
Type checking compiler construction Chapter #6Type checking compiler construction Chapter #6
Type checking compiler construction Chapter #6
 
Lecture 04 syntax analysis
Lecture 04 syntax analysisLecture 04 syntax analysis
Lecture 04 syntax analysis
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Type checking in compiler design
Type checking in compiler designType checking in compiler design
Type checking in compiler design
 
The role of the parser and Error recovery strategies ppt in compiler design
The role of the parser and Error recovery strategies ppt in compiler designThe role of the parser and Error recovery strategies ppt in compiler design
The role of the parser and Error recovery strategies ppt in compiler design
 
1.Role lexical Analyzer
1.Role lexical Analyzer1.Role lexical Analyzer
1.Role lexical Analyzer
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Syntax analyzer
Syntax analyzerSyntax analyzer
Syntax analyzer
 
Compiler design and lexical analyser
Compiler design and lexical analyserCompiler design and lexical analyser
Compiler design and lexical analyser
 
System Programming Unit IV
System Programming Unit IVSystem Programming Unit IV
System Programming Unit IV
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
Lexical analysis-using-lex
Lexical analysis-using-lexLexical analysis-using-lex
Lexical analysis-using-lex
 
Chap 1-language processor
Chap 1-language processorChap 1-language processor
Chap 1-language processor
 
A Role of Lexical Analyzer
A Role of Lexical AnalyzerA Role of Lexical Analyzer
A Role of Lexical Analyzer
 
Symbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code GenerationSymbol Table, Error Handler & Code Generation
Symbol Table, Error Handler & Code Generation
 
Lecture3 lexical analysis
Lecture3 lexical analysisLecture3 lexical analysis
Lecture3 lexical analysis
 

Similar to Lexical1 (20)

Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Structure of the compiler
Structure of the compilerStructure of the compiler
Structure of the compiler
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
001 Lecture-11-C-Traps-and-Pitfalls-part-1.pdf
001 Lecture-11-C-Traps-and-Pitfalls-part-1.pdf001 Lecture-11-C-Traps-and-Pitfalls-part-1.pdf
001 Lecture-11-C-Traps-and-Pitfalls-part-1.pdf
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Computational model language and grammar bnf
Computational model language and grammar bnfComputational model language and grammar bnf
Computational model language and grammar bnf
 
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
 
NLP_KASHK:Regular Expressions
NLP_KASHK:Regular Expressions NLP_KASHK:Regular Expressions
NLP_KASHK:Regular Expressions
 
6. describing syntax and semantics
6. describing syntax and semantics6. describing syntax and semantics
6. describing syntax and semantics
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design
 
Module4 lex and yacc.ppt
Module4 lex and yacc.pptModule4 lex and yacc.ppt
Module4 lex and yacc.ppt
 
Pcd question bank
Pcd question bank Pcd question bank
Pcd question bank
 
Lexical
LexicalLexical
Lexical
 
Lexical analysis
Lexical analysisLexical analysis
Lexical analysis
 
A Quick Taste of C
A Quick Taste of CA Quick Taste of C
A Quick Taste of C
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Java unit 2
Java unit 2Java unit 2
Java unit 2
 

Recently uploaded

Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

Lexical1

  • 2. Lexical analyzer • Lexical analysis, also called scanning, is the phase of the compilation process which deals with the actual program being compiled, character by character. The higher level parts of the compiler will call the lexical analyzer with the command "get the next word from the input", and it is the scanner's job to sort through the input characters and find this word. • The types of "words" commonly found in a program are: • programming language keywords, such as if, while, struct, int etc. • operator symbols like =, +, -, &&, !, <= etc. • other special symbols like: ( ), { }, [ ], ;, & etc. • constants like 1, 2, 3, 'a', 'b', 'c', "any quoted string" etc. • variable and function names (called identifers) such as x, i, t1 etc. • Some languages (such as C) are case sensitive, in that they differentiate between eg. if and IF; thus the former would be a keyword, the latter a variable name.
  • 3. Tokens • Also, most languages would insist that identifers cannot be any of the keywords, or contain operator symbols (versions of Fortran don't, making lexical analysis quite difficult). • In addition to the basic grouping process, lexical analysis usually performs the following tasks: • Since there are only a finite number of types of words, instead of passing the actual word to the next phase we can save space by passing a suitable representation. This representation is known as a token. • If the language isn't case sensitive, we can eliminate differences between case at this point by using just one token per keyword, irrespective of case; eg. #define IF- TOKEN 1 #define WHILE-TOKEN 2 ..... ..... if we meet "IF", "If", "iF", "if" then return IF_TOKEN if we meet "WHILE, "While", "WHile", ... then return WHILE-TOKEN • We can pick out mistakes in the lexical syntax of the program such as using a character which is not valid in the language. (Note that we do not worry about the combination of patterns; eg. the pattern of characters"+*" would be returned as PLUS-TOKEN, MULT-TOKEN, and it would be up to the next phase to see that these should not follow in sequence.) • We can eliminate pieces of the program that are no longer relevant, such as spaces, tabs, carriage-returns (in most languages), and comments. • In order to specify the lexical analysis process, what we need is some method of describing which patterns of characters correspond to which words.
  • 4. Regular Expressions • Regular expressions are used to define patterns of characters; they are used in UNIX tools such as awk, grep, vi and, of course, lex. • A regular expression is just a form of notation, used for describing sets of words. For any given set of characters , a regular expression over is defined by: • The empty string, , which denotes a string of length zero, and means ``take nothing from the input''. It is most commonly used in conjunction with other regular expressions eg. to denote optionality. • Any character in may be used in a regular expression. For instance, if we write a as a regular expression, this means ``take the letter a from the input''; ie. it denotes the (singleton) set of words {``a''} • The union operator, ``|'', which denotes the union of two sets of words. Thus the regular expression a|b denotes the set {``a'', ``b''}, and means ``take either the letter a or the letter b from the input'' • Writing two regular expressions side-by-side is known as concatenation; thus the regular expression ab denotes the set {``ab''} and means ``take the character a followed by the character b from the input''. • The Kleene closure of a regular expression, denoted by ``*'', indicates zero or more occurrences of that expression. Thus a* is the (infinite) set {, ``a'', ``aa'', ``aaa'', ...} and means ``take zero or more as from the input''. • Brackets may be used in a regular expression to enforce precedence or increase clarity.