SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Regular Expressions
Languages
• Recall.
– A language is a set of strings made up of
symbols from a given alphabet.
Computer Science Theory 2
1
Language
recognition
machine
String Recognition Machine
• Given a string and a definition of a language
(set of strings), is the string a member of the
language?
YES, string is
in Language
Input string
NO, string is
not in
Language
Computer Science Theory 3
Language Classes
• We will be looking at classes of languages:
– Each class will have its own means for
describing the language
– Each class will have its own machine model for
string recognition
– Languages and machines get more complex as
we move forward in the course.
Computer Science Theory 4
2
Languages
• A language is a set of strings.
• A class of languages is nothing more than a
set of languages
Computer Science Theory 5
Regular Languages
• Today we start looking at our first class of
languages: Regular languages
– Means of defining: Regular Expressions
– Machine for accepting: Finite Automata
• We’ll be studying different aspects of
regular languages until the midterm.
Computer Science Theory 6
3
Specifying Languages
• Recall: how do we specify languages?
– If language is finite, you can list all of its
strings.
• L = {a, aa, aba, aca}
– Descriptive:
• L = {x | na(x) = nb(x)}
– Using basic Language operations
• L= {aa, ab}*  {b}{bb}*
– Regular languages are described using this last
method
Computer Science Theory 7
Regular Languages
• A regular language over  is a language that
can be expressed using only the set
operations of
– Union
– Concatenation
– Kleene Star
Computer Science Theory 8
4
Kleene Star Operation
• The set of strings that can be obtained by
concatenating any number of elements of a
language L is called the Kleene Star, L*

L*
 ULi
 L0
 L1
 L2
 L3
 L4
...
i0
4 Note that since, L* contains L0,  is an
element of L*
Computer Science Theory 9
Regular Expressions
• Regular expressions are the mechanism by
which regular languages are described:
– Take the “set operation” definition of the
language and:
• Replace  with +
• Replace {} with ()
– And you have a regular expression
Computer Science Theory 10
5
Computer Science Theory 11
Regular expressions
{} 
{011} 011
{0,1} 0 + 1
{0, 01} 0 + 01
{110}*{0,1} (110)*(0+1)
{10, 11, 01}* (10 + 11 + 01)*
{0, 11}*({11}*  {101, }) (0 + 11)*((11)* + 101 + )
Regular Expression
• Recursive definition of regular languages /
expressions over an alphabet,  :
1.  is a regular language and its regular
expression is 
2. {} is a regular language and  is its regular
expression
3. For each a  , {a} is a regular language and
its regular expression is a
Computer Science Theory 12
6
Regular Expression
4. If L1 and L2 are regular languages with regular
expressions r1 and r2 then
-- L1  L2 is a regular language with regular
expression (r1 + r2)
-- L1L2 is a regular language with regular
expression (r1r2)
-- L1
* is a regular language with regular
expression (r1
*)
Only languages obtainable by using rules 1-4 are
regular languages.
Computer Science Theory 13
Regular Expressions
• Some shorthand
– If we apply precedents to the operators, we can
relax the full parenthesized definition:
• Kleene star has highest precedent
• Concatenation has mid precedent
• + has lowest precedent
– Thus
• a + b*c is the same as (a + ((b*)c))
• (a + b)* is not the same as a + b*
Computer Science Theory 14
7
Regular Expressions
• More shorthand
– Equating regular expressions.
• Two regular expressions are considered equal if they
describe the same language
• 1*1* =1*
• (a + b)*  a + b*
Computer Science Theory 15
Regular Expressions
• Even more shorthand
– Sometimes you might see in the book:
• rn where n indicates the number of concatenations of
r (e.g. r6)
• r+ to indicate one or more concatenations of r.
– Note that this is only shorthand!
– r6 and r+ are not regular expressions.
Computer Science Theory 16
8
Regular Expressions
• Important thing to remember
– A regular expression is not a language
– A regular expression is used to describe a
language.
• It is incorrect to say that for a language L,
– L = (a + b + c)*
• But it’s okay to say that L is described by
– (a + b + c)*
Computer Science Theory 17
Examples of Regular Languages
• All finite languages are regular
– Can anyone tell why?
Computer Science Theory 18
9
Examples of Regular Languages
• All finite languages are regular
– A finite language L can be expressed as the
union of languages each with one string
corresponding to a string inL
– Example:
• L = {a, aa, aba, aca}
• L = {a}  {aa}  {aba}  {aca}
• Regular expression: (a + aa + aba + abc)
Computer Science Theory 19
Examples of Regular Languages
• L = {x  {0,1}* | |x| is even}
– Any string of even length can be obtained by
concatenating strings length 2.
– Any concatenation of strings of length 2 will be
even
– L = {00, 01, 10, 11}*
– Regular expressions describing L:
• (00 + 01 + 10 +11)*
• ((0 + 1)(0 +1))*
Computer Science Theory 20
10
Examples of Regular Languages
• L = {x  {0,1}* | x does not end in 01 }
– If x does not end in 01, then either
• |x| < 2or
• x ends in 00, 10, or 11
– A regular expression that describes L is:
•  + 0 + 1 + (0 + 1)*(00 + 10 + 11)
Computer Science Theory 21
Examples of Regular Languages
• L = {x  {0,1}* | x contains an odd number
of 0s }
– Express x = yz
– y is a string of the form y=1i01j
– In z, there must be an even number of additional
0s or z = (01k01m)*
– x can be described by (1*01*)(01*01*)*
Computer Science Theory 22
11
Practical uses for regular expressions
• grep
– Global (search for) Regular Expressions and
Print
– Finds patterns of characters in a text file.
– grep man foo.txt
– grep [ab]*c[de]? foo.txt
Computer Science Theory 23
Computer Science Theory 24
Practical uses for regular expressions
• How a compiler works
Stream
of tokens
Parse
Tree
Objec
t
code
lexe
r
parse
r
codege
n
Sourc
e file
12
Practical uses for regular expressions
• How a compiler works
– The Lexical Analyzer (lexer) reads source code
and generates a stream of tokens
– What is a token?
• Identifier
• Keyword
• Number
• Operator
• Punctuation
Computer Science Theory 25
Practical uses for regular expressions
• How a compiler works
– Tokens can be described using regular
expressions!
Computer Science Theory 26
13
Examples of Regular Languages
• L = set of valid C keywords
– This is a finite set
– L can be described by
• if + then + else + while + do + goto + break + switch
+ …
Computer Science Theory 27
Examples of Regular Languages
• L = set of valid C identifiers
– A valid C identifier begins with a letter or _
– A valid C identifier contains letters, numbers,
and _
– If we let:
• l = {a , b , … , z , A , B , … ,Z}
• d = {1 , 2 , … , 9 , 0}
– Then a regular expression for L:
• (l + _)(l + d +_)*
Computer Science Theory 28
14
Practical uses for regular expressions
• lex
– Program that will create a lexical analyzer.
– Input: set of valid tokens
– Tokens are given by regular expressions.
Computer Science Theory 29
Summary
• Regular languages can be expressed using
only the set operations of union,
concatenation, Kleene Star.
• Regular languages
– Means of describing: Regular Expression
– Machine for accepting: Finite Automata
• Practical uses
– Text search (grep)
– Compilers / Lexical Analysis (lex)
Computer Science Theory 30
15

Weitere ähnliche Inhalte

Was ist angesagt?

1.5 & 1.6 regular languages &amp; regular expression
1.5 & 1.6 regular languages &amp; regular expression1.5 & 1.6 regular languages &amp; regular expression
1.5 & 1.6 regular languages &amp; regular expressionSampath Kumar S
 
Lecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular LanguagesLecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular LanguagesMarina Santini
 
Ch3 4 regular expression and grammar
Ch3 4 regular expression and grammarCh3 4 regular expression and grammar
Ch3 4 regular expression and grammarmeresie tesfay
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsShiraz316
 
A195259101 22750 24_2018_grammars and languages generated by grammars
A195259101 22750 24_2018_grammars and languages generated by grammarsA195259101 22750 24_2018_grammars and languages generated by grammars
A195259101 22750 24_2018_grammars and languages generated by grammarsMohd Arif Ansari
 
Regular Expression in Compiler design
Regular Expression in Compiler designRegular Expression in Compiler design
Regular Expression in Compiler designRiazul Islam
 
Lecture 02 lexical analysis
Lecture 02 lexical analysisLecture 02 lexical analysis
Lecture 02 lexical analysisIffat Anjum
 
2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt2_2Specification of Tokens.ppt
2_2Specification of Tokens.pptRatnakar Mikkili
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)Jagjit Wilku
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languagesparmeet834
 
Theory of automata and formal language
Theory of automata and formal languageTheory of automata and formal language
Theory of automata and formal languageRabia Khalid
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptApoorv Diwan
 
Lecture 3,4
Lecture 3,4Lecture 3,4
Lecture 3,4shah zeb
 

Was ist angesagt? (20)

Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
 
1.5 & 1.6 regular languages &amp; regular expression
1.5 & 1.6 regular languages &amp; regular expression1.5 & 1.6 regular languages &amp; regular expression
1.5 & 1.6 regular languages &amp; regular expression
 
Lecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular LanguagesLecture: Regular Expressions and Regular Languages
Lecture: Regular Expressions and Regular Languages
 
Ch3 4 regular expression and grammar
Ch3 4 regular expression and grammarCh3 4 regular expression and grammar
Ch3 4 regular expression and grammar
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Theory of computation Lec1
Theory of computation Lec1Theory of computation Lec1
Theory of computation Lec1
 
A195259101 22750 24_2018_grammars and languages generated by grammars
A195259101 22750 24_2018_grammars and languages generated by grammarsA195259101 22750 24_2018_grammars and languages generated by grammars
A195259101 22750 24_2018_grammars and languages generated by grammars
 
Lecture4 lexical analysis2
Lecture4 lexical analysis2Lecture4 lexical analysis2
Lecture4 lexical analysis2
 
Regular Expression in Compiler design
Regular Expression in Compiler designRegular Expression in Compiler design
Regular Expression in Compiler design
 
Minimizing DFA
Minimizing DFAMinimizing DFA
Minimizing DFA
 
Lecture 02 lexical analysis
Lecture 02 lexical analysisLecture 02 lexical analysis
Lecture 02 lexical analysis
 
2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languages
 
Flat unit 1
Flat unit 1Flat unit 1
Flat unit 1
 
Lecture3 lexical analysis
Lecture3 lexical analysisLecture3 lexical analysis
Lecture3 lexical analysis
 
Theory of automata and formal language
Theory of automata and formal languageTheory of automata and formal language
Theory of automata and formal language
 
Compiler lec 8
Compiler lec 8Compiler lec 8
Compiler lec 8
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.ppt
 
Lecture 3,4
Lecture 3,4Lecture 3,4
Lecture 3,4
 

Ähnlich wie Regular expressions h1 (20)

01-Introduction&Languages.pdf
01-Introduction&Languages.pdf01-Introduction&Languages.pdf
01-Introduction&Languages.pdf
 
01.ppt
01.ppt01.ppt
01.ppt
 
L_2_apl.pptx
L_2_apl.pptxL_2_apl.pptx
L_2_apl.pptx
 
Unit-1-part-1.pptx
Unit-1-part-1.pptxUnit-1-part-1.pptx
Unit-1-part-1.pptx
 
9781284077247_PPTx_CH01.pptx
9781284077247_PPTx_CH01.pptx9781284077247_PPTx_CH01.pptx
9781284077247_PPTx_CH01.pptx
 
Lisp and scheme i
Lisp and scheme iLisp and scheme i
Lisp and scheme i
 
7645347.ppt
7645347.ppt7645347.ppt
7645347.ppt
 
intro.ppt
intro.pptintro.ppt
intro.ppt
 
Lexical
LexicalLexical
Lexical
 
Syntax Analyzer.pdf
Syntax Analyzer.pdfSyntax Analyzer.pdf
Syntax Analyzer.pdf
 
Unit -I Toc.pptx
Unit -I Toc.pptxUnit -I Toc.pptx
Unit -I Toc.pptx
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Module 1 TOC.pptx
Module 1 TOC.pptxModule 1 TOC.pptx
Module 1 TOC.pptx
 
Programming_Language_Syntax.ppt
Programming_Language_Syntax.pptProgramming_Language_Syntax.ppt
Programming_Language_Syntax.ppt
 
Syntax
SyntaxSyntax
Syntax
 
10651372.ppt
10651372.ppt10651372.ppt
10651372.ppt
 
regular expression
regular expressionregular expression
regular expression
 
RegularExpressions.pdf
RegularExpressions.pdfRegularExpressions.pdf
RegularExpressions.pdf
 
Lecture 3,4
Lecture 3,4Lecture 3,4
Lecture 3,4
 

Mehr von Rajendran

Element distinctness lower bounds
Element distinctness lower boundsElement distinctness lower bounds
Element distinctness lower boundsRajendran
 
Scheduling with Startup and Holding Costs
Scheduling with Startup and Holding CostsScheduling with Startup and Holding Costs
Scheduling with Startup and Holding CostsRajendran
 
Divide and conquer surfing lower bounds
Divide and conquer  surfing lower boundsDivide and conquer  surfing lower bounds
Divide and conquer surfing lower boundsRajendran
 
Red black tree
Red black treeRed black tree
Red black treeRajendran
 
Medians and order statistics
Medians and order statisticsMedians and order statistics
Medians and order statisticsRajendran
 
Proof master theorem
Proof master theoremProof master theorem
Proof master theoremRajendran
 
Recursion tree method
Recursion tree methodRecursion tree method
Recursion tree methodRajendran
 
Recurrence theorem
Recurrence theoremRecurrence theorem
Recurrence theoremRajendran
 
Master method
Master method Master method
Master method Rajendran
 
Master method theorem
Master method theoremMaster method theorem
Master method theoremRajendran
 
Master method theorem
Master method theoremMaster method theorem
Master method theoremRajendran
 
Greedy algorithms
Greedy algorithmsGreedy algorithms
Greedy algorithmsRajendran
 
Longest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm AnalysisLongest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm AnalysisRajendran
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisRajendran
 
Average case Analysis of Quicksort
Average case Analysis of QuicksortAverage case Analysis of Quicksort
Average case Analysis of QuicksortRajendran
 
Np completeness
Np completenessNp completeness
Np completenessRajendran
 
computer languages
computer languagescomputer languages
computer languagesRajendran
 

Mehr von Rajendran (20)

Element distinctness lower bounds
Element distinctness lower boundsElement distinctness lower bounds
Element distinctness lower bounds
 
Scheduling with Startup and Holding Costs
Scheduling with Startup and Holding CostsScheduling with Startup and Holding Costs
Scheduling with Startup and Holding Costs
 
Divide and conquer surfing lower bounds
Divide and conquer  surfing lower boundsDivide and conquer  surfing lower bounds
Divide and conquer surfing lower bounds
 
Red black tree
Red black treeRed black tree
Red black tree
 
Hash table
Hash tableHash table
Hash table
 
Medians and order statistics
Medians and order statisticsMedians and order statistics
Medians and order statistics
 
Proof master theorem
Proof master theoremProof master theorem
Proof master theorem
 
Recursion tree method
Recursion tree methodRecursion tree method
Recursion tree method
 
Recurrence theorem
Recurrence theoremRecurrence theorem
Recurrence theorem
 
Master method
Master method Master method
Master method
 
Master method theorem
Master method theoremMaster method theorem
Master method theorem
 
Hash tables
Hash tablesHash tables
Hash tables
 
Lower bound
Lower boundLower bound
Lower bound
 
Master method theorem
Master method theoremMaster method theorem
Master method theorem
 
Greedy algorithms
Greedy algorithmsGreedy algorithms
Greedy algorithms
 
Longest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm AnalysisLongest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm Analysis
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm Analysis
 
Average case Analysis of Quicksort
Average case Analysis of QuicksortAverage case Analysis of Quicksort
Average case Analysis of Quicksort
 
Np completeness
Np completenessNp completeness
Np completeness
 
computer languages
computer languagescomputer languages
computer languages
 

Kürzlich hochgeladen

5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...CaraSkikne1
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxAditiChauhan701637
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesCeline George
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapitolTechU
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE
 
Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationMJDuyan
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptxraviapr7
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphNetziValdelomar1
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational PhilosophyShuvankar Madhu
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and stepobaje godwin sunday
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice documentXsasf Sfdfasd
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICESayali Powar
 
M-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptxM-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptxDr. Santhosh Kumar. N
 

Kürzlich hochgeladen (20)

5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptx
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 Sales
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptx
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024
 
Benefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive EducationBenefits & Challenges of Inclusive Education
Benefits & Challenges of Inclusive Education
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a Paragraph
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational Philosophy
 
General views of Histopathology and step
General views of Histopathology and stepGeneral views of Histopathology and step
General views of Histopathology and step
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice document
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICE
 
M-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptxM-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptx
 

Regular expressions h1

  • 1. Regular Expressions Languages • Recall. – A language is a set of strings made up of symbols from a given alphabet. Computer Science Theory 2 1
  • 2. Language recognition machine String Recognition Machine • Given a string and a definition of a language (set of strings), is the string a member of the language? YES, string is in Language Input string NO, string is not in Language Computer Science Theory 3 Language Classes • We will be looking at classes of languages: – Each class will have its own means for describing the language – Each class will have its own machine model for string recognition – Languages and machines get more complex as we move forward in the course. Computer Science Theory 4 2
  • 3. Languages • A language is a set of strings. • A class of languages is nothing more than a set of languages Computer Science Theory 5 Regular Languages • Today we start looking at our first class of languages: Regular languages – Means of defining: Regular Expressions – Machine for accepting: Finite Automata • We’ll be studying different aspects of regular languages until the midterm. Computer Science Theory 6 3
  • 4. Specifying Languages • Recall: how do we specify languages? – If language is finite, you can list all of its strings. • L = {a, aa, aba, aca} – Descriptive: • L = {x | na(x) = nb(x)} – Using basic Language operations • L= {aa, ab}*  {b}{bb}* – Regular languages are described using this last method Computer Science Theory 7 Regular Languages • A regular language over  is a language that can be expressed using only the set operations of – Union – Concatenation – Kleene Star Computer Science Theory 8 4
  • 5. Kleene Star Operation • The set of strings that can be obtained by concatenating any number of elements of a language L is called the Kleene Star, L*  L*  ULi  L0  L1  L2  L3  L4 ... i0 4 Note that since, L* contains L0,  is an element of L* Computer Science Theory 9 Regular Expressions • Regular expressions are the mechanism by which regular languages are described: – Take the “set operation” definition of the language and: • Replace  with + • Replace {} with () – And you have a regular expression Computer Science Theory 10 5
  • 6. Computer Science Theory 11 Regular expressions {}  {011} 011 {0,1} 0 + 1 {0, 01} 0 + 01 {110}*{0,1} (110)*(0+1) {10, 11, 01}* (10 + 11 + 01)* {0, 11}*({11}*  {101, }) (0 + 11)*((11)* + 101 + ) Regular Expression • Recursive definition of regular languages / expressions over an alphabet,  : 1.  is a regular language and its regular expression is  2. {} is a regular language and  is its regular expression 3. For each a  , {a} is a regular language and its regular expression is a Computer Science Theory 12 6
  • 7. Regular Expression 4. If L1 and L2 are regular languages with regular expressions r1 and r2 then -- L1  L2 is a regular language with regular expression (r1 + r2) -- L1L2 is a regular language with regular expression (r1r2) -- L1 * is a regular language with regular expression (r1 *) Only languages obtainable by using rules 1-4 are regular languages. Computer Science Theory 13 Regular Expressions • Some shorthand – If we apply precedents to the operators, we can relax the full parenthesized definition: • Kleene star has highest precedent • Concatenation has mid precedent • + has lowest precedent – Thus • a + b*c is the same as (a + ((b*)c)) • (a + b)* is not the same as a + b* Computer Science Theory 14 7
  • 8. Regular Expressions • More shorthand – Equating regular expressions. • Two regular expressions are considered equal if they describe the same language • 1*1* =1* • (a + b)*  a + b* Computer Science Theory 15 Regular Expressions • Even more shorthand – Sometimes you might see in the book: • rn where n indicates the number of concatenations of r (e.g. r6) • r+ to indicate one or more concatenations of r. – Note that this is only shorthand! – r6 and r+ are not regular expressions. Computer Science Theory 16 8
  • 9. Regular Expressions • Important thing to remember – A regular expression is not a language – A regular expression is used to describe a language. • It is incorrect to say that for a language L, – L = (a + b + c)* • But it’s okay to say that L is described by – (a + b + c)* Computer Science Theory 17 Examples of Regular Languages • All finite languages are regular – Can anyone tell why? Computer Science Theory 18 9
  • 10. Examples of Regular Languages • All finite languages are regular – A finite language L can be expressed as the union of languages each with one string corresponding to a string inL – Example: • L = {a, aa, aba, aca} • L = {a}  {aa}  {aba}  {aca} • Regular expression: (a + aa + aba + abc) Computer Science Theory 19 Examples of Regular Languages • L = {x  {0,1}* | |x| is even} – Any string of even length can be obtained by concatenating strings length 2. – Any concatenation of strings of length 2 will be even – L = {00, 01, 10, 11}* – Regular expressions describing L: • (00 + 01 + 10 +11)* • ((0 + 1)(0 +1))* Computer Science Theory 20 10
  • 11. Examples of Regular Languages • L = {x  {0,1}* | x does not end in 01 } – If x does not end in 01, then either • |x| < 2or • x ends in 00, 10, or 11 – A regular expression that describes L is: •  + 0 + 1 + (0 + 1)*(00 + 10 + 11) Computer Science Theory 21 Examples of Regular Languages • L = {x  {0,1}* | x contains an odd number of 0s } – Express x = yz – y is a string of the form y=1i01j – In z, there must be an even number of additional 0s or z = (01k01m)* – x can be described by (1*01*)(01*01*)* Computer Science Theory 22 11
  • 12. Practical uses for regular expressions • grep – Global (search for) Regular Expressions and Print – Finds patterns of characters in a text file. – grep man foo.txt – grep [ab]*c[de]? foo.txt Computer Science Theory 23 Computer Science Theory 24 Practical uses for regular expressions • How a compiler works Stream of tokens Parse Tree Objec t code lexe r parse r codege n Sourc e file 12
  • 13. Practical uses for regular expressions • How a compiler works – The Lexical Analyzer (lexer) reads source code and generates a stream of tokens – What is a token? • Identifier • Keyword • Number • Operator • Punctuation Computer Science Theory 25 Practical uses for regular expressions • How a compiler works – Tokens can be described using regular expressions! Computer Science Theory 26 13
  • 14. Examples of Regular Languages • L = set of valid C keywords – This is a finite set – L can be described by • if + then + else + while + do + goto + break + switch + … Computer Science Theory 27 Examples of Regular Languages • L = set of valid C identifiers – A valid C identifier begins with a letter or _ – A valid C identifier contains letters, numbers, and _ – If we let: • l = {a , b , … , z , A , B , … ,Z} • d = {1 , 2 , … , 9 , 0} – Then a regular expression for L: • (l + _)(l + d +_)* Computer Science Theory 28 14
  • 15. Practical uses for regular expressions • lex – Program that will create a lexical analyzer. – Input: set of valid tokens – Tokens are given by regular expressions. Computer Science Theory 29 Summary • Regular languages can be expressed using only the set operations of union, concatenation, Kleene Star. • Regular languages – Means of describing: Regular Expression – Machine for accepting: Finite Automata • Practical uses – Text search (grep) – Compilers / Lexical Analysis (lex) Computer Science Theory 30 15