SlideShare ist ein Scribd-Unternehmen logo
1 von 16
By Niko Adrianus Yuwono
BUZOO PHP TEAM
REGULAR EXPRESSIONS LECTURE
What is Regular Expressions?
 Regular Expressions or Regex (We’ll mostly use
Regex to call it in this presentation) are a
powerful tool for examining and modifying text.
 Regex use general pattern notation to allow you
describe and parse text.
 PHP supports two different types of regular
expressions: POSIX-extended and Perl-
Compatible Regular Expressions (PCRE). But
we’ll focus on PCRE in this lecture.
Delimiters
 When using PCRE functions we need to enclose
the pattern using delimiters.
 Often used delimiters are forward slashes (/),
hash signs (#) and tildes (~ ).
 Example of usage :
 /([^/ | ^-]+).html/
 /</span>(.*?)</span>/
Literal-Characters
 Literal characters are normal characters that
match themselves. Alphanumeric characters and
symbols are example of literal characters
 To difference between Meta-Characters and
Literal-Characters we need to add backslash ()
before the literal character to define that
character is a literal character not a meta
character
Meta-characters
 Meta-characters are the main power of regular
expressions, with meta-characters it’s possible to
encode alternatives and repetitions in the pattern.
 Meta-characters are divided into two type, meta-
characters outside class, and meta-characters
inside class.
Meta-characters Cont’d
 Here is list of meta-character that can work
outside a class :
  , ^ , $ , . , [ , ] , | , ( , ) , ? , * , + , { , }
 And this is the list of meta-character that work
inside a class :
  , ^ , -
Character Classes
 Character classes in Regex started by opening
square bracket ([) and closed by and closing
square bracket (])
 A character class matches a single character in
the subject; the character must be in the set of
characters defined by the class.
 Example :
 [a-z] will match any lowercase letter
 [^A-Z] will match a
ny character that is not a uppercase letter
Subpatterns
 Subpatterns are delimited by parentheses (round
brackets), which can be nested.
 Subpatterns can do two things :
1. It localizes a set of alternatives. For example,
the pattern hen(dy|rio|ri) matches one of the
words “hendy", “henrio", or “henri". Without the
parentheses, it would match “hendy", “rio" or the
“ri”.
2. It sets up the subpattern as a capturing
subpattern (as defined above).
Subpatterns Cont’d
 For example, if the string “kafji tinggi" is matched
against the pattern ((kafji|niko)
(tinggi|tampan)) the captured substrings are
“kafji tinggi", “kafji", and “tinggi", and are
numbered 1, 2, and 3.
 There are often times we don’t need capturing
functions. In that case we can add "?:“ after the
opening parenthesis.
Optional Items
 The question mark makes the preceding token in
the regular expression optional.
 Example : colou?r will match both
colour and color.
 You can also wrap a set of characters in
parenthesis to make them optional.
 Example : Jan(uary)? will match both Jan and
January.
Repetition
 There are two repetition characters, star ( * ) and
plus ( + ).
 Star ( * ) character will try to match the preceding
token zero or more times.
 Plus ( + ) character will try to match the preceding
token one or more times
 Example :
 [sS]+ will match any character one or more
 [sS]* will match any character zero or more
Limiting Repetition
 Sometimes we need to limit some repetition, to
achieve that we can use { } bracket.
 The syntax is {min,max} where min is a must and
you can empty the max but it’ll be counted as
infinity, and if you omit both the coma and max it’ll
repeat the token exactly min times.
 Example :
 ([A-Z]{3}|[0-9]{4}) will matches three letters or four
numbers
Greediness
 Greediness is a condition where the regex given
to option try to match the pattern or not to match
the pattern.
 But the regex will always try to match the pattern.
It can cause some trouble to us and will return an
unexpected result.
 For example the regex Feb 23(rd)? to the
string Today is Feb 23rd, 2003, the match will
always be Feb 23rd and not Feb 23.
Greediness Cont’d
 Example for repetition :
 You want to get HTML tag for crawling a website.
Usually new people will use <.+> to match the
HTML tag. But it will return a different result than
you expected. Let’s try to match that pattern with
this string -> “Saya <b>suka</b> makan”
 The result will be <b>suka</b>
 Why?
Greediness Cont’d
 That’s because of greediness, the pattern <.+>
will try to match dot ( . ) as many as possible.
 Let’s try to do it step by step.
 First the regex will try to search < from this string
“Saya <b>suka</b> makan” so Saya will be
skipped.
 Then after finding < it’ll try to run (.+) that means
to find any character one or more so it’ll read from
b until the end of string. Then it’ll backtracking
until the last > character that have been found so
the result will be <b>suka</b> not <b> and </b>
Laziness
 How to fix greediness problem? You can use
laziness by adding ? Question mark after the
repetition or question mark to make them lazy
 But there is also another alternative to laziness
that is negated character class.
 Example for previous question :
 <[^>]+> will match anything except > character

Weitere ähnliche Inhalte

Was ist angesagt?

3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regexJalpesh Vasa
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaghu nath
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007Geoffrey Dunn
 
Regular Expressions grep and egrep
Regular Expressions grep and egrepRegular Expressions grep and egrep
Regular Expressions grep and egrepTri Truong
 
Regular expression
Regular expressionRegular expression
Regular expressionLarry Nung
 
Regular Expression
Regular ExpressionRegular Expression
Regular ExpressionLambert Lum
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsDanny Bryant
 
Introduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_RIntroduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_RHellen Gakuruh
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaj Gupta
 
Java: Regular Expression
Java: Regular ExpressionJava: Regular Expression
Java: Regular ExpressionMasudul Haque
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regexwayn
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentationarnolambert
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Sandy Smith
 
Regular expression
Regular expressionRegular expression
Regular expressionRajon
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101Raj Rajandran
 

Was ist angesagt? (20)

Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regex
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007
 
Regular Expressions grep and egrep
Regular Expressions grep and egrepRegular Expressions grep and egrep
Regular Expressions grep and egrep
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
NLP_KASHK:Regular Expressions
NLP_KASHK:Regular Expressions NLP_KASHK:Regular Expressions
NLP_KASHK:Regular Expressions
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Introduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_RIntroduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_R
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Java: Regular Expression
Java: Regular ExpressionJava: Regular Expression
Java: Regular Expression
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
 
Regex posix
Regex posixRegex posix
Regex posix
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101
 

Andere mochten auch

Engagement for a Modern Sales Team
Engagement for a Modern Sales TeamEngagement for a Modern Sales Team
Engagement for a Modern Sales TeamDalia Asterbadi
 
realSociable - Creating a need and changing sales flow
realSociable - Creating a need and changing sales flowrealSociable - Creating a need and changing sales flow
realSociable - Creating a need and changing sales flowDalia Asterbadi
 
Sales Methodologies - A quick guide to boosting success - realSociable
Sales Methodologies - A quick guide to boosting success - realSociableSales Methodologies - A quick guide to boosting success - realSociable
Sales Methodologies - A quick guide to boosting success - realSociableDalia Asterbadi
 
핑그래프(Fingra.ph) 모바일 광고 적용 사례
핑그래프(Fingra.ph) 모바일 광고 적용 사례핑그래프(Fingra.ph) 모바일 광고 적용 사례
핑그래프(Fingra.ph) 모바일 광고 적용 사례Fingra.ph
 
[PHP] Zend_Db (Zend Framework)
[PHP] Zend_Db (Zend Framework)[PHP] Zend_Db (Zend Framework)
[PHP] Zend_Db (Zend Framework)Jun Shimizu
 
Slicing Up the Mobile Services Revenue Pie
Slicing Up the Mobile Services Revenue PieSlicing Up the Mobile Services Revenue Pie
Slicing Up the Mobile Services Revenue PieSam Gellar
 
Design Pattern with Burger
Design Pattern with BurgerDesign Pattern with Burger
Design Pattern with BurgerJun Shimizu
 

Andere mochten auch (9)

Engagement for a Modern Sales Team
Engagement for a Modern Sales TeamEngagement for a Modern Sales Team
Engagement for a Modern Sales Team
 
realSociable - Creating a need and changing sales flow
realSociable - Creating a need and changing sales flowrealSociable - Creating a need and changing sales flow
realSociable - Creating a need and changing sales flow
 
Sales Methodologies - A quick guide to boosting success - realSociable
Sales Methodologies - A quick guide to boosting success - realSociableSales Methodologies - A quick guide to boosting success - realSociable
Sales Methodologies - A quick guide to boosting success - realSociable
 
핑그래프(Fingra.ph) 모바일 광고 적용 사례
핑그래프(Fingra.ph) 모바일 광고 적용 사례핑그래프(Fingra.ph) 모바일 광고 적용 사례
핑그래프(Fingra.ph) 모바일 광고 적용 사례
 
Verma sons
Verma sonsVerma sons
Verma sons
 
[PHP] Zend_Db (Zend Framework)
[PHP] Zend_Db (Zend Framework)[PHP] Zend_Db (Zend Framework)
[PHP] Zend_Db (Zend Framework)
 
Piling lica
Piling licaPiling lica
Piling lica
 
Slicing Up the Mobile Services Revenue Pie
Slicing Up the Mobile Services Revenue PieSlicing Up the Mobile Services Revenue Pie
Slicing Up the Mobile Services Revenue Pie
 
Design Pattern with Burger
Design Pattern with BurgerDesign Pattern with Burger
Design Pattern with Burger
 

Ähnlich wie Regex lecture

Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expressionazzamhadeel89
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Regex startup
Regex startupRegex startup
Regex startupPayPal
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressionsmussawir20
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular ExpressionsMukesh Tekwani
 
Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Sandy Smith
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Sandy Smith
 
Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Sandy Smith
 
Regular Expressions in Google Analytics
Regular Expressions in Google AnalyticsRegular Expressions in Google Analytics
Regular Expressions in Google AnalyticsShivani Singh
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracleLogan Palanisamy
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionskeeyre
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptxDurgaNayak4
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfBryan Alejos
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionProf. Wim Van Criekinge
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for PatternsKeith Wright
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressionsKrishna Nanda
 
Java căn bản - Chapter9
Java căn bản - Chapter9Java căn bản - Chapter9
Java căn bản - Chapter9Vince Vo
 

Ähnlich wie Regex lecture (20)

Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Regex startup
Regex startupRegex startup
Regex startup
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
 
Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15
 
Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015
 
Regular Expressions in Google Analytics
Regular Expressions in Google AnalyticsRegular Expressions in Google Analytics
Regular Expressions in Google Analytics
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
Regular Expressions in Stata
Regular Expressions in StataRegular Expressions in Stata
Regular Expressions in Stata
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for Patterns
 
Les08
Les08Les08
Les08
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressions
 
Java căn bản - Chapter9
Java căn bản - Chapter9Java căn bản - Chapter9
Java căn bản - Chapter9
 

Kürzlich hochgeladen

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Regex lecture

  • 1. By Niko Adrianus Yuwono BUZOO PHP TEAM REGULAR EXPRESSIONS LECTURE
  • 2. What is Regular Expressions?  Regular Expressions or Regex (We’ll mostly use Regex to call it in this presentation) are a powerful tool for examining and modifying text.  Regex use general pattern notation to allow you describe and parse text.  PHP supports two different types of regular expressions: POSIX-extended and Perl- Compatible Regular Expressions (PCRE). But we’ll focus on PCRE in this lecture.
  • 3. Delimiters  When using PCRE functions we need to enclose the pattern using delimiters.  Often used delimiters are forward slashes (/), hash signs (#) and tildes (~ ).  Example of usage :  /([^/ | ^-]+).html/  /</span>(.*?)</span>/
  • 4. Literal-Characters  Literal characters are normal characters that match themselves. Alphanumeric characters and symbols are example of literal characters  To difference between Meta-Characters and Literal-Characters we need to add backslash () before the literal character to define that character is a literal character not a meta character
  • 5. Meta-characters  Meta-characters are the main power of regular expressions, with meta-characters it’s possible to encode alternatives and repetitions in the pattern.  Meta-characters are divided into two type, meta- characters outside class, and meta-characters inside class.
  • 6. Meta-characters Cont’d  Here is list of meta-character that can work outside a class :  , ^ , $ , . , [ , ] , | , ( , ) , ? , * , + , { , }  And this is the list of meta-character that work inside a class :  , ^ , -
  • 7. Character Classes  Character classes in Regex started by opening square bracket ([) and closed by and closing square bracket (])  A character class matches a single character in the subject; the character must be in the set of characters defined by the class.  Example :  [a-z] will match any lowercase letter  [^A-Z] will match a ny character that is not a uppercase letter
  • 8. Subpatterns  Subpatterns are delimited by parentheses (round brackets), which can be nested.  Subpatterns can do two things : 1. It localizes a set of alternatives. For example, the pattern hen(dy|rio|ri) matches one of the words “hendy", “henrio", or “henri". Without the parentheses, it would match “hendy", “rio" or the “ri”. 2. It sets up the subpattern as a capturing subpattern (as defined above).
  • 9. Subpatterns Cont’d  For example, if the string “kafji tinggi" is matched against the pattern ((kafji|niko) (tinggi|tampan)) the captured substrings are “kafji tinggi", “kafji", and “tinggi", and are numbered 1, 2, and 3.  There are often times we don’t need capturing functions. In that case we can add "?:“ after the opening parenthesis.
  • 10. Optional Items  The question mark makes the preceding token in the regular expression optional.  Example : colou?r will match both colour and color.  You can also wrap a set of characters in parenthesis to make them optional.  Example : Jan(uary)? will match both Jan and January.
  • 11. Repetition  There are two repetition characters, star ( * ) and plus ( + ).  Star ( * ) character will try to match the preceding token zero or more times.  Plus ( + ) character will try to match the preceding token one or more times  Example :  [sS]+ will match any character one or more  [sS]* will match any character zero or more
  • 12. Limiting Repetition  Sometimes we need to limit some repetition, to achieve that we can use { } bracket.  The syntax is {min,max} where min is a must and you can empty the max but it’ll be counted as infinity, and if you omit both the coma and max it’ll repeat the token exactly min times.  Example :  ([A-Z]{3}|[0-9]{4}) will matches three letters or four numbers
  • 13. Greediness  Greediness is a condition where the regex given to option try to match the pattern or not to match the pattern.  But the regex will always try to match the pattern. It can cause some trouble to us and will return an unexpected result.  For example the regex Feb 23(rd)? to the string Today is Feb 23rd, 2003, the match will always be Feb 23rd and not Feb 23.
  • 14. Greediness Cont’d  Example for repetition :  You want to get HTML tag for crawling a website. Usually new people will use <.+> to match the HTML tag. But it will return a different result than you expected. Let’s try to match that pattern with this string -> “Saya <b>suka</b> makan”  The result will be <b>suka</b>  Why?
  • 15. Greediness Cont’d  That’s because of greediness, the pattern <.+> will try to match dot ( . ) as many as possible.  Let’s try to do it step by step.  First the regex will try to search < from this string “Saya <b>suka</b> makan” so Saya will be skipped.  Then after finding < it’ll try to run (.+) that means to find any character one or more so it’ll read from b until the end of string. Then it’ll backtracking until the last > character that have been found so the result will be <b>suka</b> not <b> and </b>
  • 16. Laziness  How to fix greediness problem? You can use laziness by adding ? Question mark after the repetition or question mark to make them lazy  But there is also another alternative to laziness that is negated character class.  Example for previous question :  <[^>]+> will match anything except > character