SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
MODULE 3 – PART 4
REGULAR EXPRESSIONS
By,
Ravi Kumar B N
Assistant professor, Dept. of CSE
BMSIT & M
➢ Regular expression is a sequence of characters that define a search pattern.
➢ patterns are used by string searching algorithms for "find" or "find and
replace" operations on strings, or for input validation.
➢ The regular expression library “re” must be imported into our program before
we can use it.
INTRODUCTION
➢ search() function: used to search for a particular string. will only return the first occurrence that
matches the specified pattern.
This function is available in “re” library.
➢ the caret character (^) : is used in regular expressions to match the beginning of a line.
➢ The dollar character ($) : is used in regular expressions to match the end of a line.
Example: program to match only lines where “From:” is at the beginning of the line
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search('^From:', line) :
print(line)
#Output
From:stephen Sat Jan 5 09:14:16 2008
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
mbox1.txt
From:stephen Sat Jan 5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
Subject: [sakai] svn commit:
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
Return-Path: <postmaster@collab.sakaiproject.org>
✓ The instruction re.search('^From:', line) equivalent with the startswith() method from the
string library.
SEARCH() FUNCTION:
➢ The dot character (.) : The most commonly used special character is the period (”dot”) or full
stop, which matches any character.
The regular expression “F..m:” would match any of the following strings since the period
characters in the regular expression match any character.
“From:”, “Fxxm:”, “F12m:”, or “F!@m:”
➢ The program in the previous slide is rewritten using dot character which gives the same output
CHARACTER MATCHING IN REGULAR
EXPRESSIONS
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search(‘^F..m:', line) :
print(line)
#Output
From:stephen Sat Jan 5 09:14:16 2008
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
Character can be repeated any number of times using the “*” or “+” characters in a
regular expression.
➢ The Asterisk character (*) : matches zero-or-more characters
➢ The Plus character (+) : matches one-or-more characters
Example: Program to match lines that start with “From:”, followed by mail-id
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search(‘^From:.+@', line) :
print(line)
#Output
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
✓ The search string “ˆFrom:.+@” will successfully match lines that start with “From:”, followed by one
or more characters (“.+”), followed by an at-sign. The “.+” wildcard matches all the characters
between the colon character and the at-sign.
➢ non-whitespace character (S) - matches one non-whitespace character
➢findall() function: It is used to search for “all” occurrences that match a given pattern.
In contrast, search() function will only return the first occurrence that matches the specified pattern.
import re
s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM'
lst = re.findall('S+@S+', s)
print(lst)
#output
['csev@umich.edu', 'cwen@iupui.edu']
Example1: Program returns a list of all of the strings that look like email addresses from a given line.
# same program using search() it will display only first mail id or first
matching string
import re
s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM'
lst = re.search('S+@S+', s)
print(lst)
#output
<re.Match object; span=(11, 25), match='csev@umich.edu'>
'S+@S+’ this regular expression
matches substrings that have at least one
non-whitespace character, followed by an
at-sign, followed by at least one more
non-whitespace character
Example2: Program returns a list of all of the strings that look like email addresses from a given file.
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
x = re.findall('S+@S+', line)
if len(x) > 0 :
print(x)
#Output
['<postmaster@collab.sakaiproject.org>']
['louis@media.berkeley.edu']
['zqian@umich.edu']
['<postmaster@collab.sakaiproject.org>']
➢ Square brackets “[]” : square brackets are used to indicate a set of multiple acceptable characters we
are willing to consider matching.
Example: [a-z] matches single lowercase letter
[A-Z] matches single uppercase letter
[a-zA-Z] matches single lowercase letter or uppercase letter
[a-zA-Z0-9] matches single lowercase letter or uppercase letter or number
Some of our email addresses have incorrect characters like
“<” or “;” at the beginning or end. we are only interested in
the portion of the string that starts and ends with a letter or
a number. To get the proper output we have to use following
character.
[amk] matches 'a', 'm', or ’k’
[(+*)] matches any of the literal characters ’(‘ , '+’, '*’, or ’)’
[0-5][0-9] matches all the two-digits numbers from 00 to 59
➢ Characters that are not within a range can be matched by complementing the set
If the first character of the set is '^', all the characters that are not in the set will be matched.
For example,
[^5] will match any character except ’5’
Ex: Program returns list of all email addresses in proper format.
import re
hand = open('mbox.txt')
for line in hand:
line = line.rstrip()
x = re.findall('[a-zA-Z0-9]S*@S*[a-zA-Z]', line)
if len(x) > 0 :
print(x)
#output
['postmaster@collab.sakaiproject.org']
['louis@media.berkeley.edu']
['zqian@umich.edu']
['postmaster@collab.sakaiproject.org']
[a-zA-Z0-9]S*@S*[a-zA-Z] : substrings that start with a
single lowercase letter, uppercase letter, or number “[a-zA-
Z0-9]”, followed by zero or more non-blank characters “S*”,
followed by an at-sign, followed by zero or more non-blank
characters “S*”, followed by an uppercase or lowercase
letter “[a-zA-Z]”.
SEARCH AND EXTRACT
import re
hand = open('mbox2.txt')
for line in hand:
line = line.rstrip()
if re.search('^XS*: [0-9.]+', line) :
print(line)
#Output
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.9245
Example1: Find numbers on lines that start with the string “X-”
lines such as: X-DSPAM-Confidence: 0.8475
➢ parentheses “()” in regular expression : used to extract a portion of the substring that
matches the regular expression.
import re
hand = open('mbox2.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^XS*: ([0-9.]+)', line)
if len(x) > 0 :
print(x) Search
#Output
['0.8475’] Extract
['0.9245']
mbox2.txt
From: stephen.marquard@uct.ac.za
Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/conten
impl/impl/src/java/org
X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8
X-Content-Type-Message-Body: text/plain; charset=UTF-8
Content-Type: text/plain; charset=UTF-8
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.9245
Above output has entire line we only want to extract
numbers from lines that have the above syntax
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^From.* ([0-3][0-9]):', line)
if len(x) > 0 :
print(x)
#Output
['09']
['16']
['16']
Example2: Program to print the day of received mails
RANDOM EXECUTION
>>> s=" 0.9 .90 1.0 1. 138 pqr“
>>> re.findall('[0-9.]+',s)
['0.9', '.90', '1.0', '1.', '138’]
>>> re.findall('[0-9]+[.][0-9]',s)
['0.9', '1.0’]
>>> re.findall('[0-9]+[.][0-9]+',s)
['0.9', '1.0']
>>> re.findall('[0-9]*[.][0-9]+’,s)
['0.9', '.90', '1.0’]
>>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1byis112, 1byee190“
>>> re.findall('1bycs...',usn)
['1bycs123', '1bycs009’]
>>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn)
['1bycs123', '1bycs009’]
>>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1vecs112, 1svcs190"
>>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn)
['1bycs123', '1bycs009', '1vecs112', '1svcs190’]
>>> re.findall('[0-9]+cs[0-9]+',usn)
[]
>>> re.findall('[a-zA-Z0-9]+cs([0-9]+)',usn)
['123', '009', '112', '190']
ESCAPE CHARACTER
➢ Escape character (backslash "" ) is a metacharacter in regular expressions. It allow special
characters to be used without invoking their special meaning.
If you want to match 1+1=2, the correct regex is 1+1=2. Otherwise, the plus sign has a
special meaning.
For example, we can find money amounts with the following regular expression.
>>>import re
>>>x = 'We just received $10.00 for cookies.’
>>>y = re.findall(‘$[0-9.]+’,x)
>>> y
['$10.00']
SUMMARY
Character Meaning
ˆ Matches the beginning of the line
$ Matches the end of the line
. Matches any character (a wildcard)
s Matches a whitespace character
S Matches a non-whitespace character (opposite of s)
* Applies to the immediately preceding character and indicates to match zero or more of the
preceding character(s)
*? Applies to the immediately preceding character and indicates to match zero or more of the
preceding character(s) in “non-greedy mode”
+ Applies to the immediately preceding character and indicates to match one or more of the
preceding character(s)
+? Applies to the immediately preceding character and indicates to match one or more of the
preceding character(s) in “non-greedy mode”.
[aeiou] Matches a single character as long as that character is in the specified set. In this example, it would
match “a”, “e”, “i”, “o”, or “u”, but no other characters.
[a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that
must be a lowercase letter or a digit.
Character Meaning
[ˆA-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches
a single character that is anything other than an uppercase or lowercase letter.
( ) When parentheses are added to a regular expression, they are ignored for the purpose of
matching, but allow you to extract a particular subset of the matched string rather than the
whole string when using findall()
b Matches the empty string, but only at the start or end of a word.
B Matches the empty string, but not at the start or end of a word
d Matches any decimal digit; equivalent to the set [0-9].
D Matches any non-digit character; equivalent to the set [ˆ0-9]
ASSIGNMENT
1) Write a python program to check the validity of a Password In this program, we will be taking a
password as a combination of alphanumeric characters along with special characters, and check whether
the password is valid or not with the help of few conditions.
Primary conditions for password validation :
1.Minimum 8 characters.
2.The alphabets must be between [a-z]
3.At least one alphabet should be of Upper Case [A-Z]
4.At least 1 number or digit between [0-9].
5.At least 1 character from [ _ or @ or $ ].
2) Write a pattern for the following:
Pattern to extract lines starting with the word From (or from) and ending with edu.
Pattern to extract lines ending with any digit.
Start with upper case letters and end with digits.
Search for the first white-space character in the string and display its position.
Replace every white-space character with the number 9: consider a sample text txt = "The rain in Spain"
THANK
YOU

Weitere ähnliche Inhalte

Was ist angesagt? (20)

Namespaces
NamespacesNamespaces
Namespaces
 
Modules and packages in python
Modules and packages in pythonModules and packages in python
Modules and packages in python
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Python File Handling | File Operations in Python | Learn python programming |...
Python File Handling | File Operations in Python | Learn python programming |...Python File Handling | File Operations in Python | Learn python programming |...
Python File Handling | File Operations in Python | Learn python programming |...
 
Methods in Java
Methods in JavaMethods in Java
Methods in Java
 
File Handling Python
File Handling PythonFile Handling Python
File Handling Python
 
Python : Data Types
Python : Data TypesPython : Data Types
Python : Data Types
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
 
Datastructures in python
Datastructures in pythonDatastructures in python
Datastructures in python
 
Strings in python
Strings in pythonStrings in python
Strings in python
 
Oop concepts in python
Oop concepts in pythonOop concepts in python
Oop concepts in python
 
Java Streams
Java StreamsJava Streams
Java Streams
 
Constructor in java
Constructor in javaConstructor in java
Constructor in java
 
Java Collections Tutorials
Java Collections TutorialsJava Collections Tutorials
Java Collections Tutorials
 
Print input-presentation
Print input-presentationPrint input-presentation
Print input-presentation
 
Functions in python slide share
Functions in python slide shareFunctions in python slide share
Functions in python slide share
 
Python Exception Handling
Python Exception HandlingPython Exception Handling
Python Exception Handling
 
Chapter 03 python libraries
Chapter 03 python librariesChapter 03 python libraries
Chapter 03 python libraries
 
Python Datatypes by SujithKumar
Python Datatypes by SujithKumarPython Datatypes by SujithKumar
Python Datatypes by SujithKumar
 
02 c++ Array Pointer
02 c++ Array Pointer02 c++ Array Pointer
02 c++ Array Pointer
 

Ähnlich wie Python Regular Expressions

Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxPythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxDave Tan
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptxDurgaNayak4
 
scanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierscanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierherosaikiran
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracleLogan Palanisamy
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007Geoffrey Dunn
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaj Gupta
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxalfred4lewis58146
 
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxShad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxSonu62614
 
Python programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsPython programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsMegha V
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdfDarellMuchoko
 
Beginning with vi text editor
Beginning with vi text editorBeginning with vi text editor
Beginning with vi text editorJose Pla
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYvikram mahendra
 
String in programming language in c or c++
String in programming language in c or c++String in programming language in c or c++
String in programming language in c or c++Azeemaj101
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressionsKrishna Nanda
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)Chirag Shetty
 

Ähnlich wie Python Regular Expressions (20)

Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxPythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptx
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
P3 2017 python_regexes
P3 2017 python_regexesP3 2017 python_regexes
P3 2017 python_regexes
 
scanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierscanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifier
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docx
 
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxShad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
 
php string part 4
php string part 4php string part 4
php string part 4
 
Python programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsPython programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operations
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
 
lecture_lex.pdf
lecture_lex.pdflecture_lex.pdf
lecture_lex.pdf
 
Beginning with vi text editor
Beginning with vi text editorBeginning with vi text editor
Beginning with vi text editor
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAY
 
String in programming language in c or c++
String in programming language in c or c++String in programming language in c or c++
String in programming language in c or c++
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressions
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 

Mehr von BMS Institute of Technology and Management (11)

Software Engineering and Introduction, Activities and ProcessModels
Software Engineering and Introduction, Activities and ProcessModels Software Engineering and Introduction, Activities and ProcessModels
Software Engineering and Introduction, Activities and ProcessModels
 
Pytho_tuples
Pytho_tuplesPytho_tuples
Pytho_tuples
 
Pytho dictionaries
Pytho dictionaries Pytho dictionaries
Pytho dictionaries
 
Pytho lists
Pytho listsPytho lists
Pytho lists
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
 
Introduction to the Python
Introduction to the PythonIntroduction to the Python
Introduction to the Python
 
15CS562 AI VTU Question paper
15CS562 AI VTU Question paper15CS562 AI VTU Question paper
15CS562 AI VTU Question paper
 
weak slot and filler
weak slot and fillerweak slot and filler
weak slot and filler
 
strong slot and filler
strong slot and fillerstrong slot and filler
strong slot and filler
 
Problems, Problem spaces and Search
Problems, Problem spaces and SearchProblems, Problem spaces and Search
Problems, Problem spaces and Search
 
Introduction to Artificial Intelligence and few examples
Introduction to Artificial Intelligence and few examplesIntroduction to Artificial Intelligence and few examples
Introduction to Artificial Intelligence and few examples
 

Kürzlich hochgeladen

Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxchumtiyababu
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxMuhammadAsimMuhammad6
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEselvakumar948
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 

Kürzlich hochgeladen (20)

Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 

Python Regular Expressions

  • 1. MODULE 3 – PART 4 REGULAR EXPRESSIONS By, Ravi Kumar B N Assistant professor, Dept. of CSE BMSIT & M
  • 2. ➢ Regular expression is a sequence of characters that define a search pattern. ➢ patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. ➢ The regular expression library “re” must be imported into our program before we can use it. INTRODUCTION
  • 3. ➢ search() function: used to search for a particular string. will only return the first occurrence that matches the specified pattern. This function is available in “re” library. ➢ the caret character (^) : is used in regular expressions to match the beginning of a line. ➢ The dollar character ($) : is used in regular expressions to match the end of a line. Example: program to match only lines where “From:” is at the beginning of the line import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search('^From:', line) : print(line) #Output From:stephen Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008 mbox1.txt From:stephen Sat Jan 5 09:14:16 2008 Return-Path: <postmaster@collab.sakaiproject.org> From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 Subject: [sakai] svn commit: From:zqian@umich.edu Fri Jan 4 16:10:39 2008 Return-Path: <postmaster@collab.sakaiproject.org> ✓ The instruction re.search('^From:', line) equivalent with the startswith() method from the string library. SEARCH() FUNCTION:
  • 4. ➢ The dot character (.) : The most commonly used special character is the period (”dot”) or full stop, which matches any character. The regular expression “F..m:” would match any of the following strings since the period characters in the regular expression match any character. “From:”, “Fxxm:”, “F12m:”, or “F!@m:” ➢ The program in the previous slide is rewritten using dot character which gives the same output CHARACTER MATCHING IN REGULAR EXPRESSIONS import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search(‘^F..m:', line) : print(line) #Output From:stephen Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008
  • 5. Character can be repeated any number of times using the “*” or “+” characters in a regular expression. ➢ The Asterisk character (*) : matches zero-or-more characters ➢ The Plus character (+) : matches one-or-more characters Example: Program to match lines that start with “From:”, followed by mail-id import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search(‘^From:.+@', line) : print(line) #Output From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008 ✓ The search string “ˆFrom:.+@” will successfully match lines that start with “From:”, followed by one or more characters (“.+”), followed by an at-sign. The “.+” wildcard matches all the characters between the colon character and the at-sign.
  • 6. ➢ non-whitespace character (S) - matches one non-whitespace character ➢findall() function: It is used to search for “all” occurrences that match a given pattern. In contrast, search() function will only return the first occurrence that matches the specified pattern. import re s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM' lst = re.findall('S+@S+', s) print(lst) #output ['csev@umich.edu', 'cwen@iupui.edu'] Example1: Program returns a list of all of the strings that look like email addresses from a given line. # same program using search() it will display only first mail id or first matching string import re s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM' lst = re.search('S+@S+', s) print(lst) #output <re.Match object; span=(11, 25), match='csev@umich.edu'> 'S+@S+’ this regular expression matches substrings that have at least one non-whitespace character, followed by an at-sign, followed by at least one more non-whitespace character
  • 7. Example2: Program returns a list of all of the strings that look like email addresses from a given file. import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() x = re.findall('S+@S+', line) if len(x) > 0 : print(x) #Output ['<postmaster@collab.sakaiproject.org>'] ['louis@media.berkeley.edu'] ['zqian@umich.edu'] ['<postmaster@collab.sakaiproject.org>'] ➢ Square brackets “[]” : square brackets are used to indicate a set of multiple acceptable characters we are willing to consider matching. Example: [a-z] matches single lowercase letter [A-Z] matches single uppercase letter [a-zA-Z] matches single lowercase letter or uppercase letter [a-zA-Z0-9] matches single lowercase letter or uppercase letter or number Some of our email addresses have incorrect characters like “<” or “;” at the beginning or end. we are only interested in the portion of the string that starts and ends with a letter or a number. To get the proper output we have to use following character.
  • 8. [amk] matches 'a', 'm', or ’k’ [(+*)] matches any of the literal characters ’(‘ , '+’, '*’, or ’)’ [0-5][0-9] matches all the two-digits numbers from 00 to 59 ➢ Characters that are not within a range can be matched by complementing the set If the first character of the set is '^', all the characters that are not in the set will be matched. For example, [^5] will match any character except ’5’ Ex: Program returns list of all email addresses in proper format. import re hand = open('mbox.txt') for line in hand: line = line.rstrip() x = re.findall('[a-zA-Z0-9]S*@S*[a-zA-Z]', line) if len(x) > 0 : print(x) #output ['postmaster@collab.sakaiproject.org'] ['louis@media.berkeley.edu'] ['zqian@umich.edu'] ['postmaster@collab.sakaiproject.org'] [a-zA-Z0-9]S*@S*[a-zA-Z] : substrings that start with a single lowercase letter, uppercase letter, or number “[a-zA- Z0-9]”, followed by zero or more non-blank characters “S*”, followed by an at-sign, followed by zero or more non-blank characters “S*”, followed by an uppercase or lowercase letter “[a-zA-Z]”.
  • 9. SEARCH AND EXTRACT import re hand = open('mbox2.txt') for line in hand: line = line.rstrip() if re.search('^XS*: [0-9.]+', line) : print(line) #Output X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.9245 Example1: Find numbers on lines that start with the string “X-” lines such as: X-DSPAM-Confidence: 0.8475 ➢ parentheses “()” in regular expression : used to extract a portion of the substring that matches the regular expression. import re hand = open('mbox2.txt') for line in hand: line = line.rstrip() x = re.findall('^XS*: ([0-9.]+)', line) if len(x) > 0 : print(x) Search #Output ['0.8475’] Extract ['0.9245'] mbox2.txt From: stephen.marquard@uct.ac.za Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/conten impl/impl/src/java/org X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8 X-Content-Type-Message-Body: text/plain; charset=UTF-8 Content-Type: text/plain; charset=UTF-8 X-DSPAM-Result: Innocent X-DSPAM-Processed: Sat Jan 5 09:14:16 2008 X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.9245 Above output has entire line we only want to extract numbers from lines that have the above syntax
  • 10. import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() x = re.findall('^From.* ([0-3][0-9]):', line) if len(x) > 0 : print(x) #Output ['09'] ['16'] ['16'] Example2: Program to print the day of received mails
  • 11. RANDOM EXECUTION >>> s=" 0.9 .90 1.0 1. 138 pqr“ >>> re.findall('[0-9.]+',s) ['0.9', '.90', '1.0', '1.', '138’] >>> re.findall('[0-9]+[.][0-9]',s) ['0.9', '1.0’] >>> re.findall('[0-9]+[.][0-9]+',s) ['0.9', '1.0'] >>> re.findall('[0-9]*[.][0-9]+’,s) ['0.9', '.90', '1.0’] >>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1byis112, 1byee190“ >>> re.findall('1bycs...',usn) ['1bycs123', '1bycs009’] >>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn) ['1bycs123', '1bycs009’] >>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1vecs112, 1svcs190" >>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn) ['1bycs123', '1bycs009', '1vecs112', '1svcs190’] >>> re.findall('[0-9]+cs[0-9]+',usn) [] >>> re.findall('[a-zA-Z0-9]+cs([0-9]+)',usn) ['123', '009', '112', '190']
  • 12. ESCAPE CHARACTER ➢ Escape character (backslash "" ) is a metacharacter in regular expressions. It allow special characters to be used without invoking their special meaning. If you want to match 1+1=2, the correct regex is 1+1=2. Otherwise, the plus sign has a special meaning. For example, we can find money amounts with the following regular expression. >>>import re >>>x = 'We just received $10.00 for cookies.’ >>>y = re.findall(‘$[0-9.]+’,x) >>> y ['$10.00']
  • 13. SUMMARY Character Meaning ˆ Matches the beginning of the line $ Matches the end of the line . Matches any character (a wildcard) s Matches a whitespace character S Matches a non-whitespace character (opposite of s) * Applies to the immediately preceding character and indicates to match zero or more of the preceding character(s) *? Applies to the immediately preceding character and indicates to match zero or more of the preceding character(s) in “non-greedy mode” + Applies to the immediately preceding character and indicates to match one or more of the preceding character(s) +? Applies to the immediately preceding character and indicates to match one or more of the preceding character(s) in “non-greedy mode”. [aeiou] Matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, or “u”, but no other characters. [a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.
  • 14. Character Meaning [ˆA-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter. ( ) When parentheses are added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when using findall() b Matches the empty string, but only at the start or end of a word. B Matches the empty string, but not at the start or end of a word d Matches any decimal digit; equivalent to the set [0-9]. D Matches any non-digit character; equivalent to the set [ˆ0-9]
  • 15. ASSIGNMENT 1) Write a python program to check the validity of a Password In this program, we will be taking a password as a combination of alphanumeric characters along with special characters, and check whether the password is valid or not with the help of few conditions. Primary conditions for password validation : 1.Minimum 8 characters. 2.The alphabets must be between [a-z] 3.At least one alphabet should be of Upper Case [A-Z] 4.At least 1 number or digit between [0-9]. 5.At least 1 character from [ _ or @ or $ ]. 2) Write a pattern for the following: Pattern to extract lines starting with the word From (or from) and ending with edu. Pattern to extract lines ending with any digit. Start with upper case letters and end with digits. Search for the first white-space character in the string and display its position. Replace every white-space character with the number 9: consider a sample text txt = "The rain in Spain"