SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
How to check
valid email?
Not only in Ruby
brought to you by
Piotr Wasiak
Find using RegEx(p?)
Agenda
2
1. RegEx overview
2. Recommendations
3. Ruby quirks / amenities
4. Tools / Resources
5. Advanced RE(2)
Who am I?
Piotr Wasiak
Ruby, Rails developer
Current PRUG organiser
3
Interests:
● climbing, hiking, squash
● contract bridge, chess
● ruby, programming, crypto
Regular Expression
is a character sequence, that defines a search pattern
The purpose is:
● validate the string by the pattern
● get parts of the content (e.g. find or find_and_replace in text editors)
4
RegEx history
● Concept of language arose in the 1950s
● Different syntaxes (1980+):
○ POSIX (Basic - or Extended Regular Expressions)
○ Perl (influenced/imported to other languages as PCRE 1997, PCRE2 2015)
5
RegEx as a state machine
6
Statement validation: /(?<name>ADAM|PIOTR)s?[=><]{1,2}s*"(?:PIENIĄDZ|KUKU)"/g
Basics
7
Find RegEx
In replace we can use
matched whole
phrase or groups.
Group number is
ordered by starting
bracket index and is
limited to 1 - 9
8
Valid email (1/3)
Rails popular gem solution:
9
Valid email (2/3)
10
Email validation:
/(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"
(?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|[x01-x09x0bx0c
x0e-x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9]
(?:[a-z0-9-]*[a-z0-9])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0
bx0cx0e-x1fx21-x5ax5d-x7f]|[x01-x09x0bx0cx0e-x7f])+)])/g
Valid email (3/3)
11
Email validation:
/(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"
(?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|[x01-x09x0bx0c
x0e-x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9]
(?:[a-z0-9-]*[a-z0-9])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0
bx0cx0e-x1fx21-x5ax5d-x7f]|[x01-x09x0bx0cx0e-x7f])+)])/g
12
2. Recommendations
original_regexp =
%r{(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[x01-x08x0bx0cx0e-x1f!#-x5b]-x7f]|[x01-x09x0bx0cx0e-x7f])*")@(?:(?:[[:alnum:]](?:[a-z0-9
-]*[[:alnum:]])?.)+[[:alnum:]](?:[a-z0-9-]*[[:alnum:]])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[[:alnum:]]:(?:[x01-x08x0bx
0cx0e-x1f!-Z]-x7f]|[x01-x09x0bx0cx0e-x7f])+)])}
alnum_with_hypen = /[a-z0-9-]/.source # posix alternative /[-[:alnum:]]/
ip_number_type = /25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?/.source
common_parts = /[x01-x08x0bx0cx0e-x1f]-x7f]/.source
username_without_backslash_prepended_set = /[#{common_parts}!#-x5b]/.source
domain_port_unescaped_set = /[#{common_parts}!-Z]/.source
domain_port_escaped_chars_set = /[#{common_parts}x0e-x7f]/.source
non_ending_chars = %r{[a-z0-9!#$%&'*+/=?^_`{|}~-]+}.source
final_with_variables =
/(?:#{non_ending_chars}(?:.#{non_ending_chars})*|"(?:#{username_without_backslash
_prepended_set}|#{domain_port_escaped_chars_set})*")@(?:(?:[[:alnum:]](?:#{alnum
_with_hypen}*[[:alnum:]])?.)+[[:alnum:]](?:#{alnum_with_hypen}*[[:alnum:]])?|[(?
:(?:#{ip_number_type}).){3}(?:#{ip_number_type}|#{alnum_with_hypen}*[[:alnum:]]:(
?:#{domain_port_unescaped_set}|#{domain_port_escaped_chars_set})+)])/
13
Simplify valid email
original_regexp =
%r{(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[x01-x08x0bx0cx0e-x1f!#-x5b]-x7f]|[x01-x09x0bx0cx0e-x7f])*")@(?:(?:[[:alnum:]](?:[a-z0-9
-]*[[:alnum:]])?.)+[[:alnum:]](?:[a-z0-9-]*[[:alnum:]])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[[:alnum:]]:(?:[x01-x08x0bx
0cx0e-x1f!-Z]-x7f]|[x01-x09x0bx0cx0e-x7f])+)])}
alnum_with_hypen = /[a-z0-9-]/.source # posix alternative /[-[:alnum:]]/
ip_number_type = /25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?/.source
ascii_wo_tabs_cr_nl = /[[:ascii:]&&[^x09-x0ax0d]]/.source
domain_port_escaped_chars_set = /[#{ascii_wo_tabs_cr_nl}x09x20"]/.source
domain_port_unescaped_set = /[#{ascii_wo_tabs_cr_nl}&&[^x20]]/.source
username = /[#{domain_port_unescaped_set}&&[^"]]/.source
non_ending_chars = %r{[a-z0-9!#$%&'*+/=?^_`{|}~-]+}.source
final_with_variables =
/(?:#{non_ending_chars}(?:.#{non_ending_chars})*|"(?:#{username}|#{domain_port_
escaped_chars_set})*")@(?:(?:[[:alnum:]](?:#{alnum_with_hypen}*[[:alnum:]])?.)+[[
:alnum:]](?:#{alnum_with_hypen}*[[:alnum:]])?|[(?:(?:#{ip_number_type}).){3}(?:#
{ip_number_type}|#{alnum_with_hypen}*[[:alnum:]]:(?:#{domain_port_unescaped_set}|
#{domain_port_escaped_chars_set})+)])/
14
Simplify valid email (more ruby version)
original_regexp = %r{ # there is no heredoc for regexp
(?: # strings with some special chars, but not ending with .
[a-z0-9!#$%&'*+/=?^_`{|}~-]+
(?:
.[a-z0-9!#$%&'*+/=?^_`{|}~-]+
)*
|
"
(?: # special chars enquoted
[x01-x08x0bx0cx0e-x1f!#-x5b]-x7f]
|
 # prepended with backslash, here escaped
[x01-x09x0bx0cx0e-x7f] # more special chars
)*
" # closing quote
)
@ # the most crucial ampersand
(?: # domain regexp
(?: # at least one subdomain joined and finished with .
[[:alnum:]]
(?:
[a-z0-9-]* # subdomain can have many alphanumeric or - inside
[[:alnum:]] # subdomain have to finish with alphanumeric char
)?
. # dot separator
)+
[[:alnum:]] # domain have to start with alphanumeric char
(?:
[a-z0-9-]* # domain can have many alphanumeric or - inside
[[:alnum:]] # domain have to finish with alphanumeric char
)? 15
/x comments mode
| # or direct ip implementation or 3 numbers
with . suffix and some special usecases
[ # enquoted with square brackets
(?:
(?: # numbers are quite complex in RegEx
25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? #
0-255
). # . suffix
){3} # 3 times
(?:
25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? # 0-255
| # or 3 numbers with . suffix and some
special usecases
[a-z0-9-]* # alnums also starting with -
[[:alnum:]] # finishing without -
:
(?:
[x01-x08x0bx0cx0e-x1f!-Z]-x7f] #
many chars
|
 # more ansii chars prefixed with
backslash
[x01-x09x0bx0cx0e-x7f]
)+
)
] # closing square bracket
)
}x # switch to treat spaces/new lines and `# `
suffix as comments
Ruby simply string methods are faster and more meaningful:
● .start_with? / .end_with?
● .include?(‘some substring’)
● .chomp
● .strip
● .lines
● .split(‘ ’) # without regexp
● .tr(‘from chars’, ‘1-9’)
16
Do not overuse regular expression (1/2)
Libraries and gems for common concepts:
● URI(url)
+ .host / .path / .query / .fragment
● File(path_to_file)
+ .dirname / .basename / .extname
● Nokogiri::HTML(
open('https://nokogiri.org/’)
)
17
Do not overuse regular expression (2/2)
Do not use REGEX as language parser
Programming languages depend more on language nodes/tree.
There will be always a problem with some exceptions, different coding
styles
In Ruby we need to use Ripper or other tools to decompose Ruby code
into pieces
Markup languages can be parsed by e.g. Nokogiri, Ox, Oj gems easier
and more secure
18
Clear RegEx
● extract common parts in alternation
● put more likely to appear words in the front of alternation
● use comments and whitespace with /x modifier
● give a name for captured groups, use also non-captured
● split code to smaller logical pieces
● lint code with ruby -w for warnings
19
3. Ruby quirks / flavor
20
mix ? Interpolation of RegEx
MULTILINE
IGNORECASE
EXTENDED
21
Joke
Scrabble: what is a longest word from combined RE switch letters?
22
I M N O X
Joke
Scrabble: what is a longest word from combined RE switch letters?
23
I M N O X
- in general "dot matches at line breaks mode" is turn on with s flag
instead of ruby m flag
- In Ruby, ^ and $ always match on every line.
If you want to specify the beginning of the string, use A.
For the very end of the string, use z (or Z including final line break).
Quirks in Ruby RegEx engine (1/3)
24
Quirks in Ruby RegEx engine (2/3)
Ruby does not allow
● look-ahead
● negative look-behind
inside a look-behind, such as:
25
- Intersection […&&[…]]
- Subtraction […&&[^…]]
26
Quirks in Ruby RegEx engine (3/3)
Character classes operators
Ruby amenities (1/3)
27
Ruby amenities (2/3)
28
Ruby amenities (3/3)
29
4. Tools / Resources
30
Tools / Websites
● regex101.com/
nicest editor, explanation on hover, cheatset, performance analysis
● www.debuggex.com/ visualized graphs with cheat-set
● Visualization plugins for Visual Studio Code
● rubocop and rubocop-performance have some rules for regex
● rubular.com/ check if RegEx works in Ruby 2.5. Other with 2.1
● rubyapi.org/3.1/o/regexp good Ruby docs
31
32
5. Advanced RE(2)
33
Backtracking
problem
34
/d-d+$/g
Catastrophic backtracking case /a?n
an
=~ an
/
35
“Most modern engines are regex-directed because this is the only way to
implement useful features such as lazy quantifiers and backreferences;
and atomic grouping and possessive quantifiers that give extra control
to backtracking.”
PCRE like solutions
36
37
38
Back to Finite Automaton - (D/N) FA
39
/abb*a/
RegEx to Deterministic Finite Automaton
What RegEx is it?
40
RegEx to Deterministic Finite Automaton
/(0|1)*1/ matches: [ 1010101, 1, 10101]
41
RegEx to Deterministic Finite Automaton
/(0|1)*1/
42
RegEx to Deterministic Finite Automaton
/(0|1)*1/
43
RE2
PCRE2
44
Sources
● devopedia.org/regex-engines
● patshaughnessy.net/2012/4/3/ (...) rubys-regular-expression-algorithm
● github.com/google/re2/wiki/Syntax
● optimized re2 called hyperscan
● wiki/Determinizacja_automatu_skonczonego
● regular-expressions.info/refrepeat.html
● rexegg.com/regex-optimizations.html
45
Thanks for listening
What’s your question?
46

Weitere ähnliche Inhalte

Was ist angesagt?

Intermediate code generation1
Intermediate code generation1Intermediate code generation1
Intermediate code generation1Shashwat Shriparv
 
Assembly Language Compiler Implementation
Assembly Language Compiler ImplementationAssembly Language Compiler Implementation
Assembly Language Compiler ImplementationRAVI TEJA KOMMA
 
Declarative Type System Specification with Statix
Declarative Type System Specification with StatixDeclarative Type System Specification with Statix
Declarative Type System Specification with StatixEelco Visser
 
Three address code In Compiler Design
Three address code In Compiler DesignThree address code In Compiler Design
Three address code In Compiler DesignShine Raj
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generationRamchandraRegmi
 
C++20 the small things - Timur Doumler
C++20 the small things - Timur DoumlerC++20 the small things - Timur Doumler
C++20 the small things - Timur Doumlercorehard_by
 
Chapter 6 intermediate code generation
Chapter 6   intermediate code generationChapter 6   intermediate code generation
Chapter 6 intermediate code generationVipul Naik
 
Lecture 12 intermediate code generation
Lecture 12 intermediate code generationLecture 12 intermediate code generation
Lecture 12 intermediate code generationIffat Anjum
 
cs241-f06-final-overview
cs241-f06-final-overviewcs241-f06-final-overview
cs241-f06-final-overviewColin Bell
 
Lecture 03 lexical analysis
Lecture 03 lexical analysisLecture 03 lexical analysis
Lecture 03 lexical analysisIffat Anjum
 
Chapter 6 Flow control Instructions
Chapter 6 Flow control InstructionsChapter 6 Flow control Instructions
Chapter 6 Flow control Instructionswarda aziz
 

Was ist angesagt? (20)

Intermediate code generation1
Intermediate code generation1Intermediate code generation1
Intermediate code generation1
 
Assembly Language Compiler Implementation
Assembly Language Compiler ImplementationAssembly Language Compiler Implementation
Assembly Language Compiler Implementation
 
Declarative Type System Specification with Statix
Declarative Type System Specification with StatixDeclarative Type System Specification with Statix
Declarative Type System Specification with Statix
 
Three address code In Compiler Design
Three address code In Compiler DesignThree address code In Compiler Design
Three address code In Compiler Design
 
Assembler
AssemblerAssembler
Assembler
 
Compiler Design Unit 3
Compiler Design Unit 3Compiler Design Unit 3
Compiler Design Unit 3
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Optimization of dfa
Optimization of dfaOptimization of dfa
Optimization of dfa
 
Assembler
AssemblerAssembler
Assembler
 
Ch9a
Ch9aCh9a
Ch9a
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 
Compiler Design Unit 5
Compiler Design Unit 5Compiler Design Unit 5
Compiler Design Unit 5
 
Ch8b
Ch8bCh8b
Ch8b
 
C++20 the small things - Timur Doumler
C++20 the small things - Timur DoumlerC++20 the small things - Timur Doumler
C++20 the small things - Timur Doumler
 
Chapter 6 intermediate code generation
Chapter 6   intermediate code generationChapter 6   intermediate code generation
Chapter 6 intermediate code generation
 
Lecture 12 intermediate code generation
Lecture 12 intermediate code generationLecture 12 intermediate code generation
Lecture 12 intermediate code generation
 
cs241-f06-final-overview
cs241-f06-final-overviewcs241-f06-final-overview
cs241-f06-final-overview
 
[ASM]Lab4
[ASM]Lab4[ASM]Lab4
[ASM]Lab4
 
Lecture 03 lexical analysis
Lecture 03 lexical analysisLecture 03 lexical analysis
Lecture 03 lexical analysis
 
Chapter 6 Flow control Instructions
Chapter 6 Flow control InstructionsChapter 6 Flow control Instructions
Chapter 6 Flow control Instructions
 

Ähnlich wie How to check valid Email? Find using regex.

How to check valid Email? Find using regex.
How to check valid Email? Find using regex.How to check valid Email? Find using regex.
How to check valid Email? Find using regex.Poznań Ruby User Group
 
How to check valid email? Find using regex(p?)
How to check valid email? Find using regex(p?)How to check valid email? Find using regex(p?)
How to check valid email? Find using regex(p?)Visuality
 
Linux fundamental - Chap 06 regx
Linux fundamental - Chap 06 regxLinux fundamental - Chap 06 regx
Linux fundamental - Chap 06 regxKenny (netman)
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondMax Shirshin
 
And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...Codemotion
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentationarnolambert
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentationarnolambert
 
Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Aslak Hellesøy
 
Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Aslak Hellesøy
 
Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Aslak Hellesøy
 
CS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic ServicesCS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic ServicesEelco Visser
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsEran Zimbler
 
OISF: Regular Expressions (Regex) Overview
OISF: Regular Expressions (Regex) OverviewOISF: Regular Expressions (Regex) Overview
OISF: Regular Expressions (Regex) OverviewCiNPA Security SIG
 
Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023WSO2
 
Domain Specific Languages In Scala Duse3
Domain Specific Languages In Scala Duse3Domain Specific Languages In Scala Duse3
Domain Specific Languages In Scala Duse3Peter Maas
 
Coffee 'n code: Regexes
Coffee 'n code: RegexesCoffee 'n code: Regexes
Coffee 'n code: RegexesPhil Ewels
 
Regular expressions and php
Regular expressions and phpRegular expressions and php
Regular expressions and phpDavid Stockton
 

Ähnlich wie How to check valid Email? Find using regex. (20)

How to check valid Email? Find using regex.
How to check valid Email? Find using regex.How to check valid Email? Find using regex.
How to check valid Email? Find using regex.
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regular expression for everyone
Regular expression for everyoneRegular expression for everyone
Regular expression for everyone
 
How to check valid email? Find using regex(p?)
How to check valid email? Find using regex(p?)How to check valid email? Find using regex(p?)
How to check valid email? Find using regex(p?)
 
Linux fundamental - Chap 06 regx
Linux fundamental - Chap 06 regxLinux fundamental - Chap 06 regx
Linux fundamental - Chap 06 regx
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And Beyond
 
And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009
 
Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009
 
Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009Ruby presentasjon på NTNU 22 april 2009
Ruby presentasjon på NTNU 22 april 2009
 
CS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic ServicesCS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic Services
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
OISF: Regular Expressions (Regex) Overview
OISF: Regular Expressions (Regex) OverviewOISF: Regular Expressions (Regex) Overview
OISF: Regular Expressions (Regex) Overview
 
Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023Ballerina Tech Talk - May 2023
Ballerina Tech Talk - May 2023
 
Domain Specific Languages In Scala Duse3
Domain Specific Languages In Scala Duse3Domain Specific Languages In Scala Duse3
Domain Specific Languages In Scala Duse3
 
Coffee 'n code: Regexes
Coffee 'n code: RegexesCoffee 'n code: Regexes
Coffee 'n code: Regexes
 
Regular expressions and php
Regular expressions and phpRegular expressions and php
Regular expressions and php
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 

Kürzlich hochgeladen

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Kürzlich hochgeladen (20)

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 

How to check valid Email? Find using regex.

  • 1. How to check valid email? Not only in Ruby brought to you by Piotr Wasiak Find using RegEx(p?)
  • 2. Agenda 2 1. RegEx overview 2. Recommendations 3. Ruby quirks / amenities 4. Tools / Resources 5. Advanced RE(2)
  • 3. Who am I? Piotr Wasiak Ruby, Rails developer Current PRUG organiser 3 Interests: ● climbing, hiking, squash ● contract bridge, chess ● ruby, programming, crypto
  • 4. Regular Expression is a character sequence, that defines a search pattern The purpose is: ● validate the string by the pattern ● get parts of the content (e.g. find or find_and_replace in text editors) 4
  • 5. RegEx history ● Concept of language arose in the 1950s ● Different syntaxes (1980+): ○ POSIX (Basic - or Extended Regular Expressions) ○ Perl (influenced/imported to other languages as PCRE 1997, PCRE2 2015) 5
  • 6. RegEx as a state machine 6 Statement validation: /(?<name>ADAM|PIOTR)s?[=><]{1,2}s*"(?:PIENIĄDZ|KUKU)"/g
  • 8. Find RegEx In replace we can use matched whole phrase or groups. Group number is ordered by starting bracket index and is limited to 1 - 9 8
  • 9. Valid email (1/3) Rails popular gem solution: 9
  • 10. Valid email (2/3) 10 Email validation: /(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|" (?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|[x01-x09x0bx0c x0e-x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9] (?:[a-z0-9-]*[a-z0-9])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0 bx0cx0e-x1fx21-x5ax5d-x7f]|[x01-x09x0bx0cx0e-x7f])+)])/g
  • 11. Valid email (3/3) 11 Email validation: /(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|" (?:[x01-x08x0bx0cx0e-x1fx21x23-x5bx5d-x7f]|[x01-x09x0bx0c x0e-x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9] (?:[a-z0-9-]*[a-z0-9])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[x01-x08x0 bx0cx0e-x1fx21-x5ax5d-x7f]|[x01-x09x0bx0cx0e-x7f])+)])/g
  • 13. original_regexp = %r{(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[x01-x08x0bx0cx0e-x1f!#-x5b]-x7f]|[x01-x09x0bx0cx0e-x7f])*")@(?:(?:[[:alnum:]](?:[a-z0-9 -]*[[:alnum:]])?.)+[[:alnum:]](?:[a-z0-9-]*[[:alnum:]])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[[:alnum:]]:(?:[x01-x08x0bx 0cx0e-x1f!-Z]-x7f]|[x01-x09x0bx0cx0e-x7f])+)])} alnum_with_hypen = /[a-z0-9-]/.source # posix alternative /[-[:alnum:]]/ ip_number_type = /25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?/.source common_parts = /[x01-x08x0bx0cx0e-x1f]-x7f]/.source username_without_backslash_prepended_set = /[#{common_parts}!#-x5b]/.source domain_port_unescaped_set = /[#{common_parts}!-Z]/.source domain_port_escaped_chars_set = /[#{common_parts}x0e-x7f]/.source non_ending_chars = %r{[a-z0-9!#$%&'*+/=?^_`{|}~-]+}.source final_with_variables = /(?:#{non_ending_chars}(?:.#{non_ending_chars})*|"(?:#{username_without_backslash _prepended_set}|#{domain_port_escaped_chars_set})*")@(?:(?:[[:alnum:]](?:#{alnum _with_hypen}*[[:alnum:]])?.)+[[:alnum:]](?:#{alnum_with_hypen}*[[:alnum:]])?|[(? :(?:#{ip_number_type}).){3}(?:#{ip_number_type}|#{alnum_with_hypen}*[[:alnum:]]:( ?:#{domain_port_unescaped_set}|#{domain_port_escaped_chars_set})+)])/ 13 Simplify valid email
  • 14. original_regexp = %r{(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[x01-x08x0bx0cx0e-x1f!#-x5b]-x7f]|[x01-x09x0bx0cx0e-x7f])*")@(?:(?:[[:alnum:]](?:[a-z0-9 -]*[[:alnum:]])?.)+[[:alnum:]](?:[a-z0-9-]*[[:alnum:]])?|[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[[:alnum:]]:(?:[x01-x08x0bx 0cx0e-x1f!-Z]-x7f]|[x01-x09x0bx0cx0e-x7f])+)])} alnum_with_hypen = /[a-z0-9-]/.source # posix alternative /[-[:alnum:]]/ ip_number_type = /25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?/.source ascii_wo_tabs_cr_nl = /[[:ascii:]&&[^x09-x0ax0d]]/.source domain_port_escaped_chars_set = /[#{ascii_wo_tabs_cr_nl}x09x20"]/.source domain_port_unescaped_set = /[#{ascii_wo_tabs_cr_nl}&&[^x20]]/.source username = /[#{domain_port_unescaped_set}&&[^"]]/.source non_ending_chars = %r{[a-z0-9!#$%&'*+/=?^_`{|}~-]+}.source final_with_variables = /(?:#{non_ending_chars}(?:.#{non_ending_chars})*|"(?:#{username}|#{domain_port_ escaped_chars_set})*")@(?:(?:[[:alnum:]](?:#{alnum_with_hypen}*[[:alnum:]])?.)+[[ :alnum:]](?:#{alnum_with_hypen}*[[:alnum:]])?|[(?:(?:#{ip_number_type}).){3}(?:# {ip_number_type}|#{alnum_with_hypen}*[[:alnum:]]:(?:#{domain_port_unescaped_set}| #{domain_port_escaped_chars_set})+)])/ 14 Simplify valid email (more ruby version)
  • 15. original_regexp = %r{ # there is no heredoc for regexp (?: # strings with some special chars, but not ending with . [a-z0-9!#$%&'*+/=?^_`{|}~-]+ (?: .[a-z0-9!#$%&'*+/=?^_`{|}~-]+ )* | " (?: # special chars enquoted [x01-x08x0bx0cx0e-x1f!#-x5b]-x7f] | # prepended with backslash, here escaped [x01-x09x0bx0cx0e-x7f] # more special chars )* " # closing quote ) @ # the most crucial ampersand (?: # domain regexp (?: # at least one subdomain joined and finished with . [[:alnum:]] (?: [a-z0-9-]* # subdomain can have many alphanumeric or - inside [[:alnum:]] # subdomain have to finish with alphanumeric char )? . # dot separator )+ [[:alnum:]] # domain have to start with alphanumeric char (?: [a-z0-9-]* # domain can have many alphanumeric or - inside [[:alnum:]] # domain have to finish with alphanumeric char )? 15 /x comments mode | # or direct ip implementation or 3 numbers with . suffix and some special usecases [ # enquoted with square brackets (?: (?: # numbers are quite complex in RegEx 25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? # 0-255 ). # . suffix ){3} # 3 times (?: 25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? # 0-255 | # or 3 numbers with . suffix and some special usecases [a-z0-9-]* # alnums also starting with - [[:alnum:]] # finishing without - : (?: [x01-x08x0bx0cx0e-x1f!-Z]-x7f] # many chars | # more ansii chars prefixed with backslash [x01-x09x0bx0cx0e-x7f] )+ ) ] # closing square bracket ) }x # switch to treat spaces/new lines and `# ` suffix as comments
  • 16. Ruby simply string methods are faster and more meaningful: ● .start_with? / .end_with? ● .include?(‘some substring’) ● .chomp ● .strip ● .lines ● .split(‘ ’) # without regexp ● .tr(‘from chars’, ‘1-9’) 16 Do not overuse regular expression (1/2)
  • 17. Libraries and gems for common concepts: ● URI(url) + .host / .path / .query / .fragment ● File(path_to_file) + .dirname / .basename / .extname ● Nokogiri::HTML( open('https://nokogiri.org/’) ) 17 Do not overuse regular expression (2/2)
  • 18. Do not use REGEX as language parser Programming languages depend more on language nodes/tree. There will be always a problem with some exceptions, different coding styles In Ruby we need to use Ripper or other tools to decompose Ruby code into pieces Markup languages can be parsed by e.g. Nokogiri, Ox, Oj gems easier and more secure 18
  • 19. Clear RegEx ● extract common parts in alternation ● put more likely to appear words in the front of alternation ● use comments and whitespace with /x modifier ● give a name for captured groups, use also non-captured ● split code to smaller logical pieces ● lint code with ruby -w for warnings 19
  • 20. 3. Ruby quirks / flavor 20
  • 21. mix ? Interpolation of RegEx MULTILINE IGNORECASE EXTENDED 21
  • 22. Joke Scrabble: what is a longest word from combined RE switch letters? 22 I M N O X
  • 23. Joke Scrabble: what is a longest word from combined RE switch letters? 23 I M N O X
  • 24. - in general "dot matches at line breaks mode" is turn on with s flag instead of ruby m flag - In Ruby, ^ and $ always match on every line. If you want to specify the beginning of the string, use A. For the very end of the string, use z (or Z including final line break). Quirks in Ruby RegEx engine (1/3) 24
  • 25. Quirks in Ruby RegEx engine (2/3) Ruby does not allow ● look-ahead ● negative look-behind inside a look-behind, such as: 25
  • 26. - Intersection […&&[…]] - Subtraction […&&[^…]] 26 Quirks in Ruby RegEx engine (3/3) Character classes operators
  • 30. 4. Tools / Resources 30
  • 31. Tools / Websites ● regex101.com/ nicest editor, explanation on hover, cheatset, performance analysis ● www.debuggex.com/ visualized graphs with cheat-set ● Visualization plugins for Visual Studio Code ● rubocop and rubocop-performance have some rules for regex ● rubular.com/ check if RegEx works in Ruby 2.5. Other with 2.1 ● rubyapi.org/3.1/o/regexp good Ruby docs 31
  • 32. 32
  • 35. Catastrophic backtracking case /a?n an =~ an / 35
  • 36. “Most modern engines are regex-directed because this is the only way to implement useful features such as lazy quantifiers and backreferences; and atomic grouping and possessive quantifiers that give extra control to backtracking.” PCRE like solutions 36
  • 37. 37
  • 38. 38
  • 39. Back to Finite Automaton - (D/N) FA 39 /abb*a/
  • 40. RegEx to Deterministic Finite Automaton What RegEx is it? 40
  • 41. RegEx to Deterministic Finite Automaton /(0|1)*1/ matches: [ 1010101, 1, 10101] 41
  • 42. RegEx to Deterministic Finite Automaton /(0|1)*1/ 42
  • 43. RegEx to Deterministic Finite Automaton /(0|1)*1/ 43
  • 45. Sources ● devopedia.org/regex-engines ● patshaughnessy.net/2012/4/3/ (...) rubys-regular-expression-algorithm ● github.com/google/re2/wiki/Syntax ● optimized re2 called hyperscan ● wiki/Determinizacja_automatu_skonczonego ● regular-expressions.info/refrepeat.html ● rexegg.com/regex-optimizations.html 45
  • 46. Thanks for listening What’s your question? 46