2. Introductions
●
●
●
●
Working with web technologies for 10 years
Former HUB supervisor
Tour de jobs: http://tinyurl.com/kmsns38
Graduated from CSU with a BAS in
Technology Management 2013
● Husband and proud father
● Presenter on regular expressions!
4. What Could I Do With a RegExp?
●
●
●
●
●
●
Searching
Syntax highlighting
Data validation
Sanitation
Data queries / extraction
Many tasks that require matching a pattern
8. We Pattern Match Every Day
● Telephone numbers follow a pattern that we
recognize
● This pattern has rules (3 digit zip, 7 digit
number, numeric only)
● There are often many variations to a pattern
(optional intl code)
10. Regular Expressions in Javascript
var haystack = "The cat in the hat";
var needle = new RegExp(/cat/);
haystack.match(needle); // truthy
needle = new RegExp(/dog/);
haystack.match(needle); // falsey
12. Special Characters (Metacharacters)
● - escape character
● ^ - beginning of line (not
inside brackets)
● $ - ending of line
● . - wildcard
● | - or junction
●
●
●
●
●
●
? - zero or one
* - zero or more
+ - one or more
() - grouping
[] - character set
{} - repetition
13.
14. Demonstration of Special Characters
String: ...To login to your email use the
username: “ben.simpson@mail.com” with a
password “password123”...
RegExp: /username "(.*)" .* password "(.*)"/
Results: 1. ben.simpson@mail.com
2. password123
15. Shorthand Character Classes
● d - digit [0-9]
● w - word
● s - whitespace
● D - digit [^d]
● W - word [^w]
● S - whitespace [^s]
17. Thinking about a Telephone Pattern
●
●
●
●
●
●
●
●
●
Optional international code
3 digit area code
7 digit number
Optional extension
What about alpha phrases? (e.g. 678 466-HELP)
What is the length of intl codes? (e.g. 358 for Finland)
Are parenthesis optional?
Is spacing optional?
Country specific formats (e.g. France 06 87 71 23 45)
22. Surprisingly Difficult
● Seemingly simple patterns can become very
complex.
● Its best to work against data that is
consistent, or regular in its implementation of
patterns
● If the data is too dirty, a regular expression
won’t be much help
23. When RegExps Go Bad
● Websites that don’t accept special
characters in email addresses, URLs,
telephone numbers, etc
● May be RegExps that are too restrictive
● Doesn’t take into account all variations of a
pattern
● Longer expressions are difficult to grok
24.
25. In a Nutshell
“Some people, when confronted with a
problem, think ‘I know, I'll use regular
expressions.’ Now they have two problems.”
-Jamie Zawinski
26. Brain Teaser
Which of the following a valid email address?
1. thehoagie@gmail.com
2. ben.simpson+work@analoganalytics.com
3. ben+email
4. http://www.clayton.edu
5. abc."defghi".xyz@example.com
27. Thinking about Email Address
● Has a local part (e.g. thehub@clayton.edu)
● Has a domain part (e.g. thehub@clayton.
edu)
● Has an @ symbol in the middle
● Do we need to support special characters?
● Can we verify based on minimum /
maximum length?
28. Best to Keep It Simple!
String: thehoagie@gmail.com
RegExp: .*@.*
Yeah, but isn’t here an official email Regex that
takes all the patterns into account? Yes...
32. Brain Teaser
Which is a valid zipcode?
1. 30022
2. 30022-7155
3. 300131
4. -7155
5. AB123XY
33. Thinking About a Zipcode
●
●
●
●
●
Digits only
5 digits mandatory plus optional 4 digit code
4 digit code suffixed with hyphen
Do other countries use zip codes?
Pattern is easier because there is less
variation (Thank USPS!)
34. Brain Teaser
Which is a valid URL?
1. http://www.clayton.edu
2. www.clayton.edu
3. clayton.edu
4. thehub.clayton.edu
5. ben:pass@clayton.edu:80/foo?bar=baz#qux