This document provides a 3-sentence summary of the Ruby regular expressions (regex) document:
The document covers various regex patterns and constructs in Ruby including character classes, repetition, grouping, alternatives, substitutions, and captures through special variables. Examples are given to demonstrate matching text and extracting substrings using regex with explanations of how different regex patterns work. Useful methods for regular expressions like sub, gsub, match, and scan are also illustrated.
Automating Google Workspace (GWS) & more with Apps Script
ruby3_6up
1. Index by each word
A quick Ruby Tutorial, Part 3
class WordIndex
def initialize
@index = {}
end
def add_to_index(obj, *phrases)
phrases.each do |phrase|
phrase.scan(/w[-w']+/) do |word| # extract each word
word.downcase!
COMP313 @index[word] = [] if @index[word].nil?
@index[word].push(obj)
Source: Programming Ruby, The Pragmatic end
end
Programmers’ Guide by Dave Thomas, Chad end
def lookup(word)
Fowler, and Andy Hunt @index[word.downcase]
end
end
Add full index to SongList Ranges
class SongList 1..10
def initialize 'a'..'z'
@songs = Array.new my_array = [ 1, 2, 3 ]
@index = WordIndex.new 0...my_array.length
end
def append(song) (1..10).to_a [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
@songs.push(song) ('bar'..'bat').to_a ["bar", "bas", "bat"]
@index.add_to_index(song, song.name, song.artist)
self digits = 0..9
end digits.include?(5) true
def lookup(word) digits.min 0
@index.lookup(word) digits.max 9
end digits.reject {|i| i < 5 } [5, 6, 7, 8, 9]
end digits.each {|digit| dial(digit) } 0..9
Range example: VU meter
class VU
VU Ranges / Ranges in conditions
include Comparable
attr :volume
def initialize(volume) # 0..9 medium_volume = VU.new(4)..VU.new(7)
@volume = volume medium_volume.to_a [####, #####, ######, #######]
end
medium_volume.include?(VU.new(3)) false
def inspect
'#' * @volume
end (1..10) === 5 true
# Support for ranges
(1..10) === 15 false
def <=>(other)
self.volume <=> other.volume (1..10) === 3.14159 true
end ('a'..'j') === 'c' true
def succ
raise(IndexError, "Volume too big") if @volume >= 9
('a'..'j') === 'z' false
VU.new(@volume.succ)
end
end
1
2. Regular expressions Special vars $` $& $’
a = Regexp.new('^s*[a-z]') /^s*[a-z]/ def show_regexp(a, re)
if a =~ re
b = /^s*[a-z]/ /^s*[a-z]/
"#{$`}<<#{$&}>>#{$'}"
c = %r{^s*[a-z]} /^s*[a-z]/ else
"no match"
end
end
name = "Fats Waller"
name =~ /a/ 1 show_regexp('very interesting', /t/) very in<<t>>eresting
show_regexp('Fats Waller', /a/) F<<a>>ts Waller
name =~ /z/ nil
show_regexp('Fats Waller', /ll/) Fats Wa<<ll>>er
/a/ =~ name 1 show_regexp('Fats Waller', /z/) no match
Special chars, anchors More RegExps
., |, (, ), [, ], {, }, +, , ^, $, *, and ? show_regexp('Price $12.', /[aeiou]/) Pr<<i>>ce $12.
show_regexp('Price $12.', /[s]/) Price<< >>$12.
show_regexp('kangaroo', /angar/) k<<angar>>oo show_regexp('Price $12.', /[[:digit:]]/) Price $<<1>>2.
show_regexp('Price $12.', /[[:space:]]/) Price<< >>$12.
show_regexp('!@%&-_=+', /%&/) !@<<%&>>-_=+
show_regexp('Price $12.', /[[:punct:]aeiou]/) Pr<<i>>ce $12.
show_regexp("this isnthe time", /^the/) this isn<<the>> time
a = 'see [Design Patterns-page 123]'
show_regexp("this isnthe time", /is$/) this <<is>>nthe time show_regexp(a, /[A-F]/) see [<<D>>esign Patterns-page 123]
show_regexp("this isnthe time", /Athis/) <<this>> isnthe time show_regexp(a, /[A-Fa-f]/) s<<e>>e [Design Patterns-page 123]
show_regexp("this isnthe time", /Athe/) no match show_regexp(a, /[0-9]/) see [Design Patterns-page <<1>>23]
show_regexp(a, /[0-9][0-9]/) see [Design Patterns-page <<12>>3]
character classes Repetition
d [0-9] Digit character r* matches zero or more occurrences of r.
D [^0-9] Any character except a digit r+ matches one or more occurrences of r.
s [ trnf] Whitespace character r? matches zero or one occurrence of r.
S [^ trnf] Any character except whitespace r{m,n} matches at least m and at most n occurrences of r.
w [A-Za-z0-9_] Word character r{m,} matches at least m occurrences of r.
W [^A-Za-z0-9_] Any character except a word character r{m} matches exactly m occurrences of r.
/ab+/ matches ab, abb, abbbb, …
/(ab)+/ matches ab, abab, ababab, …
/a*/ matches everything (why?)
2
3. Greedy matching/Alternatives Groupings
a = "The moon is made of cheese" show_regexp('banana', /an*/) b<<an>>ana
show_regexp(a, /w+/) <<The>> moon is made of cheese show_regexp('banana', /(an)*/) <<>>banana
show_regexp(a, /s.*s/) The<< moon is made of >>cheese show_regexp('banana', /(an)+/) b<<anan>>a
show_regexp(a, /s.*?s/) The<< moon >>is made of cheese
a = 'red ball blue sky'
show_regexp(a, /[aeiou]{2,99}/) The m<<oo>>n is made of
show_regexp(a, /blue|red/) <<red>> ball blue sky
cheese
show_regexp(a, /(blue|red) w+/) <<red ball>> blue sky
show_regexp(a, /(red|blue) w+/) <<red ball>> blue sky
a = "red ball blue sky"
show_regexp(a, /red|blue w+/) <<red>> ball blue sky
show_regexp(a, /d|e/) r<<e>>d ball blue sky show_regexp(a, /red (ball|angry) sky/) no match a = 'the red angry sky'
show_regexp(a, /al|lu/) red b<<al>>l blue sky show_regexp(a, /red (ball|angry) sky/) the <<red angry sky>>
show_regexp(a, /red ball|angry sky/) <<red ball>> blue sky
Groupings collect matches Match inside with 1,…
"12:50am" =~ /(dd):(dd)(..)/ 0 # match duplicated letter
"Hour is #$1, minute #$2" "Hour is 12, minute 50" show_regexp('He said "Hello"', /(w)1/) He said "He<<ll>>o"
# match duplicated substrings
"12:50am" =~ /((dd):(dd))(..)/ 0 show_regexp('Mississippi', /(w+)1/) M<<ississ>>ippi
"Time is #$1" "Time is 12:50"
"Hour is #$2, minute #$3"
"Hour is 12, minute 50" show_regexp('He said "Hello"', /(["']).*?1/) He said <<"Hello">>
"AM/PM is #$4" "AM/PM is am" show_regexp("He said 'Hello'", /(["']).*?1/) He said <<'Hello'>>
Substitute patterns Upcase every first char
a = "the quick brown fox" def mixed_case(name)
a.sub(/[aeiou]/, '*') "th* quick brown fox" name.gsub(/bw/) {|word| word.upcase }
a.gsub(/[aeiou]/, '*') "th* q**ck br*wn f*x" end
a.sub(/sS+/, '') "the brown fox"
a.gsub(/sS+/, '') "the" mixed_case("fats waller") "Fats Waller"
mixed_case("louis armstrong") "Louis Armstrong"
a.sub(/^./) {|match| match.upcase } "The quick brown fox" mixed_case("strength in numbers") "Strength In Numbers"
a.gsub(/[aeiou]/) {|vowel| vowel.upcase } "thE qUIck brOwn fOx"
3
4. Classes behind regexps optional args for methods
re = /(d+):(d+)/ # match a time hh:mm def cool_dude(arg1="Miles", arg2="Coltrane", arg3="Roach")
re.class -> Regexp "#{arg1}, #{arg2}, #{arg3}."
md = re.match("Time: 12:34am") end
md.class MatchData
md[0] "12:34" # $& cool_dude "Miles, Coltrane, Roach."
md[1] "12" # $1 cool_dude("Bart") "Bart, Coltrane, Roach."
md[2] "34" # $2 cool_dude("Bart", "Elwood") "Bart, Elwood, Roach."
md.pre_match "Time: ” # $` cool_dude("Bart", "Elwood", "Linus") "Bart, Elwood, Linus."
md.post_match "am" # $’
Variable number of args code blocks again
def varargs(arg1, *rest) def take_block(p1)
"Got #{arg1} and #{rest.join(', ')}" if block_given?
end yield(p1)
else
varargs("one") "Got one and " p1
varargs("one", "two") "Got one and two" end
varargs "one", "two", "three" "Got one and two, three" end
take_block("no block") "no block"
take_block("no block") {|s| s.sub(/no /, ‘’) } "block"
Capture block explicitly Calling a method
class TaxCalculator connection.download_MP3("jitterbug") {|p| show_progress(p) }
def initialize(name, &block)
@name, @block = name, block File.size("testfile") 66
end Math.sin(Math::PI/4) 0.707106781186548
def get_tax(amount)
"#@name on #{amount} = #{ @block.call(amount) }"
self.class Object
end
self.frozen? false
end
frozen? false
tc = TaxCalculator.new("Sales tax") {|amt| amt * 0.075 } self.object_id 969948
tc.get_tax(100) "Sales tax on 100 = 7.5" object_id 969948
tc.get_tax(250) "Sales tax on 250 = 18.75"
4
5. Multiple return values Expanding arrays into args
def some_method def five(a, b, c, d, e)
100.times do |num| "I was passed #{a} #{b} #{c} #{d} #{e}"
square = num*num end
return num, square if square > 1000
end
five(1, 2, 3, 4, 5 ) "I was passed 1 2 3 4 5"
end
five(1, 2, 3, *['a', 'b']) "I was passed 1 2 3 a b"
five(*(10..14).to_a) "I was passed 10 11 12 13 14"
some_method [32, 1024]
num, square = some_method
num 32
square 1024
& for procedure objects Keyword args: use Hash
print "(t)imes or (p)lus: " class SongList
times = gets def create_search(name, params)
print "number: "
# ...
number = Integer(gets)
if times =~ /^t/ end
calc = lambda {|n| n*number } end
else
calc = lambda {|n| n+number }
list.create_search("short jazz songs", { 'genre' => "jazz",
end
'duration_less_than' => 270 })
puts((1..10).collect(&calc).join(", "))
(t)imes or (p)lus: t list.create_search('short jazz songs', 'genre' => 'jazz',
number: 2 'duration_less_than' => 270)
2, 4, 6, 8, 10, 12, 14, 16, 18, 20
Expression fun More expressions
a=b=c=0 0 rating = case votes_cast
[ 3, 1, 7, 0 ].sort.reverse [7, 3, 1, 0] when 0...10 then Rating::SkipThisOne
song_type = if song.mp3_type == MP3::Jazz when 10...50 then Rating::CouldDoBetter
if song.written < Date.new(1935, 1, 1) else Rating::Rave
Song::TradJazz end
else
Song::Jazz
end command expressions:
else `date` "Mon Jan 16 22:32:17 CST 2006n"
Song::Other
end
5