SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
hacker 102
 code4lib 2010 preconference
Asheville, NC, USA 2010-02-21
iv. regular expressions

      JavaScript
if all language
      looked like
“aabaaaabbbabaababa”
         it’d be
    easy to parse
parsing
“aabaaaabbbabaababa”
  •   there are two
      elements, “a” and “b”
  •   either may occur in
      any order
  •   /([ab]+)/
• [] denotes “elements” or “class”
• // demarcates regex
• + denotes “one or more of previous thing”
• () denotes “remember this matched group”
• /[ab]/ # an ‘a’ or a ‘b’
• /[ab]+/ # one or more ‘a’s or ‘b’s
• /([ab]+)/ # a group of one or more ‘a’s or ‘b’s
to firebug!
• [a-z] is any lower case char bet. a-z
• [0-9] is any digit
• + is one or more of previous thing
• ? is zero or one of previous thing
• | is or, e.g. [a|b] is ‘a’ or ‘b’
• * is zero to many of previous thing
• . matches any character
• [^a-z] is anything *but* [a-z]
• [a-zA-Z0-9] is any of a-z, A-Z, 0-9
• {5} matches only 5 of the preceding thing
• {2,} matches at least 2 of the preceding thing
• {2,6} matches from 2 to 6 of preceding thing
• [d] is like [0-9] (any digit)
• [S] is any non-whitespace
try this

• visit any web page
• open firebug console
• title = window.document.title
• try regexes to match parts of
  the title
most every language
 has regex support
try unix “grep”
v. glue it together

     Python
problem: Carol’s data
TITLE: ABA journal.
BD. HOLDINGS: Vol. 70 (1984) - Vol. 94 (2008)
CURRENT VOL.: Vol. 95 (2009) -
OTHER LIBRARIES:
      Miami:v. 68 (1982) -
      USDC: v. 88 (2002) -
      Birm.:v. 89 (2003) -
(Formerly: American Bar Association Journal)
(Bound and on Hein)


TITLE: Administrative law review.
BD. HOLDINGS: Vol. 22 (1969/1970) - Vol. 60
(2008)
CURRENT VOL.: Vol. 61 (2009) -
(Bound and on Hein)
starter code
   for you
#!/usr/bin/env python
import re
re_tag = re.compile(r'([A-Z .]+):')
re_title = re.compile('TITLE: (.*)')
for line in open('journals-carol-bean.txt'):
    line = line.strip()
    m1 = re_tag.match(line)
    m2 = re_title.match(line)
    if line == "":
        continue
    print "n->", line, "<-"
    if m1 or m2:
        print "MATCH"
    if m1:
        print 'tag:', m1.groups()
    if m2:
        print 'title:', m2.groups()

Weitere ähnliche Inhalte

Andere mochten auch

web archiving tools and technologies
web archiving tools and technologiesweb archiving tools and technologies
web archiving tools and technologiesDan Chudnov
 
Hacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, PythonHacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, PythonDan Chudnov
 
introduction to Django in five slides
introduction to Django in five slides introduction to Django in five slides
introduction to Django in five slides Dan Chudnov
 
collecting twitter data w/social feed manager
collecting twitter data w/social feed managercollecting twitter data w/social feed manager
collecting twitter data w/social feed managerDan Chudnov
 
think locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talkthink locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talkDan Chudnov
 
Repository Development at LC - Access 2009
Repository Development at LC - Access 2009Repository Development at LC - Access 2009
Repository Development at LC - Access 2009Dan Chudnov
 
TCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linkingTCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linkingDan Chudnov
 
what i want from linked data
what i want from linked datawhat i want from linked data
what i want from linked dataDan Chudnov
 
CRM: A Business Imperative for Companies during the Global Economic Downturn
CRM: A Business Imperative for Companies during the Global Economic DownturnCRM: A Business Imperative for Companies during the Global Economic Downturn
CRM: A Business Imperative for Companies during the Global Economic DownturnNavik Numsiang
 
WWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service MediumWWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service MediumDan Chudnov
 
Biodiversity Conservation in the Production Forests of Indonesia
Biodiversity Conservation in the Production Forests of IndonesiaBiodiversity Conservation in the Production Forests of Indonesia
Biodiversity Conservation in the Production Forests of IndonesiaGPFLR
 
Overview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research LabOverview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research LabDan Chudnov
 
Capturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed ManagerCapturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed ManagerDan Chudnov
 
Experience Gedepahala Corridor Programme
Experience Gedepahala Corridor ProgrammeExperience Gedepahala Corridor Programme
Experience Gedepahala Corridor ProgrammeGPFLR
 

Andere mochten auch (14)

web archiving tools and technologies
web archiving tools and technologiesweb archiving tools and technologies
web archiving tools and technologies
 
Hacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, PythonHacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, Python
 
introduction to Django in five slides
introduction to Django in five slides introduction to Django in five slides
introduction to Django in five slides
 
collecting twitter data w/social feed manager
collecting twitter data w/social feed managercollecting twitter data w/social feed manager
collecting twitter data w/social feed manager
 
think locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talkthink locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talk
 
Repository Development at LC - Access 2009
Repository Development at LC - Access 2009Repository Development at LC - Access 2009
Repository Development at LC - Access 2009
 
TCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linkingTCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linking
 
what i want from linked data
what i want from linked datawhat i want from linked data
what i want from linked data
 
CRM: A Business Imperative for Companies during the Global Economic Downturn
CRM: A Business Imperative for Companies during the Global Economic DownturnCRM: A Business Imperative for Companies during the Global Economic Downturn
CRM: A Business Imperative for Companies during the Global Economic Downturn
 
WWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service MediumWWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service Medium
 
Biodiversity Conservation in the Production Forests of Indonesia
Biodiversity Conservation in the Production Forests of IndonesiaBiodiversity Conservation in the Production Forests of Indonesia
Biodiversity Conservation in the Production Forests of Indonesia
 
Overview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research LabOverview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research Lab
 
Capturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed ManagerCapturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed Manager
 
Experience Gedepahala Corridor Programme
Experience Gedepahala Corridor ProgrammeExperience Gedepahala Corridor Programme
Experience Gedepahala Corridor Programme
 

Ähnlich wie Hacker102 - RegExes w/JavaScript and Python

Library Carpentry. Week One: Basics
Library Carpentry. Week One: BasicsLibrary Carpentry. Week One: Basics
Library Carpentry. Week One: BasicsJames Baker
 
Scala in practice - 3 years later
Scala in practice - 3 years laterScala in practice - 3 years later
Scala in practice - 3 years laterpatforna
 
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Thoughtworks
 
Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in pythonJohn(Qiang) Zhang
 
Testing stateful, concurrent, and async systems using test.check
Testing stateful, concurrent, and async systems using test.checkTesting stateful, concurrent, and async systems using test.check
Testing stateful, concurrent, and async systems using test.checkEric Normand
 
Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Henry S
 
From Ruby to Scala
From Ruby to ScalaFrom Ruby to Scala
From Ruby to Scalatod esking
 
Perl Intro 3 Datalog Parsing
Perl Intro 3 Datalog ParsingPerl Intro 3 Datalog Parsing
Perl Intro 3 Datalog ParsingShaun Griffith
 
shellScriptAlt.pptx
shellScriptAlt.pptxshellScriptAlt.pptx
shellScriptAlt.pptxNiladriDey18
 
2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekinge2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekingeProf. Wim Van Criekinge
 
Compass, Sass, and the Enlightened CSS Developer
Compass, Sass, and the Enlightened CSS DeveloperCompass, Sass, and the Enlightened CSS Developer
Compass, Sass, and the Enlightened CSS DeveloperWynn Netherland
 
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...Andrea Telatin
 
Things that every JavaScript developer should know by Rachel Appel at FrontCo...
Things that every JavaScript developer should know by Rachel Appel at FrontCo...Things that every JavaScript developer should know by Rachel Appel at FrontCo...
Things that every JavaScript developer should know by Rachel Appel at FrontCo...DevClub_lv
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfBryan Alejos
 

Ähnlich wie Hacker102 - RegExes w/JavaScript and Python (20)

Library Carpentry. Week One: Basics
Library Carpentry. Week One: BasicsLibrary Carpentry. Week One: Basics
Library Carpentry. Week One: Basics
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
PHP - Introduction to PHP
PHP -  Introduction to PHPPHP -  Introduction to PHP
PHP - Introduction to PHP
 
Scala in practice - 3 years later
Scala in practice - 3 years laterScala in practice - 3 years later
Scala in practice - 3 years later
 
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
Scala in-practice-3-years by Patric Fornasier, Springr, presented at Pune Sca...
 
Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in python
 
Intro to Perl and Bioperl
Intro to Perl and BioperlIntro to Perl and Bioperl
Intro to Perl and Bioperl
 
Testing stateful, concurrent, and async systems using test.check
Testing stateful, concurrent, and async systems using test.checkTesting stateful, concurrent, and async systems using test.check
Testing stateful, concurrent, and async systems using test.check
 
Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2
 
From Ruby to Scala
From Ruby to ScalaFrom Ruby to Scala
From Ruby to Scala
 
Perl Intro 3 Datalog Parsing
Perl Intro 3 Datalog ParsingPerl Intro 3 Datalog Parsing
Perl Intro 3 Datalog Parsing
 
shellScriptAlt.pptx
shellScriptAlt.pptxshellScriptAlt.pptx
shellScriptAlt.pptx
 
2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekinge2015 bioinformatics python_strings_wim_vancriekinge
2015 bioinformatics python_strings_wim_vancriekinge
 
Compass, Sass, and the Enlightened CSS Developer
Compass, Sass, and the Enlightened CSS DeveloperCompass, Sass, and the Enlightened CSS Developer
Compass, Sass, and the Enlightened CSS Developer
 
Introduction to Perl and BioPerl
Introduction to Perl and BioPerlIntroduction to Perl and BioPerl
Introduction to Perl and BioPerl
 
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Regular expression for everyone
Regular expression for everyoneRegular expression for everyone
Regular expression for everyone
 
Things that every JavaScript developer should know by Rachel Appel at FrontCo...
Things that every JavaScript developer should know by Rachel Appel at FrontCo...Things that every JavaScript developer should know by Rachel Appel at FrontCo...
Things that every JavaScript developer should know by Rachel Appel at FrontCo...
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
 

Kürzlich hochgeladen

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Kürzlich hochgeladen (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Hacker102 - RegExes w/JavaScript and Python

  • 1. hacker 102 code4lib 2010 preconference Asheville, NC, USA 2010-02-21
  • 3. if all language looked like “aabaaaabbbabaababa” it’d be easy to parse
  • 4. parsing “aabaaaabbbabaababa” • there are two elements, “a” and “b” • either may occur in any order • /([ab]+)/
  • 5. • [] denotes “elements” or “class” • // demarcates regex • + denotes “one or more of previous thing” • () denotes “remember this matched group” • /[ab]/ # an ‘a’ or a ‘b’ • /[ab]+/ # one or more ‘a’s or ‘b’s • /([ab]+)/ # a group of one or more ‘a’s or ‘b’s
  • 7. • [a-z] is any lower case char bet. a-z • [0-9] is any digit • + is one or more of previous thing • ? is zero or one of previous thing • | is or, e.g. [a|b] is ‘a’ or ‘b’ • * is zero to many of previous thing • . matches any character
  • 8. • [^a-z] is anything *but* [a-z] • [a-zA-Z0-9] is any of a-z, A-Z, 0-9 • {5} matches only 5 of the preceding thing • {2,} matches at least 2 of the preceding thing • {2,6} matches from 2 to 6 of preceding thing • [d] is like [0-9] (any digit) • [S] is any non-whitespace
  • 9. try this • visit any web page • open firebug console • title = window.document.title • try regexes to match parts of the title
  • 10. most every language has regex support
  • 12. v. glue it together Python
  • 14. TITLE: ABA journal. BD. HOLDINGS: Vol. 70 (1984) - Vol. 94 (2008) CURRENT VOL.: Vol. 95 (2009) - OTHER LIBRARIES: Miami:v. 68 (1982) - USDC: v. 88 (2002) - Birm.:v. 89 (2003) - (Formerly: American Bar Association Journal) (Bound and on Hein) TITLE: Administrative law review. BD. HOLDINGS: Vol. 22 (1969/1970) - Vol. 60 (2008) CURRENT VOL.: Vol. 61 (2009) - (Bound and on Hein)
  • 15. starter code for you
  • 16. #!/usr/bin/env python import re re_tag = re.compile(r'([A-Z .]+):') re_title = re.compile('TITLE: (.*)') for line in open('journals-carol-bean.txt'): line = line.strip() m1 = re_tag.match(line) m2 = re_title.match(line) if line == "": continue print "n->", line, "<-" if m1 or m2: print "MATCH" if m1: print 'tag:', m1.groups() if m2: print 'title:', m2.groups()