SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Databases, Markup,
and Regular Expressions
2 November 2010
Weekly reflection
•What keeps you from “being technical,” or feeling like
you are?
•Alternately, if you know you’re technical, how did you
get to be that way?
•Or both! What keeps you from feeling as technical as
you actually are?
Tool of the week:
text editors
•AKA “programmers’ editors”
•Just the text, ma’am! No binary garbage, no WYSIWYG;
that’s not what these are FOR.
•Look for:
• Regular expressions (“grep”)
• Syntax coloring in your favorite language
• Code-folding, code completion... lots of bells and whistles
•Windows: UltraEdit. Mac: BBEdit (TextWrangler is OK).
Cross-platform: jedit. Emacs and vi are for geeks only.
Tip of the week: Getting the
most out of library school
•An MLS does not guarantee you a library job. Anybody
who says it does is lying to you.
•You get out of library school what you put in.
• The “extras” like workshops, talks, committees? NOT EXTRAS.
• Don’t breeze through. Take the classes that mean something.
• Pick your practicum carefully.
• Look for champions. You’ll need those recommendations.
•Get professionally involved NOW.
•Take any chance to have your résumé and sample cover
letter read by a professional librarian.
The relational database
•Designed by EF Codd in the mid-1960s.
•RUNS THE WORLD. Almost every non-trivial web
application you’ll find has a relational DB underneath it.
•Interacts with the outside world (i.e. programs) through
SQL: Structured Query Language.
• There is an actual SQL standard...
• ... but no two databases implement it quite the same way.
• The basics, however, are pretty consistent.
•Taught here at SLIS. If you have any thoughts of being a
techie, TAKE THAT CLASS.
Tables
•Most tables represent the “things” you’re describing.
•Some tables relate those things to each other.
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
Primary key, foreign key
•Every row in a table should have some kind of unique
identifier within the table: PRIMARY KEY.
• It is often named <thing>_id, and often just a number.
•You can use a PK in other tables to refer to a row. In that
other table, it is a FOREIGN KEY.
•For BOOK, could I have chosen a different PK? PATRON?
The magic: relations!
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
checkout_id book_id patron_id
1 2 1
2 3 1
3 1 3
CHECKOUT
My First SQL Query
•Syntax: SELECT <thing(s) you want> FROM <table(s)>
WHERE <how you know which things you want>;
• Often “how you know...” is the information you’re starting with.
•What’s the barcode on the book with the ISBN 441478123?
• What happens if we have two copies of the book with this ISBN?
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK
SELECT book_barcode
FROM book WHERE
book_isbn =
‘441478123’;
A little harder!
•Who has checked out the book with barcode 12345_67890?
•Oh no! Everything’s in different tables!
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
checkout_id book_id patron_id
1 2 1
2 3 1
3 1 3
CHECKOUT
Subqueries
•You can put whole queries in the WHERE clause!
•So. What do you want, and from which table?
• patron_lname from the PATRON table
• SELECT patron_lname FROM patron WHERE...
• Or “SELECT patron_lname, patron_phone FROM patron WHERE...”
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
checkout_id book_id patron_id
1 2 1
2 3 1
3 1 3
CHECKOUT
Where what?
•Where the patron_id is associated with the right book_id in
the CHECKOUT table.
• WHERE patron_id = (SELECT patron_id FROM checkout WHERE...)
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
checkout_id book_id patron_id
1 2 1
2 3 1
3 1 3
CHECKOUT
Where what?
•You now want the book_id from the BOOK table given the
barcode number.
• WHERE book_id = (SELECT book_id FROM book WHERE book_barcode =
‘12345_67890’)
book_id book_isbn book_barcode
1 441009328 12345_67890
2 441478123 01234_56789
3 441012248 23456_78901
BOOK PATRON
patron_id patron_lname patron_phone
1 Salo 262-5493
2 Gorman 265-5291
3 Tobias 265-6381
checkout_id book_id patron_id
1 2 1
2 3 1
3 1 3
CHECKOUT
Putting it all together
•SELECT patron_lname FROM patron WHERE patron_id =
(SELECT patron_id FROM checkout WHERE book_id =
(SELECT book_id FROM book WHERE book_barcode =
‘12345_67890’));
•Whew!
Markup
XML and (X)HTML
Markup
•In the dark ages of typesetting, we told text what to
look like. [ol0[ep[fy120,10,12,1]blah[ep
• Renear: “presentational” markup.
•Lots of drawbacks to this approach!
• If “what it looks like” changes, you have to change EVERY
SINGLE PLACE where that particular kind of text appears.
• You can’t do ANYTHING consistently across documents with
different designs.
Paragraphs and
characters
•Most WYSIWYG programs mark text this way.
• Microsoft Word: “paragraph” and “character” styles.
•Most copyeditors still think this way, too.
• “keymarking” = going through a manuscript to decide what
each paragraph of text is and label it
•Notice the difference! Now you can tell text what to BE.
• Heading 1, Body Text, Abstract, Citation
•What does that let you do?
•But there’s a problem with this, too...
Nested structures
•Structures exist in texts that are bigger than paragraphs.
• A list has a beginning and end... but not within the same list item,
most times! And abstracts can be >1 paragraph.
• What about a section? Or a pullout? Or a chapter?
• Need some hierarchy here!
•WYSIWYG programs can’t do this at all, or do it very badly.
Markup does it very well!
•And so (leaving aside decades of development) we have XML.
Extensible Markup Language
•A set of rules for delimiting text structures.
•Also a family of standards designed to work with
marked-up text structures!
• DOM: Document Object Model (for programmers)
• XSLT: transform one text structure to another
• XPath: drill down into a text structure
• ... etc.
The Rules
•Thou shalt use Unicode, or else mark thy preferred encoding.
•Thou shalt put thy markup in angle brackets, clearly marking the
start and end of a text run with “tags.”
• <exclamation>Hello, World!</exclamation>
•To mark a point instead of a text run, thou shalt use empty tags.
• <empty /> OR <empty></empty>
•Thou shalt enclose thine entire document in ONE SET of tags.
•Thou shalt not permit overlapping text runs; thou shalt keep thy
hierarchy clean.
• <exclamation>Hello, <addressee>World</addressee>!</exclamation>
• <exclamation>Hello, <addressee>World!</exclamation></addressee>
More rules
•To describe a text run further, thou mayst add “attributes” (key-
value pairs) to thy start tags. Thou shalt put quote marks
around the value!
• <exclamation type=”greeting”>Hello, World!</exclamation>
•Thou shalt neither use angle brackets nor ampersands in thy
text, lest thou confuse the computer. Thou shalt refer to them
thus: & as &amp;, < as &lt;, and > as &gt;.
•Thou shalt always use the same case in thine tag and attribute
names.
• <exclamation>Hello, World!</EXCLAMATION>
That’s pretty much it.
Those are the rules!
And if your document obeys them, it is
“well-formed.”
But wait!
Don’t different kinds of text have rules of their own?
Markup languages
•The basic rules of XML, plus constraints relating to the
type of text you’re dealing with.
• Tag and attribute name/value constraints
• Hierarchy constraints
• Required/optional constraints
• Constraints on number of occurrences
•These constraints are laid out in a Schema or DTD.
• “Parser” checks that you’ve followed the XML rules and are
“well-formed.”
• “Validator” checks that you’ve followed your constraints. If
you have, you are “well-formed” AND “valid.”
Markup languages we use
•XHTML, of course!
• (the “X” is because this version of HTML uses the XML rules)
• (earlier versions of HTML didn’t)
•MODS and METS and XMLMARC, oh my!
•TEI
• Text Encoding Initiative
• For marking up books, manuscripts, dictionaries, etc.
•EAD
• Encoded Archival Description
• For marking up finding aids.
Regular expressions
the metadata librarian’s lifesaver!
http://xkcd.com/208

Weitere ähnliche Inhalte

Andere mochten auch

Digital preservation and institutional repositories
Digital preservation and institutional repositoriesDigital preservation and institutional repositories
Digital preservation and institutional repositoriesDorothea Salo
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)Dorothea Salo
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!Dorothea Salo
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Dorothea Salo
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Dorothea Salo
 
Web Stock09 Viorel Spinu
Web Stock09 Viorel SpinuWeb Stock09 Viorel Spinu
Web Stock09 Viorel SpinuFreelancer
 
Social Networks And Private Life
Social Networks And Private LifeSocial Networks And Private Life
Social Networks And Private LifeFreelancer
 
Altctrl Presentation Geek
Altctrl Presentation GeekAltctrl Presentation Geek
Altctrl Presentation GeekFreelancer
 
Din Cascada, Prin Spirala, Inspre Programari
Din Cascada, Prin Spirala, Inspre ProgramariDin Cascada, Prin Spirala, Inspre Programari
Din Cascada, Prin Spirala, Inspre ProgramariFreelancer
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
Webstock Bob Rapp
Webstock  Bob RappWebstock  Bob Rapp
Webstock Bob RappFreelancer
 

Andere mochten auch (16)

Digital preservation and institutional repositories
Digital preservation and institutional repositoriesDigital preservation and institutional repositories
Digital preservation and institutional repositories
 
Library Linked Data
Library Linked DataLibrary Linked Data
Library Linked Data
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
 
What We Organize
What We OrganizeWhat We Organize
What We Organize
 
The Social Journal
The Social JournalThe Social Journal
The Social Journal
 
Paying for it
Paying for itPaying for it
Paying for it
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!
 
Web Stock09 Viorel Spinu
Web Stock09 Viorel SpinuWeb Stock09 Viorel Spinu
Web Stock09 Viorel Spinu
 
Social Networks And Private Life
Social Networks And Private LifeSocial Networks And Private Life
Social Networks And Private Life
 
Altctrl Presentation Geek
Altctrl Presentation GeekAltctrl Presentation Geek
Altctrl Presentation Geek
 
Din Cascada, Prin Spirala, Inspre Programari
Din Cascada, Prin Spirala, Inspre ProgramariDin Cascada, Prin Spirala, Inspre Programari
Din Cascada, Prin Spirala, Inspre Programari
 
Codnuita IAB
Codnuita IABCodnuita IAB
Codnuita IAB
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
Webstock Bob Rapp
Webstock  Bob RappWebstock  Bob Rapp
Webstock Bob Rapp
 

Ähnlich wie Databases, Markup, and Regular Expressions

Expanding your Toolbox to make you a more Productive Editor
Expanding your Toolbox to make you a more Productive EditorExpanding your Toolbox to make you a more Productive Editor
Expanding your Toolbox to make you a more Productive EditorKelly Schrank, MA, ELS
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimizationBaohua Cai
 
Brixton Library Technology Initiative Week0 Recap
Brixton Library Technology Initiative Week0 RecapBrixton Library Technology Initiative Week0 Recap
Brixton Library Technology Initiative Week0 RecapBasil Bibi
 
Pa1 dictionaries subset
Pa1 dictionaries subsetPa1 dictionaries subset
Pa1 dictionaries subsetaiclub_slides
 
ATLAS.ti Training - Covering the Basics (Mac edition)
ATLAS.ti Training - Covering the Basics (Mac edition)ATLAS.ti Training - Covering the Basics (Mac edition)
ATLAS.ti Training - Covering the Basics (Mac edition)Arun Verma
 
ATLAS.ti training presentation: Covering the basics
ATLAS.ti training presentation: Covering the basics ATLAS.ti training presentation: Covering the basics
ATLAS.ti training presentation: Covering the basics Arun Verma
 
Creating Fixed-Layout EPUBs
Creating Fixed-Layout EPUBsCreating Fixed-Layout EPUBs
Creating Fixed-Layout EPUBsLaura Brady
 
Exploring Natural Language Processing in Ruby
Exploring Natural Language Processing in RubyExploring Natural Language Processing in Ruby
Exploring Natural Language Processing in RubyKevin Dias
 
How To Go About Researching
How To Go About ResearchingHow To Go About Researching
How To Go About ResearchingSudhira H. S.
 
How publishing works in the digital era
How publishing works in the digital eraHow publishing works in the digital era
How publishing works in the digital eraApex CoVantage
 
"MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C...
"MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C..."MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C...
"MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C...CAPSiDE
 
Text editing, analysis, processing, bibliography
Text editing, analysis, processing, bibliographyText editing, analysis, processing, bibliography
Text editing, analysis, processing, bibliographySubramanianMuthusamy3
 

Ähnlich wie Databases, Markup, and Regular Expressions (20)

Expanding your Toolbox to make you a more Productive Editor
Expanding your Toolbox to make you a more Productive EditorExpanding your Toolbox to make you a more Productive Editor
Expanding your Toolbox to make you a more Productive Editor
 
Learn Excel Macro
Learn Excel Macro  Learn Excel Macro
Learn Excel Macro
 
Dictionaries
DictionariesDictionaries
Dictionaries
 
Mysql query optimization
Mysql query optimizationMysql query optimization
Mysql query optimization
 
Brixton Library Technology Initiative Week0 Recap
Brixton Library Technology Initiative Week0 RecapBrixton Library Technology Initiative Week0 Recap
Brixton Library Technology Initiative Week0 Recap
 
Lexing and parsing
Lexing and parsingLexing and parsing
Lexing and parsing
 
Pa1 dictionaries subset
Pa1 dictionaries subsetPa1 dictionaries subset
Pa1 dictionaries subset
 
Annotation notes[1]
Annotation notes[1]Annotation notes[1]
Annotation notes[1]
 
ATLAS.ti Training - Covering the Basics (Mac edition)
ATLAS.ti Training - Covering the Basics (Mac edition)ATLAS.ti Training - Covering the Basics (Mac edition)
ATLAS.ti Training - Covering the Basics (Mac edition)
 
ATLAS.ti training presentation: Covering the basics
ATLAS.ti training presentation: Covering the basics ATLAS.ti training presentation: Covering the basics
ATLAS.ti training presentation: Covering the basics
 
Creating Fixed-Layout EPUBs
Creating Fixed-Layout EPUBsCreating Fixed-Layout EPUBs
Creating Fixed-Layout EPUBs
 
Module 3 design and implementing tables
Module 3 design and implementing tablesModule 3 design and implementing tables
Module 3 design and implementing tables
 
Exploring Natural Language Processing in Ruby
Exploring Natural Language Processing in RubyExploring Natural Language Processing in Ruby
Exploring Natural Language Processing in Ruby
 
How To Go About Researching
How To Go About ResearchingHow To Go About Researching
How To Go About Researching
 
How publishing works in the digital era
How publishing works in the digital eraHow publishing works in the digital era
How publishing works in the digital era
 
"MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C...
"MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C..."MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C...
"MySQL Boosting - DB Best Practices & Optimization" by José Luis Martínez - C...
 
Boosting MySQL (for starters)
Boosting MySQL (for starters)Boosting MySQL (for starters)
Boosting MySQL (for starters)
 
The Good Code Review
The Good Code ReviewThe Good Code Review
The Good Code Review
 
Text editing, analysis, processing, bibliography
Text editing, analysis, processing, bibliographyText editing, analysis, processing, bibliography
Text editing, analysis, processing, bibliography
 
Python assignment help
Python assignment helpPython assignment help
Python assignment help
 

Mehr von Dorothea Salo

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Dorothea Salo
 
Preservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesPreservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesDorothea Salo
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Dorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAsDorothea Salo
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!Dorothea Salo
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's WayDorothea Salo
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsDorothea Salo
 
So are we winning yet?
So are we winning yet?So are we winning yet?
So are we winning yet?Dorothea Salo
 
So are we winning yet?
So are we winning yet?So are we winning yet?
So are we winning yet?Dorothea Salo
 
Open Sesame (and other open movements)
Open Sesame (and other open movements)Open Sesame (and other open movements)
Open Sesame (and other open movements)Dorothea Salo
 

Mehr von Dorothea Salo (17)

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)
 
Encryption
EncryptionEncryption
Encryption
 
Preservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesPreservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanities
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?
 
FRBR and RDA
FRBR and RDAFRBR and RDA
FRBR and RDA
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
Occupy Copyright!
Occupy Copyright!Occupy Copyright!
Occupy Copyright!
 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAs
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's Way
 
Open Content
Open ContentOpen Content
Open Content
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library Systems
 
Metadata
MetadataMetadata
Metadata
 
Escaping Datageddon
Escaping DatageddonEscaping Datageddon
Escaping Datageddon
 
So are we winning yet?
So are we winning yet?So are we winning yet?
So are we winning yet?
 
So are we winning yet?
So are we winning yet?So are we winning yet?
So are we winning yet?
 
Open Sesame (and other open movements)
Open Sesame (and other open movements)Open Sesame (and other open movements)
Open Sesame (and other open movements)
 

Kürzlich hochgeladen

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 

Kürzlich hochgeladen (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 

Databases, Markup, and Regular Expressions

  • 1. Databases, Markup, and Regular Expressions 2 November 2010
  • 2. Weekly reflection •What keeps you from “being technical,” or feeling like you are? •Alternately, if you know you’re technical, how did you get to be that way? •Or both! What keeps you from feeling as technical as you actually are?
  • 3. Tool of the week: text editors •AKA “programmers’ editors” •Just the text, ma’am! No binary garbage, no WYSIWYG; that’s not what these are FOR. •Look for: • Regular expressions (“grep”) • Syntax coloring in your favorite language • Code-folding, code completion... lots of bells and whistles •Windows: UltraEdit. Mac: BBEdit (TextWrangler is OK). Cross-platform: jedit. Emacs and vi are for geeks only.
  • 4. Tip of the week: Getting the most out of library school •An MLS does not guarantee you a library job. Anybody who says it does is lying to you. •You get out of library school what you put in. • The “extras” like workshops, talks, committees? NOT EXTRAS. • Don’t breeze through. Take the classes that mean something. • Pick your practicum carefully. • Look for champions. You’ll need those recommendations. •Get professionally involved NOW. •Take any chance to have your résumé and sample cover letter read by a professional librarian.
  • 5. The relational database •Designed by EF Codd in the mid-1960s. •RUNS THE WORLD. Almost every non-trivial web application you’ll find has a relational DB underneath it. •Interacts with the outside world (i.e. programs) through SQL: Structured Query Language. • There is an actual SQL standard... • ... but no two databases implement it quite the same way. • The basics, however, are pretty consistent. •Taught here at SLIS. If you have any thoughts of being a techie, TAKE THAT CLASS.
  • 6. Tables •Most tables represent the “things” you’re describing. •Some tables relate those things to each other. book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381
  • 7. book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381 Primary key, foreign key •Every row in a table should have some kind of unique identifier within the table: PRIMARY KEY. • It is often named <thing>_id, and often just a number. •You can use a PK in other tables to refer to a row. In that other table, it is a FOREIGN KEY. •For BOOK, could I have chosen a different PK? PATRON?
  • 8. The magic: relations! book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381 checkout_id book_id patron_id 1 2 1 2 3 1 3 1 3 CHECKOUT
  • 9. My First SQL Query •Syntax: SELECT <thing(s) you want> FROM <table(s)> WHERE <how you know which things you want>; • Often “how you know...” is the information you’re starting with. •What’s the barcode on the book with the ISBN 441478123? • What happens if we have two copies of the book with this ISBN? book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK SELECT book_barcode FROM book WHERE book_isbn = ‘441478123’;
  • 10. A little harder! •Who has checked out the book with barcode 12345_67890? •Oh no! Everything’s in different tables! book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381 checkout_id book_id patron_id 1 2 1 2 3 1 3 1 3 CHECKOUT
  • 11. Subqueries •You can put whole queries in the WHERE clause! •So. What do you want, and from which table? • patron_lname from the PATRON table • SELECT patron_lname FROM patron WHERE... • Or “SELECT patron_lname, patron_phone FROM patron WHERE...” book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381 checkout_id book_id patron_id 1 2 1 2 3 1 3 1 3 CHECKOUT
  • 12. Where what? •Where the patron_id is associated with the right book_id in the CHECKOUT table. • WHERE patron_id = (SELECT patron_id FROM checkout WHERE...) book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381 checkout_id book_id patron_id 1 2 1 2 3 1 3 1 3 CHECKOUT
  • 13. Where what? •You now want the book_id from the BOOK table given the barcode number. • WHERE book_id = (SELECT book_id FROM book WHERE book_barcode = ‘12345_67890’) book_id book_isbn book_barcode 1 441009328 12345_67890 2 441478123 01234_56789 3 441012248 23456_78901 BOOK PATRON patron_id patron_lname patron_phone 1 Salo 262-5493 2 Gorman 265-5291 3 Tobias 265-6381 checkout_id book_id patron_id 1 2 1 2 3 1 3 1 3 CHECKOUT
  • 14. Putting it all together •SELECT patron_lname FROM patron WHERE patron_id = (SELECT patron_id FROM checkout WHERE book_id = (SELECT book_id FROM book WHERE book_barcode = ‘12345_67890’)); •Whew!
  • 16. Markup •In the dark ages of typesetting, we told text what to look like. [ol0[ep[fy120,10,12,1]blah[ep • Renear: “presentational” markup. •Lots of drawbacks to this approach! • If “what it looks like” changes, you have to change EVERY SINGLE PLACE where that particular kind of text appears. • You can’t do ANYTHING consistently across documents with different designs.
  • 17. Paragraphs and characters •Most WYSIWYG programs mark text this way. • Microsoft Word: “paragraph” and “character” styles. •Most copyeditors still think this way, too. • “keymarking” = going through a manuscript to decide what each paragraph of text is and label it •Notice the difference! Now you can tell text what to BE. • Heading 1, Body Text, Abstract, Citation •What does that let you do? •But there’s a problem with this, too...
  • 18. Nested structures •Structures exist in texts that are bigger than paragraphs. • A list has a beginning and end... but not within the same list item, most times! And abstracts can be >1 paragraph. • What about a section? Or a pullout? Or a chapter? • Need some hierarchy here! •WYSIWYG programs can’t do this at all, or do it very badly. Markup does it very well! •And so (leaving aside decades of development) we have XML.
  • 19. Extensible Markup Language •A set of rules for delimiting text structures. •Also a family of standards designed to work with marked-up text structures! • DOM: Document Object Model (for programmers) • XSLT: transform one text structure to another • XPath: drill down into a text structure • ... etc.
  • 20. The Rules •Thou shalt use Unicode, or else mark thy preferred encoding. •Thou shalt put thy markup in angle brackets, clearly marking the start and end of a text run with “tags.” • <exclamation>Hello, World!</exclamation> •To mark a point instead of a text run, thou shalt use empty tags. • <empty /> OR <empty></empty> •Thou shalt enclose thine entire document in ONE SET of tags. •Thou shalt not permit overlapping text runs; thou shalt keep thy hierarchy clean. • <exclamation>Hello, <addressee>World</addressee>!</exclamation> • <exclamation>Hello, <addressee>World!</exclamation></addressee>
  • 21. More rules •To describe a text run further, thou mayst add “attributes” (key- value pairs) to thy start tags. Thou shalt put quote marks around the value! • <exclamation type=”greeting”>Hello, World!</exclamation> •Thou shalt neither use angle brackets nor ampersands in thy text, lest thou confuse the computer. Thou shalt refer to them thus: & as &amp;, < as &lt;, and > as &gt;. •Thou shalt always use the same case in thine tag and attribute names. • <exclamation>Hello, World!</EXCLAMATION>
  • 22. That’s pretty much it. Those are the rules! And if your document obeys them, it is “well-formed.”
  • 23. But wait! Don’t different kinds of text have rules of their own?
  • 24. Markup languages •The basic rules of XML, plus constraints relating to the type of text you’re dealing with. • Tag and attribute name/value constraints • Hierarchy constraints • Required/optional constraints • Constraints on number of occurrences •These constraints are laid out in a Schema or DTD. • “Parser” checks that you’ve followed the XML rules and are “well-formed.” • “Validator” checks that you’ve followed your constraints. If you have, you are “well-formed” AND “valid.”
  • 25. Markup languages we use •XHTML, of course! • (the “X” is because this version of HTML uses the XML rules) • (earlier versions of HTML didn’t) •MODS and METS and XMLMARC, oh my! •TEI • Text Encoding Initiative • For marking up books, manuscripts, dictionaries, etc. •EAD • Encoded Archival Description • For marking up finding aids.
  • 26. Regular expressions the metadata librarian’s lifesaver!