SlideShare ist ein Scribd-Unternehmen logo
1 von 147
Downloaden Sie, um offline zu lesen
Repository
        Development
           at LC
Daniel Chudnov - 2009-10-01 - dchud at loc gov
       Access 2009 - Charlottetown, PEI
who we are
what we do
what’s next
who we are
30ish people
    dev, QA, PM, ops
from libs, uni, industry, etc.
OSI
     Office of
Strategic Initiatives
“...capture the digital artifact,
 register and/or deposit it for the
 Copyright Office, pass it along to
   those who decide whether to
include it in the Library, and allow
  it to be incorporated digitally in
 the collection, with the optimum
  flow-through of information for
 registration, cataloging, indexing,
          and preservation.”
                              (search for “LC21”)
or, to be precise
capture the
    “



digital artifact
 register and/or deposit it for the Copyright Office, pass
                                                          ,


it along to those who decide whether to include it in the
  Library, and allow it to be incorporated digitally in the
       collection, with the optimum flow-through of
                        information for
    registration, cataloging, indexing, and preservation.”

                                                (search for “LC21”)
“capture the digital artifact,


 register and/or
deposit it for the
Copyright Office                                          ,
 pass it along to those who decide whether to include it
in the Library, and allow it to be incorporated digitally in
    the collection, with the optimum flow-through of
                       information for
   registration, cataloging, indexing, and preservation.”
                                                 (search for “LC21”)
“capture the digital artifact, register and/or deposit
            it for the Copyright Office,



pass it along
  to those who decide whether to include it in the
Library, and allow it to be incorporated digitally in
 the collection, with the optimum flow-through of
                    information for
registration, cataloging, indexing, and preservation.”


                                            (search for “LC21”)
“capture the digital artifact, register and/or deposit it for the
Copyright Office, pass it along to those who decide whether to
                 include it in the Library, and



  allow it to be     incorporated digitally


in the collection
       with the optimum flow-through of information for
                                                                 ,

      registration, cataloging, indexing, and preservation.”

                                                    (search for “LC21”)
“capture the digital artifact, register and/or deposit it for
 the Copyright Office, pass it along to those who decide
  whether to include it in the Library, and allow it to be
incorporated digitally in the collection, with the optimum
             flow-through of information for


        registration,
         cataloging,
       indexing, and
       preservation                                 .”
                                                   (search for “LC21”)
what we do
“capture the
digital artifact”
at scale
world scale
  then
web scale
wdl.org
partners
 all over
the world
content from
  all over
 the world
users
 all over
the world
wdl.org/ru/
wdl.org/zh/
wdl.org/ar/
launched
April 2009
lots of press
9,026 req/s
1.25 Gbit/s
on day one
no crash
just bugs
  (yay!)
that was
new for LC
how?
solaris
apache
 nginx
 mysql
  solr
django
jquery
clean URIs

static pages
global edge caching
what we do
capture the artifact

   pass it along

cataloging, indexing
chroniclingamerica.loc.gov
139,582 title records
1,442,462 pages
freely available
               now


download whole issues - tell friends - mash it up
100+ TB
16 of 50+ states/terr.
 and growing quickly
how?
solaris
apache
 mysql
 solr
django
clean URIs

page caching
capture the artifact

   pass it along

cataloging, indexing,
    preservation
preservation
  storage
 “movage”
capture the artifact
BagIt

packing slip
  for data
.
|--   bag-info.txt
|--   bagit.txt
|--   data
|     |-- batch.xml
|     |-- batch_1.xml
|     |-- batch_ne_dewitt_rework
|     |    |-- 00206538016_batch.xml
|     |    |-- 00206538028_batch.xml
|     |    `-- sn99021999
|     `-- sn99021999
|
|
|
           |-- 00206538016
           |
           |
               |-- 0000.jp2
               |-- 0000.pdf
                               data in a Bag
|          |   |-- 0000.tif
|          |   |-- 0000.xml
|          |   |-- 0001.jp2
|          |   |-- 0001.pdf
|          |   |-- 0001.tif
|          |   |-- 0001.xml
.
|--
|--
      bag-info.txt
      bagit.txt                        identifies
                                         a bag
|--   data
|     |-- batch.xml
|     |-- batch_1.xml
|     |-- batch_ne_dewitt_rework
|     |    |-- 00206538016_batch.xml
|     |    |-- 00206538028_batch.xml
|     |    `-- sn99021999
|     `-- sn99021999
|          |-- 00206538016
|          |   |-- 0000.jp2
|          |   |-- 0000.pdf
|          |   |-- 0000.tif
|          |   |-- 0000.xml
|          |   |-- 0001.jp2
|          |   |-- 0001.pdf
|          |   |-- 0001.tif
|          |   |-- 0001.xml
.

                               where the
|--   bag-info.txt
|--   bagit.txt
|--   data
|
|
|
      |-- batch.xml
      |-- batch_1.xml          data starts
      |-- batch_ne_dewitt_rework
|     |    |-- 00206538016_batch.xml
|     |    |-- 00206538028_batch.xml
|     |    `-- sn99021999
|     `-- sn99021999
|          |-- 00206538016
|          |   |-- 0000.jp2
|          |   |-- 0000.pdf
|          |   |-- 0000.tif
|          |   |-- 0000.xml
|          |   |-- 0001.jp2
|          |   |-- 0001.pdf
|          |   |-- 0001.tif
|          |   |-- 0001.xml
.
|--   bag-info.txt
|--   bagit.txt
|--   data
|     |-- batch.xml
|     |-- batch_1.xml
|     |-- batch_ne_dewitt_rework
|     |    |-- 00206538016_batch.xml
|     |    |-- 00206538028_batch.xml
|     |    `-- sn99021999
|     `-- sn99021999
|          |-- 00206538016
|          |   |-- 0000.jp2
|          |   |-- 0000.pdf
|          |   |-- 0000.tif
|          |   |-- 0000.xml
|
|
|
           |
           |
           |
               |-- 0001.jp2
               |-- 0001.pdf
               |   ...
                                       packing
|--
`--
      manifest-md5.txt
      tagmanifest-md5.txt                slip
71607ad119be88c842268a76f0b6b9e9   data/sn99021999/00206538107/1884091301/0621.pdf
c602d2ac07508059ce5f5597e239b97f   data/sn99021999/00206538120/1885100601/0831.xml
a59795bd1584532d5cbc0b1d82f75cf8   data/sn99021999/00206538016/1880061401/0593.pdf
3c64fac7e2d49671e0d93908ae42a779   data/sn99021999/00206539616/1888101801/0905.xml
03158a560baa7479b3805d2b45ee02cd   data/sn99021999/00206538028/1880111501/0405.tif
fa56ea18580e1446939ed62709e5b2db   data/sn99021999/00206538077/1883061901/1145.pdf
bf4fb83ff8305e8256970a3466c1a12d   data/sn99021999/00206538120/1885061501/0043.pdf
8f3649fc812de74b9d9443ee90a8ac9c   data/sn99021999/00206538120/1885111101/1109.tif
e0b83a7f9ca228271fdaecf6348e1cec   data/sn99021999/00206538120/1885101201/0871.xml
1c2f84e12792c123ba0aabedd0c0bbad   data/sn99021999/00206538107/1884071401/0197.xml
080e557fe9f68037605e5b80df4bc4ac   data/sn99021999/0020653820A/1888050701/0543.tif
532efe32c156459d9d9589caf618f502   data/sn99021999/00206538120/1885071401/0250.tif
ce607af59a96f2656d9448f38ffda072   data/sn99021999/0020653820A/1888052801/0731.pdf
60b626d8fd40aca1b425e86a004bb055   data/sn99021999/00206539628/1888111801/0088.xml
a467cd62350334c7aa83cf1e9056c1c6   data/sn99021999/00206539616/1888091701/0629.jp2
1a434f7a4d843a2c8ffe8d0824fafc3f   data/sn99021999/00206538028/1880120801/0482.jp2
22996d89b4a3334256afaddcaa0238d8   data/sn99021999/00206538016/1874102001/0259.jp2
36f550da273ad4c592fee1761c98322a   data/sn99021999/00206538016/1880052201/0518.jp2
7f7ccec3f2afae896338498372fd476e   data/sn99021999/00206539616/1888080101/0200.pdf
c247a5d74d0e7f857c534d935661adbe   data/sn99021999/00206538107/1884072601/0286.jp2
4d497a18a154adcc8636239378ab340b   data/sn99021999/00206539628/1889021101/0868.pdf
2e8ca2558b54b5c49b2f20a355a60895   data/sn99021999/00206538065/1882092001/0136.xml
fb71493048e5010100f18012f5060d42   data/sn99021999/00206538028/1880123001/0569.xml
40b100432890b055a5defbfbea815d57   data/sn99021999/00206538107/1884090901/0590.xml
46f6d61480dadc1c988b0baa4de8b6c4   data/sn99021999/00206539628/1888122801/0463.pdf
1cb8af0648e8c9df395b63226fe7371f   data/sn99021999/00206538016/1874101501/0244.pdf
9257834023c683b02f354888b2740b8f   data/sn99021999/00206539616/1888102301/0956.xml
0d52b3b2b1c5459b7e8d500a8566b0bf   data/sn99021999/00206538120/1885080801/0425.tif
defines two things
1

  what i think
i’m sending you
2

whether you
 received it
just like
      a
packing slip
works across
   space
works across
  systems
works across
   orgs
works across
   time
easy to make
md5deep
BIL

 BagIt
Library
bvar@sun9 /ingest/bvar/test $ bag create --dest new_bag test_data/*
12:08:47,044 [main] INFO CommandLineBagDriver : Performing operation: create
2.301112941466272:2.3
12:08:47,141 [main] INFO ManifestImpl : Creating manifest for manifest-md5.txt
12:09:09,493 [main] INFO ManifestImpl : Creating manifest for tagmanifest-md5.txt
12:09:09,511 [main] INFO AbstractBagImpl : Writing bag
12:09:41,507 [main] INFO CommandLineBagDriver : Operation completed.
12:09:41,508 [main] INFO CommandLineBagDriver : Returning 0
bvar@sun9 /ingest/bvar/push/test_bag $ bag isvalid .
11:55:45,582 [main] INFO CommandLineBagDriver : Performing operation: isvalid
11:55:46,378 [main] INFO ManifestImpl : Creating manifest for manifest-md5.txt
11:55:46,458 [main] INFO ManifestImpl : Creating manifest for tagmanifest-md5.txt
11:55:46,540 [main] INFO AbstractBagImpl : Completion check: Result is true.
11:56:21,273 [main] INFO AbstractBagImpl : Validity check: Result is true.
11:56:21,273 [main] INFO CommandLineBagDriver : Result is true.
11:56:21,274 [main] INFO CommandLineBagDriver : Returning 0
bvar@sun9 /ingest/bvar/push/test_bag $
Bagger
free/open source
     releases
     from LC
sf.net/projects/loc-xferutils/



 get yours today - tell friends - start trading bags
that was
new for LC
pass it along
transfer
inventory
workflow
transfer UI - inventory - workflow
how?
apache
spring/mvc
 hibernate
   mysql
and other
automation
 strategies
lots of
   work
still to do
lots of
integration
 still to do
register/deposit
       for
   Copyright
not my area,
    but
we hope to support
    eDeposit
 with these tools
“Deposit Demand”

     June 2009
  Federal Register
Proposed Rulemaking
stay tuned
        or
ask my colleagues :)

    (ask me whom to ask)
but, not my area
“allow it to be...
      incorporated digitally


in the collection”
“allow it to be...


incorporated
   digitally
    in the collection”
how?
traditional approach:

  catalog records
    exhibit sites
cost of
integrating everything
        is high
cost of
updating everything
      is high
cost of
consistent web strategies
         is low
for example
Linked Data
use URIs as names for things
       use HTTP URIs
 provide useful information
 include links to other URIs
 http://www.w3.org/DesignIssues/LinkedData.html
id.loc.gov
LCSH
on the web
    free
clean URIs
follow
 your
 nose

formats
view source
<link rel="alternate"
  type="application/rdf+xml"
  href="/authorities/sh00009460.rdf" />
<link rel="alternate"
  type="text/plain"
  href="/authorities/sh00009460.nt" />
<link rel="alternate"
  type="application/json"
  href="/authorities/sh00009460.json" />
<rdf:RDF>
 <rdf:Description rdf:about="http://id.loc.gov/authorities/
sh00009460#concept">
 <dcterms:modified rdf:datatype="http://www.w3.org/2001/
XMLSchema#dateTime">2000-11-27T10:39:57-04:00</dcterms:modified>
 <skos:prefLabel xml:lang="en">National parks and reserves--Prince Edward
Island</skos:prefLabel>
 <owl:sameAs rdf:resource="info:lc/authorities/sh00009460"/>
 <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
 <skos:inScheme rdf:resource="http://id.loc.gov/authorities#conceptScheme"/>
 <skos:inScheme rdf:resource="http://id.loc.gov/authorities#topicalTerms"/>
 <dcterms:created rdf:datatype="http://www.w3.org/2001/
XMLSchema#dateTime">2000-10-17T00:00:00-04:00</dcterms:created>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh2002010534#concept"/>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh2008004743#concept"/>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh2003002637#concept"/>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh00009458#concept"/>
 </rdf:Description>
 <rdf:Description rdf:about="http://id.loc.gov/authorities/
sh2002010534#concept">
 <skos:prefLabel xml:lang="en">Prince Edward Island National Park (P.E.I.)
 </skos:prefLabel>
</rdf:Description>
<rdf:RDF>
 <rdf:Description rdf:about="http://id.loc.gov/authorities/
sh00009460#concept">
 <dcterms:modified rdf:datatype="http://www.w3.org/2001/
XMLSchema#dateTime">2000-11-27T10:39:57-04:00</dcterms:modified>
 <skos:prefLabel xml:lang="en">National parks and reserves--Prince Edward
Island</skos:prefLabel>
 <owl:sameAs rdf:resource="info:lc/authorities/sh00009460"/>
 <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
 <skos:inScheme rdf:resource="http://id.loc.gov/authorities#conceptScheme"/>
 <skos:inScheme rdf:resource="http://id.loc.gov/authorities#topicalTerms"/>
 <dcterms:created rdf:datatype="http://www.w3.org/2001/
XMLSchema#dateTime">2000-10-17T00:00:00-04:00</dcterms:created>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh2002010534#concept"/>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh2008004743#concept"/>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh2003002637#concept"/>
 <skos:narrower rdf:resource="http://id.loc.gov/authorities/
sh00009458#concept"/>
 </rdf:Description>
 <rdf:Description rdf:about="http://id.loc.gov/authorities/
sh2002010534#concept">
 <skos:prefLabel xml:lang="en">Prince Edward Island National Park (P.E.I.)
 </skos:prefLabel>
</rdf:Description>
                          explicit concepts, schema, meaning
a web of data...
...with precise meaning
at this URI
   is this
concept
 with this
meaning
a standard way
  to refer to
   a heading
freely available
                   now


download the whole thing - tell friends - amaze enemies
that was
new for LC
another example
<link rel="resourcemap"
  type="application/rdf+xml" href="/lccn/
sn83030214/1905-01-15/ed-1/seq-25.rdf" />
<link rel="alternate"
  type="image/jp2" href="/lccn/sn83030214/1905-01-15/
ed-1/seq-25.jp2" />
<link rel="alternate"
  type="application/pdf" href="/lccn/
sn83030214/1905-01-15/ed-1/seq-25.pdf" />
<link rel="alternate"
  type="application/xml" href="/lccn/
sn83030214/1905-01-15/ed-1/seq-25/ocr.xml" />
<link rel="alternate"
  type="text/plain" href="/lccn/
sn83030214/1905-01-15/ed-1/seq-25/ocr.txt" />
<rdf:Description rdf:about="/lccn/sn83030214/1905-01-15/ed-1/
seq-25#page">
    <ore:isDescribedBy rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25.rdf"/>
    <foaf:depiction rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25/thumbnail.jpg"/>
    <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25.jp2"/>
    <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25/ocr.txt"/>
    <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25.pdf"/>
    <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25/ocr.xml"/>
    <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/
seq-25/thumbnail.jpg"/>
    <rdf:type rdf:resource="http://chroniclingamerica.loc.gov/
terms#Page"/>
    <ore:isAggregatedBy rdf:resource="/lccn/sn83030214/1905-01-15/
ed-1#issue"/>
    <dcterms:issued rdf:datatype="http://www.w3.org/2001/
XMLSchema#date">1905-01-15</dcterms:issued>
    <ndnp:sequence rdf:datatype="http://www.w3.org/2001/
XMLSchema#long">25</ndnp:sequence>
    <dcterms:title>New-York tribune. - 1905-01-15 - 25</dcterms:title>
</rdf:Description>
OAI-ORE
aggregation
this is a
  page
it has these
files in these formats
it is this
sequence number
it is
part of this issue
it has this
issue date
it has this
    title
all explicit concepts
all exposed
 in the app
on the web
that was
new for LC
the web
is the API
the

web
is the

API
there’s an API doc...
...it’s just a
bunch of links
“...make resources

 available
             and

    useful             ...”


from the mission of the Library
“allow it to be...


incorporated
   digitally
      in the collection”



   from the LC21 report
“...sustain and preserve
               a

 universal
collection                    ...”


from the mission of the Library
each app
consistent
  about
 meaning
follow your nose
        to
concept definitions
in our apps
and in yours
distributed
conceptual
integration
the web is a
universal collection
this is a way to
incorporate digitally
our digital artifacts
   on our web
your digital artifacts
   in your web
our digital artifacts
   in your web
your digital artifacts
    in our web
available
   &
 useful
  &c.
summary
content that scales
  on the way in
apps that scale
on the way out
movage
movage
movage
transfer
  inventory
  workflow

all in active development
the BagIt spec



   try it - it works
free/open source
software releases
free data
you can use
web of data
available and useful
view source:

           wdl.org
 chroniclingamerica.loc.gov
          id.loc.gov
sf.net/projects/loc-xferutils/

   dchud at loc gov - @dchud

Weitere ähnliche Inhalte

Ähnlich wie Repository Development at LC Captures Digital Artifacts

Leinfelder Earth Grid Jam2008
Leinfelder Earth Grid Jam2008Leinfelder Earth Grid Jam2008
Leinfelder Earth Grid Jam2008leinfelder
 
eFileCabinet Manual Version 4.0
eFileCabinet Manual Version 4.0eFileCabinet Manual Version 4.0
eFileCabinet Manual Version 4.0eFileCabinet
 
Git 101 tutorial presentation
Git 101 tutorial presentationGit 101 tutorial presentation
Git 101 tutorial presentationTerry Wang
 
Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...
Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...
Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...Nicolaie Constantinescu
 
Icinga Camp Antwerp - Current State of Icinga
Icinga Camp Antwerp - Current State of IcingaIcinga Camp Antwerp - Current State of Icinga
Icinga Camp Antwerp - Current State of IcingaIcinga
 
iMarine Products and Services delivery
iMarine Products and Services deliveryiMarine Products and Services delivery
iMarine Products and Services deliveryiMarine283644
 
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...ChemAxon
 

Ähnlich wie Repository Development at LC Captures Digital Artifacts (7)

Leinfelder Earth Grid Jam2008
Leinfelder Earth Grid Jam2008Leinfelder Earth Grid Jam2008
Leinfelder Earth Grid Jam2008
 
eFileCabinet Manual Version 4.0
eFileCabinet Manual Version 4.0eFileCabinet Manual Version 4.0
eFileCabinet Manual Version 4.0
 
Git 101 tutorial presentation
Git 101 tutorial presentationGit 101 tutorial presentation
Git 101 tutorial presentation
 
Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...
Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...
Datele in biblioteca noi servicii / Bibliotheken als Datenzentren: ein Einbli...
 
Icinga Camp Antwerp - Current State of Icinga
Icinga Camp Antwerp - Current State of IcingaIcinga Camp Antwerp - Current State of Icinga
Icinga Camp Antwerp - Current State of Icinga
 
iMarine Products and Services delivery
iMarine Products and Services deliveryiMarine Products and Services delivery
iMarine Products and Services delivery
 
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
EUGM 2013 - Anh Kiet Tran Minh (CNRS): French Academic Compound Library: the ...
 

Mehr von Dan Chudnov

Overview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research LabOverview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research LabDan Chudnov
 
stuff i'm learning in data school
stuff i'm learning in data schoolstuff i'm learning in data school
stuff i'm learning in data schoolDan Chudnov
 
Capturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed ManagerCapturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed ManagerDan Chudnov
 
think locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talkthink locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talkDan Chudnov
 
what i want from linked data
what i want from linked datawhat i want from linked data
what i want from linked dataDan Chudnov
 
collecting twitter data w/social feed manager
collecting twitter data w/social feed managercollecting twitter data w/social feed manager
collecting twitter data w/social feed managerDan Chudnov
 
web archiving tools and technologies
web archiving tools and technologiesweb archiving tools and technologies
web archiving tools and technologiesDan Chudnov
 
20121018 Access "social feed manager"
20121018 Access "social feed manager"20121018 Access "social feed manager"
20121018 Access "social feed manager"Dan Chudnov
 
WWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service MediumWWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service MediumDan Chudnov
 
introduction to Django in five slides
introduction to Django in five slides introduction to Django in five slides
introduction to Django in five slides Dan Chudnov
 
Linking Library Data on the Web
Linking Library Data on the WebLinking Library Data on the Web
Linking Library Data on the WebDan Chudnov
 
Hacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, PythonHacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, PythonDan Chudnov
 
Hacker102 - RegExes w/JavaScript and Python
Hacker102 - RegExes w/JavaScript and PythonHacker102 - RegExes w/JavaScript and Python
Hacker102 - RegExes w/JavaScript and PythonDan Chudnov
 
Hacker 101/102 - Introduction to Programming w/Processing
Hacker 101/102 - Introduction to Programming w/ProcessingHacker 101/102 - Introduction to Programming w/Processing
Hacker 101/102 - Introduction to Programming w/ProcessingDan Chudnov
 
TCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linkingTCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linkingDan Chudnov
 

Mehr von Dan Chudnov (15)

Overview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research LabOverview of Adaptive Blocking for DDL Research Lab
Overview of Adaptive Blocking for DDL Research Lab
 
stuff i'm learning in data school
stuff i'm learning in data schoolstuff i'm learning in data school
stuff i'm learning in data school
 
Capturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed ManagerCapturing the Ephemeral: Collecting Social Media with Social Feed Manager
Capturing the Ephemeral: Collecting Social Media with Social Feed Manager
 
think locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talkthink locally, code globally - dchud's code4lib japan 2013 talk
think locally, code globally - dchud's code4lib japan 2013 talk
 
what i want from linked data
what i want from linked datawhat i want from linked data
what i want from linked data
 
collecting twitter data w/social feed manager
collecting twitter data w/social feed managercollecting twitter data w/social feed manager
collecting twitter data w/social feed manager
 
web archiving tools and technologies
web archiving tools and technologiesweb archiving tools and technologies
web archiving tools and technologies
 
20121018 Access "social feed manager"
20121018 Access "social feed manager"20121018 Access "social feed manager"
20121018 Access "social feed manager"
 
WWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service MediumWWIC - Library Linked Data as a Customer Service Medium
WWIC - Library Linked Data as a Customer Service Medium
 
introduction to Django in five slides
introduction to Django in five slides introduction to Django in five slides
introduction to Django in five slides
 
Linking Library Data on the Web
Linking Library Data on the WebLinking Library Data on the Web
Linking Library Data on the Web
 
Hacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, PythonHacker 102 - regexes w/Javascript, Python
Hacker 102 - regexes w/Javascript, Python
 
Hacker102 - RegExes w/JavaScript and Python
Hacker102 - RegExes w/JavaScript and PythonHacker102 - RegExes w/JavaScript and Python
Hacker102 - RegExes w/JavaScript and Python
 
Hacker 101/102 - Introduction to Programming w/Processing
Hacker 101/102 - Introduction to Programming w/ProcessingHacker 101/102 - Introduction to Programming w/Processing
Hacker 101/102 - Introduction to Programming w/Processing
 
TCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linkingTCDL 2009 keynote: Better living through linking
TCDL 2009 keynote: Better living through linking
 

Kürzlich hochgeladen

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Kürzlich hochgeladen (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

Repository Development at LC Captures Digital Artifacts

  • 1. Repository Development at LC Daniel Chudnov - 2009-10-01 - dchud at loc gov Access 2009 - Charlottetown, PEI
  • 2. who we are what we do what’s next
  • 4. 30ish people dev, QA, PM, ops from libs, uni, industry, etc.
  • 5. OSI Office of Strategic Initiatives
  • 6. “...capture the digital artifact, register and/or deposit it for the Copyright Office, pass it along to those who decide whether to include it in the Library, and allow it to be incorporated digitally in the collection, with the optimum flow-through of information for registration, cataloging, indexing, and preservation.” (search for “LC21”)
  • 7. or, to be precise
  • 8. capture the “ digital artifact register and/or deposit it for the Copyright Office, pass , it along to those who decide whether to include it in the Library, and allow it to be incorporated digitally in the collection, with the optimum flow-through of information for registration, cataloging, indexing, and preservation.” (search for “LC21”)
  • 9. “capture the digital artifact, register and/or deposit it for the Copyright Office , pass it along to those who decide whether to include it in the Library, and allow it to be incorporated digitally in the collection, with the optimum flow-through of information for registration, cataloging, indexing, and preservation.” (search for “LC21”)
  • 10. “capture the digital artifact, register and/or deposit it for the Copyright Office, pass it along to those who decide whether to include it in the Library, and allow it to be incorporated digitally in the collection, with the optimum flow-through of information for registration, cataloging, indexing, and preservation.” (search for “LC21”)
  • 11. “capture the digital artifact, register and/or deposit it for the Copyright Office, pass it along to those who decide whether to include it in the Library, and allow it to be incorporated digitally in the collection with the optimum flow-through of information for , registration, cataloging, indexing, and preservation.” (search for “LC21”)
  • 12. “capture the digital artifact, register and/or deposit it for the Copyright Office, pass it along to those who decide whether to include it in the Library, and allow it to be incorporated digitally in the collection, with the optimum flow-through of information for registration, cataloging, indexing, and preservation .” (search for “LC21”)
  • 16. world scale then web scale
  • 19. content from all over the world
  • 29. how?
  • 30. solaris apache nginx mysql solr django jquery
  • 34. capture the artifact pass it along cataloging, indexing
  • 38. freely available now download whole issues - tell friends - mash it up
  • 39. 100+ TB 16 of 50+ states/terr. and growing quickly
  • 40. how?
  • 43. capture the artifact pass it along cataloging, indexing, preservation
  • 44. preservation storage “movage”
  • 46. BagIt packing slip for data
  • 47. . |-- bag-info.txt |-- bagit.txt |-- data | |-- batch.xml | |-- batch_1.xml | |-- batch_ne_dewitt_rework | | |-- 00206538016_batch.xml | | |-- 00206538028_batch.xml | | `-- sn99021999 | `-- sn99021999 | | | |-- 00206538016 | | |-- 0000.jp2 |-- 0000.pdf data in a Bag | | |-- 0000.tif | | |-- 0000.xml | | |-- 0001.jp2 | | |-- 0001.pdf | | |-- 0001.tif | | |-- 0001.xml
  • 48. . |-- |-- bag-info.txt bagit.txt identifies a bag |-- data | |-- batch.xml | |-- batch_1.xml | |-- batch_ne_dewitt_rework | | |-- 00206538016_batch.xml | | |-- 00206538028_batch.xml | | `-- sn99021999 | `-- sn99021999 | |-- 00206538016 | | |-- 0000.jp2 | | |-- 0000.pdf | | |-- 0000.tif | | |-- 0000.xml | | |-- 0001.jp2 | | |-- 0001.pdf | | |-- 0001.tif | | |-- 0001.xml
  • 49. . where the |-- bag-info.txt |-- bagit.txt |-- data | | | |-- batch.xml |-- batch_1.xml data starts |-- batch_ne_dewitt_rework | | |-- 00206538016_batch.xml | | |-- 00206538028_batch.xml | | `-- sn99021999 | `-- sn99021999 | |-- 00206538016 | | |-- 0000.jp2 | | |-- 0000.pdf | | |-- 0000.tif | | |-- 0000.xml | | |-- 0001.jp2 | | |-- 0001.pdf | | |-- 0001.tif | | |-- 0001.xml
  • 50. . |-- bag-info.txt |-- bagit.txt |-- data | |-- batch.xml | |-- batch_1.xml | |-- batch_ne_dewitt_rework | | |-- 00206538016_batch.xml | | |-- 00206538028_batch.xml | | `-- sn99021999 | `-- sn99021999 | |-- 00206538016 | | |-- 0000.jp2 | | |-- 0000.pdf | | |-- 0000.tif | | |-- 0000.xml | | | | | | |-- 0001.jp2 |-- 0001.pdf | ... packing |-- `-- manifest-md5.txt tagmanifest-md5.txt slip
  • 51. 71607ad119be88c842268a76f0b6b9e9 data/sn99021999/00206538107/1884091301/0621.pdf c602d2ac07508059ce5f5597e239b97f data/sn99021999/00206538120/1885100601/0831.xml a59795bd1584532d5cbc0b1d82f75cf8 data/sn99021999/00206538016/1880061401/0593.pdf 3c64fac7e2d49671e0d93908ae42a779 data/sn99021999/00206539616/1888101801/0905.xml 03158a560baa7479b3805d2b45ee02cd data/sn99021999/00206538028/1880111501/0405.tif fa56ea18580e1446939ed62709e5b2db data/sn99021999/00206538077/1883061901/1145.pdf bf4fb83ff8305e8256970a3466c1a12d data/sn99021999/00206538120/1885061501/0043.pdf 8f3649fc812de74b9d9443ee90a8ac9c data/sn99021999/00206538120/1885111101/1109.tif e0b83a7f9ca228271fdaecf6348e1cec data/sn99021999/00206538120/1885101201/0871.xml 1c2f84e12792c123ba0aabedd0c0bbad data/sn99021999/00206538107/1884071401/0197.xml 080e557fe9f68037605e5b80df4bc4ac data/sn99021999/0020653820A/1888050701/0543.tif 532efe32c156459d9d9589caf618f502 data/sn99021999/00206538120/1885071401/0250.tif ce607af59a96f2656d9448f38ffda072 data/sn99021999/0020653820A/1888052801/0731.pdf 60b626d8fd40aca1b425e86a004bb055 data/sn99021999/00206539628/1888111801/0088.xml a467cd62350334c7aa83cf1e9056c1c6 data/sn99021999/00206539616/1888091701/0629.jp2 1a434f7a4d843a2c8ffe8d0824fafc3f data/sn99021999/00206538028/1880120801/0482.jp2 22996d89b4a3334256afaddcaa0238d8 data/sn99021999/00206538016/1874102001/0259.jp2 36f550da273ad4c592fee1761c98322a data/sn99021999/00206538016/1880052201/0518.jp2 7f7ccec3f2afae896338498372fd476e data/sn99021999/00206539616/1888080101/0200.pdf c247a5d74d0e7f857c534d935661adbe data/sn99021999/00206538107/1884072601/0286.jp2 4d497a18a154adcc8636239378ab340b data/sn99021999/00206539628/1889021101/0868.pdf 2e8ca2558b54b5c49b2f20a355a60895 data/sn99021999/00206538065/1882092001/0136.xml fb71493048e5010100f18012f5060d42 data/sn99021999/00206538028/1880123001/0569.xml 40b100432890b055a5defbfbea815d57 data/sn99021999/00206538107/1884090901/0590.xml 46f6d61480dadc1c988b0baa4de8b6c4 data/sn99021999/00206539628/1888122801/0463.pdf 1cb8af0648e8c9df395b63226fe7371f data/sn99021999/00206538016/1874101501/0244.pdf 9257834023c683b02f354888b2740b8f data/sn99021999/00206539616/1888102301/0956.xml 0d52b3b2b1c5459b7e8d500a8566b0bf data/sn99021999/00206538120/1885080801/0425.tif
  • 53. 1 what i think i’m sending you
  • 55. just like a packing slip
  • 56. works across space
  • 57. works across systems
  • 58. works across orgs
  • 59. works across time
  • 63. bvar@sun9 /ingest/bvar/test $ bag create --dest new_bag test_data/* 12:08:47,044 [main] INFO CommandLineBagDriver : Performing operation: create 2.301112941466272:2.3 12:08:47,141 [main] INFO ManifestImpl : Creating manifest for manifest-md5.txt 12:09:09,493 [main] INFO ManifestImpl : Creating manifest for tagmanifest-md5.txt 12:09:09,511 [main] INFO AbstractBagImpl : Writing bag 12:09:41,507 [main] INFO CommandLineBagDriver : Operation completed. 12:09:41,508 [main] INFO CommandLineBagDriver : Returning 0 bvar@sun9 /ingest/bvar/push/test_bag $ bag isvalid . 11:55:45,582 [main] INFO CommandLineBagDriver : Performing operation: isvalid 11:55:46,378 [main] INFO ManifestImpl : Creating manifest for manifest-md5.txt 11:55:46,458 [main] INFO ManifestImpl : Creating manifest for tagmanifest-md5.txt 11:55:46,540 [main] INFO AbstractBagImpl : Completion check: Result is true. 11:56:21,273 [main] INFO AbstractBagImpl : Validity check: Result is true. 11:56:21,273 [main] INFO CommandLineBagDriver : Result is true. 11:56:21,274 [main] INFO CommandLineBagDriver : Returning 0 bvar@sun9 /ingest/bvar/push/test_bag $
  • 65. free/open source releases from LC
  • 66. sf.net/projects/loc-xferutils/ get yours today - tell friends - start trading bags
  • 70. transfer UI - inventory - workflow
  • 71. how?
  • 74. lots of work still to do
  • 76. register/deposit for Copyright
  • 78. we hope to support eDeposit with these tools
  • 79. “Deposit Demand” June 2009 Federal Register Proposed Rulemaking
  • 80. stay tuned or ask my colleagues :) (ask me whom to ask)
  • 81. but, not my area
  • 82. “allow it to be... incorporated digitally in the collection”
  • 83. “allow it to be... incorporated digitally in the collection”
  • 84. how?
  • 85. traditional approach: catalog records exhibit sites
  • 88. cost of consistent web strategies is low
  • 91. use URIs as names for things use HTTP URIs provide useful information include links to other URIs http://www.w3.org/DesignIssues/LinkedData.html
  • 94.
  • 95. clean URIs follow your nose formats
  • 97. <link rel="alternate" type="application/rdf+xml" href="/authorities/sh00009460.rdf" /> <link rel="alternate" type="text/plain" href="/authorities/sh00009460.nt" /> <link rel="alternate" type="application/json" href="/authorities/sh00009460.json" />
  • 98. <rdf:RDF> <rdf:Description rdf:about="http://id.loc.gov/authorities/ sh00009460#concept"> <dcterms:modified rdf:datatype="http://www.w3.org/2001/ XMLSchema#dateTime">2000-11-27T10:39:57-04:00</dcterms:modified> <skos:prefLabel xml:lang="en">National parks and reserves--Prince Edward Island</skos:prefLabel> <owl:sameAs rdf:resource="info:lc/authorities/sh00009460"/> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <skos:inScheme rdf:resource="http://id.loc.gov/authorities#conceptScheme"/> <skos:inScheme rdf:resource="http://id.loc.gov/authorities#topicalTerms"/> <dcterms:created rdf:datatype="http://www.w3.org/2001/ XMLSchema#dateTime">2000-10-17T00:00:00-04:00</dcterms:created> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh2002010534#concept"/> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh2008004743#concept"/> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh2003002637#concept"/> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh00009458#concept"/> </rdf:Description> <rdf:Description rdf:about="http://id.loc.gov/authorities/ sh2002010534#concept"> <skos:prefLabel xml:lang="en">Prince Edward Island National Park (P.E.I.) </skos:prefLabel> </rdf:Description>
  • 99. <rdf:RDF> <rdf:Description rdf:about="http://id.loc.gov/authorities/ sh00009460#concept"> <dcterms:modified rdf:datatype="http://www.w3.org/2001/ XMLSchema#dateTime">2000-11-27T10:39:57-04:00</dcterms:modified> <skos:prefLabel xml:lang="en">National parks and reserves--Prince Edward Island</skos:prefLabel> <owl:sameAs rdf:resource="info:lc/authorities/sh00009460"/> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <skos:inScheme rdf:resource="http://id.loc.gov/authorities#conceptScheme"/> <skos:inScheme rdf:resource="http://id.loc.gov/authorities#topicalTerms"/> <dcterms:created rdf:datatype="http://www.w3.org/2001/ XMLSchema#dateTime">2000-10-17T00:00:00-04:00</dcterms:created> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh2002010534#concept"/> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh2008004743#concept"/> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh2003002637#concept"/> <skos:narrower rdf:resource="http://id.loc.gov/authorities/ sh00009458#concept"/> </rdf:Description> <rdf:Description rdf:about="http://id.loc.gov/authorities/ sh2002010534#concept"> <skos:prefLabel xml:lang="en">Prince Edward Island National Park (P.E.I.) </skos:prefLabel> </rdf:Description> explicit concepts, schema, meaning
  • 100. a web of data...
  • 102. at this URI is this concept with this meaning
  • 103. a standard way to refer to a heading
  • 104. freely available now download the whole thing - tell friends - amaze enemies
  • 107.
  • 108. <link rel="resourcemap" type="application/rdf+xml" href="/lccn/ sn83030214/1905-01-15/ed-1/seq-25.rdf" /> <link rel="alternate" type="image/jp2" href="/lccn/sn83030214/1905-01-15/ ed-1/seq-25.jp2" /> <link rel="alternate" type="application/pdf" href="/lccn/ sn83030214/1905-01-15/ed-1/seq-25.pdf" /> <link rel="alternate" type="application/xml" href="/lccn/ sn83030214/1905-01-15/ed-1/seq-25/ocr.xml" /> <link rel="alternate" type="text/plain" href="/lccn/ sn83030214/1905-01-15/ed-1/seq-25/ocr.txt" />
  • 109. <rdf:Description rdf:about="/lccn/sn83030214/1905-01-15/ed-1/ seq-25#page"> <ore:isDescribedBy rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25.rdf"/> <foaf:depiction rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25/thumbnail.jpg"/> <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25.jp2"/> <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25/ocr.txt"/> <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25.pdf"/> <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25/ocr.xml"/> <ore:aggregates rdf:resource="/lccn/sn83030214/1905-01-15/ed-1/ seq-25/thumbnail.jpg"/> <rdf:type rdf:resource="http://chroniclingamerica.loc.gov/ terms#Page"/> <ore:isAggregatedBy rdf:resource="/lccn/sn83030214/1905-01-15/ ed-1#issue"/> <dcterms:issued rdf:datatype="http://www.w3.org/2001/ XMLSchema#date">1905-01-15</dcterms:issued> <ndnp:sequence rdf:datatype="http://www.w3.org/2001/ XMLSchema#long">25</ndnp:sequence> <dcterms:title>New-York tribune. - 1905-01-15 - 25</dcterms:title> </rdf:Description>
  • 111. this is a page
  • 112. it has these files in these formats
  • 114. it is part of this issue
  • 116. it has this title
  • 118. all exposed in the app on the web
  • 122. there’s an API doc...
  • 124. “...make resources available and useful ...” from the mission of the Library
  • 125. “allow it to be... incorporated digitally in the collection” from the LC21 report
  • 126. “...sustain and preserve a universal collection ...” from the mission of the Library
  • 127. each app consistent about meaning
  • 128. follow your nose to concept definitions
  • 129. in our apps and in yours
  • 131. the web is a universal collection
  • 132. this is a way to incorporate digitally
  • 133. our digital artifacts on our web
  • 134. your digital artifacts in your web
  • 135. our digital artifacts in your web
  • 136. your digital artifacts in our web
  • 137. available & useful &c.
  • 139. content that scales on the way in
  • 140. apps that scale on the way out
  • 142. transfer inventory workflow all in active development
  • 143. the BagIt spec try it - it works
  • 146. web of data available and useful
  • 147. view source: wdl.org chroniclingamerica.loc.gov id.loc.gov sf.net/projects/loc-xferutils/ dchud at loc gov - @dchud