SlideShare a Scribd company logo
1 of 46
Download to read offline
Jennifer Bowen, University of Rochester
DC-2010 Conference
October 20, 2010, Pittsburgh, PA
Moving Library Metadata
toward Linked Data: Opportunities
Provided by the eXtensible Catalog
About me…
Currently:
- Librarian
- Technical services administrator
- Software development team co-leader
Formerly:
- Cataloger (MARC)
- Standards developer (RDA)
Maybe someday…Linked Data Expert?
2
My Topics Today
3
Is it feasible to turn legacy library
MARC metadata into Linked Data
in an automated environment,
and,
How can eXtensible Catalog (XC)
software play a role in that
process?
Image source: www.blog.kdl.org
Semantic Web and Linked Data
Semantic Web: a set of technologies that
allow computers to understand the meaning
of information on the web
Linked Data: a mechanism for exposing,
sharing and connecting data on the web,
using identifiers and relationships
4
Linked Data “Expectations of Behavior”
– Use URIs as names for things
– Use HTTP URIs so that people can look up
those names.
– When someone looks up a URI, provide useful
information, using the standards (RDF*,
SPARQL)
– Include links to other URIs so that they can
discover more things.
Tim Berners-Lee,“Design issues”, 2006
http://www.w3.org/DesignIssues/LinkedData.html
5
Linked Data: RDF triple
6
This presentation Jennifer Bowen
has creator
ObjectPredicateSubject
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
A Reality Check
7
Teaching MARC metadata new tricks?
8
Image source: http://www.englishcafe.com/node/2337
Turning legacy data into Linked Data…
How do we even get started?
9
Getting Started
To create Linked Data, we need:
–Software to transform legacy data
–Analysis: mapping of legacy metadata to
Linked Data properties
10
The software…
11
eXtensible Catalog (XC) is open source,
user-centered, next generation software
for libraries.
XC provides a discovery system and a set
of tools for libraries to manage metadata
and build applications.
XC Software Components
User Interface Website on Drupal CMS
Integrated Library System Repository
XC User Interface
Metadata Processing Metadata
Services Toolkit
Connectivity tools NCIP
Toolkit
12
OAI
Toolkit
XC’s original metadata goals
- Aggregate MARC and other metadata for
use in new applications
- Define a FRBR-based metadata schema to
support XC’s user-interface functionality
- Create a software application to process
batches of metadata through a set of
services
13
Software development:
a moving target!
14
XC and Linked Data
How can XC help move legacy library
metadata closer to Linked Data?
NOT among XC’s original goals
However, XC software creates an opportunity
to contribute to this effort and provides
important “lessons learned”
15
Converting MARC to Linked Data
What XC software can do:
– Convert MARC codes to vocabulary values
– Remove extraneous data
– Normalize inconsistencies
– Map most MARC fields/subfields and parse to
appropriate FRBR Group 1 entity records
16
Converting MARC to Linked Data
Problematic areas:
– Some MARC fields/subfields are difficult to
map to appropriate FRBR entities
– Tracking relationships between FRBR entity
records: How many relationships can we
support with XC software?
17
MARC to XC Schema Transformation
Parses MARCXML
records into linked
FRBR-based records Maps MARCXML data
elements to Linked-Data-
Compatible elements in the
XC Schema.
Managing Relationships
Managing Relationships
20
Issue: Managing Multiple Relationships
21
MARC bibliographic records can refer to
multiple FRBR entities of the same type
(analytics that represent multiple
works/expressions, e.g. tracks on a CD)
Issue: Beyond FRBR Group 1 Entities
22
MARC “Alternate Graphic Representation”
(880 fields) can contain data that belong in
records for Group 2 and Group 3 entities
Contributor:
700 1    ‡6 880‐08 ‡a Vasil’ev, Maksim.
880 1    ‡6 700‐08 ‡a Васильев, Максим.
Subject:
600 10 ‡6 880‐06 ‡a Putin, Vladimir Vladimirovich, ‡d 1952‐
880 10 ‡6 600‐06 ‡a Путин, Владимир Владимирович, ‡d 
1952‐
If we were to parse this 880 data correctly:
23
Alternative
script of
name from
880
Alternative
script of
subject
from 880
Issue: Related Group 1 Entities
Language attribute for a related expression
041  1    ‡a eng ‡h ita
100  0    ‡a Dante Alighieri, ‡d 1265‐1321.
240  10 ‡a Divina commedia. ‡l English
245  14 ‡a The divine comedy / ‡c Dante ; a     
new verse translation by C.H. Sisson.
500        ‡a Translation of: Divina commedia.
24
If we were to parse 041 ‡h data…
25
Alternative
script of
name from
880
Original
language from
041 ‡h
Alternative
script of
subject
from 880
Managing Relationships Between Entities
26
Original
language from
041 $h
Alternative
script of
subject
from 880
Alternative
script of
name from
880
•new records
•changed records
•deleted records
•changed
relationships
Maintaining links between separate FRBR
entity records in a production
environment monopolizes system
resources and may not be scalable.
What we are learning from XC
27
28
But wait…
If we can map a
MARC data element
to a FRBR entity, we
can probably convert
it to Linked Data.
What does this emphasis on FRBR have to do
with Linked Data?
FRBR Group 1 Entities
29
But do we have to?
- Do we have to be able to map MARC
elements to a FRBR entity in order to create
Linked Data?
- Would managing RDF triples be more
scalable than managing FRBR-based records
and the relationships between those
records?
Best Practices for Linked Data
- Unique identifiers for XC metadata
records
- Data elements from registered schemas
- Registered vocabularies
30
By attempting to follow best practices in
XC for Linked Data, we hope to facilitate
eventual output of XC metadata in RDF.
RDF Triple
31
This resource Poets, American
has subject
ObjectPredicateSubject
URIs for each?
RDF Triple – Record identifiers
32
ObjectPredicateSubject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource has subject Poets, American
Identifiers for XC Schema records
33
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:rdarole="http://rdvocab.info/roles">
<xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">
<dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject>
<rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author>
<rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork>
<xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-
1962.</xc:subject>
<xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject>
</xc:entity>
</xc:frbr> A persistent, globally unique identifier
for each XC Schema record
RDF Triple - Registered Data Elements
34
http://www.
extensiblecatalog.info
/Elements/subject
ObjectPredicateSubject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource has subject Poets, American
35
DCMI
36
RDA
37
XC
XC Schema “work” record: data elements
38
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:rdarole="http://rdvocab.info/roles">
<xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">
<dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject>
<dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject>
<rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author>
<rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork>
<xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-
1962.</xc:subject>
<xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject>
</xc:entity>
</xc:frbr> Data elements from registered
namespaces for DC terms, RDA roles
and vocab, and XC
RDF Triple - RegisteredVocabularies
39
http://id.loc.gov/authorities
/sh85103735#concept
http://www.
extensiblecatalog.info
/Elements/subject
ObjectPredicateSubject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource has subject Poets, American
40
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" …
xmlns:subjid=“id.loc.gov/authorities”>
<xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">
…
<xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject>
<xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets,
American</xc:subject>
<xc:temporal>20th century</xc:temporal>
<xc:type>Biography</xc:type>
</xc:entity>
XCWork record with embedded URI
for LCSH “Poets,American”
RDF Triple
41
http://id.loc.gov/authorities
/sh85103735#concept
http://www.
extensiblecatalog.info
/Elements/subject
ObjectPredicateSubject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This resource has subject Poets, American
Experimenting with Linked Data
- Within a MARC or MARCXML
environment?
- Possible to give each record a
URI
- MARC elements themselves
don’t have URIs
- How to embed multiple URIs for
registered vocabularies in MARC?
42
- XC enables experimentation outside of a MARC
environment with data that originated as MARC
Making Linked Data a Priority for XC
– Balancing goals
– Time/funding constraints
– What’s our use case?
– Output of Linked Data from XC vs.
– Using Linked Data within XC?
43
XC Linked Data Accomplishments
XC has set the stage for Linked Data by:
- Providing a platform for creating Linked Data
using XC software
- Ensuring that XC Schema records can be
converted to RDF triples as easily as possible
- Enabling others to build upon what we have
accomplished done so far.
44
Next Steps
- Monitor RDA implementations
- Develop XC authority control service
- Enable RDF output of XC Schema metadata
- Encourage libraries to use XC software and
contribute to the XC user community
- Seek funding for additional software
development
45
www.eXtensiblecatalog.org
Jennifer Bowen
jbowen@library.rochester.edu
Thank you! Questions?

More Related Content

What's hot

Modèles de données et langages de description ouverts 6 - 2021-2022
Modèles de données et langages de description ouverts   6 - 2021-2022Modèles de données et langages de description ouverts   6 - 2021-2022
Modèles de données et langages de description ouverts 6 - 2021-2022François-Xavier Boffy
 
Deploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application ServerDeploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application Serverwebhostingguy
 
Another RDF Encoding Form
Another RDF Encoding FormAnother RDF Encoding Form
Another RDF Encoding FormJakob .
 
XML.ppt
XML.pptXML.ppt
XML.pptbutest
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLFranck Michel
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In PracticeMarcia Zeng
 
call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...IJERD Editor
 
Applications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and ClassificationApplications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and Classificationshakimov
 
The Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked DataThe Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked DataRichard Urban
 
Web ontology language (owl)
Web ontology language (owl)Web ontology language (owl)
Web ontology language (owl)Ameer Sameer
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationNew York University
 
Jarrar: RDF Stores -Challenges and Solutions
Jarrar: RDF Stores -Challenges and SolutionsJarrar: RDF Stores -Challenges and Solutions
Jarrar: RDF Stores -Challenges and SolutionsMustafa Jarrar
 
Genre discovery in corpus management systems (2004)
Genre discovery in corpus management systems (2004)Genre discovery in corpus management systems (2004)
Genre discovery in corpus management systems (2004)Joseba Abaitua
 
Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008askamy
 
when the link makes sense
when the link makes sensewhen the link makes sense
when the link makes senseFabien Gandon
 

What's hot (19)

Modèles de données et langages de description ouverts 6 - 2021-2022
Modèles de données et langages de description ouverts   6 - 2021-2022Modèles de données et langages de description ouverts   6 - 2021-2022
Modèles de données et langages de description ouverts 6 - 2021-2022
 
Deploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application ServerDeploying PHP applications using Virtuoso as Application Server
Deploying PHP applications using Virtuoso as Application Server
 
Another RDF Encoding Form
Another RDF Encoding FormAnother RDF Encoding Form
Another RDF Encoding Form
 
XML.ppt
XML.pptXML.ppt
XML.ppt
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RML
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In Practice
 
call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...
 
Applications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and ClassificationApplications of Word Vectors in Text Retrieval and Classification
Applications of Word Vectors in Text Retrieval and Classification
 
The Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked DataThe Dublin Core 1:1 Principle in the Age of Linked Data
The Dublin Core 1:1 Principle in the Age of Linked Data
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
RDF data model
RDF data modelRDF data model
RDF data model
 
Web ontology language (owl)
Web ontology language (owl)Web ontology language (owl)
Web ontology language (owl)
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content Integration
 
Jarrar: RDF Stores -Challenges and Solutions
Jarrar: RDF Stores -Challenges and SolutionsJarrar: RDF Stores -Challenges and Solutions
Jarrar: RDF Stores -Challenges and Solutions
 
Efficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data StreamsEfficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data Streams
 
Genre discovery in corpus management systems (2004)
Genre discovery in corpus management systems (2004)Genre discovery in corpus management systems (2004)
Genre discovery in corpus management systems (2004)
 
Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008Metadata Workshop-Maastricht - November 6, 2008
Metadata Workshop-Maastricht - November 6, 2008
 
Files
FilesFiles
Files
 
when the link makes sense
when the link makes sensewhen the link makes sense
when the link makes sense
 

Similar to Moving Library Metadata Toward Linked Data: Opportunities Provided by the eXtensible Catalog

Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Jane Stevenson
 
Digital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl ConferenceDigital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl Conferenceguestbba8ac
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Beat Signer
 
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Beat Signer
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeDan Brickley
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordMark Wilkinson
 
MR^3: Meta-Model Management based on RDFs Revision Reflection
MR^3: Meta-Model Management based on RDFs Revision ReflectionMR^3: Meta-Model Management based on RDFs Revision Reflection
MR^3: Meta-Model Management based on RDFs Revision ReflectionTakeshi Morita
 
THGenius, rdf and open linked data for thesaurus management
THGenius, rdf and open linked data for thesaurus managementTHGenius, rdf and open linked data for thesaurus management
THGenius, rdf and open linked data for thesaurus management@CULT Srl
 
Lee Iverson - How does the web connect content?
Lee Iverson - How does the web connect content?Lee Iverson - How does the web connect content?
Lee Iverson - How does the web connect content?Museums Computer Group
 
Dynamic and repeatable transformation of existing Thesauri and Authority list...
Dynamic and repeatable transformation of existing Thesauri and Authority list...Dynamic and repeatable transformation of existing Thesauri and Authority list...
Dynamic and repeatable transformation of existing Thesauri and Authority list...DESTIN-Informatique.com
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Databricks
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data qualityJose Emilio Labra Gayo
 

Similar to Moving Library Metadata Toward Linked Data: Opportunities Provided by the eXtensible Catalog (20)

Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 
Digital Library Applications Of Social Networking
Digital Library Applications Of Social Networking  Digital Library Applications Of Social Networking
Digital Library Applications Of Social Networking
 
Digital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl ConferenceDigital Library Applications Of Social Networking Jeju Intl Conference
Digital Library Applications Of Social Networking Jeju Intl Conference
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
 
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)
 
mx & dbs
mx & dbsmx & dbs
mx & dbs
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
Semantic web Santhosh N Basavarajappa
Semantic web   Santhosh N BasavarajappaSemantic web   Santhosh N Basavarajappa
Semantic web Santhosh N Basavarajappa
 
MR^3: Meta-Model Management based on RDFs Revision Reflection
MR^3: Meta-Model Management based on RDFs Revision ReflectionMR^3: Meta-Model Management based on RDFs Revision Reflection
MR^3: Meta-Model Management based on RDFs Revision Reflection
 
Biodiversity Informatics on the Semantic Web
Biodiversity Informatics on the Semantic WebBiodiversity Informatics on the Semantic Web
Biodiversity Informatics on the Semantic Web
 
THGenius, rdf and open linked data for thesaurus management
THGenius, rdf and open linked data for thesaurus managementTHGenius, rdf and open linked data for thesaurus management
THGenius, rdf and open linked data for thesaurus management
 
Linked data and voyager
Linked data and voyagerLinked data and voyager
Linked data and voyager
 
Lee Iverson - How does the web connect content?
Lee Iverson - How does the web connect content?Lee Iverson - How does the web connect content?
Lee Iverson - How does the web connect content?
 
Dynamic and repeatable transformation of existing Thesauri and Authority list...
Dynamic and repeatable transformation of existing Thesauri and Authority list...Dynamic and repeatable transformation of existing Thesauri and Authority list...
Dynamic and repeatable transformation of existing Thesauri and Authority list...
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015Spark Community Update - Spark Summit San Francisco 2015
Spark Community Update - Spark Summit San Francisco 2015
 
Semantic Web and Linked Open Data
Semantic Web and Linked Open DataSemantic Web and Linked Open Data
Semantic Web and Linked Open Data
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data quality
 

Recently uploaded

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Moving Library Metadata Toward Linked Data: Opportunities Provided by the eXtensible Catalog

  • 1. Jennifer Bowen, University of Rochester DC-2010 Conference October 20, 2010, Pittsburgh, PA Moving Library Metadata toward Linked Data: Opportunities Provided by the eXtensible Catalog
  • 2. About me… Currently: - Librarian - Technical services administrator - Software development team co-leader Formerly: - Cataloger (MARC) - Standards developer (RDA) Maybe someday…Linked Data Expert? 2
  • 3. My Topics Today 3 Is it feasible to turn legacy library MARC metadata into Linked Data in an automated environment, and, How can eXtensible Catalog (XC) software play a role in that process? Image source: www.blog.kdl.org
  • 4. Semantic Web and Linked Data Semantic Web: a set of technologies that allow computers to understand the meaning of information on the web Linked Data: a mechanism for exposing, sharing and connecting data on the web, using identifiers and relationships 4
  • 5. Linked Data “Expectations of Behavior” – Use URIs as names for things – Use HTTP URIs so that people can look up those names. – When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) – Include links to other URIs so that they can discover more things. Tim Berners-Lee,“Design issues”, 2006 http://www.w3.org/DesignIssues/LinkedData.html 5
  • 6. Linked Data: RDF triple 6 This presentation Jennifer Bowen has creator ObjectPredicateSubject
  • 7. “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” A Reality Check 7
  • 8. Teaching MARC metadata new tricks? 8 Image source: http://www.englishcafe.com/node/2337
  • 9. Turning legacy data into Linked Data… How do we even get started? 9
  • 10. Getting Started To create Linked Data, we need: –Software to transform legacy data –Analysis: mapping of legacy metadata to Linked Data properties 10
  • 11. The software… 11 eXtensible Catalog (XC) is open source, user-centered, next generation software for libraries. XC provides a discovery system and a set of tools for libraries to manage metadata and build applications.
  • 12. XC Software Components User Interface Website on Drupal CMS Integrated Library System Repository XC User Interface Metadata Processing Metadata Services Toolkit Connectivity tools NCIP Toolkit 12 OAI Toolkit
  • 13. XC’s original metadata goals - Aggregate MARC and other metadata for use in new applications - Define a FRBR-based metadata schema to support XC’s user-interface functionality - Create a software application to process batches of metadata through a set of services 13
  • 15. XC and Linked Data How can XC help move legacy library metadata closer to Linked Data? NOT among XC’s original goals However, XC software creates an opportunity to contribute to this effort and provides important “lessons learned” 15
  • 16. Converting MARC to Linked Data What XC software can do: – Convert MARC codes to vocabulary values – Remove extraneous data – Normalize inconsistencies – Map most MARC fields/subfields and parse to appropriate FRBR Group 1 entity records 16
  • 17. Converting MARC to Linked Data Problematic areas: – Some MARC fields/subfields are difficult to map to appropriate FRBR entities – Tracking relationships between FRBR entity records: How many relationships can we support with XC software? 17
  • 18. MARC to XC Schema Transformation Parses MARCXML records into linked FRBR-based records Maps MARCXML data elements to Linked-Data- Compatible elements in the XC Schema.
  • 21. Issue: Managing Multiple Relationships 21 MARC bibliographic records can refer to multiple FRBR entities of the same type (analytics that represent multiple works/expressions, e.g. tracks on a CD)
  • 22. Issue: Beyond FRBR Group 1 Entities 22 MARC “Alternate Graphic Representation” (880 fields) can contain data that belong in records for Group 2 and Group 3 entities Contributor: 700 1    ‡6 880‐08 ‡a Vasil’ev, Maksim. 880 1    ‡6 700‐08 ‡a Васильев, Максим. Subject: 600 10 ‡6 880‐06 ‡a Putin, Vladimir Vladimirovich, ‡d 1952‐ 880 10 ‡6 600‐06 ‡a Путин, Владимир Владимирович, ‡d  1952‐
  • 23. If we were to parse this 880 data correctly: 23 Alternative script of name from 880 Alternative script of subject from 880
  • 24. Issue: Related Group 1 Entities Language attribute for a related expression 041  1    ‡a eng ‡h ita 100  0    ‡a Dante Alighieri, ‡d 1265‐1321. 240  10 ‡a Divina commedia. ‡l English 245  14 ‡a The divine comedy / ‡c Dante ; a      new verse translation by C.H. Sisson. 500        ‡a Translation of: Divina commedia. 24
  • 25. If we were to parse 041 ‡h data… 25 Alternative script of name from 880 Original language from 041 ‡h Alternative script of subject from 880
  • 26. Managing Relationships Between Entities 26 Original language from 041 $h Alternative script of subject from 880 Alternative script of name from 880
  • 27. •new records •changed records •deleted records •changed relationships Maintaining links between separate FRBR entity records in a production environment monopolizes system resources and may not be scalable. What we are learning from XC 27
  • 28. 28 But wait… If we can map a MARC data element to a FRBR entity, we can probably convert it to Linked Data. What does this emphasis on FRBR have to do with Linked Data? FRBR Group 1 Entities
  • 29. 29 But do we have to? - Do we have to be able to map MARC elements to a FRBR entity in order to create Linked Data? - Would managing RDF triples be more scalable than managing FRBR-based records and the relationships between those records?
  • 30. Best Practices for Linked Data - Unique identifiers for XC metadata records - Data elements from registered schemas - Registered vocabularies 30 By attempting to follow best practices in XC for Linked Data, we hope to facilitate eventual output of XC metadata in RDF.
  • 31. RDF Triple 31 This resource Poets, American has subject ObjectPredicateSubject URIs for each?
  • 32. RDF Triple – Record identifiers 32 ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  • 33. Identifiers for XC Schema records 33 <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> <dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject> <rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author> <rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork> <xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894- 1962.</xc:subject> <xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject> </xc:entity> </xc:frbr> A persistent, globally unique identifier for each XC Schema record
  • 34. RDF Triple - Registered Data Elements 34 http://www. extensiblecatalog.info /Elements/subject ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  • 37. 37 XC
  • 38. XC Schema “work” record: data elements 38 <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> <dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject> <rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author> <rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork> <xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894- 1962.</xc:subject> <xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject> </xc:entity> </xc:frbr> Data elements from registered namespaces for DC terms, RDA roles and vocab, and XC
  • 39. RDF Triple - RegisteredVocabularies 39 http://id.loc.gov/authorities /sh85103735#concept http://www. extensiblecatalog.info /Elements/subject ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  • 40. 40 <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" … xmlns:subjid=“id.loc.gov/authorities”> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> … <xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject> <xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets, American</xc:subject> <xc:temporal>20th century</xc:temporal> <xc:type>Biography</xc:type> </xc:entity> XCWork record with embedded URI for LCSH “Poets,American”
  • 42. Experimenting with Linked Data - Within a MARC or MARCXML environment? - Possible to give each record a URI - MARC elements themselves don’t have URIs - How to embed multiple URIs for registered vocabularies in MARC? 42 - XC enables experimentation outside of a MARC environment with data that originated as MARC
  • 43. Making Linked Data a Priority for XC – Balancing goals – Time/funding constraints – What’s our use case? – Output of Linked Data from XC vs. – Using Linked Data within XC? 43
  • 44. XC Linked Data Accomplishments XC has set the stage for Linked Data by: - Providing a platform for creating Linked Data using XC software - Ensuring that XC Schema records can be converted to RDF triples as easily as possible - Enabling others to build upon what we have accomplished done so far. 44
  • 45. Next Steps - Monitor RDA implementations - Develop XC authority control service - Enable RDF output of XC Schema metadata - Encourage libraries to use XC software and contribute to the XC user community - Seek funding for additional software development 45