SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Wrangling Metadata from
HathiTrust and PubMed:
Providing Full-text Linking to The Cornell Veterinarian
Photo credit: http://www.walls.com/ Steven Folsom, NASIG Annual Conference 2014
Cornell Library Digital Consulting
and Production Services
 A single-point of service for those wishing to create digital
collections
 A virtual group that spans multiple departments within
the Library (Digital Scholarship and Preservation Services,
Cornell Library IT and Metadata Librarians from Library
Technical Services)
 Approaches digital collection building holistically, and
addresses the entire life cycle management of a project
Steven Folsom, NASIG Annual Conference 2014
The Cornell Veterinarian Project
Participants
Client:
 Cornell Flower-Sprecher Veterinary Library
DCAPS Involvement:
 Jaron Porciello, Digital Scholarship Initiatives Coordinator
 Michelle Paolillo, Project Manager/Business Analyst (CUL’s
HathiTrust Liaison)
 John Cline, Cornell Library Programmer
 Steven Folsom, Metadata Librarian
Steven Folsom, NASIG Annual Conference 2014
HathiTrust Digital Library
 Digital Library consisting of the Google Books project,
Internet Archive digitization initiatives, and content
digitized locally by libraries
 Committed to preserving content with stable access and
distributed/coordinated cost of storage
 Centralized technical framework with that allows for the
creation of tools and services
Steven Folsom, NASIG Annual Conference 2014
The Cornell Veterinarian
Steven Folsom, NASIG Annual Conference 2014
The Challenge
Steven Folsom, NASIG Annual Conference 2014
Hathi Volume Interface
Steven Folsom, NASIG Annual Conference 2014
Google Books:
Contributions from Cornell Library
 Participation in the Google Books Library Project since
2008
 Google focuses on materials that they have not already
digitized
 Using OCLC holdings information, they compose a
Cornell candidate list
Steven Folsom, NASIG Annual Conference 2014
HathiTrust Data API
Steven Folsom, NASIG Annual Conference 2014
Hathi METS File
Steven Folsom, NASIG Annual Conference 2014
METS File Continued
Steven Folsom, NASIG Annual Conference 2014
Hathifiles
 Tab-delimited full files of the Hathi Digital Library and
incremental updates (Full file is currently over 2.5 GB
uncompressed)
 Light Bibliographic data
 Includes some administrative metadata, e.g. rights
information, the originating institution for the scanned
copy
Steven Folsom, NASIG Annual Conference 2014
Select Hathifile Record Elements
Hathi Volume ID: mdp.39015076694507
Access: allow [Notes on mapping for rights attributes where
contextual user data would affect access]
Rights: pd [public domain]
HathiTrust record number: 000529434
Enumeration/Chronology: v.33 no.11 1900
Source: MIU
Title: The Chicago medical times
OCLC number: 1554176
Steven Folsom, NASIG Annual Conference 2014
HathiTrust Bibliographic API
 Meant for use to retrieve information about small numbers of
items at a time
 Returns bibliographic, rights, and volume information when
given a single or multiple standard identifiers (ISBN, LCCN,
OCLC, etc.), includes overlap with the Hathifile data
 Brief example:
http://catalog.hathitrust.org/api/volumes/brief/oclc/424023.
json
 Full
example:http://catalog.hathitrust.org/api/volumes/full/oclc
/424023.json
Steven Folsom, NASIG Annual Conference 2014
Hathi Metadata Recap
• Administrative
data about
scans and
corresponding
volumes
• Uses Hathi id’s
to link to
bibliographic
data
• Bulk
Bibliographic
data
• Some
administrative
data, e.g.
Rights
information
• Small requests
for
Bibliographic
data retrieved
using
standard
identifiers
(ISBN, LCCN,
OCLC…)
Steven Folsom, NASIG Annual Conference 2014
What we thought was the solution….
 Use Hathi Data API to find Table of Contents for each
Volume
 Gather the related OCR
 Parse out article citation values from the OCR (Hopefully in a
mostly automated way)
 Use the pagination data from TOC to build links by mapping
to pagination in the METS files.
 What couldn’t be automated would be done manually
(with the projected outcome being an citation index with Hathi
URLs that could be used to build an interface or given to an
index like PubMed)
Steven Folsom, NASIG Annual Conference 2014
Reality set in…
Steven Folsom, NASIG Annual Conference 2014Photo credit: ehive.com
HathiTrust OCR
Steven Folsom, NASIG Annual Conference 2014
The metadata continued to fight back…
Photo credit: http://glpiggy.net/ Steven Folsom, NASIG Annual Conference 2014
PubMed Indexing and API
Steven Folsom, NASIG Annual Conference 2014
A Path for Automation
For each citation already in PubMed for which the HathiTrust has one
volume
1. Search PubMed <Volume> AND the Hathi Catalog id (000535347)
for The Cornell Veterinarian against the Hathi File to get the
corresponding Hathi object id from the METS
2. Use the METS object id AND the PubMed start page (the numeric
value before the ‘-“ for each PubMed article citation to find the
<ORDERLABEL> to get the <Order> number from the METS file
3. Create the URL to be added to the PubMed XML. The Hathi METS
object id and <Order> number are used to create the URL. The
sequence number in this URL equals the <Order> number. The
METS id equals the id in the URL,
http://babel.hathitrust.org/cgi/pt?id=coo.31924051143075;view=1
up;seq=11
Steven Folsom, NASIG Annual Conference 2014
NCBI’s LinkOut Program
 A service that allows third parties to link specific NCBI
database records to relevant web-accessible resources
 The relevant journal/publication must already have gone
through the Medline selection process
 Document Type Definition (DTD) for contributing links in XML
Steven Folsom, NASIG Annual Conference 2014
PubMed Citation Data Requirements
 PubMED DTD specifies how the data should be
formatted
 Data Tags (R = Required, O = Optional O/R = Optional or
Required). Required tags must be included; optional tags
must be included only if the data requested appears in
the print or electronic article. Optional or Required tags
are dependent on the use of other tags
 Tag names are case sensitive
Steven Folsom, NASIG Annual Conference 2014
PubMed Citation Data Elements
File Header (R)
ArticleSet (R)
Article (R)
Journal (R)
PublisherName (R)
JournalTitle (R)
Issn (R)
Volume (O/R)
Issue (O/R)
PubDate (R)
Year (R)
Month (O/R)
Season (O)
Day (O)
Replaces (O)
ArticleTitle (O)
VernacularTitle (O)
FirstPage (O/R)
LastPage (O)
ELocationID (O/R)
Language (O)
AuthorList (O/R)
Author (R)
FirstName (O/R)
MiddleName (O)
LastName (O/R)
Suffix (O)
CollectiveName (O)
Affiliation (O)
Identifier (O)
GroupList (O/R)
Group (R)
GroupName (R)
IndividualName (O)
PublicationType (O)
ArticleIdList (O/R)
ArticleId (R)
History (O)
Abstract (O)
OtherAbstract (O)
CopyrightInformation
(O)
ObjectList (O)
Object (O)
Param (O)
Steven Folsom, NASIG Annual Conference 2014
In an Ideal World…
Steven Folsom, NASIG Annual Conference 2014Photo credit: http://www.priefert.com/
The metadata that got away…
 Pre-1945 issues not indexed by PubMed
 Supplemental volumes*
What we hope to do about it:
 Manually capture the Hathi URL’s for the supplemental
volumes and provide them to PubMed using their linking
format
 Manually capture citation data for pre-1945 articles using the
OCR files, and send to PubMed using their indexing format.
Steven Folsom, NASIG Annual Conference 2014
Project Outcomes
Soft:
 Better understanding of what’s possible with Hathi API’s
 Better understanding of PubMed’s metadata/URL contribution
requirements
 Increased desire within the Cornell Library to consider greater return on our
HathiTrust investment
Concrete:
 The Cornell Veterinarian should be available via PubMed for the years
already indexed soon
 Manually capturing the complete backfile for The Cornell Veterinarian to
contribute to PubMed
Steven Folsom, NASIG Annual Conference 2014
Future Considerations
 Potential for improved access to other titles currently
lacking full-text linking in PubMed [if in HathiTrust]
 Investigations into other (non)full-text indexes and fulltext
repositories
 New Services for interacting with HathiTrust Digital Library
 Potential improvements to the Hathi workflows.
Steven Folsom, NASIG Annual Conference 2014
Questions?
Steven Folsom, NASIG Annual Conference 2014Photo credit: ehive.com

Weitere ähnliche Inhalte

Was ist angesagt?

Understanding Crossref Metadata
Understanding Crossref MetadataUnderstanding Crossref Metadata
Understanding Crossref MetadataCrossref
 
Beyond openurl
Beyond openurlBeyond openurl
Beyond openurlCrossref
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseUniversity of Bologna
 
Collaboration Through Interoperability: FundRef and Other Metadata
Collaboration Through Interoperability: FundRef and Other Metadata Collaboration Through Interoperability: FundRef and Other Metadata
Collaboration Through Interoperability: FundRef and Other Metadata Crossref
 
Collaboration Through Interoperability
Collaboration Through InteroperabilityCollaboration Through Interoperability
Collaboration Through InteroperabilityCarol Anne Meyer
 
Crossref Metadata and Metadata Services
Crossref Metadata and Metadata ServicesCrossref Metadata and Metadata Services
Crossref Metadata and Metadata ServicesCrossref
 
2010 06 ipaw_prv
2010 06 ipaw_prv2010 06 ipaw_prv
2010 06 ipaw_prvJun Zhao
 
Introduction to CrossRef Basics Webinar
Introduction to CrossRef Basics WebinarIntroduction to CrossRef Basics Webinar
Introduction to CrossRef Basics WebinarCrossref
 
Managing plagiarism: Similarity Check
Managing plagiarism: Similarity CheckManaging plagiarism: Similarity Check
Managing plagiarism: Similarity CheckCrossref
 
2013 CrossRef Annual Meeting System Update Chuck Koscher
2013 CrossRef Annual Meeting System Update Chuck Koscher2013 CrossRef Annual Meeting System Update Chuck Koscher
2013 CrossRef Annual Meeting System Update Chuck KoscherCrossref
 
Getting started with looking up metadata
Getting started with looking up metadata Getting started with looking up metadata
Getting started with looking up metadata Crossref
 
FundRef Update - Charleston Conference 2013
FundRef Update - Charleston Conference 2013FundRef Update - Charleston Conference 2013
FundRef Update - Charleston Conference 2013Chris Shillum
 
Chuck Koscher: The Metadata Engine #crossref15
Chuck Koscher: The Metadata Engine #crossref15Chuck Koscher: The Metadata Engine #crossref15
Chuck Koscher: The Metadata Engine #crossref15Crossref
 
Crossref/OASPA Publishers
Crossref/OASPA PublishersCrossref/OASPA Publishers
Crossref/OASPA PublishersCrossref
 
Introduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining WebinarIntroduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining WebinarCrossref
 
Giving researchers credit for data
Giving researchers credit for dataGiving researchers credit for data
Giving researchers credit for dataJisc
 
Crossref Funding Data Webinar 091616
Crossref Funding Data Webinar 091616Crossref Funding Data Webinar 091616
Crossref Funding Data Webinar 091616Crossref
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersCrossref
 

Was ist angesagt? (20)

Understanding Crossref Metadata
Understanding Crossref MetadataUnderstanding Crossref Metadata
Understanding Crossref Metadata
 
Beyond openurl
Beyond openurlBeyond openurl
Beyond openurl
 
From federated to aggregated search
From federated to aggregated searchFrom federated to aggregated search
From federated to aggregated search
 
Friday talk 11.02.2011
Friday talk 11.02.2011Friday talk 11.02.2011
Friday talk 11.02.2011
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations arise
 
Collaboration Through Interoperability: FundRef and Other Metadata
Collaboration Through Interoperability: FundRef and Other Metadata Collaboration Through Interoperability: FundRef and Other Metadata
Collaboration Through Interoperability: FundRef and Other Metadata
 
Collaboration Through Interoperability
Collaboration Through InteroperabilityCollaboration Through Interoperability
Collaboration Through Interoperability
 
Crossref Metadata and Metadata Services
Crossref Metadata and Metadata ServicesCrossref Metadata and Metadata Services
Crossref Metadata and Metadata Services
 
2010 06 ipaw_prv
2010 06 ipaw_prv2010 06 ipaw_prv
2010 06 ipaw_prv
 
Introduction to CrossRef Basics Webinar
Introduction to CrossRef Basics WebinarIntroduction to CrossRef Basics Webinar
Introduction to CrossRef Basics Webinar
 
Managing plagiarism: Similarity Check
Managing plagiarism: Similarity CheckManaging plagiarism: Similarity Check
Managing plagiarism: Similarity Check
 
2013 CrossRef Annual Meeting System Update Chuck Koscher
2013 CrossRef Annual Meeting System Update Chuck Koscher2013 CrossRef Annual Meeting System Update Chuck Koscher
2013 CrossRef Annual Meeting System Update Chuck Koscher
 
Getting started with looking up metadata
Getting started with looking up metadata Getting started with looking up metadata
Getting started with looking up metadata
 
FundRef Update - Charleston Conference 2013
FundRef Update - Charleston Conference 2013FundRef Update - Charleston Conference 2013
FundRef Update - Charleston Conference 2013
 
Chuck Koscher: The Metadata Engine #crossref15
Chuck Koscher: The Metadata Engine #crossref15Chuck Koscher: The Metadata Engine #crossref15
Chuck Koscher: The Metadata Engine #crossref15
 
Crossref/OASPA Publishers
Crossref/OASPA PublishersCrossref/OASPA Publishers
Crossref/OASPA Publishers
 
Introduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining WebinarIntroduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining Webinar
 
Giving researchers credit for data
Giving researchers credit for dataGiving researchers credit for data
Giving researchers credit for data
 
Crossref Funding Data Webinar 091616
Crossref Funding Data Webinar 091616Crossref Funding Data Webinar 091616
Crossref Funding Data Webinar 091616
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, Members
 

Ähnlich wie Wrangling metadata from hathi trust and pubmed to provide full text linking to the cornell veterinarian

Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Todd Vision
 
Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) nickyn
 
Starting from scratch – building the perfect digital repository
Starting from scratch – building the perfect digital repositoryStarting from scratch – building the perfect digital repository
Starting from scratch – building the perfect digital repositoryVioleta Ilik
 
Repositories and the wider context
Repositories and the wider contextRepositories and the wider context
Repositories and the wider contextJulie Allinson
 
HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?Brian Vetruba
 
chapter 1-Overview of Information Retrieval.ppt
chapter 1-Overview of Information Retrieval.pptchapter 1-Overview of Information Retrieval.ppt
chapter 1-Overview of Information Retrieval.pptSamuelKetema1
 
Global Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage LibraryGlobal Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage LibraryMartin Kalfatovic
 
Asis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsAsis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsBert Carelli
 
OCLC Research @ U of Calgary: New directions for metadata workflows across li...
OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research @ U of Calgary: New directions for metadata workflows across li...
OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...FAIRDOM
 
Getting started with looking up metadata
Getting started with looking up metadataGetting started with looking up metadata
Getting started with looking up metadataCrossref
 
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...Open Science Fair
 
The Electronic Notebook Ontology
The Electronic Notebook OntologyThe Electronic Notebook Ontology
The Electronic Notebook OntologyStuart Chalk
 
Riding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information accessRiding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information accessdatacite
 
Archives Hub - Data in :: Data out
Archives Hub - Data in :: Data outArchives Hub - Data in :: Data out
Archives Hub - Data in :: Data outJane Stevenson
 
ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR
 
Botanicus.org: Applying ermerging technology to historic scientific literature
Botanicus.org: Applying ermerging technology to historic scientific literatureBotanicus.org: Applying ermerging technology to historic scientific literature
Botanicus.org: Applying ermerging technology to historic scientific literatureChris Freeland
 

Ähnlich wie Wrangling metadata from hathi trust and pubmed to provide full text linking to the cornell veterinarian (20)

Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck
 
Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI)
 
Starting from scratch – building the perfect digital repository
Starting from scratch – building the perfect digital repositoryStarting from scratch – building the perfect digital repository
Starting from scratch – building the perfect digital repository
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
Repositories and the wider context
Repositories and the wider contextRepositories and the wider context
Repositories and the wider context
 
HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?
 
chapter 1-Overview of Information Retrieval.ppt
chapter 1-Overview of Information Retrieval.pptchapter 1-Overview of Information Retrieval.ppt
chapter 1-Overview of Information Retrieval.ppt
 
Limitreal
LimitrealLimitreal
Limitreal
 
Global Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage LibraryGlobal Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage Library
 
Asis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsAsis&t webinar people directories access innovations
Asis&t webinar people directories access innovations
 
OCLC Research @ U of Calgary: New directions for metadata workflows across li...
OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research @ U of Calgary: New directions for metadata workflows across li...
OCLC Research @ U of Calgary: New directions for metadata workflows across li...
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
Getting started with looking up metadata
Getting started with looking up metadataGetting started with looking up metadata
Getting started with looking up metadata
 
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
OSFair2017 Workshop | How FAIR friendly is the FAIRDOM Hub? Exposing metadata...
 
The Electronic Notebook Ontology
The Electronic Notebook OntologyThe Electronic Notebook Ontology
The Electronic Notebook Ontology
 
Riding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information accessRiding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information access
 
Archives Hub - Data in :: Data out
Archives Hub - Data in :: Data outArchives Hub - Data in :: Data out
Archives Hub - Data in :: Data out
 
TIDSR
TIDSRTIDSR
TIDSR
 
ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration Tools
 
Botanicus.org: Applying ermerging technology to historic scientific literature
Botanicus.org: Applying ermerging technology to historic scientific literatureBotanicus.org: Applying ermerging technology to historic scientific literature
Botanicus.org: Applying ermerging technology to historic scientific literature
 

Mehr von NASIG

Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...
Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...
Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...NASIG
 
The Serial Cohort: A Confederacy of Catalogers
The Serial Cohort: A Confederacy of CatalogersThe Serial Cohort: A Confederacy of Catalogers
The Serial Cohort: A Confederacy of CatalogersNASIG
 
Calculating how much your University spends on Open Access and what to do abo...
Calculating how much your University spends on Open Access and what to do abo...Calculating how much your University spends on Open Access and what to do abo...
Calculating how much your University spends on Open Access and what to do abo...NASIG
 
Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...
Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...
Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...NASIG
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments NASIG
 
Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...
Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...
Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...NASIG
 
Access to Supplemental Journal Article Materials
Access to Supplemental Journal Article Materials Access to Supplemental Journal Article Materials
Access to Supplemental Journal Article Materials NASIG
 
Communications and context: strategies for onboarding new e-resources librari...
Communications and context: strategies for onboarding new e-resources librari...Communications and context: strategies for onboarding new e-resources librari...
Communications and context: strategies for onboarding new e-resources librari...NASIG
 
Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...
Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...
Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...NASIG
 
Bloomsbury digital resources
Bloomsbury digital resourcesBloomsbury digital resources
Bloomsbury digital resourcesNASIG
 
Web accessibility in the institutional repository crafting user centered sub...
Web accessibility in the institutional repository  crafting user centered sub...Web accessibility in the institutional repository  crafting user centered sub...
Web accessibility in the institutional repository crafting user centered sub...NASIG
 
Linked Data at Smithsonian Libraries
Linked Data at Smithsonian Libraries Linked Data at Smithsonian Libraries
Linked Data at Smithsonian Libraries NASIG
 
Walk this way: Online content platform migration experiences and collaboration
Walk this way: Online content platform migration experiences and collaboration Walk this way: Online content platform migration experiences and collaboration
Walk this way: Online content platform migration experiences and collaboration NASIG
 
Read & Publish – What It Takes to Implement a Seamless Model?
Read & Publish – What It Takes to Implement a Seamless Model?Read & Publish – What It Takes to Implement a Seamless Model?
Read & Publish – What It Takes to Implement a Seamless Model?NASIG
 
Mapping Domain Knowledge for Leading and Managing Change
Mapping Domain Knowledge for Leading and Managing ChangeMapping Domain Knowledge for Leading and Managing Change
Mapping Domain Knowledge for Leading and Managing ChangeNASIG
 
When to hold them when to fold them: reassessing big deals in 2020
When to hold them when to fold them: reassessing big deals in 2020When to hold them when to fold them: reassessing big deals in 2020
When to hold them when to fold them: reassessing big deals in 2020NASIG
 
Getting on the Same Page: Aligning ERM and LIbGuides Content
Getting on the Same Page: Aligning ERM and LIbGuides ContentGetting on the Same Page: Aligning ERM and LIbGuides Content
Getting on the Same Page: Aligning ERM and LIbGuides ContentNASIG
 
A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...NASIG
 
Knowledge Bases: The Heart of Resource Management
Knowledge Bases: The Heart of Resource ManagementKnowledge Bases: The Heart of Resource Management
Knowledge Bases: The Heart of Resource ManagementNASIG
 
Practical approaches to linked data
Practical approaches to linked dataPractical approaches to linked data
Practical approaches to linked dataNASIG
 

Mehr von NASIG (20)

Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...
Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...
Ctrl + Alt + Repeat: Strategies for Regaining Authority Control after a Migra...
 
The Serial Cohort: A Confederacy of Catalogers
The Serial Cohort: A Confederacy of CatalogersThe Serial Cohort: A Confederacy of Catalogers
The Serial Cohort: A Confederacy of Catalogers
 
Calculating how much your University spends on Open Access and what to do abo...
Calculating how much your University spends on Open Access and what to do abo...Calculating how much your University spends on Open Access and what to do abo...
Calculating how much your University spends on Open Access and what to do abo...
 
Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...
Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...
Measure Twice and Cut Once: How a Budget Cut Impacted Subscription Renewals f...
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments
 
Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...
Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...
Supporting Students: OER and Textbook Affordability Initiatives at a Mid-Size...
 
Access to Supplemental Journal Article Materials
Access to Supplemental Journal Article Materials Access to Supplemental Journal Article Materials
Access to Supplemental Journal Article Materials
 
Communications and context: strategies for onboarding new e-resources librari...
Communications and context: strategies for onboarding new e-resources librari...Communications and context: strategies for onboarding new e-resources librari...
Communications and context: strategies for onboarding new e-resources librari...
 
Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...
Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...
Full Text Coverage Ratios: A Simple Method of Article-Level Collections Analy...
 
Bloomsbury digital resources
Bloomsbury digital resourcesBloomsbury digital resources
Bloomsbury digital resources
 
Web accessibility in the institutional repository crafting user centered sub...
Web accessibility in the institutional repository  crafting user centered sub...Web accessibility in the institutional repository  crafting user centered sub...
Web accessibility in the institutional repository crafting user centered sub...
 
Linked Data at Smithsonian Libraries
Linked Data at Smithsonian Libraries Linked Data at Smithsonian Libraries
Linked Data at Smithsonian Libraries
 
Walk this way: Online content platform migration experiences and collaboration
Walk this way: Online content platform migration experiences and collaboration Walk this way: Online content platform migration experiences and collaboration
Walk this way: Online content platform migration experiences and collaboration
 
Read & Publish – What It Takes to Implement a Seamless Model?
Read & Publish – What It Takes to Implement a Seamless Model?Read & Publish – What It Takes to Implement a Seamless Model?
Read & Publish – What It Takes to Implement a Seamless Model?
 
Mapping Domain Knowledge for Leading and Managing Change
Mapping Domain Knowledge for Leading and Managing ChangeMapping Domain Knowledge for Leading and Managing Change
Mapping Domain Knowledge for Leading and Managing Change
 
When to hold them when to fold them: reassessing big deals in 2020
When to hold them when to fold them: reassessing big deals in 2020When to hold them when to fold them: reassessing big deals in 2020
When to hold them when to fold them: reassessing big deals in 2020
 
Getting on the Same Page: Aligning ERM and LIbGuides Content
Getting on the Same Page: Aligning ERM and LIbGuides ContentGetting on the Same Page: Aligning ERM and LIbGuides Content
Getting on the Same Page: Aligning ERM and LIbGuides Content
 
A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...A multi-institutional model for advancing open access journals and reclaiming...
A multi-institutional model for advancing open access journals and reclaiming...
 
Knowledge Bases: The Heart of Resource Management
Knowledge Bases: The Heart of Resource ManagementKnowledge Bases: The Heart of Resource Management
Knowledge Bases: The Heart of Resource Management
 
Practical approaches to linked data
Practical approaches to linked dataPractical approaches to linked data
Practical approaches to linked data
 

Kürzlich hochgeladen

Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 

Wrangling metadata from hathi trust and pubmed to provide full text linking to the cornell veterinarian

  • 1. Wrangling Metadata from HathiTrust and PubMed: Providing Full-text Linking to The Cornell Veterinarian Photo credit: http://www.walls.com/ Steven Folsom, NASIG Annual Conference 2014
  • 2. Cornell Library Digital Consulting and Production Services  A single-point of service for those wishing to create digital collections  A virtual group that spans multiple departments within the Library (Digital Scholarship and Preservation Services, Cornell Library IT and Metadata Librarians from Library Technical Services)  Approaches digital collection building holistically, and addresses the entire life cycle management of a project Steven Folsom, NASIG Annual Conference 2014
  • 3. The Cornell Veterinarian Project Participants Client:  Cornell Flower-Sprecher Veterinary Library DCAPS Involvement:  Jaron Porciello, Digital Scholarship Initiatives Coordinator  Michelle Paolillo, Project Manager/Business Analyst (CUL’s HathiTrust Liaison)  John Cline, Cornell Library Programmer  Steven Folsom, Metadata Librarian Steven Folsom, NASIG Annual Conference 2014
  • 4. HathiTrust Digital Library  Digital Library consisting of the Google Books project, Internet Archive digitization initiatives, and content digitized locally by libraries  Committed to preserving content with stable access and distributed/coordinated cost of storage  Centralized technical framework with that allows for the creation of tools and services Steven Folsom, NASIG Annual Conference 2014
  • 5. The Cornell Veterinarian Steven Folsom, NASIG Annual Conference 2014
  • 6. The Challenge Steven Folsom, NASIG Annual Conference 2014
  • 7. Hathi Volume Interface Steven Folsom, NASIG Annual Conference 2014
  • 8. Google Books: Contributions from Cornell Library  Participation in the Google Books Library Project since 2008  Google focuses on materials that they have not already digitized  Using OCLC holdings information, they compose a Cornell candidate list Steven Folsom, NASIG Annual Conference 2014
  • 9. HathiTrust Data API Steven Folsom, NASIG Annual Conference 2014
  • 10. Hathi METS File Steven Folsom, NASIG Annual Conference 2014
  • 11. METS File Continued Steven Folsom, NASIG Annual Conference 2014
  • 12. Hathifiles  Tab-delimited full files of the Hathi Digital Library and incremental updates (Full file is currently over 2.5 GB uncompressed)  Light Bibliographic data  Includes some administrative metadata, e.g. rights information, the originating institution for the scanned copy Steven Folsom, NASIG Annual Conference 2014
  • 13. Select Hathifile Record Elements Hathi Volume ID: mdp.39015076694507 Access: allow [Notes on mapping for rights attributes where contextual user data would affect access] Rights: pd [public domain] HathiTrust record number: 000529434 Enumeration/Chronology: v.33 no.11 1900 Source: MIU Title: The Chicago medical times OCLC number: 1554176 Steven Folsom, NASIG Annual Conference 2014
  • 14. HathiTrust Bibliographic API  Meant for use to retrieve information about small numbers of items at a time  Returns bibliographic, rights, and volume information when given a single or multiple standard identifiers (ISBN, LCCN, OCLC, etc.), includes overlap with the Hathifile data  Brief example: http://catalog.hathitrust.org/api/volumes/brief/oclc/424023. json  Full example:http://catalog.hathitrust.org/api/volumes/full/oclc /424023.json Steven Folsom, NASIG Annual Conference 2014
  • 15. Hathi Metadata Recap • Administrative data about scans and corresponding volumes • Uses Hathi id’s to link to bibliographic data • Bulk Bibliographic data • Some administrative data, e.g. Rights information • Small requests for Bibliographic data retrieved using standard identifiers (ISBN, LCCN, OCLC…) Steven Folsom, NASIG Annual Conference 2014
  • 16. What we thought was the solution….  Use Hathi Data API to find Table of Contents for each Volume  Gather the related OCR  Parse out article citation values from the OCR (Hopefully in a mostly automated way)  Use the pagination data from TOC to build links by mapping to pagination in the METS files.  What couldn’t be automated would be done manually (with the projected outcome being an citation index with Hathi URLs that could be used to build an interface or given to an index like PubMed) Steven Folsom, NASIG Annual Conference 2014
  • 17. Reality set in… Steven Folsom, NASIG Annual Conference 2014Photo credit: ehive.com
  • 18. HathiTrust OCR Steven Folsom, NASIG Annual Conference 2014
  • 19. The metadata continued to fight back… Photo credit: http://glpiggy.net/ Steven Folsom, NASIG Annual Conference 2014
  • 20. PubMed Indexing and API Steven Folsom, NASIG Annual Conference 2014
  • 21. A Path for Automation For each citation already in PubMed for which the HathiTrust has one volume 1. Search PubMed <Volume> AND the Hathi Catalog id (000535347) for The Cornell Veterinarian against the Hathi File to get the corresponding Hathi object id from the METS 2. Use the METS object id AND the PubMed start page (the numeric value before the ‘-“ for each PubMed article citation to find the <ORDERLABEL> to get the <Order> number from the METS file 3. Create the URL to be added to the PubMed XML. The Hathi METS object id and <Order> number are used to create the URL. The sequence number in this URL equals the <Order> number. The METS id equals the id in the URL, http://babel.hathitrust.org/cgi/pt?id=coo.31924051143075;view=1 up;seq=11 Steven Folsom, NASIG Annual Conference 2014
  • 22. NCBI’s LinkOut Program  A service that allows third parties to link specific NCBI database records to relevant web-accessible resources  The relevant journal/publication must already have gone through the Medline selection process  Document Type Definition (DTD) for contributing links in XML Steven Folsom, NASIG Annual Conference 2014
  • 23. PubMed Citation Data Requirements  PubMED DTD specifies how the data should be formatted  Data Tags (R = Required, O = Optional O/R = Optional or Required). Required tags must be included; optional tags must be included only if the data requested appears in the print or electronic article. Optional or Required tags are dependent on the use of other tags  Tag names are case sensitive Steven Folsom, NASIG Annual Conference 2014
  • 24. PubMed Citation Data Elements File Header (R) ArticleSet (R) Article (R) Journal (R) PublisherName (R) JournalTitle (R) Issn (R) Volume (O/R) Issue (O/R) PubDate (R) Year (R) Month (O/R) Season (O) Day (O) Replaces (O) ArticleTitle (O) VernacularTitle (O) FirstPage (O/R) LastPage (O) ELocationID (O/R) Language (O) AuthorList (O/R) Author (R) FirstName (O/R) MiddleName (O) LastName (O/R) Suffix (O) CollectiveName (O) Affiliation (O) Identifier (O) GroupList (O/R) Group (R) GroupName (R) IndividualName (O) PublicationType (O) ArticleIdList (O/R) ArticleId (R) History (O) Abstract (O) OtherAbstract (O) CopyrightInformation (O) ObjectList (O) Object (O) Param (O) Steven Folsom, NASIG Annual Conference 2014
  • 25. In an Ideal World… Steven Folsom, NASIG Annual Conference 2014Photo credit: http://www.priefert.com/
  • 26. The metadata that got away…  Pre-1945 issues not indexed by PubMed  Supplemental volumes* What we hope to do about it:  Manually capture the Hathi URL’s for the supplemental volumes and provide them to PubMed using their linking format  Manually capture citation data for pre-1945 articles using the OCR files, and send to PubMed using their indexing format. Steven Folsom, NASIG Annual Conference 2014
  • 27. Project Outcomes Soft:  Better understanding of what’s possible with Hathi API’s  Better understanding of PubMed’s metadata/URL contribution requirements  Increased desire within the Cornell Library to consider greater return on our HathiTrust investment Concrete:  The Cornell Veterinarian should be available via PubMed for the years already indexed soon  Manually capturing the complete backfile for The Cornell Veterinarian to contribute to PubMed Steven Folsom, NASIG Annual Conference 2014
  • 28. Future Considerations  Potential for improved access to other titles currently lacking full-text linking in PubMed [if in HathiTrust]  Investigations into other (non)full-text indexes and fulltext repositories  New Services for interacting with HathiTrust Digital Library  Potential improvements to the Hathi workflows. Steven Folsom, NASIG Annual Conference 2014
  • 29. Questions? Steven Folsom, NASIG Annual Conference 2014Photo credit: ehive.com