1. | 1Open Access
Anita de Waard, VP Research Data Collaborations
Elsevier RDM Services
a.dewaard@elsevier.com
Persistent Identifiers Support
Data Publishing - IN41D-06
American Geological Union, December 15, 2016
3. | 3Open Access
PIDs in Earth Science:
Authors have uploaded data to
PANGAEA, submitted article for
publication to Marine Geology journal
Data visualization tool connects
articles and data – pulling in data
from PANGAEA for this article and
showing to the reader
4. | 4Open Access
• Chemical compound names with
PubChem CID codes are provided
by the authors
• Chemical structure image
• A short summary overview for each
compound
• Links to the PubChem Compound
Database
• >100 participating journals
• In collaboration with NCBI
See: http://www.elsevier.com/PubChemhttp://dx.doi.org/10.1016/j.jconrel.2013.03.037
PIDs in Chemistry:
10. | 10Open Access
• Implementation of Force11 Data Citation Principles
• Systematic way to link articles and data using persistent IDs
• Mechanism to give credit & attribution for data
• Announcement November 30: implemented in production workflow
PIDs for Data and Software Citation:
http://www.sciencedirect.com/science/article/pii/S0048969713006657
Data set cited in
reference list – treated
on similar footing with
articles
3. Unique Identification: A software citation should include a method for identification
that is machine actionable, globally unique, interoperable, and recognized by at least a
community of the corresponding domain experts, and preferably by general public
researchers.
4. Persistence: Unique identifiers and metadata describing the software and its
disposition should persist – even beyond the lifespan of the software they describe.
https://www.force11.org/software-citation-principles
11. | 11Open Access
• ICSU-WDS/RDA Publishing Data Service Working group for connecting papers,
• Cross-stakeholder – with input from CrossRef, DataCite, OpenAIRE, Europe
PubMed Central, ANDS, PANGAEA, Thomson Reuters, Elsevier, and others
• Proposed long-term architecture and interoperability framework: www.scholix.org
• Operational prototype at http://dliservice.research-infrastructures.eu/#/api
(including 1.4 Million links from various sources)
Linked Data Networks for PIDs:
12. | 12Open Access
We are adding PIDs for Funding
information, Licensing, Versioning
In Summary: PIDs Are The Backbone of Publishing:
Every paper we publish has a DOI (= PID).
ORCID enables every researcher
to have a PID
Every dataset we publish has a
DOI (= PID)
Knowledge components are identified
through DOIs, e.g.:
• Chemicals
• Physical Samples
• Resource Identifiers
• Organisms
• Materials
We are helping to create Open Linked Data networks of PIDs
13. | 13Open Access
BUT: Many issues to Resolve!
• Persistence: for how long? Who maintains? What if CrossRef goes out of
business?
• Identifier: Already hard to define: what is ‘a piece’ of data? ‘A piece’ of software?
Versioning, provenance, granularity, reuse?
• If these things concern you: join the PID-Party! http://pidapalooza.org/
IUPAC has recommendations for what word you should use to describe a given property, but the vocabulary itself isn’t very accessible or usable itself, thus is not universally implemented. Each site decides how it wants to label a given property, which hinders indexing and reuse of the data across silos. Structured capture of information using an ELN such as Hivebench enables the researcher to report data using a consistent vocabulary without extra effort.