7 June 2017
This webinar is the second in a series examining persistent identifiers and their use in research. This webinar:
It introduced the IGSN, outlining its structure, use, application and availability for Australian researchers and research institutions
discussed the international symposium Linking Environmental Data and Samples.
Watch full webinar: https://www.youtube.com/watch?v=mOJRaLwOaCs
2. ANDS Webinar IGSN | Linking Data and Samples
Outline
This webinar is the second in a series examining persistent identifiers and their
use in research. This webinar will:
introduce the IGSN identifier for physical samples, outlining its structure, use,
application and availability for Australian researchers and research institutions
discuss the international symposium Linking Environmental Data and Samples.
Speakers:
Lesley Wyborn (Adjunct Fellow, National Computational Infrastructure Facility
and Research School of Earth Sciences at ANU)
Jens Klump (OCE Science Leader Earth Science Informatics, CSIRO)
3. ANDS Webinar IGSN | Linking Data and Samples
Rationale
What is it and how do you use it?
Science use case: Researcher collecting samples, doing a publication with data
How do I get an IGSN
4. ANDS Webinar IGSN | Linking Data and Samples
Why do we need a unique identifier for samples (Part 1) ?
In the EarthChem global geochemical database all these samples are labeled ‘M1’
5. ANDS Webinar IGSN | Linking Data and Samples
Why do we need a unique identifier for samples (Part 2) ?
Different names for dredge sample 3 from the Amphitrite cruise
As national/international data aggregations emerge, we need to be able to uniquely identify samples and
analytical data and publications derived from these samples
6. ANDS Webinar IGSN | Linking Data and Samples
IGSN Overview: what does it do?
Provides identifiers that are guaranteed to be unique via a hierarchical governance
system (like assigning IP addresses)
Facilitates internet-based discovery and access to physical samples:
Web applications and programmatic access to sample metadata catalogues
Networks with sample repositories and data centres
Ensures preservation of, and access to sample data
Aids in the identification of samples in the literature and of data derived from them
Try it out: http://igsn.org/ICDP5054ESYI201 or http://igsn.org/AU1101
7. ANDS Webinar IGSN | Linking Data and Samples
What IGSN can be used for?
Geological samples and other materials (rocks,
water, biological materials, …)
Collections (groupings of samples)
Sampling features (boreholes, outcrops, …)
Samples can be linked to each other through
the “related identifier” metadata element
(e.g., minerals separated from a parent rock,
legs from a fossil beetle
8. ANDS Webinar IGSN | Linking Data and Samples
Tracking the sample life cycle
IGSN is used in CSIRO for tracking
samples and to support sample logistics.
In the field: unambiguous
identification, metadata capture
with mobile app.
In the lab: identification and tying data
to samples.
In the Repository: identify collections
and samples in storage, catalogue,
manage sample logistics.
9. ANDS Webinar IGSN | Linking Data and Samples
IGSN enables inking of samples with data and publications
Specimen (IGSN) Spectral Results (DOI)
Publication (DOI)
10. ANDS Webinar IGSN | Linking Data and Samples
Using IGSN in the Literature
Elsevier and Copernicus earth
science journals use IGSN. The use
of IGSN is also endorsed by the
Coalition for Publishing Data in the
Earth and Space Sciences
(COPDESS).
Example: Dere, A. L., T. S. White, R. H. April, B.
Reynolds, T. E. Miller, E. P. Knapp, L. D. McKay, and S.
L. Brantley (2013), Climate dependence of feldspar
weathering in shale soils along a latitudinal gradient,
Geochimica et Cosmochimica Acta, 122, 101–126,
http://dx.doi.org/10.1016/j.gca.2013.08.001.
11. ANDS Webinar IGSN | Linking Data and Samples
IGSN System Overview
A. Sample registration:
a. Clients register samples with an
allocating agent
b. the allocating agent registers the
samples with IGSN e.V. (IGSN
Implementation Organization).
B. IGSN allocating agents in Australia
a. CSIRO
b. Geoscience Australia
c. Curtin University
12. ANDS Webinar IGSN | Linking Data and Samples
IGSN @ CSIRO
CSIRO became a member of IGSN in 2013.
IGSN are currently used for:
Repository of the ARRC (Rock, Mineral, Soil)
National Collection of Mineral Reference Spectra
(Mineral, Rock, Synthetic Material)
Capricorn Distal Footprints Project (Water,
Vegetation, Soil, Rock, Regolith)
Future use cases:
Soils collection
Insect collection
13. ANDS Webinar IGSN | Linking Data and Samples
IGSN @ Geoscience Australia
GA became a member of IGSN in 2014.
IGSN are currently used in the GA
collection.
1.6 Million samples registered
Rocks
Mineral Separates
Thin sections
Fossils
GA is the IGSN Registration Agent for the
Geological Surveys of the Australian
states and territories.
14. ANDS Webinar IGSN | Linking Data and Samples
IGSN @ Curtin University Curtin University became a member of IGSN in
2015.
IGSN is currently used in the John de Laeter
Centre for Geochemistry.
IGSN and data management are supported by the
Curtin University Library.
JdLC, CSIRO, Univ. Western Australia and Geol.
Survey of Western Australia work together in
the Natural Resources Research Precinct.
This project was supported by ANDS Major Open
Data Collections Project
15. ANDS Webinar IGSN | Linking Data and Samples
Australian IGSN Portal
A grant from the NCRIS Research Data
Services programme made it possible
to develop a demonstrator for a
common Australia Geo Sample Portal.
Metadata are harvested into a common
metadata portal to facilitate the
discovery of samples curated by
Australian IGSN members.
Samples are described in a common
metadata schema.
http://igsn2.csiro.au
16. ANDS Webinar IGSN | Linking Data and Samples
Linking to a Global Perspective
IGSN Implementation Organization
e.V. is a charitable organisation
incorporated under German law in
Potsdam, Germany.
IGSN e.V. currently has 19 members
on 4 continents.
17. ANDS Webinar IGSN | Linking Data and Samples
Governance Model
The governance model of IGSN is based on hierarchical delegation.
IGSN identifiers are registered through IGSN agents.
Each IGSN agent is given namespaces for the registration of IGSN.
Examples:
AU… Geoscience Australia
CS… CSIRO
CSCAP… CDF Project (CSIRO)
18. ANDS Webinar IGSN | Linking Data and Samples
Technical Base
IGSN builds upon an existing technical
base and community.
IGSN are based on the Handle System.
The IGSN technical architecture is
developed in close alignment with
DataCite.
http://igsn.github.io (documentation)
https://github.com/igsn (repository)
19. ANDS Webinar IGSN | Linking Data and Samples
Status
Still work in progress.
Active registration agents:
● CSIRO
● Curtin University
● Geoscience Australia
● GFZ Potsdam
● IEDA (Columbia University)
● MARUM (Univ. Bremen)
● BGR (by proxy)
● USGS (by proxy)
20. ANDS Webinar IGSN | Linking Data and Samples
Samples moving between institutions
What happens when samples move from one
institution to another?
Case 1: Laboratory
IGSN available: The laboratory uses the already
assigned IGSN.
IGSN not available: The laboratory assigns a
new IGSN.
Case 2: Subsampling
A subsample should be identified by its own
IGSN. This case depends on the details of the
setting.
21. ANDS Webinar IGSN | Linking Data and Samples
Future Outlook
Build a developer community around
IGSN, document best practices, build
reference implementations of services.
Expand identifying and linking to objects
in other domains.
Other domains start reusing IGSN
technology.
22. ANDS Webinar IGSN | Linking Data and Samples
Report from “Linking Environmental Data and Samples”
A “Cutting edge science symposium”
Seed funding from CSIRO Research Plus office
Goals:
Bring international research leaders to Australia
Provide forum for Early Career Researchers to engage
23. ANDS Webinar IGSN | Linking Data and Samples
Web Page and Sponsors
https://confluence.csiro.au/display/LEDS/Linking+Environmental+Data+and+Samples
or https://goo.gl/FaexqL
24. ANDS Webinar IGSN | Linking Data and Samples
Science Drivers
We have a rich resource of samples
supporting scientific investigation.
How can we link these to the data sets
that were derived from these samples?
How can we link samples and data to
the literature where they are
interpreted and put into context?
How can we include machines as
users?
25. ANDS Webinar IGSN | Linking Data and Samples
Solutions
What is the role of infrastructure for building the linked data federation?
How can we support the evolution of linked data?
Heterogeneity is inherent, we need to have mediation mechanisms.
How precise do terms need to be? We can suffer some degree of imprecision.
Vocabularies will be adopted if they are useful.
The fabric of science: which elements produce output?
The process of science: which outputs are produced?
The language of science: how do we describe the elements?
26. ANDS Webinar IGSN | Linking Data and Samples
Social and Community Factors in LOD
Greatest organisational effectiveness: when
technology systems fit social systems. (Paul
Box, CSIRO)
Who bears the costs in current systems? What
would be the incentive to contribute?
Fail patterns:
The 'anti-Life-of-Brian' pattern (but I'm
different!)
The "Too Big to Fail" pattern (should have
ended long ago, but too much invested)