This presentation was provided by Andrew McKenna-Foster of Figshare, during the sixth session of our Spring 2023 NISO Training Series "Quality Assurance of Data Sets." The class focused on Data Catalogs and Repositories, and was held June 1, 2023.
2. Quick Introduction
2
Andrew Mckenna-Foster
MSc - Environmental Science and Policy
Research background in invertebrate biodiversity
and conservation. (e.g. Headless millipede
mystery)
MLIS in 2020 with a focus on data curation
Product Specialist with Figshare since 2020
3. Agenda
â Introduction to data catalogues and repositories
and their role in data discoverability and data
quality assurance
â Examples of data catalogues and repositories,
such as Data.gov and figshare
â Best practices for sharing and publishing data sets
in repositories
3
10. Data Quality Assurance and Repositories
10
Repositories can provide:
â Credibility
â CoreTrust Seal
â Branding
â Linking to organizationâs resources
â Provenance in a structured way
â Versioning
â Linking to related objects
â Linking to funding
â Linking to citations
â Long-term access
â Machine readable metadata
â Citation mechanisms (persistent identifier)
â Re-use License
Modestas Urbonas on Unsplash
12. Short Activity
12
Search for your favorite topic in Google Dataset Search
Select a dataset result
Focus on the repository:
1. What type of repository is it? Domain, generalist, institutional, publisher?
2. How does the repository help with quality assurance? E.g.:
a. Does it clearly indicate funding, related material, and author information?
b. Can you tell if there was a curation/review process?
c. Are there links to affiliated institutions or organizations?
3. Brief share out
13. Data Quality Assurance and Catalogs
13
Catalogs provide
â Discovery
â Metadata only records that point to file
locations
â Possibly enhanced metadata, possibly not
Photo by jesse orrico on Unsplash
17. Short Group Activity
17
USDAâs Ag Data Commons is both a repository and a catalog
Look at this dataset from Ag Data Commons
1. Note the extensive metadata. Click the âExplore dataâ button to see the data
set. Where does that take you?
2. Now copy the dataset title and search for it at data.gov (searching by the full
title may not work ;) You may have to get creative)
3. What are the differences in metadata between the data.gov and Ag data
commons records?
21. Using a data repository as an individual
â Look for repositories in this order: Domain,
Institutional, Publisher, Generalist
â Think about how others will look for your
data
â keywords
â make sure to link from your paper
â Think about how others will need to reuse
and cite your data
â group files or publish separately?
â what additional resources do you need
to link to or include?
â Should you submit the record information to
any data catalogs? 21