SlideShare a Scribd company logo
1 of 75
Download to read offline
Enriching Scholarship
May 6, 2014
Natsuko Nicholls, UM Libraries
Elizabeth Moss, ICPSR
NIH (2003) Data Sharing Policy that all funding
applications of $500,000 or more per year are
expected to address data-sharing in their
application.
NSF (2011) All funding proposals submitted on
or after January 18, 2011, must include a “Data
Management Plan” describing how the
proposal will conform to NSF policy on the
dissemination and sharing of research results.
US Federal Funding Mandates
International Mandates
Aug 2011… “expectation that all our funded
researchers should maximise access to
their research data with as few
restrictions as possible. …. submit a data
management and sharing plan as part
of the application process.”
2007… “Researchers are to retain research
data and primary materials, manage
storage of research data and primary
materials, maintain confidentiality of
research data and primary materials.”
Journal Mandates
Dec 2013 . . .“We ask you to make available the data
underlying the findings in the paper, which would be
needed by someone wishing to understand, validate or
replicate the work. Our policy has not changed in this regard.
What has changed is that we now ask you to say where the
data can be found.
As the PLOS data policy applies to all fields in which we
publish, we recognize that we’ll need to work closely with
authors in some subject areas to ensure adherence to the new
policy. Some fields have very well established standards and
practices around data, while others are still evolving, and we
would like to work with any field that is developing data
standards. We are aiming to ensure transparency about
data availability.”
Questions
 Sharing data—how does it happen?
 What is data publishing?
 Is data archiving the same?
 How can we find data, access it, and reuse it?
 How can we measure the impact of sharing data?
 What’s the common denominator?
Paradigm Shift
The nature of research has become…
 More quantitative/data-intensive
 More funder-driven
 More interdisciplinary/collaborative
 More transparent
 More complicated in terms of cross-linking
 More diverse in terms of citable scholarly
outputs
The focus of scholarly communication
has changed…
From:
 Preserve publications
 Preserve data
 Preserve both (at least separately)
To:
 Preserve publications and data ‘together’
 Preserve the ‘relationships’ among them
Paradigm Shift
Publishing and Archiving
Scholarly
Communication
Availability Citability Validation
Scholarly
Publishing Data Archiving
Scholarly Publishing that
includes ‘Data Publication’
Data Dissemination Methods Indicated in
DMPs Written by UM Engineering Faculty
journal
publication
42%
faculty/
project website
36%
conference
presentation
11%
"upon request"
11%
NSF Engineering Data
Management Plan Analysis,
N=156
Data Dissemination Methods
 Submitted with journal article
 Appear in journal article upon publication
 Supplemental materials (including codebooks)
 Websites (prior/post publication)
 Institutional repositories (prior/post publication)
 Data archive per discipline’s culture of sharing
 Data repository (may be assigned by journal
publishers)
 Data papers in data journals (may be independent of
the journal article)
 “Data upon request” via email (some/all)
Repository Directory Lists
 IR
 OpenDOAR (over 2600 academic open access repositories
listed)
 Deep Blue (University of Michigan Library)
 DR
 NIH Data Sharing Repositories (57 repositories)
 Thomson Reuters Data Citation Index (174 repositories)
 Databib (975 repositories listed)
 re3Data.org (609 repositories listed)
DataCite, re3data.org, and Databib announced collaboration
towards one service under the auspices of DataCite by 2015
Disciplinary Data Repositories:
What to Look for?
 Subject/Discipline focus
 Hosted by…
 Access to data: open vs. restricted
 Deposit of data: open vs. restricted
 Deposit fee
 Persistent identifiers (DOI, hdl)
 Sustainability & preservation policy
 (Non-) Proprietary file formats
 Amount of data description/metadata
(data package level, file level, data item level)
 Associated code/software
More on Persistent IDs
 A DOI is a system for persistently identifying and locating digital objects;
Originally designed and developed for “journal articles”; ISO 26324 since 2012
 DOI can be assigned by only DOI registration agencies: e.g. DataCite, CrossRef
 Assigning DOI is not free (e.g. Costing ~$1 per DOI via CrossRef in 2013)
 DOI: prefix + suffix
• e.g. DOI for a dataset http://doi.org/10.3886/ICPSR27282.v1
 DOI prefix is unique to each publisher/repository
• ICPSR: 10.3886
• UK Data Service: 10.5255
• Figshare: 10.6084
• PANGAEA: 10.1594
• Dyad: 10.5061
 Very similar to ‘handles’ in terms of persistency
• e.g. U of M IR Deep Blue: e.g. http://hdl.handle.net/2027.42/106575
 Moving towards “Data with DOI” just as any scholarly articles
Data Repositories
Let’s take a closer look at this example!
Data Papers: Going beyond Appendices and
Supplements
Data Journals
 Number of ‘Data Journals’
As of today, 70+ data journals*
 Journal host
a) Authors
b) Journals
c) Publisher data repositories
d) Data repositories (IR/DR)
 Data journal article structure
a) Intro/Overview
b) Methods
c) Dataset description
d) Reuse potential
Source: K. Akers and J. Green. Data Sharing and Publication,
Presented at the Cyberinfrastructure (CI) Days Event, University
of Michigan, Ann Arbor, MI, November 13-14, 2013.
UP
*Note: To see a full list of data journals that currently exist, see
K. Akers’ blog post at:
http://mlibrarydata.wordpress.com/2014/05/09/data-journals/
Data Journal Example
Geoscience Data Journal by Wiley
 Launched in Fall 2012
 Published on behalf of Royal Meteorological Society
 OA with author-pay model ($1,500 per article)
 Publishes short data papers cross-linked to (and citing)
datasets that have been deposited in approved data
centers/repositories and awarded DOIs.
 A data article describes a dataset, giving details of its
collection, processing, file formats etc., but does not go
into detail of any scientific analysis of the dataset or
draw conclusions from that data.
 The data paper should allow the reader to understand
the when, why and how the data was collected, and what
the data is.
Data Journal Example (continued)
Data centers/repositories approved by Geoscience Data Journal
 3TU.Datacentrum
 British Atmospheric Data Centre (BADC)
 British Oceanographic Data Centre (BODC)
 CISL Research Data Archive
 CSIRO Data Access Portal
 Environmental Information Data Centre (EIDC)
 Figshare
 IEDA:EarthChem
 IEDA:MGDS
 National Center for Atmospheric Research (NCAR), USA
 Earth Observing Lab (EOL), observational and supporting data from atmospheric science field
experiments and arctic research
 Research Data Archive (RDA), reference datasets for weather and climate research
 National Geoscience Data Centre (NGDC)
 NERC Earth Observation Data Centre (NEODC)
 NOAA National Climatic Data Center (NCDC)
 NOAA National Oceanographic Data Center (NODC)
 NOAA National Geophysical Data Center (NGDC)
 PANGAEA
 Polar Data Centre (PDC)
 Zenodo
Data Journal Example (continued)
Data Publisher Examples
 Wiley
 Geoscience Data Journal
 Ubiquity Press
 Journal of Open Archaeology Data
 Journal of Open Psychology Data
 Open Health Data
 Journal of Open Research Software
 Nature
 Scientific Data
Data Journal Examples (to name only a
few): Some Feature Comparison
Publisher Journal OA?
Publication
Fee per Article
Publisher
hosts data?
Approved data
center/
repositories
recommended
for data deposit?
How is the article
called?
DOI?
Wiley
Geoscience
Data Journal
Yes $1,500 No Yes ‘Data Paper’ Yes
Ubiquity
Press
Open
Archeology
Data
Yes $40 No Yes ‘Data Paper’ Yes
Nature
Publishing
Group
Scientific
Data
Yes $700 No Yes ‘Data Descriptor’ Yes
Located on U of M Campus
www.icpsr.umich.edu ICPSR: Inter-university Consortium for Political and Social Research
Signs of a Trusted Repository
 A unit of ISR, ICPSR is governed by a Counsel representing
over 700 member institutions, including U of M
 Long-term sustainability: “publishing” data for 52 years
 Largest social science data repository in US with a catalog
of over 8,000 studies containing thousands of files
 Awarded the Data Seal of Approval from DANS
 Federal agencies’ archives are housed at ICPSR and fully
integrated with ICPSR’s collection
 Data preservation standards followed for data long-term,
guarding against deterioration, accidental loss, and digital
obsolescence
 Data are screened for confidentiality and privacy concerns.
Stringent protections are in place for securing and
distributing sensitive data.
 Physical and virtual data enclaves for analyzing restricted-
use data
Rich Metadata for Better Access,
Discovery, Context, and Reuse
 ICPSR formats, organizes and enhances deposited raw
research data with meaningful metadata and
documentation to make it complete, self-explanatory, and
usable for future researchers
 Study metadata and codebooks are generated according to
the Data Documentation Initiative (DDI) XML standard
 Search and filter online catalog with fielded metadata
records to enhance discovery; side-by-side comparison using
structured variable-level documentation in XML, tagged
according to the DDI standard
 All studies are registered with a unique identifier—DOIs
from DataCite. ICPSR has been providing citations to its
data since 1990 and started assigning DOIs in 2008
Replication Datasets
http://www.icpsr.umich.edu/icpsrweb/deposit/pra/index.jsp
Open Sharing for DMP Proposals
http://openicpsr.org/
Top 10 Data Downloads (last six months)
(non-anonymous, distinct users downloading one or more files)
Title Archive # Downloads
National Longitudinal Study of Adolescent Health (Add Health),
1994-2008
DSDR 1,188
General Social Survey, 1972-2012 [Cumulative File] ICPSR 737
Chinese Household Income Project, 2002 DSDR 720
India Human Development Survey (IHDS), 2005 SAMHDA 445
Collaborative Psychiatric Epidemiology Surveys (CPES), 2001-2003
[United States]
CPES 407
National Survey on Drug Use and Health, 2012 SAMHDA 314
Children of Immigrants Longitudinal Study (CILS), 1991-2006 DSDR 289
National Crime Victimization Survey, 2012 NACJD 260
National Prisoner Statistics, 1978-2011 NACJD 249
Historical, Demographic, Economic, and Social Data: The United
States, 1790-2002
ICPSR 245
Who uses these shared data?
How are they used?
With what impact?
The ICPSR Bibliography of
Data-related Literature
 Link research data to the scholarly literature about it
 Aid students, instructors, researchers, and funders to
discover and understand data use
 A searchable database currently containing over 65,000
citations of known published and unpublished works
resulting from analyses of data archived at ICPSR
 It generates study bibliographies linking each study with
the literature about it, and out to the full text
Linking the Data to the Literature
Altmetrics for research data
 Easier to access and analyze much more
research data online
 New focus on sharing that research data
 Increasing use of social media to discuss, via
tweets, likes and blog posts
 More online tools to download, collaborate
and share, like Mendeley, Figshare,
SlideShare, Dryad and ResearchGate,
DeepBlue, openICPSR
 Dependent on good citation practice
Publishers
 Springer
 Elsevier
 Wiley
 Cambridge
Journals
 BMJ Journals
 Nature
Publish
Group
 PLoS
Altmetrics
Aggregators
• Altmetric
• ImpactStory
• Plum
Analytics
Funders
• NSF
• Sloan
Foundation**
• MacMillan
• EBSCO
**The Alfred P. Sloan Foundation helps fund
ImpactStory, and is now funding the National
Information Standards Organization (NISO) to
develop standards and recommended best
practices for altmetrics.
Impact Story:
Product-level Metric
 “New ways to measure the research impact . . . of
emerging products like blog posts, datasets, and
software . . . to build a new scholarly reward system
that values and encourages web-native scholarship.”
 Open metrics, with context, using diverse products
to provide researchers with a “comprehensive impact
report” of their research output
Source: https://impactstory.org/about
Artifact-level Metric
Source: http://www.plumanalytics.com/metrics.html
Integration with Web of Science All
Databases: Research data is equal
to research literature
Articles linked to underlying data.
Increased data discovery.
Reward for data citation.
Potential for automated tracking.
Elsevier Connect
 “Elsevier is collaborating with a rapidly growing number of
external data set repositories to optimize interoperability
between their data sets and research articles on
ScienceDirect. As part of the Article of the Future project,
this reciprocal linking aims to expand the availability of
research data and improve the researcher workflow.”
 “Elsevier encourages authors to submit their data sets to
external repositories. . . But not all authors know how or
where to submit their data, and not all authors are aware of
the possibilities that data linking offers. . .The recent
agreement with Dryad Digital Repository marked the 35th
data linking partnership Elsevier has established. . .”
Source: http://www.elsevier.com/connect/bringing-data-to-life-with-data-linking
Source: http://www.slideshare.net/ElsevierConnect/columbia-27feb13v2ext
For Better Metrics on Research
Data Impact
 Need more aggregator and repository data to be
exposed for altmetric harvesters like ImpactStory
 More integrated efforts among libraries, publishers,
archives, and funders. For example:
 The Data Conservancy, IEEE, and Portico receive
Alfred P. Sloan Foundation grant to connect
publications and their linked data
Formal Citation, in the References,
with the DOI
doi:10.3886/ICPSR21240
http://www.flickr.com/photos/papertrix/38028138/
Some Challenges
No Common Practice of Formal
Data Citation
 Abstract?
 Acknowledgements?
 Charts and Tables?
 Appendices?
 Discussion?
 Footnotes?
 Sample?
 Methods?
References!
 Without an explicit
citation, reader must
infer or be out of luck
 No attribution—no credit
 No access—no reuse
 No discernible impact!
Examples of Bad Data Citation
Poorly described and cited data
+
Excessive human search effort, extensive collection
knowledge
=
Too costly, too questionable for confident measure
of impact
Examples of Good Data Citation
Formal data
Citing with
a DOI
+
Minimal human search effort
=
High hit accuracy for the cost, and better
confidence of impact measures
Basic Data Citation Format
Creator (Year) Title. Publisher. Identifier
(For datasets that have DOIs, DataCite and CrossRef provide a citation
formatter to generate a citation in various journal styles.)
Core Elements
Creator(s): Individual(s) or organization responsible for creating
the dataset.
Year: Year the dataset was published, not necessarily created.
Title: Should be as descriptive as possible
Publisher: Organization that provides access to the dataset (e.g.
Dryad, Zenodo)
Identifier: Persistent, unique identifier (e.g. a DOI)
Source: http://datapub.cdlib.org/datacitation/
How to Cite Data
Additional Elements
Location / Availability: The web address of the dataset is essential
when the identifier can’t be used to reach the dataset.
Version / Edition: Version of the dataset used in the present
publication. Needed to reproduce analysis of versioned dynamic
datasets.
Access Date: Date of access for analysis in the present publication.
Needed to reproduce analysis of continuously updated dynamic
datasets.
Format / Material Designator: e.g., database, CD-ROM.
Feature Name: A description of the subset of the dataset used. May be
a formal title or a list of variables (e.g., concentration, optical density).
Verifier: Used to confirm that two datasets are identical. Most
commonly a UNF or MD5 checksum.
Series: Used if the dataset is part of series of releases (e.g., monthly)
Contributor: e.g., editor, compiler
Source: http://datapub.cdlib.org/datacitation/
How to Cite Data
Data Citation Examples
Deschenes, Elizabeth Piper, Susan Turner, and Joan Petersilia.
Intensive Community Supervision in Minnesota, 1990-1992: A Dual
Experiment in Prison Diversion and Enhanced Supervised Release.
ICPSR06849-v1. Ann Arbor, MI: Inter-university Consortium for
Political and Social Research [distributor], 2000.
doi:10.3886/ICPSR06849.v1
Esther Duflo; Rohini Pande, 2006, "Dams, Poverty, Public Goods
and Malaria Incidence in India",
http://hdl.handle.net/1902.1/IOJHHXOOLZ
UNF:5:obNHHq1gtV400a4T+Xrp9g== Murray Research Archive
[Distributor] V2 [Version]
Sidlauskas B (2007) Data from: Testing for unequal rates of
morphological diversification in the absence of a detailed
phylogeny: a case study from characiform fishes. Dryad Digital
Repository. doi:10.5061/dryad.20
Joint Declaration of Data
Citation Principles
1. Future Of Research Communication and
E-Scholarship (FORCE11)
2. Committee on Data for Science and
Technology (CODATA)
3. Digital Curation Centre (DCC)
Source: https://www.force11.org/datacitation
Eight Principles
1. Importance--Data should be considered
legitimate, citable products of research. Data
citations should be accorded the same importance
in the scholarly record as citations of other
research objects, such as publications.
2.Credit and Attribution--Data citations should
facilitate giving scholarly credit and normative and
legal attribution to all contributors to the data,
recognizing that a single style or mechanism of
attribution may not be applicable to all data.
Eight Principles
3. Evidence—In scholarly literature, whenever and
wherever a claim relies upon data, the
corresponding data should be cited.
4. Unique Identification—A data citation should
include a persistent method for identification
that is machine actionable, globally unique, and
widely used by a community.
Eight Principles
5. Access—Data citations should facilitate access to
the data themselves and to such associated
metadata, documentation, code, and other
materials, as are necessary for both humans and
machines to make informed use of the referenced
data.
6.Persistence—Unique identifiers, and metadata
describing the data, and its disposition, should
persist -- even beyond the lifespan of the data they
describe.
Eight Principles
7. Specificity and Verifiability—Data citations
should facilitate identification of, access to, and
verification of the specific data that support a
claim.
Citations or citation metadata should include
information about provenance and fixity
sufficient to facilitate verifying that the specific
timeslice, version and/or granular portion of data
retrieved subsequently is the same as was
originally cited.
Eight Principles
8. Interoperability and flexibility—Data citation
methods should be sufficiently flexible to
accommodate the variant practices among
communities, but should not differ so much that
they compromise interoperability of data citation
practices across communities.
Make Your Data Count
 If it’s not cited, it can’t be counted
 Without counting data use, there is no
accurate way to measure the impact of your
shared data
 Without a well-formed citation, your data
cannot take advantage of the potential of
linked scholarly publishing
 Store your data where citations are unique and
persistent
 Cite your own data and others’ in your
publications
Questions Answered?
 Sharing data—how does it happen?
 What is data publishing?
 Is data archiving the same?
 How can we find data, access it, and reuse it?
 How can we measure the impact of sharing data?
 What’s the common denominator?
Thank you!
Natsuko Nicholls
hayashin@umich.edu
Elizabeth Moss
eammoss@umich.edu

More Related Content

What's hot

DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your dataLeon Osinski
 
Good (enough) research data management practices
Good (enough) research data management practicesGood (enough) research data management practices
Good (enough) research data management practicesLeon Osinski
 
Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Robert Oostenveld
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU EindhovenLeon Osinski
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycleMarieke Guy
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishingVarsha Khodiyar
 
Research Data Management and Librarians
Research Data Management and LibrariansResearch Data Management and Librarians
Research Data Management and LibrariansJohann van Wyk
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identificationguest453b14
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data ManagementOpenAIRE
 
Research Data Management and the Research Data Lifecycle: a Gentle Introduction
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionResearch Data Management and the Research Data Lifecycle: a Gentle Introduction
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
 
Research data management for masters and ph d students
Research data management for masters and ph d studentsResearch data management for masters and ph d students
Research data management for masters and ph d studentsDebs Martindale
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
Going Full Circle: Research Data Management @ University of Pretoria
Going Full Circle: Research Data Management @ University of PretoriaGoing Full Circle: Research Data Management @ University of Pretoria
Going Full Circle: Research Data Management @ University of PretoriaJohann van Wyk
 

What's hot (19)

DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
Good (enough) research data management practices
Good (enough) research data management practicesGood (enough) research data management practices
Good (enough) research data management practices
 
Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
The Donders Repository
The Donders RepositoryThe Donders Repository
The Donders Repository
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
Preparing your data for sharing and publishing
Preparing your data for sharing and publishingPreparing your data for sharing and publishing
Preparing your data for sharing and publishing
 
Research Data Management and Librarians
Research Data Management and LibrariansResearch Data Management and Librarians
Research Data Management and Librarians
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identification
 
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
Research Data Management and the Research Data Lifecycle: a Gentle Introduction
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionResearch Data Management and the Research Data Lifecycle: a Gentle Introduction
Research Data Management and the Research Data Lifecycle: a Gentle Introduction
 
Research data management for masters and ph d students
Research data management for masters and ph d studentsResearch data management for masters and ph d students
Research data management for masters and ph d students
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Going Full Circle: Research Data Management @ University of Pretoria
Going Full Circle: Research Data Management @ University of PretoriaGoing Full Circle: Research Data Management @ University of Pretoria
Going Full Circle: Research Data Management @ University of Pretoria
 

Viewers also liked

Social media and the doctoral researcher
Social media and the doctoral researcherSocial media and the doctoral researcher
Social media and the doctoral researcherPat Thomson
 
Powerpoint final case study presentation
Powerpoint final case study presentationPowerpoint final case study presentation
Powerpoint final case study presentationJLUM13
 
Journal article introductions
Journal article introductionsJournal article introductions
Journal article introductionsPat Thomson
 
How to present a journal club
How to present a journal clubHow to present a journal club
How to present a journal clubsanch1684
 

Viewers also liked (6)

Find Journal Article
Find Journal ArticleFind Journal Article
Find Journal Article
 
Social media and the doctoral researcher
Social media and the doctoral researcherSocial media and the doctoral researcher
Social media and the doctoral researcher
 
Powerpoint final case study presentation
Powerpoint final case study presentationPowerpoint final case study presentation
Powerpoint final case study presentation
 
Patient Case Presentation
Patient Case PresentationPatient Case Presentation
Patient Case Presentation
 
Journal article introductions
Journal article introductionsJournal article introductions
Journal article introductions
 
How to present a journal club
How to present a journal clubHow to present a journal club
How to present a journal club
 

Similar to Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing the Underlying Data

Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional RepositoriesRobin Rice
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
ODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific DataODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific Datadatacite
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryRobin Rice
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Datacunera
 
Research Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementResearch Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementARDC
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...EDINA, University of Edinburgh
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Robin Rice
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesSEAD
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)aaroncollie
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositoriesChris Rusbridge
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfreypvhead123
 
DataCite overview 2014
DataCite overview 2014DataCite overview 2014
DataCite overview 2014datacite
 

Similar to Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing the Underlying Data (20)

Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
ODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific DataODIN Final Event - The Care and Feeding of Scientific Data
ODIN Final Event - The Care and Feeding of Scientific Data
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Data
 
Research Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementResearch Integrity Advisor and Data Management
Research Integrity Advisor and Data Management
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Introduction to Data Management and Sharing
Introduction to Data Management and SharingIntroduction to Data Management and Sharing
Introduction to Data Management and Sharing
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research Series
 
Researh data management
Researh data managementResearh data management
Researh data management
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfrey
 
Engaging the Researcher in RDM
Engaging the Researcher in RDMEngaging the Researcher in RDM
Engaging the Researcher in RDM
 
DataCite overview 2014
DataCite overview 2014DataCite overview 2014
DataCite overview 2014
 

Recently uploaded

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 

Recently uploaded (20)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 

Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing the Underlying Data

  • 1. Enriching Scholarship May 6, 2014 Natsuko Nicholls, UM Libraries Elizabeth Moss, ICPSR
  • 2. NIH (2003) Data Sharing Policy that all funding applications of $500,000 or more per year are expected to address data-sharing in their application. NSF (2011) All funding proposals submitted on or after January 18, 2011, must include a “Data Management Plan” describing how the proposal will conform to NSF policy on the dissemination and sharing of research results. US Federal Funding Mandates
  • 3. International Mandates Aug 2011… “expectation that all our funded researchers should maximise access to their research data with as few restrictions as possible. …. submit a data management and sharing plan as part of the application process.” 2007… “Researchers are to retain research data and primary materials, manage storage of research data and primary materials, maintain confidentiality of research data and primary materials.”
  • 4. Journal Mandates Dec 2013 . . .“We ask you to make available the data underlying the findings in the paper, which would be needed by someone wishing to understand, validate or replicate the work. Our policy has not changed in this regard. What has changed is that we now ask you to say where the data can be found. As the PLOS data policy applies to all fields in which we publish, we recognize that we’ll need to work closely with authors in some subject areas to ensure adherence to the new policy. Some fields have very well established standards and practices around data, while others are still evolving, and we would like to work with any field that is developing data standards. We are aiming to ensure transparency about data availability.”
  • 5. Questions  Sharing data—how does it happen?  What is data publishing?  Is data archiving the same?  How can we find data, access it, and reuse it?  How can we measure the impact of sharing data?  What’s the common denominator?
  • 6.
  • 7. Paradigm Shift The nature of research has become…  More quantitative/data-intensive  More funder-driven  More interdisciplinary/collaborative  More transparent  More complicated in terms of cross-linking  More diverse in terms of citable scholarly outputs
  • 8. The focus of scholarly communication has changed… From:  Preserve publications  Preserve data  Preserve both (at least separately) To:  Preserve publications and data ‘together’  Preserve the ‘relationships’ among them Paradigm Shift
  • 9. Publishing and Archiving Scholarly Communication Availability Citability Validation Scholarly Publishing Data Archiving Scholarly Publishing that includes ‘Data Publication’
  • 10.
  • 11. Data Dissemination Methods Indicated in DMPs Written by UM Engineering Faculty journal publication 42% faculty/ project website 36% conference presentation 11% "upon request" 11% NSF Engineering Data Management Plan Analysis, N=156
  • 12. Data Dissemination Methods  Submitted with journal article  Appear in journal article upon publication  Supplemental materials (including codebooks)  Websites (prior/post publication)  Institutional repositories (prior/post publication)  Data archive per discipline’s culture of sharing  Data repository (may be assigned by journal publishers)  Data papers in data journals (may be independent of the journal article)  “Data upon request” via email (some/all)
  • 13. Repository Directory Lists  IR  OpenDOAR (over 2600 academic open access repositories listed)  Deep Blue (University of Michigan Library)  DR  NIH Data Sharing Repositories (57 repositories)  Thomson Reuters Data Citation Index (174 repositories)  Databib (975 repositories listed)  re3Data.org (609 repositories listed) DataCite, re3data.org, and Databib announced collaboration towards one service under the auspices of DataCite by 2015
  • 14. Disciplinary Data Repositories: What to Look for?  Subject/Discipline focus  Hosted by…  Access to data: open vs. restricted  Deposit of data: open vs. restricted  Deposit fee  Persistent identifiers (DOI, hdl)  Sustainability & preservation policy  (Non-) Proprietary file formats  Amount of data description/metadata (data package level, file level, data item level)  Associated code/software
  • 15. More on Persistent IDs  A DOI is a system for persistently identifying and locating digital objects; Originally designed and developed for “journal articles”; ISO 26324 since 2012  DOI can be assigned by only DOI registration agencies: e.g. DataCite, CrossRef  Assigning DOI is not free (e.g. Costing ~$1 per DOI via CrossRef in 2013)  DOI: prefix + suffix • e.g. DOI for a dataset http://doi.org/10.3886/ICPSR27282.v1  DOI prefix is unique to each publisher/repository • ICPSR: 10.3886 • UK Data Service: 10.5255 • Figshare: 10.6084 • PANGAEA: 10.1594 • Dyad: 10.5061  Very similar to ‘handles’ in terms of persistency • e.g. U of M IR Deep Blue: e.g. http://hdl.handle.net/2027.42/106575  Moving towards “Data with DOI” just as any scholarly articles
  • 16. Data Repositories Let’s take a closer look at this example!
  • 17. Data Papers: Going beyond Appendices and Supplements
  • 18. Data Journals  Number of ‘Data Journals’ As of today, 70+ data journals*  Journal host a) Authors b) Journals c) Publisher data repositories d) Data repositories (IR/DR)  Data journal article structure a) Intro/Overview b) Methods c) Dataset description d) Reuse potential Source: K. Akers and J. Green. Data Sharing and Publication, Presented at the Cyberinfrastructure (CI) Days Event, University of Michigan, Ann Arbor, MI, November 13-14, 2013. UP *Note: To see a full list of data journals that currently exist, see K. Akers’ blog post at: http://mlibrarydata.wordpress.com/2014/05/09/data-journals/
  • 19. Data Journal Example Geoscience Data Journal by Wiley  Launched in Fall 2012  Published on behalf of Royal Meteorological Society  OA with author-pay model ($1,500 per article)  Publishes short data papers cross-linked to (and citing) datasets that have been deposited in approved data centers/repositories and awarded DOIs.  A data article describes a dataset, giving details of its collection, processing, file formats etc., but does not go into detail of any scientific analysis of the dataset or draw conclusions from that data.  The data paper should allow the reader to understand the when, why and how the data was collected, and what the data is.
  • 20. Data Journal Example (continued) Data centers/repositories approved by Geoscience Data Journal  3TU.Datacentrum  British Atmospheric Data Centre (BADC)  British Oceanographic Data Centre (BODC)  CISL Research Data Archive  CSIRO Data Access Portal  Environmental Information Data Centre (EIDC)  Figshare  IEDA:EarthChem  IEDA:MGDS  National Center for Atmospheric Research (NCAR), USA  Earth Observing Lab (EOL), observational and supporting data from atmospheric science field experiments and arctic research  Research Data Archive (RDA), reference datasets for weather and climate research  National Geoscience Data Centre (NGDC)  NERC Earth Observation Data Centre (NEODC)  NOAA National Climatic Data Center (NCDC)  NOAA National Oceanographic Data Center (NODC)  NOAA National Geophysical Data Center (NGDC)  PANGAEA  Polar Data Centre (PDC)  Zenodo
  • 21. Data Journal Example (continued)
  • 22. Data Publisher Examples  Wiley  Geoscience Data Journal  Ubiquity Press  Journal of Open Archaeology Data  Journal of Open Psychology Data  Open Health Data  Journal of Open Research Software  Nature  Scientific Data
  • 23. Data Journal Examples (to name only a few): Some Feature Comparison Publisher Journal OA? Publication Fee per Article Publisher hosts data? Approved data center/ repositories recommended for data deposit? How is the article called? DOI? Wiley Geoscience Data Journal Yes $1,500 No Yes ‘Data Paper’ Yes Ubiquity Press Open Archeology Data Yes $40 No Yes ‘Data Paper’ Yes Nature Publishing Group Scientific Data Yes $700 No Yes ‘Data Descriptor’ Yes
  • 24.
  • 25. Located on U of M Campus www.icpsr.umich.edu ICPSR: Inter-university Consortium for Political and Social Research
  • 26.
  • 27. Signs of a Trusted Repository  A unit of ISR, ICPSR is governed by a Counsel representing over 700 member institutions, including U of M  Long-term sustainability: “publishing” data for 52 years  Largest social science data repository in US with a catalog of over 8,000 studies containing thousands of files  Awarded the Data Seal of Approval from DANS  Federal agencies’ archives are housed at ICPSR and fully integrated with ICPSR’s collection  Data preservation standards followed for data long-term, guarding against deterioration, accidental loss, and digital obsolescence  Data are screened for confidentiality and privacy concerns. Stringent protections are in place for securing and distributing sensitive data.  Physical and virtual data enclaves for analyzing restricted- use data
  • 28. Rich Metadata for Better Access, Discovery, Context, and Reuse  ICPSR formats, organizes and enhances deposited raw research data with meaningful metadata and documentation to make it complete, self-explanatory, and usable for future researchers  Study metadata and codebooks are generated according to the Data Documentation Initiative (DDI) XML standard  Search and filter online catalog with fielded metadata records to enhance discovery; side-by-side comparison using structured variable-level documentation in XML, tagged according to the DDI standard  All studies are registered with a unique identifier—DOIs from DataCite. ICPSR has been providing citations to its data since 1990 and started assigning DOIs in 2008
  • 30. Open Sharing for DMP Proposals http://openicpsr.org/
  • 31.
  • 32.
  • 33.
  • 34. Top 10 Data Downloads (last six months) (non-anonymous, distinct users downloading one or more files) Title Archive # Downloads National Longitudinal Study of Adolescent Health (Add Health), 1994-2008 DSDR 1,188 General Social Survey, 1972-2012 [Cumulative File] ICPSR 737 Chinese Household Income Project, 2002 DSDR 720 India Human Development Survey (IHDS), 2005 SAMHDA 445 Collaborative Psychiatric Epidemiology Surveys (CPES), 2001-2003 [United States] CPES 407 National Survey on Drug Use and Health, 2012 SAMHDA 314 Children of Immigrants Longitudinal Study (CILS), 1991-2006 DSDR 289 National Crime Victimization Survey, 2012 NACJD 260 National Prisoner Statistics, 1978-2011 NACJD 249 Historical, Demographic, Economic, and Social Data: The United States, 1790-2002 ICPSR 245
  • 35. Who uses these shared data? How are they used? With what impact?
  • 36.
  • 37.
  • 38. The ICPSR Bibliography of Data-related Literature  Link research data to the scholarly literature about it  Aid students, instructors, researchers, and funders to discover and understand data use  A searchable database currently containing over 65,000 citations of known published and unpublished works resulting from analyses of data archived at ICPSR  It generates study bibliographies linking each study with the literature about it, and out to the full text
  • 39. Linking the Data to the Literature
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. Altmetrics for research data  Easier to access and analyze much more research data online  New focus on sharing that research data  Increasing use of social media to discuss, via tweets, likes and blog posts  More online tools to download, collaborate and share, like Mendeley, Figshare, SlideShare, Dryad and ResearchGate, DeepBlue, openICPSR  Dependent on good citation practice
  • 47. Publishers  Springer  Elsevier  Wiley  Cambridge Journals  BMJ Journals  Nature Publish Group  PLoS Altmetrics Aggregators • Altmetric • ImpactStory • Plum Analytics Funders • NSF • Sloan Foundation** • MacMillan • EBSCO **The Alfred P. Sloan Foundation helps fund ImpactStory, and is now funding the National Information Standards Organization (NISO) to develop standards and recommended best practices for altmetrics.
  • 48. Impact Story: Product-level Metric  “New ways to measure the research impact . . . of emerging products like blog posts, datasets, and software . . . to build a new scholarly reward system that values and encourages web-native scholarship.”  Open metrics, with context, using diverse products to provide researchers with a “comprehensive impact report” of their research output Source: https://impactstory.org/about
  • 50.
  • 51. Integration with Web of Science All Databases: Research data is equal to research literature
  • 52. Articles linked to underlying data. Increased data discovery. Reward for data citation. Potential for automated tracking.
  • 53. Elsevier Connect  “Elsevier is collaborating with a rapidly growing number of external data set repositories to optimize interoperability between their data sets and research articles on ScienceDirect. As part of the Article of the Future project, this reciprocal linking aims to expand the availability of research data and improve the researcher workflow.”  “Elsevier encourages authors to submit their data sets to external repositories. . . But not all authors know how or where to submit their data, and not all authors are aware of the possibilities that data linking offers. . .The recent agreement with Dryad Digital Repository marked the 35th data linking partnership Elsevier has established. . .” Source: http://www.elsevier.com/connect/bringing-data-to-life-with-data-linking
  • 55. For Better Metrics on Research Data Impact  Need more aggregator and repository data to be exposed for altmetric harvesters like ImpactStory  More integrated efforts among libraries, publishers, archives, and funders. For example:  The Data Conservancy, IEEE, and Portico receive Alfred P. Sloan Foundation grant to connect publications and their linked data
  • 56.
  • 57. Formal Citation, in the References, with the DOI doi:10.3886/ICPSR21240
  • 59. No Common Practice of Formal Data Citation  Abstract?  Acknowledgements?  Charts and Tables?  Appendices?  Discussion?  Footnotes?  Sample?  Methods? References!  Without an explicit citation, reader must infer or be out of luck  No attribution—no credit  No access—no reuse  No discernible impact!
  • 60. Examples of Bad Data Citation Poorly described and cited data + Excessive human search effort, extensive collection knowledge = Too costly, too questionable for confident measure of impact
  • 61.
  • 62. Examples of Good Data Citation Formal data Citing with a DOI + Minimal human search effort = High hit accuracy for the cost, and better confidence of impact measures
  • 63.
  • 64. Basic Data Citation Format Creator (Year) Title. Publisher. Identifier (For datasets that have DOIs, DataCite and CrossRef provide a citation formatter to generate a citation in various journal styles.) Core Elements Creator(s): Individual(s) or organization responsible for creating the dataset. Year: Year the dataset was published, not necessarily created. Title: Should be as descriptive as possible Publisher: Organization that provides access to the dataset (e.g. Dryad, Zenodo) Identifier: Persistent, unique identifier (e.g. a DOI) Source: http://datapub.cdlib.org/datacitation/ How to Cite Data
  • 65. Additional Elements Location / Availability: The web address of the dataset is essential when the identifier can’t be used to reach the dataset. Version / Edition: Version of the dataset used in the present publication. Needed to reproduce analysis of versioned dynamic datasets. Access Date: Date of access for analysis in the present publication. Needed to reproduce analysis of continuously updated dynamic datasets. Format / Material Designator: e.g., database, CD-ROM. Feature Name: A description of the subset of the dataset used. May be a formal title or a list of variables (e.g., concentration, optical density). Verifier: Used to confirm that two datasets are identical. Most commonly a UNF or MD5 checksum. Series: Used if the dataset is part of series of releases (e.g., monthly) Contributor: e.g., editor, compiler Source: http://datapub.cdlib.org/datacitation/ How to Cite Data
  • 66. Data Citation Examples Deschenes, Elizabeth Piper, Susan Turner, and Joan Petersilia. Intensive Community Supervision in Minnesota, 1990-1992: A Dual Experiment in Prison Diversion and Enhanced Supervised Release. ICPSR06849-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2000. doi:10.3886/ICPSR06849.v1 Esther Duflo; Rohini Pande, 2006, "Dams, Poverty, Public Goods and Malaria Incidence in India", http://hdl.handle.net/1902.1/IOJHHXOOLZ UNF:5:obNHHq1gtV400a4T+Xrp9g== Murray Research Archive [Distributor] V2 [Version] Sidlauskas B (2007) Data from: Testing for unequal rates of morphological diversification in the absence of a detailed phylogeny: a case study from characiform fishes. Dryad Digital Repository. doi:10.5061/dryad.20
  • 67. Joint Declaration of Data Citation Principles 1. Future Of Research Communication and E-Scholarship (FORCE11) 2. Committee on Data for Science and Technology (CODATA) 3. Digital Curation Centre (DCC) Source: https://www.force11.org/datacitation
  • 68. Eight Principles 1. Importance--Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications. 2.Credit and Attribution--Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data.
  • 69. Eight Principles 3. Evidence—In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited. 4. Unique Identification—A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.
  • 70. Eight Principles 5. Access—Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data. 6.Persistence—Unique identifiers, and metadata describing the data, and its disposition, should persist -- even beyond the lifespan of the data they describe.
  • 71. Eight Principles 7. Specificity and Verifiability—Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited.
  • 72. Eight Principles 8. Interoperability and flexibility—Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities.
  • 73. Make Your Data Count  If it’s not cited, it can’t be counted  Without counting data use, there is no accurate way to measure the impact of your shared data  Without a well-formed citation, your data cannot take advantage of the potential of linked scholarly publishing  Store your data where citations are unique and persistent  Cite your own data and others’ in your publications
  • 74. Questions Answered?  Sharing data—how does it happen?  What is data publishing?  Is data archiving the same?  How can we find data, access it, and reuse it?  How can we measure the impact of sharing data?  What’s the common denominator?