1. Adrian Burton & Kate Le May
Data sharing and biomedical journals
Australian National Data Service
6 June 2016
2. The Australian National Data Service (ANDS) makes
Australia’s research data assets more valuable for
researchers, research institutions and the nation.
3. Data is no longer a waste product of research
by carterzufeltdesign
5. “To the greatest extent and with the fewest
constraints possible publicly funded
scientific research data should be open,
while at the same time respecting concerns
in relation to privacy, safety, security and
commercial interests, whilst acknowledging
the legitimate concerns of private partners.
G8 Science Ministers’ Statement 13 June 2013
6. “Australia’s capacity to remain competitive in the
digital economy is contingent upon its ability to
harness the value of data.”
“The Australian Government will…where possible,
ensure non-sensitive publicly funded research data is
made open for use and reuse”
Australian Government Public Data Policy Statement 7-Dec-2015
8. “3.2.2 To promote access to the benefits of research, such data should be
collected, stored and accessible in such a way that they can be used in future
research projects.” National Statement on Ethical Conduct in Human Research
A11.5.2 “The ARC considers data management planning an important part of
the responsible conduct of research and strongly encourages the depositing
of data arising from a Project in an appropriate publically accessible subject
and/or institutional repository” ARC Discovery Program
10. “PLOS journals require authors to make all data
underlying the findings described in their
manuscript fully available without restriction,
with rare exception.”
Data Availability Policy
11. “A condition of
publication in a Nature
journal is that authors
are required to make
materials, data, code,
and associated
protocols promptly
available to readers
without undue
qualifications….”
http://www.nature.com/authors/policies/availability.html
12. “there is an ethical obligation to responsibly share
data generated by interventional clinical trials
because participants have put themselves at
risk…”
“….proposes to require authors to share with
others the deidentified individual-patient data
underlying the results presented in the article…”
Proposed changes to Recommendations for the Conduct, Reporting, Editing,
and Publication of Scholarly work in Medical Journals - January 2016
14. Acceptable Data-Sharing Methods
A compulsory data availability statement
must specify how data will be shared:
• Data deposition in a repository
(strongly recommended)
• Data in supporting information files
• Data made available to all interested
researchers upon request (only by
exception)
• Data available from a third party (only
by exception)
http://journals.plos.org/plosone/s/data-availability
15.
16. Joint Data Archiving Policy (JDAP)
“[Journal] requires, as a condition for
publication, that data supporting the results in
the paper should be archived in an
appropriate public archive, such as [list of
approved archives here]. Data are important
products of the scientific enterprise, and they
should be preserved and usable for decades in
the future.
http://datadryad.org/pages/jdap
17. Joint Data Archiving Policy (JDAP)
“Authors may elect to have the data publicly
available at time of publication, or, if the
technology of the archive allows, may opt to
embargo access to the data for a period up to
a year after publication. Exceptions may be
granted at the discretion of the editor,
especially for sensitive information such as
human subject data or the location of
endangered species.”
http://datadryad.org/pages/jdap
19. Directions in national data infrastructure
• NCRIS
• ANDS, RDS
• Institutional capacity
• EU Science Cloud
• MRFF?
20. http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories
Unstructured and/or Large Data
Dryad Digital Repository
figshare
GigaDB
Harvard Dataverse Network
Open Science Framework
Zenodo
Sequencing
dbSNP
dbVar
Database of Genomic Variants Archive (DGVa)
DNA DataBank of Japan (DDBJ)
EBI Metagenomics
EMBL Nucleotide Sequence Database (ENA)
European Variation Archive (EVA)
GenBank
miRBase
NCBI Sequence Read Archive (SRA)
NCBI Trace Archive
Uniprot
Omics
ArrayExpress
Biological General Repository for Interaction Datasets (BioGRID)
Database of Interacting Proteins (DIP)
dbGAP
The European Genome-phenome Archive (EGA)
IntAct Molecular Interaction Database
Gene Expression Omnibus (GEO)
GenomeRNAi
GPM DB
MetaboLights
PeptideAtlas
Proteomics Identifications (PRIDE)
ProteomeXchange
Structural Databases
Biological Magnetic Resonance Data Bank (BMRB)
Crystallography Open Database (COD)
Coherent X-ray Imaging Data Bank (CXIDB)
Electron Microscopy Data Bank (EMDB)
FlowRepository
Protein Circular Dichroism Data Bank (PCDDB)
Worldwide Protein Data Bank (wwPDB)
Neuroscience
21. Citation metrics for research data
http://wokinfo.com/products_tools/multidisciplinary/dci/
22. Citation metrics for data
As for journal articles, data citation metrics are:
●a measure of impact and reuse
●based on citations in reference lists
●facilitated by consistent citation format
eg. Stueckle, TA; Lu, Y; Davis, ME (2012): Whole genome expression profile of lung epithelial cells following chronic
arsenic exposure. ArrayExpress Archive. http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-33520
ANDS is collaborating with Thomson Reuters to grow
Australian content in the Data Citation Index.
http://www.ands.org.au/online-services/research-data-australia/data-citation-index
24. ANDS materials
Guides
• Publishing and sharing
sensitive data
• Ethics, consent and data
sharing
• De-identification
• Data citation
• Copyright, data and licensing
Medical and Health data page
http://www.ands.org.au/working-with-
data/enabling-data-reuse/medical-
and-health
25. ANDS Guide to publishing sensitive data
How to
confidentialise data?
Conditional accessEthics and consent
LicensingRespositories
Ownership
http://ands.org.au/guides/sensitivedata
28. adrian.burton@ands.org.au
kate.lemay@ands.org.au
With the exception of logos, third party images or where otherwise indicated, this
work is licensed under the Creative Commons Australia Attribution 3.0 Licence.
ANDS is supported by the Australian
Government through the National Collaborative
Research Infrastructure Strategy Program.
Monash University leads the partnership with
the Australian National University and CSIRO.