SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Consultant,
Honorary Academic Editor
Associate Director,
Principal Investigator
!
Better Data = Better Science
!
Susanna-Assunta Sansone, PhD!
!
!
@biosharing!
@isatools!
!
NC3Rs Publication Bias Workshop,
London, 24-25 February, 2015
http://www.slideshare.net/SusannaSansone
Plagued by selective reporting of data and methods
Plagued by selective reporting of data and methods
Why? For example:
•  Researchers still lack of or insufficient motivations
o  Focus on big discovery and impact; because they “have to”
•  Hypothesis-confirming results get prioritized
o  Difficulties with reviews of other results
•  Agreements, disagreements and timing
o  Unclear or lack of data sharing agreements and timing of disclosure
•  Loose requirements and monitoring by journals and funders
o  Publish and release just enough; keep the rest, move to next grant
Are open data and methods understandable, reusable?
Are open data and methods understandable, reusable?
Not always…
•  Outputs are multi-dimensional, diverse, not always well cited / stored
•  Software, codes, workflows etc.; hard(er) to get hold of
•  Data often distributed and fragmented to fit (siloed) databases
o  Not contain enough information for others to understand it
•  Uneven level of details and annotation across different databases
o  Specialized, generalist, public and institutional
•  Data curation activities are perceived as time consuming
o  Collection and harmonization of detailed methods and experimental
steps is done/rushed at publication stage
Worldwide movement for FAIR data
Role of data papers / data journals
•  Incentive, credit for sharing!
•  Data-focused peer review!
•  Value of data vs. analysis, results!
•  Support of the FAIR concept!
market research (2011)
•  What do researchers want from a data publications?
o  96% - increased visibility and discovery
o  95% - increased usability of their research data
o  93% - credit mechanism for deposit of data
o  80% - peer review of content/datasets
Respondent characteristics
387 respondents (329 active researchers
Physics (24%)
Earth and environmental science (21%)
Biology (20%)
Chemistry (19%)
Others (16%)
Because of importance of formal
publications in the academic !
incentive structure!
Publishers occupy a leverage point
Role of publishers as “agents of change”


"
!
!
Helping you publish, discover and reuse research data
Credit for sharing
your data
Focused on reuse
and reproducibility
Peer reviewed,
curated
Promoting community
data and code
repositories
Open Access
•  Currently covering life, natural and environmental
sciences!
•  Big and small data!
o  power of small data are in their aggregation and
integration with other datasets!
•  New and previously published individual datasets,
curated collections and citizen science!
o  a fuller, more in-depth look at the data processing
steps, additional data files, codes etc!
o  tutorial-like information for scientists interested in
reusing or integrating the data with their own!
Methods and technical analyses supporting the quality
of the measurements:"
What did I do to generate the data?"
How was the data processed?"
Where is the data?"
Who did what when"
How can the data be used or reused?"
Introducing a new content type: Data Descriptor
Designed to make data
more FAIR
Focused mainly on:
•  Methods
•  Technical Validation
•  Data Records
•  Usage Notes
"
"
"
"
"
"
"
"
Scientific hypotheses:"
Synthesis"
Analysis"
Conclusions"
Methods and technical analyses supporting the quality
of the measurements:"
What did I do to generate the data?"
How was the data processed?"
Where is the data?"
Who did what when"
How can the data be used or reused?"
Relation with traditional article - content
AFTER: expand on your research articles, adding further information for reuse of the data
AT THE SAME TIME: publish your Data Descriptor(s) alongside research article(s)
OR BEFORE
Relation with traditional article - time
Publish
Data!
"
"
"
"
"
"
"
"
"
Code in GitHub
"
"
"
"
"
"
"
"
"
Data in OpenfMRI
Share your data, get credited and cited
Evaluation is not be based on the perceived impact!
or novelty of the findings or size of the data!
!
•  Experimental rigour and technical data quality!
o  Methodologically sound!
o  Technical validation experiments and statistical analyses!
o  Depth, coverage, size, and/or completeness of data sufficient for the types
of applications!
•  Completeness of the description!
o  Sufficient details to allow others to reproduce the results, reuse or
integrate it with other data!
o  Compliance with relevant minimum information or reporting standards!
•  Integrity of the data files and repository record!
o  Data files match the descriptions in the Data Descriptor!
o  Deposited in the most appropriate available databases!
Peer review process focused on quality and reuse!
"
"
"
Experimental metadata or"
structured component"
(in-house curated, machine-
readable formats)"
Article or "
narrative component"
(PDF and HTML)!
Data Descriptor: narrative and structure
Sections:!
•  Title"
•  Abstract"
•  Background & Summary"
•  Methods"
•  Technical Validation"
•  Data Records"
•  Usage Notes "
•  Figures & Tables "
•  References"
•  Data Citations"


!
Focus on data reuse"
Detailed descriptions of the methods and technical analyses supporting the
quality of the measurements.!
Does not contain tests of new scientific hypotheses!
Joint Declaration of Data Citation Principles by the
Data Citation Synthesis Group
Data Descriptor: narrative
In-house editorial curator assists authors via !
•  Excel spreadsheet
templates"
•  internal authoring tool!
to create the structured
component, also performing
value-added semantic
annotation
analysis !
method! script!
Data file or !
record in a
database!
Data Descriptor: structure (CC0)
Because we do not want cryptic experimental info, e.g.:
LS1_C2_LD_TP2_P1! file1-fastq.gz!
…how not to report the experimental information!
•  L!S1 ! !liver sample 1!
•  C2 ! !compound 2!
•  LD ! !low dose!
•  TP2 ! !time point 2!
•  P1 ! !protocol 1!
•  file1-fastq.gz !compressed data file for sequence !
! ! !information corresponding to this sample!
Sample name (?!)" Data file"
LS1_C2_LD_TP2_P1! file1-fastq.gz!
Structured component: key information from narrative
Seven week old C57BL/6N mice were treated
with low-fat diet.
Liver was dissected out, hepatocytes prepared…
Age value
Unit
Strain name
Subject of the experiment
Type of diet and
experimental condition
Anatomy part
Seven week old C57BL/6N mice were treated
with low-fat diet.
Liver was dissected out, hepatocytes prepared …
From natural language to ‘computable’ concepts
Type of protocol – cell preparation
Type of protocol - sample treatment
Type of protocol – liver preparation
Semantic tagging key information
!"#$%&'()
!"#$%&'&
!"#$%&(&
!"#$%&)&
&
What does a structured component add?
•  Supplements the scientific discourse!
o  natural language has a degree of ambiguity!
•  Brings clarity in reporting research methods and procedures!
o  no trimming, no cooking!
o  clear samples to data files links and relation to methods!
•  Provides the basis for search and discovery features!
SciData DD
Structured
content SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
SciData DD
Structured
content
Same tissue
Same organism
Same assay
Community
Data
Repositories
Citation of and link to data files and databases
Research
papers
Data
records
Data
Descriptors
We currently recognize over
60 public data repositories!
!
Helping the authors to find the right place for the data
Big	
  data	
  |	
  CSE	
  2014	
  2


Repositories criteria!
1.  Broad support and recognition within their scientific community !
2.  Ensure long-term persistence and preservation of datasets!
3.  Provide expert curation !
4.  Implement relevant, community-endorsed reporting requirements !
5.  Provide for confidential review of submitted datasets !
6.  Provide stable identifiers for submitted datasets !
7.  Allow public access to data without unnecessary restrictions !
~ 156
~ 70
~ 334
Source:BioPortal
Databases !
implementing !
standards!
miame!
MIAPA!
MIRIAM!
MIQAS!
MIX!
MIGEN!
ARRIVE!
MIAPE!
MIASE!
MIQE!
MISFISHIE….!
REMARK!
CONSORT!
MAGE-Tab!
GCDML!
SRAxml!
SOFT!
FASTA!
DICOM!
MzML!
SBRML!
SEDML…!
GELML!
ISA-Tab!
CML!
MITAB!
AAO!
CHEBI!
OBI!
PATO! ENVO!
MOD!
BTO!
IDO…!
TEDDY!
PRO!
XAO!
DO
VO!
Progressively refine guidance to authors and reviewers
Mapping the landscape of standards and databases
Nature 515, 312 (20 November
2014) doi:10.1038/515312a
http://www.nature.com/
news/data-access-
practices-
strengthened-1.16370
Key part of NPG data access & reproducible research
policies
Responsibilities lie across several stakeholder groups
Understand the benefits of sharing
FAIR datasets and enact them
Engage and assist researchers to
enable them to share FAIR datasets
Release or endorse practices
and polices, but also incentive
and credit mechanisms for
researchers, curators and
developers
Acknowledgements!
Visit
nature.com/scientificdata
Email
scientificdata@nature.com
Tweet
@ScientificData
Honorary Academic Editor
Susanna-Assunta Sansone, PhD
Managing Editor
Andrew L Hufton, PhD
Editorial Curator
Varsha Khodiyar
Publisher
Iain Hrynaszkiewicz
Advisory Panel and Editorial Board including
senior researchers, funders, librarians and curators
and our Advisory Boards and Collaborators
Funds:
Philippe
Rocca-Serra, PhD
Senior Research Lecturer
Alejandra
Gonzalez-Beltran, PhD
Research Lecturer
Eamonn
Maguire, Dphil
Contractor
Milo
Thurston, PhD
Senior Bioinfomatician
Allyson
Lister, PhD
Knowledge Engineer
Alfie
Abdul-Rahman, PhD
Research Software Engineer

Weitere ähnliche Inhalte

Was ist angesagt?

Publishing and impact 20141028
Publishing and impact 20141028Publishing and impact 20141028
Publishing and impact 20141028Hugo Besemer
 
Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Susanna-Assunta Sansone
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Merce Crosas
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Merce Crosas
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collectionSherry Lake
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset useHeather Piwowar
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016 Rebecca Raworth, MLIS
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Greg Landrum
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
Introduction to FundRef Webinar
Introduction to FundRef WebinarIntroduction to FundRef Webinar
Introduction to FundRef WebinarCrossref
 
Open Science: Research Data Management
Open Science: Research Data ManagementOpen Science: Research Data Management
Open Science: Research Data ManagementLibrary_Connect
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Data for Science: How Elsevier is using data science to empower researchers
Data for Science: How Elsevier is using data science to empower researchersData for Science: How Elsevier is using data science to empower researchers
Data for Science: How Elsevier is using data science to empower researchersPaul Groth
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Introduction to CrossRef for Affiliates
Introduction to CrossRef for AffiliatesIntroduction to CrossRef for Affiliates
Introduction to CrossRef for AffiliatesCrossref
 

Was ist angesagt? (20)

Publishing and impact 20141028
Publishing and impact 20141028Publishing and impact 20141028
Publishing and impact 20141028
 
Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017
 
Why managedata
Why managedataWhy managedata
Why managedata
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Best practices data collection
Best practices data collectionBest practices data collection
Best practices data collection
 
Payton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook MetadataPayton Eliminating Conflicts in Ebook Metadata
Payton Eliminating Conflicts in Ebook Metadata
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset use
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Introduction to FundRef Webinar
Introduction to FundRef WebinarIntroduction to FundRef Webinar
Introduction to FundRef Webinar
 
Open Science: Research Data Management
Open Science: Research Data ManagementOpen Science: Research Data Management
Open Science: Research Data Management
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Data for Science: How Elsevier is using data science to empower researchers
Data for Science: How Elsevier is using data science to empower researchersData for Science: How Elsevier is using data science to empower researchers
Data for Science: How Elsevier is using data science to empower researchers
 
Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Introduction to CrossRef for Affiliates
Introduction to CrossRef for AffiliatesIntroduction to CrossRef for Affiliates
Introduction to CrossRef for Affiliates
 

Andere mochten auch

Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...Susanna-Assunta Sansone
 
Workflows for Publishing Data; Scientific Data's experience as an early adopter
Workflows for Publishing Data; Scientific Data's experience as an early adopterWorkflows for Publishing Data; Scientific Data's experience as an early adopter
Workflows for Publishing Data; Scientific Data's experience as an early adopterVarsha Khodiyar
 
PUBLICATION BIAS & NEGATIVE RESULTS
PUBLICATION BIAS & NEGATIVE RESULTSPUBLICATION BIAS & NEGATIVE RESULTS
PUBLICATION BIAS & NEGATIVE RESULTSVineetha K
 
Curriculum Mapping & Analysis: Basic Definitions
Curriculum Mapping & Analysis: Basic DefinitionsCurriculum Mapping & Analysis: Basic Definitions
Curriculum Mapping & Analysis: Basic DefinitionsDr. Suad Alazzam
 

Andere mochten auch (8)

Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...Scientific Data overview of Data Descriptors - WT Data-Literature integration...
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
 
Workflows for Publishing Data; Scientific Data's experience as an early adopter
Workflows for Publishing Data; Scientific Data's experience as an early adopterWorkflows for Publishing Data; Scientific Data's experience as an early adopter
Workflows for Publishing Data; Scientific Data's experience as an early adopter
 
PUBLICATION BIAS & NEGATIVE RESULTS
PUBLICATION BIAS & NEGATIVE RESULTSPUBLICATION BIAS & NEGATIVE RESULTS
PUBLICATION BIAS & NEGATIVE RESULTS
 
Difusión y visibilidad de la producción científica en la web def
Difusión y visibilidad de la producción científica en la web defDifusión y visibilidad de la producción científica en la web def
Difusión y visibilidad de la producción científica en la web def
 
Introduction to meta analysis
Introduction to meta analysisIntroduction to meta analysis
Introduction to meta analysis
 
Curriculum Mapping
Curriculum MappingCurriculum Mapping
Curriculum Mapping
 
Curriculum Mapping & Analysis: Basic Definitions
Curriculum Mapping & Analysis: Basic DefinitionsCurriculum Mapping & Analysis: Basic Definitions
Curriculum Mapping & Analysis: Basic Definitions
 
Introduction to Curriculum Mapping
Introduction to Curriculum MappingIntroduction to Curriculum Mapping
Introduction to Curriculum Mapping
 

Ähnlich wie Better Data Journal Articles

SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...Susanna-Assunta Sansone
 
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014Susanna-Assunta Sansone
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Susanna-Assunta Sansone
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Susanna-Assunta Sansone
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Paul Groth
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataSusanna-Assunta Sansone
 
Big data, small data, data papers - short statement for "BDebate on Biomedici...
Big data, small data, data papers - short statement for "BDebate on Biomedici...Big data, small data, data papers - short statement for "BDebate on Biomedici...
Big data, small data, data papers - short statement for "BDebate on Biomedici...Susanna-Assunta Sansone
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...AKSHAY BHAGAT
 
Managing Big Data - Berlin, July 9-10, 201.
Managing Big Data - Berlin, July 9-10, 201.Managing Big Data - Berlin, July 9-10, 201.
Managing Big Data - Berlin, July 9-10, 201.Susanna-Assunta Sansone
 
Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Susanna-Assunta Sansone
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 Scott Edmunds
 
Open Science for sustainability and inclusiveness: the SKA role model
 Open Science for sustainability and inclusiveness: the SKA role model Open Science for sustainability and inclusiveness: the SKA role model
Open Science for sustainability and inclusiveness: the SKA role modelLourdes Verdes-Montenegro
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Susanna-Assunta Sansone
 
How to share useful data
How to share useful dataHow to share useful data
How to share useful dataPeter McQuilton
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
RDA - Long Tail Data Interest Group - NPG Scientitic Data oveview
RDA - Long Tail Data Interest Group - NPG Scientitic Data oveviewRDA - Long Tail Data Interest Group - NPG Scientitic Data oveview
RDA - Long Tail Data Interest Group - NPG Scientitic Data oveviewSusanna-Assunta Sansone
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypseENUG
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData ManagementUlrike Wittig
 

Ähnlich wie Better Data Journal Articles (20)

SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
 
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014Oxford DTP - Sansone curation tools - Dec 2014
Oxford DTP - Sansone curation tools - Dec 2014
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Big data, small data, data papers - short statement for "BDebate on Biomedici...
Big data, small data, data papers - short statement for "BDebate on Biomedici...Big data, small data, data papers - short statement for "BDebate on Biomedici...
Big data, small data, data papers - short statement for "BDebate on Biomedici...
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
Managing Big Data - Berlin, July 9-10, 201.
Managing Big Data - Berlin, July 9-10, 201.Managing Big Data - Berlin, July 9-10, 201.
Managing Big Data - Berlin, July 9-10, 201.
 
Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Open Science for sustainability and inclusiveness: the SKA role model
 Open Science for sustainability and inclusiveness: the SKA role model Open Science for sustainability and inclusiveness: the SKA role model
Open Science for sustainability and inclusiveness: the SKA role model
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
How to share useful data
How to share useful dataHow to share useful data
How to share useful data
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
RDA - Long Tail Data Interest Group - NPG Scientitic Data oveview
RDA - Long Tail Data Interest Group - NPG Scientitic Data oveviewRDA - Long Tail Data Interest Group - NPG Scientitic Data oveview
RDA - Long Tail Data Interest Group - NPG Scientitic Data oveview
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypse
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData Management
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 

Mehr von Susanna-Assunta Sansone

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRSusanna-Assunta Sansone
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesSusanna-Assunta Sansone
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookSusanna-Assunta Sansone
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessSusanna-Assunta Sansone
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features Susanna-Assunta Sansone
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseSusanna-Assunta Sansone
 

Mehr von Susanna-Assunta Sansone (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
FAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdfFAIRsharing-Standards-4-GSC-Aug23.pdf
FAIRsharing-Standards-4-GSC-Aug23.pdf
 
FAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdfFAIR-4-GSC-Sansone-Aug23.pdf
FAIR-4-GSC-Sansone-Aug23.pdf
 
FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
FAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-SingaporeFAIRcookbook: GSRS22-Singapore
FAIRcookbook: GSRS22-Singapore
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
 
FAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipesFAIR, community standards and data FAIRification: components and recipes
FAIR, community standards and data FAIRification: components and recipes
 
FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook
 
FAIRsharing for EOSC
FAIRsharing for EOSC FAIRsharing for EOSC
FAIRsharing for EOSC
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
 
FAIRsharing: what we do for policies
FAIRsharing: what we do for policiesFAIRsharing: what we do for policies
FAIRsharing: what we do for policies
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRness
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
 
FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features FAIRsharing - focus on standards and new features
FAIRsharing - focus on standards and new features
 
FAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 responseFAIR data and standards for a coordinated COVID-19 response
FAIR data and standards for a coordinated COVID-19 response
 
FAIRsharing poster
FAIRsharing posterFAIRsharing poster
FAIRsharing poster
 
The FAIR Cookbook poster
The FAIR Cookbook posterThe FAIR Cookbook poster
The FAIR Cookbook poster
 

Kürzlich hochgeladen

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 

Kürzlich hochgeladen (20)

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 

Better Data Journal Articles

  • 1. Consultant, Honorary Academic Editor Associate Director, Principal Investigator ! Better Data = Better Science ! Susanna-Assunta Sansone, PhD! ! ! @biosharing! @isatools! ! NC3Rs Publication Bias Workshop, London, 24-25 February, 2015 http://www.slideshare.net/SusannaSansone
  • 2. Plagued by selective reporting of data and methods
  • 3. Plagued by selective reporting of data and methods Why? For example: •  Researchers still lack of or insufficient motivations o  Focus on big discovery and impact; because they “have to” •  Hypothesis-confirming results get prioritized o  Difficulties with reviews of other results •  Agreements, disagreements and timing o  Unclear or lack of data sharing agreements and timing of disclosure •  Loose requirements and monitoring by journals and funders o  Publish and release just enough; keep the rest, move to next grant
  • 4. Are open data and methods understandable, reusable?
  • 5. Are open data and methods understandable, reusable? Not always… •  Outputs are multi-dimensional, diverse, not always well cited / stored •  Software, codes, workflows etc.; hard(er) to get hold of •  Data often distributed and fragmented to fit (siloed) databases o  Not contain enough information for others to understand it •  Uneven level of details and annotation across different databases o  Specialized, generalist, public and institutional •  Data curation activities are perceived as time consuming o  Collection and harmonization of detailed methods and experimental steps is done/rushed at publication stage
  • 7. Role of data papers / data journals •  Incentive, credit for sharing! •  Data-focused peer review! •  Value of data vs. analysis, results! •  Support of the FAIR concept!
  • 8. market research (2011) •  What do researchers want from a data publications? o  96% - increased visibility and discovery o  95% - increased usability of their research data o  93% - credit mechanism for deposit of data o  80% - peer review of content/datasets Respondent characteristics 387 respondents (329 active researchers Physics (24%) Earth and environmental science (21%) Biology (20%) Chemistry (19%) Others (16%)
  • 9. Because of importance of formal publications in the academic ! incentive structure! Publishers occupy a leverage point
  • 10. Role of publishers as “agents of change”
  • 11. 
 " ! ! Helping you publish, discover and reuse research data Credit for sharing your data Focused on reuse and reproducibility Peer reviewed, curated Promoting community data and code repositories Open Access •  Currently covering life, natural and environmental sciences! •  Big and small data! o  power of small data are in their aggregation and integration with other datasets! •  New and previously published individual datasets, curated collections and citizen science! o  a fuller, more in-depth look at the data processing steps, additional data files, codes etc! o  tutorial-like information for scientists interested in reusing or integrating the data with their own!
  • 12. Methods and technical analyses supporting the quality of the measurements:" What did I do to generate the data?" How was the data processed?" Where is the data?" Who did what when" How can the data be used or reused?" Introducing a new content type: Data Descriptor Designed to make data more FAIR Focused mainly on: •  Methods •  Technical Validation •  Data Records •  Usage Notes
  • 13. " " " " " " " " Scientific hypotheses:" Synthesis" Analysis" Conclusions" Methods and technical analyses supporting the quality of the measurements:" What did I do to generate the data?" How was the data processed?" Where is the data?" Who did what when" How can the data be used or reused?" Relation with traditional article - content
  • 14. AFTER: expand on your research articles, adding further information for reuse of the data AT THE SAME TIME: publish your Data Descriptor(s) alongside research article(s) OR BEFORE Relation with traditional article - time Publish Data!
  • 15. " " " " " " " " " Code in GitHub " " " " " " " " " Data in OpenfMRI Share your data, get credited and cited
  • 16. Evaluation is not be based on the perceived impact! or novelty of the findings or size of the data! ! •  Experimental rigour and technical data quality! o  Methodologically sound! o  Technical validation experiments and statistical analyses! o  Depth, coverage, size, and/or completeness of data sufficient for the types of applications! •  Completeness of the description! o  Sufficient details to allow others to reproduce the results, reuse or integrate it with other data! o  Compliance with relevant minimum information or reporting standards! •  Integrity of the data files and repository record! o  Data files match the descriptions in the Data Descriptor! o  Deposited in the most appropriate available databases! Peer review process focused on quality and reuse!
  • 17. " " " Experimental metadata or" structured component" (in-house curated, machine- readable formats)" Article or " narrative component" (PDF and HTML)! Data Descriptor: narrative and structure
  • 18. Sections:! •  Title" •  Abstract" •  Background & Summary" •  Methods" •  Technical Validation" •  Data Records" •  Usage Notes " •  Figures & Tables " •  References" •  Data Citations" 
 ! Focus on data reuse" Detailed descriptions of the methods and technical analyses supporting the quality of the measurements.! Does not contain tests of new scientific hypotheses! Joint Declaration of Data Citation Principles by the Data Citation Synthesis Group Data Descriptor: narrative
  • 19. In-house editorial curator assists authors via ! •  Excel spreadsheet templates" •  internal authoring tool! to create the structured component, also performing value-added semantic annotation analysis ! method! script! Data file or ! record in a database! Data Descriptor: structure (CC0)
  • 20. Because we do not want cryptic experimental info, e.g.: LS1_C2_LD_TP2_P1! file1-fastq.gz!
  • 21. …how not to report the experimental information! •  L!S1 ! !liver sample 1! •  C2 ! !compound 2! •  LD ! !low dose! •  TP2 ! !time point 2! •  P1 ! !protocol 1! •  file1-fastq.gz !compressed data file for sequence ! ! ! !information corresponding to this sample! Sample name (?!)" Data file" LS1_C2_LD_TP2_P1! file1-fastq.gz!
  • 22. Structured component: key information from narrative Seven week old C57BL/6N mice were treated with low-fat diet. Liver was dissected out, hepatocytes prepared…
  • 23. Age value Unit Strain name Subject of the experiment Type of diet and experimental condition Anatomy part Seven week old C57BL/6N mice were treated with low-fat diet. Liver was dissected out, hepatocytes prepared … From natural language to ‘computable’ concepts Type of protocol – cell preparation Type of protocol - sample treatment Type of protocol – liver preparation
  • 24. Semantic tagging key information !"#$%&'() !"#$%&'& !"#$%&(& !"#$%&)& &
  • 25. What does a structured component add? •  Supplements the scientific discourse! o  natural language has a degree of ambiguity! •  Brings clarity in reporting research methods and procedures! o  no trimming, no cooking! o  clear samples to data files links and relation to methods! •  Provides the basis for search and discovery features! SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content SciData DD Structured content Same tissue Same organism Same assay Community Data Repositories
  • 26. Citation of and link to data files and databases
  • 27. Research papers Data records Data Descriptors We currently recognize over 60 public data repositories! ! Helping the authors to find the right place for the data
  • 28. Big  data  |  CSE  2014  2 
 Repositories criteria! 1.  Broad support and recognition within their scientific community ! 2.  Ensure long-term persistence and preservation of datasets! 3.  Provide expert curation ! 4.  Implement relevant, community-endorsed reporting requirements ! 5.  Provide for confidential review of submitted datasets ! 6.  Provide stable identifiers for submitted datasets ! 7.  Allow public access to data without unnecessary restrictions !
  • 29. ~ 156 ~ 70 ~ 334 Source:BioPortal Databases ! implementing ! standards! miame! MIAPA! MIRIAM! MIQAS! MIX! MIGEN! ARRIVE! MIAPE! MIASE! MIQE! MISFISHIE….! REMARK! CONSORT! MAGE-Tab! GCDML! SRAxml! SOFT! FASTA! DICOM! MzML! SBRML! SEDML…! GELML! ISA-Tab! CML! MITAB! AAO! CHEBI! OBI! PATO! ENVO! MOD! BTO! IDO…! TEDDY! PRO! XAO! DO VO! Progressively refine guidance to authors and reviewers
  • 30. Mapping the landscape of standards and databases
  • 31. Nature 515, 312 (20 November 2014) doi:10.1038/515312a http://www.nature.com/ news/data-access- practices- strengthened-1.16370 Key part of NPG data access & reproducible research policies
  • 32. Responsibilities lie across several stakeholder groups Understand the benefits of sharing FAIR datasets and enact them Engage and assist researchers to enable them to share FAIR datasets Release or endorse practices and polices, but also incentive and credit mechanisms for researchers, curators and developers
  • 33. Acknowledgements! Visit nature.com/scientificdata Email scientificdata@nature.com Tweet @ScientificData Honorary Academic Editor Susanna-Assunta Sansone, PhD Managing Editor Andrew L Hufton, PhD Editorial Curator Varsha Khodiyar Publisher Iain Hrynaszkiewicz Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators and our Advisory Boards and Collaborators Funds: Philippe Rocca-Serra, PhD Senior Research Lecturer Alejandra Gonzalez-Beltran, PhD Research Lecturer Eamonn Maguire, Dphil Contractor Milo Thurston, PhD Senior Bioinfomatician Allyson Lister, PhD Knowledge Engineer Alfie Abdul-Rahman, PhD Research Software Engineer