Over the past 5 years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs and so forth. Don’t stop reading. Data management isn’t likely to win anyone a Nobel prize. But publications should be supported and accompanied by data, methods, procedures, etc. to assure reproducibility of results. Funding agencies expect data (and increasingly software) management retention and access plans as part of the proposal process for projects to be funded. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems Biology demands the interlinking and exchange of assets and the systematic recording
of metadata for their interpretation.
The FAIR Guiding Principles for scientific data management and stewardship (http://www.nature.com/articles/sdata201618) has been an effective rallying-cry for EU and USA Research Infrastructures. FAIRDOM (Findable, Accessible, Interoperable, Reusable Data, Operations and Models) Initiative has 8 years of experience of asset sharing and data infrastructure ranging across European programmes (SysMO and EraSysAPP ERANets), national initiatives (de.NBI, German Virtual Liver Network, UK SynBio centres) and PI's labs. It aims to support Systems and Synthetic Biology researchers with data and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety.
This talk will use the FAIRDOM Initiative to discuss the FAIR management of data, SOPs, and models for Sys Bio, highlighting the challenges of and approaches to sharing, credit, citation and asset infrastructures in practice. I'll also highlight recent experiments in affecting sharing using behavioural interventions.
http://www.fair-dom.org
http://www.fairdomhub.org
http://www.seek4science.org
Presented at COMBINE 2016, Newcastle, 19 September.
http://co.mbine.org/events/COMBINE_2016
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthetic Biology
1. FAIRDOM – FAIR Asset
management and sharing
experiences in Systems and
Synthetic Biology
Prof Carole Goble
The FAIRDOM Consortium
carole.goble@manchester.ac.uk
http://fair-dom.org, http://fairdomhub.org
COMBINE 2016, Newcastle, UK 19 September 2016
9. Upstream, downstream assets discovery
Organisation Communication Dissemination
Helps
navigation Reuse
later
Enable team
to reuse/
reproduce
Help others
find out
Reuse with
new partners
Tell more and
take credit
Collecting and tracking data/models
Choosing what to keep
Preparing what to share and when
Most data/models won’t be shared
• Wrong experimental method
• Hidden parameter discovered
• Faulty experiment
Promote standardised metadata practices
10. FAIR Projects & Programmes
What methods are been used to determine enzyme activity?
What SOP was used for this
sample?
Where is the validation data for this model?
Is there any group generating kinetic data?
Is this data available?
Track versions of my model
Whats the relationship between the data and
model
Which data belong to
which publications?
11. FAIR Projects and Programmes
Findable Exchange & find
assets and
people
Citation
Credit
Accessible
Share, disseminate and publish
assets sensitively
Gateway to third party tools,
archives
Store assets
Package assets
Track collection of data and metadata
Consistent reporting for interpretation,
interop & comparison
Promote and support standardised metadata
practices.
Support reproducible publications
Organise and link assets
Maintain the experimental context
Retain results beyond a project
Reuse results, tools, archives
Respect local solutions
Standards! Standards!
16. Surveys
Stanford et alThe evolution of standards and data management practices in systems
biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
17. Community Clubs
http://www.fair-dom.org
Samples Club with ELIXIR, BBMRI-ERIC, EBI…
Rework and harmonise
sample metadata
framework
bioschemas.org
Developers Foundry
Support developers of Systems Biology
tools and platforms
3rd Foundry meeting
Dec 1-2 2016, Frankfurt
19. FAIRDOM Platform Installations
*Troup, E.; Clark, I; Swain, P; Millar, AJ; Zielinski,T (2015) Practical evaluation of SEEK and
openBIS for biological data management in SynthSys http://hdl.handle.net/1842/12236
Local retention and In flight management,
Private sharing
Centres, large or national projects
Local skills
One stop showcase
Programmes
Post-project retention
Supplementary materials
20. People and Project Commons
FAIRDOMHub.org
self-managed workspaces
Sharing sensitivity
25. 26
Programme
Overarching research theme (The Digital Salmon)
Project
Research grant (DigiSal, GenoSysFat)
Investigation
A particular biological process, phenomenon or thing
(typically corresponds to [plans for] one or more closely
related papers)
Study
Experiment whose design reflects a specific biological
research question
Assay
Standardized measurement or diagnostic experiment
using a specific protocol
(applied to material from a study)
Jon Olav Vik,
Norwegian
University of
Life Science
30. ODE-based model
3
2
example =
glc-permease
variables: Cex
glc, Cglc, Cg6p
parameters: Rmax, Kglc, KI-g6p, KII-g6p
0 500 1000 1500
0
2
4
6
glc-ex
C
i
[mmol/L]
0 500 1000 1500
0
0.05
0.1
glc-in
C
i
[mmol/L]
0 500 1000 1500
0
2
4
6
g6p
C
i
[mmol/L]
0 500 1000 1500
0
1
2
3
f6p
time [sec]
C
i
[mmol/L]
0
ex
glc pulseex pulsex
glc glc Perm glc
r
dC FC
D C C t r a C
dt V
max
.1
.1 6 6
.1 .1 6 .1 .1 6 .1
.1
max
.1
.1
.1
1
1 1
1
1
1
1
perm influx efflux
ex
glc
perm
glc
influx ex
glc
ex
glc glc glc g p glc g p
glcglc glc I g p glc II g p
glc
glc
perm
glc
efflux
glc
glc glc
glc
r r r
C
R
K
r
C
C K C C C C
CK K K K K
K
C
R
K
r
C
C K
K
6 6
.1 6 .1 .1 6 .1
.1
1
ex
glc g p glc g p
ex
glc glc I g p glc II g p
glc
C C C C
C K K K K
K
[Rizzi et al., 1997]
DATA file
Equations
Simulations
32. FAIRDOM Catalogue, a web of resources
drawing together across resources; reusing tools and repositories
respect local project solutions, tool plugins
Standards
Personal Data
Local Stores
External
Databases
Publishing services
Modelling
tools
SOPs
BiVes
Data
tools
33. Specialist Public Repositories
General archives
Repository Repertoire
Stanford et alThe evolution of standards and data management practices in systems
biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
Local Data Stores
35. openBIS
metadata extraction
data relationship/linking
data processing
minimal user input
*Troup, E.; Clark, I; Swain, P; Millar, AJ; Zielinski,T (2015) Practical evaluation of SEEK and openBIS for
biological data management in SynthSys http://hdl.handle.net/1842/12236
36. Modelling standards based
in browser validation and simulation
Model comparison and versioning
SBML Model simulation
[Stellenbosch,
Rostock]
38. Reproducible model simulations in
papers using COMBINE Archive & SED-ML
[Jacky Snoep, DagmarWaltemath, Martin Peters ]
Three tiered
service
store DOI citable
supplementary files
on FAIRDOMHub
model and
data curation
reproducible
clickable figures in
papers using SED-
ML
45. Data annotation with standards
Embed ontologies into Excel templates
Excel spreadsheets enriched
with ontology annotations
Upload, extract metadata
and register
in browser viewing +
annotations
52. Project Support – Understanding Process
Planning, Setups, Curation, Advice, Support
Community support
Specialprojectsupport
Specialprojectsupport
model technical curation
with our JWS Online partners
PALs project
ambassadors
co-design, tailoring,
communication,
requirements, review
standard, best practices
54. Processes and People
• 80% process, 20% tech
• Structuring the ISA
• PIs
• Sticking to conventions and
policies
• Tension with standards
take-up vs laissez faire
• Time and resource
• Local responsibility
• Recognition
• Institutional Repositories
• Automagic
55. • Licenses
• Negotiated access
• Embargos
• Permission controls
• Staged sharing
• Private walled gardens
FAIR Play
Using FAIRDOM my own
lab colleagues saw what I
was doing and called to
collaborate!
Jurgen Hannstra
Vrije Universiteit Amsterdam,
Netherlands
56. FAIR Play
• Drivers
– External dominate
– Personal productivity
• Trading behaviours
– Tribal based
– Modellers vs
Experimentalists
• “enclave” sharing
– Rather than public
donation
• Reciprocity & credit
– Citation
affecting behavioural change
through libertarian paternalism*
*Garza et al Framing the Community Data System Interface, https://dx.doi.org/10.6084/m9.figshare.1300051.v5
Stanford et al The evolution of
standards and data
management practices in
systems biology, Molecular
Systems Biology (2015) 11: 851
DOI 10.15252/msb.20156053
59. Jon OlavVik,
Norwegian University
of Life Science
Maksim Zakhartsev
Plant Systems Biology
University Hohenheim
Stuttgart, Germany
Alexey Kolodkin
Siberian Branch
Russian Academy of Sciences
Tomasz Zieliński,
SynthSys Centre
University Edinburgh