Slides from the presentation at IDAMO 2016, Rostock. May 2016.
Most scientific discoveries rely on previous or other findings. A lack of transparency and openness led to what many consider the "reproducibility crisis" in systems biology and systems medicine. The crisis arose from missing standards and inappropriate support of
standards in software tools. As a consequence, numerous results in low-and high-profile publications cannot be reproduced.
In my presentation, I summarise key challenges of reproducibility in systems biology and systems medicine, and I demonstrate available solutions to the related problems.
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Standards and tools for model management in biomedical research
1. Standards and tools for model management
in biomedical research
Dagmar Waltemath
University of Rostock, Germany
dagmar.waltemath@uni-rostock.de
Dagmarwaltemath
Clickable slides available online from slideshare.
2. 2
Standards and tools for model management
Junior research group: Management of
simulation studies in systems biology
Tool development: SBGN-ED for the
graphical representation of networks
Infrastructure project: Data management for
systems biology in Germany
3. 3
Standards and tools for model management
Figs: BioModels (top) and DOI: 10.1073/pnas.88.16.7328 (bottom)
6. 6
Most scientific discoveries rely on
previous or other findings.
Fig.: Tyson 2001 (BIOM195)
Fig.: Tyson 1991 (BIOM005)
7. 7
Goals of scientific publication
– To announce a result
– To convince readers that the result is correct
Most scientific discoveries rely on
previous or other findings.
Traditional science
● Mathematical, complete proofs
● Result description and protocols
in experimental sciences
Computer-driven science
● Data analysis with modular
software tools/packages
● Workflows
● Databases rather than direct inquiry
from in-house laboratories
Mesirov (2010) Science, doi:10.1126/science.1179653
8. 8
Can we rely on findings that we ourselves
cannot evaluate? (Probably not!)
“only in ~20–25% of the projects were the relevant published data completely
in line with our in-house findings (Fig. 1c). In almost two-thirds of the projects,
there were inconsistencies [..] that either considerably prolonged the duration
of the target validation process or, in most cases, resulted in termination of the
projects because the evidence [..] was insufficient to justify further investments
into these projects.” (Prinz et al (2011))
9. 9
Reproducibility issues are discussed among
key players in science.
Publication: 10.7554/eLife.04333 ; Project progress: https://osf.io/e81xl/wiki/Studies/
11. 11
We identified key challenges of reproducibility in
systems biology and systems medicine.
Lack of data standards – Lack of data quality and quantity – Lack of data availability – Lack of transparency
12. 12
53 researchers
17 countries
various different professions
A lack of suitable data standards hinders
researchers in providing reproducible results.
Whole Cell meeting (2015)
– Goal: To identify the needs and shortcomings for today's modeling tasks
– Results:
● New developments initiated (databases, data curation tools, training data,
modeling approaches, parameter estimation tools, frameworks, parallel
simulators, extensions to standard formats)
● New grant proposals and follow-up projects, new networks, better standards,
improved tools
Fig.: Waltemath et al (2016) IEEE TBME, accepted for publication
Project homepage: http://bit.ly/wholecell
13. 13
A lack of data availability makes it impossible
for researchers in reproducing results.
Issues
– Simulation studies comprise
of several files
– Data is heterogeneous,
distributed, complex
– Documentation of the how
the study was performed
often missing
● Model code in BioModels, including
supplemental with a howto reproduce
the figures given in the original paper
● Online tool makes data available
and browseable
TriplexRNA
Recon 2Recon 2
● Publication backed up with a website
containing the supplemental material
● Model code in (noncurated) BioModels
● Visualisation of the model can easily
be explored
● References to original works
14. 14
The COMBINE initiative works towards reproducibility
and tool interoperability in computational biology.
m n
Coordinate annual meetings
Simulation
GuidelinesOntologies
- Next HARMONY:
Auckland, June 7-11, 2016
- Next COMBINE:
Newcastle, Sep 19-23, 2016
Coordinate standards development
- Common procedures
- Interoperable software tools
- Discussion forums, mailing lists...
Represent community
- Funders
- Other communities
Provide standards resources
- Single entry point
- Resolvable URI
- Web infrastructure
15. 15
The COMBINE initiative works towards reproducibility
and tool interoperability in computational biology.
● Model description (network,
parameters, kinetics)
Fig.: SBGN-PD map, http://sbgn.org
● Visual representation of
network (glyphs)
16. 16
The COMBINE initiative works towards reproducibility
and tool interoperability in computational biology.
● Simulation setup
● Definition of observed variables
(plots, data tables)
● All files that belong to a (reproducible)
simulation study
● Description of archive content
● Have a look at a fully featured
COMBINE archive on github
Figs: BioModels
17. 17
Use of standard formats leads to interoperable software.
internet
internet
internet
SEARCHubiquitin
internet
RESULTS
EXPORT
EXPORT
EXPORT
EXPORT
Query database
for annotations, persons,
simulation descriptions
Retrieve information
about models, simulations,
figures, documentation
Export simulation study
as COMBINE archive
Download archive
and open the study
with your favourite
simulation tool
Open archive in CAT
to modify its contents and
to share it with others
Cardiac Electrophysiology Web Lab, Oxford
M2CAT, SEMS
WebCAT, SEMS
JWS Online, Stellenbosch, SA SED-ML Web Tools, BIOQUANT
18. 18
We develop tools that help researchers
manage standardised data efficiently.
Storage Search, retrieval & ranking
Using graph databases to integrate
standardised model-based data.
doi: 10.1093/database/bau130
doi: 10.1186/s13326-015-0014-4
Search across heterogeneous data,
ontologies, and structures.
https://dx.doi.org/10.6084/m9.figshare.3382993.v1
SED-ML DB in JWS Online
Our methods are tested & used in major model repositories.
BioModels Physiome Model repository
19. 19
We develop tools that help researchers
manage standardised data efficiently.
Transfer of results Version control & Provenance
Bundling files necessary to reproduce
a modeling result.
doi: 10.1093/bioinformatics/btv484
Figure courtesy
Martin Scharm,
slideshare
Tracking the development of
simulation studies over time.
https://dx.doi.org/10.6084/m9.figshare.2543059.v5
Our methods are tested & used in major model repositories.
BioModels Physiome Model repository
20. 20
How can we bridge the gap between standards for
systems biology and systems medicine?
Fig. courtesy Atalag et al (2015) http://hdl.handle.net/2292/27911
21. 21
Research results must be well documented, comprehensible
and reproducible to be trust-able and reusable.
Ways outCurrent status Desired status
Blogs and databases
Detailed documentation
Open data
Standards
Reproducibility initiative
Sustainable Software
Infrastructure
Comprehensible, findable,
available, correct models
and simulation studies.
Many scientific studies in
the life sciences are
not reproducible.
Waltemath and Wolkenhauer (2016) How modeling standards, software, and initiatives support reproducibility in systems biology
and systems medicine. Accepted for publication, IEEE Transactions in Biomedical Engineering
22. Thank you for your attention.
http://www.denbi.de/
Gary Bader Mike Hucka Chris Myers
David Nickerson Dagmar WaltemathNicolas Le Novère
Martin Golebiewski
Falk Schreiber
m n
@SemsProject
http://co.mbine.org