SlideShare ist ein Scribd-Unternehmen logo
1 von 54
Bioinformatics in the Era of
Open Science and Big Data

Philip E. Bourne
University of California San Diego
pbourne@ucsd.edu
1/28/14

SIB Biel/Bienne

1
My Bias
• RCSB PDB/IEDB Database Developer – Views on
community, quality, sustainability …
• PLOS Journal Co-founder – Open Science Advocate
• Associate Vice Chancellor for Innovation – Business
models, interaction with the private
sector, sustainability
• Professor – Mentoring, reward system, value (or not) of
research
• Associate Director of NIH for Data Science - ??

1/28/14

SIB Biel/Bienne

2
The History of Bioinformatics
According to Bourne
Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
1980s

1990s

2000s

2010s

2020

Discipline:
Unknown Expt. Driven Emergent Over-sold A Service

A Partner

A Driver

The Raw Material:

Non-existent

Limited /Poor

More/Ontologies

Big Data/Siloed Open/Integrated

The People:
No name
1/28/14

Technicians

Industry recognition data scientists
SIB Biel/Bienne

Academics
3
We Need to Start By Asking How Are
We Using the Data Now!

Only Then Can We Make Rational
Decisions About Data – Large or Small

1/28/14

SIB Biel/Bienne

4
Web Logs etc. Are
Not Enough
Structure Summary page activity for
H1N1 Influenza related structures
Jan. 2008

Jul. 2008

Jan. 2009

Jul. 2009

Jan. 2010

Jul. 2010

3B7E: Neuraminidase of A/Brevig Mission/1/1918
H1N1 strain in complex with zanamivir

1RUZ: 1918 H1 Hemagglutinin

1/28/14
5

* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
SIB Biel/Bienne

[Andreas Prlic]
We Need to Learn from Industries
Whose Livelihood Addresses the
Question of Use

1/28/14

SIB Biel/Bienne

6
Next Consider What We Do Every Day

We take actions on digital data
increasingly across boundaries

1/28/14

SIB Biel/Bienne

7
Actions on Data Implies:
•
•
•
•
•
•
•
•
•

Insuring data quality and hence trust
Making data sustainable
Making data open and accessible
Making data findable
Providing suitable metadata and annotation
Making data queryable
Making data analyzable
Presenting data as to maximize its value
Rewarding good data practices

1/28/14

SIB Biel/Bienne

8
Actions on Data Implies:
•
•
•
•
•
•
•
•
•

Insuring data quality and hence trust
Making data sustainable
Making data open and accessible
Making data findable
Providing suitable metadata and annotation
Making data queryable
Making data analyzable
Presenting data as to maximize its value
Rewarding good data practices

1/28/14

SIB Biel/Bienne

9
Boundaries on Data Implies:
• Working across biological scales
• Working across biomedical disciplines
• Working across basic and clinical research and
practice
• Working across institutional boundaries
• Working across public and private sectors
• Working across national and international
borders
• Working across funding agencies
1/28/14

SIB Biel/Bienne

10
Boundaries on Data Implies:
• Working across biological scales
• Working across biomedical disciplines
• Working across basic and clinical research and
practice
• Working across institutional boundaries
• Working across public and private sectors
• Working across national and international
borders
• Working across funding agencies
1/28/14

SIB Biel/Bienne

11
These Issues Have Been Around
Almost As Long As Bioinformatics
The Good News is That “Big Data” Has
Bought More Attention to the Problem

1/28/14

SIB Biel/Bienne

12
What Are Big Data?
• Large datasets from high throughput
experiments
• Large numbers of small datasets
• Data which are “ill-formed”
• The why (causality) is replaced by the what
• A signal that a fundamental change is taking
place – a tipping point?

1/28/14

SIB Biel/Bienne

13
That Change is Embodied in:
The Digital Enterprise
• Consists of digital assets
• E.g. datasets, papers, software, lab notes
• Each asset is uniquely identified and has
provenance, including access control
• E.g. publishing simply involves changing the
access control
• Digital assets are interoperable across the
enterprise
1/28/14

SIB Biel/Bienne

14
The Enterprise Is Almost Anything..
Your Lab, your Institution, the SIB,
the NIH….

1/28/14

SIB Biel/Bienne

15
Consider an Academic Institution As A
Digital Enterprise
•

Jane scores extremely well in parts of her graduate on-line neurology class. Neurology professors,
whose research profiles are on-line and well described, are automatically notified of Jane’s
potential based on a computer analysis of her scores against the background interests of the
neuroscience professors. Consequently, professor Smith interviews Jane and offers her a research
rotation. During the rotation she enters details of her experiments related to understanding a
widespread neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line
research space – an institutional resource where stakeholders provide metadata, including access
rights and provenance beyond that available in a commercial offering. According to Jane’s
preferences, the underlying computer system may automatically bring to Jane’s attention Jack, a
graduate student in the chemistry department whose notebook reveals he is working on using
bacteria for purposes of toxic waste cleanup. Why the connection? They reference the same gene a
number of times in their notes, which is of interest to two very different disciplines – neurology and
environmental sciences. In the analog academic health center they would never have discovered
each other, but thanks to the Digital Enterprise, pooled knowledge can lead to a distinct advantage.
The collaboration results in the discovery of a homologous human gene product as a putative target
in treating the neurodegenerative disorder. A new chemical entity is developed and patented.
Accordingly, by automatically matching details of the innovation with biotech companies worldwide
that might have potential interest, a licensee is found. The licensee hires Jack to continue working
on the project. Jane joins Joe’s laboratory, and he hires another student using the revenue from the
license. The research continues and leads to a federal grant award. The students are employed,
further research is supported and in time societal benefit arises from the technology.

From What Big Data Means to Me JAMIA 2014
1/28/14

SIB Biel/Bienne

16
The NIH is Starting to Think About the
Digital Enterprise, Witness…

bd2k.nih.gov
1/28/14

SIB Biel/Bienne

17
What Defines the Digital Enterprise
•
•
•
•
•
•
•

Trans-NIH collaboration – change culture
Long-term NIH strategic planning
The BD2K Initiative
A “hub” of data science activities
International cooperation
Interagency cooperation
Data sharing policies

1/28/14

SIB Biel/Bienne

18
Consider One NIH Scenario
• NIH-Drive
– Investigator A from the NCI makes frequent
reference to the over expression of genes x and y.
– Investigator B from the NHLBI makes frequent
reference to the under expression of genes x and y
– Automatic notification of a potential common
interest before publication or database deposition

1/28/14

SIB Biel/Bienne

19
The NIH Process
An external advisory group provided a
valuable blueprint for what should be
done
http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Report.pdf
1/28/14

SIB Biel/Bienne

20
Blueprint Recommendations
• Promote central and federated catalogs
– Establish minimal metadata framework
– Tools to facilitate data sharing
– Elaborate on existing data sharing policies

• Support methods and applications
– Fund all phases of software development
– Leverage lessons from National Centers

• Training
– More funding
– Enhance review of training apps
– Quantitative component to all awards

• On campus IT strategic plan
– Catalog of existing tools
– Informatics laboratory
– Ditto big data

• Sustainable funding commitment
1/28/14

SIB Biel/Bienne

acd.od.nih.gov/diwg.htm
21
General Features of NIH Data Science
• Lightweight metadata standards
• Data & software registries
• Expanded policies on data sharing, open
source software
• Training programs & reward systems
• Institutional incentives
• Private sector incentives
• Data centers serving community needs
1/28/14

SIB Biel/Bienne

22
What is Under Way?
• Now:
–
–
–
–
–

Data centers (under review)
Data science training grants (call Q1 14)
Pilot data catalog consortium (call out)
Genomic Research Data Alliance (being finalized)
Piloting “NIH-drive

• What Is Planned:
– Extended public-private programs specifically for data science
activities
– Interagency activities
– International exchange programs
– Cold Spring Harbor-like training facilities – by-coastal?
– Programs for better data descriptions
– Reward institutions/communities
– Policies to get clinical trial data into the public domain
1/28/14

SIB Biel/Bienne

23
The History of Bioinformatics
According to PEB
The Roots in Bioinformatics Series PLOS Comp Biol
1980s

1990s

2000s

2010s

2020

Discipline:
Unknown Expt. Driven Emergent Over-sold A Service

A Partner

Driver

The Raw Material:

Non-existent

Limited /Poor

More/Ontologies

Big Data/Siloed Open/Integrated

The People:
No name
1/28/14

Technicians

Industry recognition data scientists
SIB Biel/Bienne

Academics
24
Why Will Science Become More Open?
• The public (and hence the politicians demand
it)
• Its the right thing to do
• Its part of the modern psyche
• The scholarly enterprise is broken and more
stakeholders are acknowledging it

1/28/14

SIB Biel/Bienne

25
Personal Evidence
• I have a paper with 16,000 citations that no
one has ever read
• I have papers in PLOS ONE that have more
citations than ones in PNAS
• I have data sets I am proud of but no place to
put them
• I “cant” reproduce work from my own lab

1/28/14

SIB Biel/Bienne

26
Politicians Demand It:
G8 open data charter

1/28/14
SIB Biel/Bienne
http://opensource.com/government/13/7/open-data-charter-g8 27
What Are Some of the Ramifications of
Open Science?

1/28/14

SIB Biel/Bienne

28
Open Science Has The Potential to
Deinstitutionalize

Daniel Hulshizer/Associated Press

1/28/14

SIB Biel/Bienne

29
Open Science Has The Potential to
Deinstitutionalize

Daniel Hulshizer/Associated Press

1/28/14

SIB Biel/Bienne

30
An Example of That Potential:
The Story of Meredith

http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne
1/28/14

SIB Biel/Bienne

31
Open Science Has The Potential to
Deinstitutionalize

Daniel Hulshizer/Associated Press

1/28/14

SIB Biel/Bienne

32
Open Science Has The Potential to
Deinstitutionalize

Daniel Hulshizer/Associated Press

1/28/14

SIB Biel/Bienne

33
There Still Needs to be a Reward System
The Wikipedia Experiment – Topic Pages

 Identify areas of Wikipedia that
relate to the journal that are
missing of stubs
 Develop a Wikipedia page in the
sandbox
 Have a Topic Page Editor Review
the page
 Publish the copy of record with
associated rewards
 Release the living version into
Wikipedia
1/28/14

SIB Biel/Bienne

34
One Possible End Product of Open
Science

0. Full text of PLoS papers stored
in a database

4. The composite view has
links to pertinent blocks
of literature text and back to the PDB

4.

1.
1. A link brings up figures
from the paper

2.
1/28/14

3. A composite view of
journal and database
content results

3.

2. Clicking the paper figure retrieves
data from the PDB which is
analyzedSIB Biel/Bienne

1. User clicks on thumbnail
2. Metadata and a
webservices call provide
a renderable image that
can be annotated
3. Selecting a features
provides a
database/literature
mashup
4. That leads to new
papers
PLoS Comp. Biol. 2005 1(3) e34
35
Change in the Way we Support the
Research Lifecycle
Authoring
Tools

Data
Capture

Lab
Notebooks

Software
Repositories

Analysis
Tools

Scholarly
Communication
Visualization

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

Commercial &
Public Tools

DisciplineBased Metadata
Standards

Community Portals
Git-like
Resources
By Discipline

Data Journals

New Reward
Systems

Training
Institutional Repositories
1/28/14

SIB Biel/Bienne
Commercial Repositories

36
Change in the Way we Support the
Research Lifecycle
Authoring
Tools

Data
Capture

Lab
Notebooks

Software
Repositories

Analysis
Tools

Scholarly
Communication
Visualization

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

Commercial &
Public Tools

DisciplineBased Metadata
Standards

Community Portals
Git-like
Resources
By Discipline

Data Journals

New Reward
Systems

Training
Institutional Repositories
1/28/14

SIB Biel/Bienne
Commercial Repositories

37
automate: workflows, pipeline &
service integrative frameworks
CS
SE
pool, share & collaborate web
systems

scientific software
engineering

semantics & ontologies
machine readable documentation
nanopub
1/28/14
[Carole Goble]

SIB Biel/Bienne

38
Why is This Important to Me
Personally?
• My wife is being treated for stage 1 breast
cancer
• This highlights for me the disparity
between what is happening in the lab and
what is happening in the clinic
– In the lab cancer is a personalized and
treatable condition
– In the clinic we are still equally “poisoning”
patients with drugs first introduced 10-20
years ago
1/28/14

SIB Biel/Bienne

39
http://sagecongress.org/Presentations/Sommer.pdf

[Josh Sommer]
1/28/14

SIB Biel/Bienne

40
http://sagecongress.org/Presentations/Sommer.pdf

[Josh Sommer]
1/28/14

SIB Biel/Bienne

41
Most Laboratories
• We are the long tail
• Goodbye to the student is
goodbye to the data
• Very few of us have
complied (or will comply
with the data
management plans we
write into grants)
• Too much software is
unusable
S.Veretnik, J.L.Fink, and P.E. Bourne 2008 Computational Biology Resources Lack
Persistence and Usability. PLoS Comp. Biol. . 4(7): e1000136

1/28/14

SIB Biel/Bienne

42
Today’s Research Lifecycle is Digitally
Fragmented at Best
• Proof:
– I cant immediately reproduce the research in
my own laboratory
• It took an estimated 280 hours for an average user
to approximately reproduce the paper

– Workflows are maturing and becoming helpful
– Data and software versions and accessibility
prevent exact reproducability
Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology:
The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .
1/28/14

SIB Biel/Bienne

43
We Have Some Really Big Problems to
Solve – The Commons Can Help

1/28/14

SIB Biel/Bienne

44
What Really Happens When You Take a
Drug?

• Can we predict drug efficacy and toxicity?
• Can we reuse old drugs?
• Can we design personalized medicines?
1/28/14

SIB Biel/Bienne

45
One Drug, One Gene, One Disease

Bernard M. Nat Rev Drug Disc 8(2009), 959-968
1/28/14

SIB Biel/Bienne

46
Polypharmacology
• Tykerb – Breast cancer
• Gleevac – Leukemia, GI
cancers
• Nexavar – Kidney and liver
cancer
• Staurosporine – natural product
– alkaloid – uses many e.g.,
antifungal antihypertensive

Collins and Workman 2006 Nature Chemical Biology 2 689-700

1/28/14

SIB Biel/Bienne

47
Polypharmacology is Not Rare but Common

• Single gene knockouts only
affect phenotype in 10-20% of
cases
A.L. Hopkins Nat. Chem. Biol. 2008 4:682-690

• 35% of biologically active
compounds bind to two or
more targets that do not have
similar sequences or global
shapes
Paolini et al. Nat. Biotechnol. 2006 24:805–815

 Predict side effects
 Repurpose drugs
Kaiser et al. Nature 462 (2009) 175-81
1/28/14

SIB Biel/Bienne

48
Drug Binding is Dynamic

• Drug effect dependents on
not only how strong (binding
affinity) but also how long the
drug is “stuck” in the protein
(residence time).
• Molecular Dynamics (MD)
simulation is powerful but
computationally intensive.
~ns
1 day simulation
~ms – hours
>106 days
D. Huang et al. (2011), PLoS Comp Biol 7(2):e1002002
1/28/14

SIB Biel/Bienne

49
Systems
Pharmacology
Systemic
response

Uptake

Enzyme
inhibition

×

×× ×
×
×

Catalytic
site

Affect protein
function

×

Secretion
(or biomass
components)

Metabolic
network

Target binding
1/28/14
Slide from Roger Chang

SIB Biel/Bienne

Drug molecules

50
Multiscale Modeling of Drug
Actions
Understanding of
dynamics and
kinetics of proteinligand interactions

Traditional
Approach

Knowledge
representation
and discovery &
model integration
1/28/14

Slide from Lei Xie

Prediction of molecular
interaction network on
a genome scale

physiological process

Systems-based
Approach
SIB Biel/Bienne

Reconstruction,
analysis and
simulation of
biological networks
51
More Generally Any Translationalbased Research That Involves
Modeling at Multiple Scales

http://sagebase.org/
1/28/14

SIB Biel/Bienne

52
The History of Bioinformatics
According to Bourne
The Roots in Bioinformatics Series PLOS Comp Biol
1980s

1990s

2000s

2010s

2020

Discipline:
Unknown Expt. Driven Emergent Over-sold A Service

A Partner

A Driver

The Raw Material:

Non-existent

Limited /Poor

More/Ontologies

Big Data/Siloed Open/Integrated

The People:
No name
1/28/14

Technicians

Industry recognition data scientists
SIB Biel/Bienne

Academics
53
In Summary:
By the End of the Decade Biomedical
Research will Be a Truly Digital
Enterprise and Computational
Scientists Will Be At the Forefront
You Have Much to Look Forward Too

1/28/14

SIB Biel/Bienne

54

Weitere ähnliche Inhalte

Was ist angesagt?

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds
 
CINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIR
CINECAProject
 
CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...
CINECAProject
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data Project
Edward Blurock
 

Was ist angesagt? (20)

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
 
Data Science BD2K Update for NIH
Data Science BD2K Update for NIH Data Science BD2K Update for NIH
Data Science BD2K Update for NIH
 
Fair by design
Fair by designFair by design
Fair by design
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
CINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIR
 
BD2K Update
BD2K Update BD2K Update
BD2K Update
 
CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Vision for Data @ the NIH
The Vision for Data @ the NIHThe Vision for Data @ the NIH
The Vision for Data @ the NIH
 
From Where Have We Come & Where Are We Going
From Where Have We Come & Where Are We GoingFrom Where Have We Come & Where Are We Going
From Where Have We Come & Where Are We Going
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Poster: Very Open Data Project
Poster: Very Open Data ProjectPoster: Very Open Data Project
Poster: Very Open Data Project
 
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchNicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do research
 
Research Data Management Services at UWA (November 2015)
Research Data Management Services at UWA (November 2015)Research Data Management Services at UWA (November 2015)
Research Data Management Services at UWA (November 2015)
 

Andere mochten auch

Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
Daniela Rotariu
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
 

Andere mochten auch (12)

Formal languages to map Genotype to Phenotype in Natural Genomes
Formal languages to map Genotype to Phenotype in Natural GenomesFormal languages to map Genotype to Phenotype in Natural Genomes
Formal languages to map Genotype to Phenotype in Natural Genomes
 
Molecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in InsectsMolecular Markers: Major Applications in Insects
Molecular Markers: Major Applications in Insects
 
Bioinformatics A Biased Overview
Bioinformatics A Biased OverviewBioinformatics A Biased Overview
Bioinformatics A Biased Overview
 
Mapping Genotype to Phenotype using Attribute Grammar, Laura Adam
Mapping Genotype to Phenotype using Attribute Grammar, Laura AdamMapping Genotype to Phenotype using Attribute Grammar, Laura Adam
Mapping Genotype to Phenotype using Attribute Grammar, Laura Adam
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
 
Ap Chapter 21
Ap Chapter 21Ap Chapter 21
Ap Chapter 21
 
Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
 
How to be a bioinformatician
How to be a bioinformaticianHow to be a bioinformatician
How to be a bioinformatician
 
Gene concept
Gene conceptGene concept
Gene concept
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 

Ähnlich wie Bioinformatics in the Era of Open Science and Big Data

Ähnlich wie Bioinformatics in the Era of Open Science and Big Data (20)

Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
What Can Happen when Genome Sciences Meets Data Sciences?
What Can Happen when Genome Sciences Meets Data Sciences?What Can Happen when Genome Sciences Meets Data Sciences?
What Can Happen when Genome Sciences Meets Data Sciences?
 
Data at the NIH: Some Early Thoughts
Data at the NIH: Some Early ThoughtsData at the NIH: Some Early Thoughts
Data at the NIH: Some Early Thoughts
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?
 
Biomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterpriseBiomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital Enterprise
 
A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterpriseA Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital Enterprise
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
The Future of Open Science
The Future of Open ScienceThe Future of Open Science
The Future of Open Science
 
Evolution or revolution? The changing data landscape
Evolution or revolution? The changing data landscapeEvolution or revolution? The changing data landscape
Evolution or revolution? The changing data landscape
 
NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)
 
Why the food sector needs a research infrastructure on Food and Health Consum...
Why the food sector needs a research infrastructure on Food and Health Consum...Why the food sector needs a research infrastructure on Food and Health Consum...
Why the food sector needs a research infrastructure on Food and Health Consum...
 
Towards Biomedical Research as a Digital Enterprise
Towards Biomedical Research as a Digital EnterpriseTowards Biomedical Research as a Digital Enterprise
Towards Biomedical Research as a Digital Enterprise
 
From Research to Practice: New Models for Data-sharing and Collaboration to I...
From Research to Practice: New Models for Data-sharing and Collaboration to I...From Research to Practice: New Models for Data-sharing and Collaboration to I...
From Research to Practice: New Models for Data-sharing and Collaboration to I...
 
From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice - New Models for Data-sharing and Collaboration to ...From Research to Practice - New Models for Data-sharing and Collaboration to ...
From Research to Practice - New Models for Data-sharing and Collaboration to ...
 
ACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBSACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBS
 
GES Center Research Highlights 2017
GES Center Research Highlights 2017GES Center Research Highlights 2017
GES Center Research Highlights 2017
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in libraries
 
Secure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewSecure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH View
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 

Mehr von Philip Bourne

Mehr von Philip Bourne (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 

Kürzlich hochgeladen

Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 

Bioinformatics in the Era of Open Science and Big Data

  • 1. Bioinformatics in the Era of Open Science and Big Data Philip E. Bourne University of California San Diego pbourne@ucsd.edu 1/28/14 SIB Biel/Bienne 1
  • 2. My Bias • RCSB PDB/IEDB Database Developer – Views on community, quality, sustainability … • PLOS Journal Co-founder – Open Science Advocate • Associate Vice Chancellor for Innovation – Business models, interaction with the private sector, sustainability • Professor – Mentoring, reward system, value (or not) of research • Associate Director of NIH for Data Science - ?? 1/28/14 SIB Biel/Bienne 2
  • 3. The History of Bioinformatics According to Bourne Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol 1980s 1990s 2000s 2010s 2020 Discipline: Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver The Raw Material: Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated The People: No name 1/28/14 Technicians Industry recognition data scientists SIB Biel/Bienne Academics 3
  • 4. We Need to Start By Asking How Are We Using the Data Now! Only Then Can We Make Rational Decisions About Data – Large or Small 1/28/14 SIB Biel/Bienne 4
  • 5. Web Logs etc. Are Not Enough Structure Summary page activity for H1N1 Influenza related structures Jan. 2008 Jul. 2008 Jan. 2009 Jul. 2009 Jan. 2010 Jul. 2010 3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir 1RUZ: 1918 H1 Hemagglutinin 1/28/14 5 * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm SIB Biel/Bienne [Andreas Prlic]
  • 6. We Need to Learn from Industries Whose Livelihood Addresses the Question of Use 1/28/14 SIB Biel/Bienne 6
  • 7. Next Consider What We Do Every Day We take actions on digital data increasingly across boundaries 1/28/14 SIB Biel/Bienne 7
  • 8. Actions on Data Implies: • • • • • • • • • Insuring data quality and hence trust Making data sustainable Making data open and accessible Making data findable Providing suitable metadata and annotation Making data queryable Making data analyzable Presenting data as to maximize its value Rewarding good data practices 1/28/14 SIB Biel/Bienne 8
  • 9. Actions on Data Implies: • • • • • • • • • Insuring data quality and hence trust Making data sustainable Making data open and accessible Making data findable Providing suitable metadata and annotation Making data queryable Making data analyzable Presenting data as to maximize its value Rewarding good data practices 1/28/14 SIB Biel/Bienne 9
  • 10. Boundaries on Data Implies: • Working across biological scales • Working across biomedical disciplines • Working across basic and clinical research and practice • Working across institutional boundaries • Working across public and private sectors • Working across national and international borders • Working across funding agencies 1/28/14 SIB Biel/Bienne 10
  • 11. Boundaries on Data Implies: • Working across biological scales • Working across biomedical disciplines • Working across basic and clinical research and practice • Working across institutional boundaries • Working across public and private sectors • Working across national and international borders • Working across funding agencies 1/28/14 SIB Biel/Bienne 11
  • 12. These Issues Have Been Around Almost As Long As Bioinformatics The Good News is That “Big Data” Has Bought More Attention to the Problem 1/28/14 SIB Biel/Bienne 12
  • 13. What Are Big Data? • Large datasets from high throughput experiments • Large numbers of small datasets • Data which are “ill-formed” • The why (causality) is replaced by the what • A signal that a fundamental change is taking place – a tipping point? 1/28/14 SIB Biel/Bienne 13
  • 14. That Change is Embodied in: The Digital Enterprise • Consists of digital assets • E.g. datasets, papers, software, lab notes • Each asset is uniquely identified and has provenance, including access control • E.g. publishing simply involves changing the access control • Digital assets are interoperable across the enterprise 1/28/14 SIB Biel/Bienne 14
  • 15. The Enterprise Is Almost Anything.. Your Lab, your Institution, the SIB, the NIH…. 1/28/14 SIB Biel/Bienne 15
  • 16. Consider an Academic Institution As A Digital Enterprise • Jane scores extremely well in parts of her graduate on-line neurology class. Neurology professors, whose research profiles are on-line and well described, are automatically notified of Jane’s potential based on a computer analysis of her scores against the background interests of the neuroscience professors. Consequently, professor Smith interviews Jane and offers her a research rotation. During the rotation she enters details of her experiments related to understanding a widespread neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line research space – an institutional resource where stakeholders provide metadata, including access rights and provenance beyond that available in a commercial offering. According to Jane’s preferences, the underlying computer system may automatically bring to Jane’s attention Jack, a graduate student in the chemistry department whose notebook reveals he is working on using bacteria for purposes of toxic waste cleanup. Why the connection? They reference the same gene a number of times in their notes, which is of interest to two very different disciplines – neurology and environmental sciences. In the analog academic health center they would never have discovered each other, but thanks to the Digital Enterprise, pooled knowledge can lead to a distinct advantage. The collaboration results in the discovery of a homologous human gene product as a putative target in treating the neurodegenerative disorder. A new chemical entity is developed and patented. Accordingly, by automatically matching details of the innovation with biotech companies worldwide that might have potential interest, a licensee is found. The licensee hires Jack to continue working on the project. Jane joins Joe’s laboratory, and he hires another student using the revenue from the license. The research continues and leads to a federal grant award. The students are employed, further research is supported and in time societal benefit arises from the technology. From What Big Data Means to Me JAMIA 2014 1/28/14 SIB Biel/Bienne 16
  • 17. The NIH is Starting to Think About the Digital Enterprise, Witness… bd2k.nih.gov 1/28/14 SIB Biel/Bienne 17
  • 18. What Defines the Digital Enterprise • • • • • • • Trans-NIH collaboration – change culture Long-term NIH strategic planning The BD2K Initiative A “hub” of data science activities International cooperation Interagency cooperation Data sharing policies 1/28/14 SIB Biel/Bienne 18
  • 19. Consider One NIH Scenario • NIH-Drive – Investigator A from the NCI makes frequent reference to the over expression of genes x and y. – Investigator B from the NHLBI makes frequent reference to the under expression of genes x and y – Automatic notification of a potential common interest before publication or database deposition 1/28/14 SIB Biel/Bienne 19
  • 20. The NIH Process An external advisory group provided a valuable blueprint for what should be done http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Report.pdf 1/28/14 SIB Biel/Bienne 20
  • 21. Blueprint Recommendations • Promote central and federated catalogs – Establish minimal metadata framework – Tools to facilitate data sharing – Elaborate on existing data sharing policies • Support methods and applications – Fund all phases of software development – Leverage lessons from National Centers • Training – More funding – Enhance review of training apps – Quantitative component to all awards • On campus IT strategic plan – Catalog of existing tools – Informatics laboratory – Ditto big data • Sustainable funding commitment 1/28/14 SIB Biel/Bienne acd.od.nih.gov/diwg.htm 21
  • 22. General Features of NIH Data Science • Lightweight metadata standards • Data & software registries • Expanded policies on data sharing, open source software • Training programs & reward systems • Institutional incentives • Private sector incentives • Data centers serving community needs 1/28/14 SIB Biel/Bienne 22
  • 23. What is Under Way? • Now: – – – – – Data centers (under review) Data science training grants (call Q1 14) Pilot data catalog consortium (call out) Genomic Research Data Alliance (being finalized) Piloting “NIH-drive • What Is Planned: – Extended public-private programs specifically for data science activities – Interagency activities – International exchange programs – Cold Spring Harbor-like training facilities – by-coastal? – Programs for better data descriptions – Reward institutions/communities – Policies to get clinical trial data into the public domain 1/28/14 SIB Biel/Bienne 23
  • 24. The History of Bioinformatics According to PEB The Roots in Bioinformatics Series PLOS Comp Biol 1980s 1990s 2000s 2010s 2020 Discipline: Unknown Expt. Driven Emergent Over-sold A Service A Partner Driver The Raw Material: Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated The People: No name 1/28/14 Technicians Industry recognition data scientists SIB Biel/Bienne Academics 24
  • 25. Why Will Science Become More Open? • The public (and hence the politicians demand it) • Its the right thing to do • Its part of the modern psyche • The scholarly enterprise is broken and more stakeholders are acknowledging it 1/28/14 SIB Biel/Bienne 25
  • 26. Personal Evidence • I have a paper with 16,000 citations that no one has ever read • I have papers in PLOS ONE that have more citations than ones in PNAS • I have data sets I am proud of but no place to put them • I “cant” reproduce work from my own lab 1/28/14 SIB Biel/Bienne 26
  • 27. Politicians Demand It: G8 open data charter 1/28/14 SIB Biel/Bienne http://opensource.com/government/13/7/open-data-charter-g8 27
  • 28. What Are Some of the Ramifications of Open Science? 1/28/14 SIB Biel/Bienne 28
  • 29. Open Science Has The Potential to Deinstitutionalize Daniel Hulshizer/Associated Press 1/28/14 SIB Biel/Bienne 29
  • 30. Open Science Has The Potential to Deinstitutionalize Daniel Hulshizer/Associated Press 1/28/14 SIB Biel/Bienne 30
  • 31. An Example of That Potential: The Story of Meredith http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne 1/28/14 SIB Biel/Bienne 31
  • 32. Open Science Has The Potential to Deinstitutionalize Daniel Hulshizer/Associated Press 1/28/14 SIB Biel/Bienne 32
  • 33. Open Science Has The Potential to Deinstitutionalize Daniel Hulshizer/Associated Press 1/28/14 SIB Biel/Bienne 33
  • 34. There Still Needs to be a Reward System The Wikipedia Experiment – Topic Pages  Identify areas of Wikipedia that relate to the journal that are missing of stubs  Develop a Wikipedia page in the sandbox  Have a Topic Page Editor Review the page  Publish the copy of record with associated rewards  Release the living version into Wikipedia 1/28/14 SIB Biel/Bienne 34
  • 35. One Possible End Product of Open Science 0. Full text of PLoS papers stored in a database 4. The composite view has links to pertinent blocks of literature text and back to the PDB 4. 1. 1. A link brings up figures from the paper 2. 1/28/14 3. A composite view of journal and database content results 3. 2. Clicking the paper figure retrieves data from the PDB which is analyzedSIB Biel/Bienne 1. User clicks on thumbnail 2. Metadata and a webservices call provide a renderable image that can be annotated 3. Selecting a features provides a database/literature mashup 4. That leads to new papers PLoS Comp. Biol. 2005 1(3) e34 35
  • 36. Change in the Way we Support the Research Lifecycle Authoring Tools Data Capture Lab Notebooks Software Repositories Analysis Tools Scholarly Communication Visualization IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Commercial & Public Tools DisciplineBased Metadata Standards Community Portals Git-like Resources By Discipline Data Journals New Reward Systems Training Institutional Repositories 1/28/14 SIB Biel/Bienne Commercial Repositories 36
  • 37. Change in the Way we Support the Research Lifecycle Authoring Tools Data Capture Lab Notebooks Software Repositories Analysis Tools Scholarly Communication Visualization IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Commercial & Public Tools DisciplineBased Metadata Standards Community Portals Git-like Resources By Discipline Data Journals New Reward Systems Training Institutional Repositories 1/28/14 SIB Biel/Bienne Commercial Repositories 37
  • 38. automate: workflows, pipeline & service integrative frameworks CS SE pool, share & collaborate web systems scientific software engineering semantics & ontologies machine readable documentation nanopub 1/28/14 [Carole Goble] SIB Biel/Bienne 38
  • 39. Why is This Important to Me Personally? • My wife is being treated for stage 1 breast cancer • This highlights for me the disparity between what is happening in the lab and what is happening in the clinic – In the lab cancer is a personalized and treatable condition – In the clinic we are still equally “poisoning” patients with drugs first introduced 10-20 years ago 1/28/14 SIB Biel/Bienne 39
  • 42. Most Laboratories • We are the long tail • Goodbye to the student is goodbye to the data • Very few of us have complied (or will comply with the data management plans we write into grants) • Too much software is unusable S.Veretnik, J.L.Fink, and P.E. Bourne 2008 Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. . 4(7): e1000136 1/28/14 SIB Biel/Bienne 42
  • 43. Today’s Research Lifecycle is Digitally Fragmented at Best • Proof: – I cant immediately reproduce the research in my own laboratory • It took an estimated 280 hours for an average user to approximately reproduce the paper – Workflows are maturing and becoming helpful – Data and software versions and accessibility prevent exact reproducability Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 . 1/28/14 SIB Biel/Bienne 43
  • 44. We Have Some Really Big Problems to Solve – The Commons Can Help 1/28/14 SIB Biel/Bienne 44
  • 45. What Really Happens When You Take a Drug? • Can we predict drug efficacy and toxicity? • Can we reuse old drugs? • Can we design personalized medicines? 1/28/14 SIB Biel/Bienne 45
  • 46. One Drug, One Gene, One Disease Bernard M. Nat Rev Drug Disc 8(2009), 959-968 1/28/14 SIB Biel/Bienne 46
  • 47. Polypharmacology • Tykerb – Breast cancer • Gleevac – Leukemia, GI cancers • Nexavar – Kidney and liver cancer • Staurosporine – natural product – alkaloid – uses many e.g., antifungal antihypertensive Collins and Workman 2006 Nature Chemical Biology 2 689-700 1/28/14 SIB Biel/Bienne 47
  • 48. Polypharmacology is Not Rare but Common • Single gene knockouts only affect phenotype in 10-20% of cases A.L. Hopkins Nat. Chem. Biol. 2008 4:682-690 • 35% of biologically active compounds bind to two or more targets that do not have similar sequences or global shapes Paolini et al. Nat. Biotechnol. 2006 24:805–815  Predict side effects  Repurpose drugs Kaiser et al. Nature 462 (2009) 175-81 1/28/14 SIB Biel/Bienne 48
  • 49. Drug Binding is Dynamic • Drug effect dependents on not only how strong (binding affinity) but also how long the drug is “stuck” in the protein (residence time). • Molecular Dynamics (MD) simulation is powerful but computationally intensive. ~ns 1 day simulation ~ms – hours >106 days D. Huang et al. (2011), PLoS Comp Biol 7(2):e1002002 1/28/14 SIB Biel/Bienne 49
  • 50. Systems Pharmacology Systemic response Uptake Enzyme inhibition × ×× × × × Catalytic site Affect protein function × Secretion (or biomass components) Metabolic network Target binding 1/28/14 Slide from Roger Chang SIB Biel/Bienne Drug molecules 50
  • 51. Multiscale Modeling of Drug Actions Understanding of dynamics and kinetics of proteinligand interactions Traditional Approach Knowledge representation and discovery & model integration 1/28/14 Slide from Lei Xie Prediction of molecular interaction network on a genome scale physiological process Systems-based Approach SIB Biel/Bienne Reconstruction, analysis and simulation of biological networks 51
  • 52. More Generally Any Translationalbased Research That Involves Modeling at Multiple Scales http://sagebase.org/ 1/28/14 SIB Biel/Bienne 52
  • 53. The History of Bioinformatics According to Bourne The Roots in Bioinformatics Series PLOS Comp Biol 1980s 1990s 2000s 2010s 2020 Discipline: Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver The Raw Material: Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated The People: No name 1/28/14 Technicians Industry recognition data scientists SIB Biel/Bienne Academics 53
  • 54. In Summary: By the End of the Decade Biomedical Research will Be a Truly Digital Enterprise and Computational Scientists Will Be At the Forefront You Have Much to Look Forward Too 1/28/14 SIB Biel/Bienne 54