SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
TRD 3: MULTI-SCALE NETWORKS – PROJECT SUMMARY
Although networks have been extremely useful for representing molecular interactions and
mechanisms, network diagrams do not visually resemble the contents of cells. Rather, the cell
involves a multi-scale hierarchy of components – proteins are subunits of protein complexes which, in
turn, are parts of pathways, biological processes, organelles, cells, tissues, and so on. In this
Technology Research and Development Project (TRD), we will pursue methods that move Network
Biology towards such hierarchical, multi-scale views of the structure and function of biological
systems. Biological ontologies are one very successful framework for capturing hierarchical multi-
scale organization, but they have so far been only indirectly connected to biological networks and
other types of ‘omics data. Recently, we introduced methods for inferring the terms and term relations
of a gene ontology directly from the hierarchical structure contained in molecular networks, and we
prototyped a web resource to distribute network-based ontologies (NeXO, nexontology.org). This
recent progress motivates and lays groundwork for our present focus on hierarchical multi-scale
representations. Specific aims are to develop tools that: (1) Iteratively and flexibly incorporate new
network experimental results into a ‘working’ NeXO ontology, (2) Use a gene ontology structure, either
inferred or literature curated, to guide an engine for generalized functional predictions, and (3) Explore
multi-scale analysis above the cellular level, by bridging ligand-receptor networks to networks of cell-
cell communication. These aims are stimulated by a range of Driving Biomedical Projects involving
the Gene Ontology project, the Saccharomyces Genome Database, a Cancer Gene Ontology, and
multi-scale analysis of viral-host, cell-cell communication and social networks. Ultimately, all research
aims synergize to use network data to propel hierarchical models of biological structure and function.
TRD 3: MULTI-SCALE NETWORKS – PROJECT NARRATIVE
Although networks have been extremely useful for representing interactions and group formation,
network diagrams fail to capture important aspects of biological structure and function. We will pursue
methods that move Network Biology towards more accurate hierarchical, multi-scale views of
biological systems. The hierarchical models developed here will enable integration of both basic and
clinical data to predict disease outcomes in response to specific therapies.
TRD 3: MULTI-SCALE NETWORKS – SPECIFIC AIMS
Although networks have been very useful for representing molecular interactions and mechanisms,
network diagrams do not visually resemble the contents of cells. Rather, the cell involves a multi-scale
hierarchy of components – proteins are subunits of protein complexes which, in turn, are parts of
pathways, biological processes, organelles, cells, tissues, and so on. In this technology research
project, we will pursue methods that move Network Biology towards such hierarchical, multi-scale
views of biological structure and function.
Aim 1. Assembly and refinement of gene ontology structure from biological network data.
Ontologies have been very successful at capturing hierarchical, multi-scale cellular organization. In
the prior period of support we introduced methods for assembling a gene ontology directly from the
hierarchical structure evidenced by molecular networks and other ‘omics data. We prototyped a web
resource to distribute network-based ontologies (NeXO, nexontology.org), but it is still at an early
stage. In the next support period, we will research methods to iteratively and flexibly incorporate new
experimental results and data into a ‘working’ NeXO ontology, highlighting new terms and term
relations that are created, alongside existing terms/relations that are further supported or weakened.
We will transform nexontology.org to an interactive community resource that enables investigators not
only to browse an existing ontology but to create, share, and iteratively update, revise, correct, and
expand these ontologies. The potential of this aim is to effectively systematize and crowd-source an
important type of biological model – the ontology.
Aim 2. Functionalized gene ontologies as a hierarchy of phenotypic prediction. Hierarchy and
scale are important not only for capturing the physical architecture of a system (Aim 1) but also its
function. Recent progress in artificial intelligence (AI), embodied by agents such as Siri and Watson,
inspires an approach for moving from networks and gene ontologies, which are currently descriptive in
nature, towards predictive models that are able to predict a range of cellular phenotypes and answer
biological questions. Using these AIs as a rough inspirational guideline, we will develop gene
ontologies as a major platform for the functional translation of genotype to phenotype, with a particular
focus on personalized cancer therapeutics. This aim intersects with the separate TRD project on
Predictive Networks and serves as a bridge between the two TRDs.
Aim 3. Bridging ligand-receptor networks to cell-cell communication networks. We will also
explore multi-scale network analysis above the cellular level, in the context of an emerging class of
biological networks called cell-cell interaction networks. In these networks nodes are cells, and edges
represent physical or chemical (e.g. hormonal) interactions. Inter-cellular signaling and regulation
networks could in the future be controlled to grow artificial organs, heal tissues and develop novel
therapies. We will infer, analyze and visualize multi-scale models of inter-cellular communication
networks and their corresponding intracellular signaling networks and pathways, which link to
traditional molecular interaction network analysis methods. For instance, we will use network analysis
methods to identify potential control points in the cell-cell and intracellular interaction network with
applications to regenerative medicine (growing blood from stem cells). Growth of data and analysis
methods in this area will enable network science to contribute to the wider understanding of
physiological systems.
These aims are stimulated by a range of Driving Biomedical Projects involving the Gene Ontology
project, the Saccharomyces Genome Database, a Cancer Gene Ontology, and multi-scale analysis of
viral-host, cell-cell communication and social networks. Ultimately, all research aims synergize to use
network data to propel hierarchical models of biological structure and function.
TRD 3: MULTI-SCALE NETWORKS – RESEARCH STRATEGY
SIGNIFICANCE
Why it is time to move beyond flat models of biological networks. Like any model of the world,
our view of the cell is inescapably bound by the time and place in which we live. Over the years
different schools have fashioned the cell in a variety of forms, from bags of enzymes1
, to metabolic
channels2
, to feedback circuits3
, to complex systems4
, to gels5
, to self-modifying programs in
software6
. A model that has pervaded cell biology for the past fifteen years is the so-called “network”
view (Figure 1A), which has bloomed in parallel with the emergence of human-made networks such
as the Internet and Facebook. This view treats cells as containers for vast networks of “nodes”
(genes, gene products, metabolites, or other biomolecules) connected by “links” (physical interactions
or functional associations)7
. Network representations of the cell flow directly from the ability to
characterize not only genes and proteins in isolation, but also their functional similarities and physical
binding partners— a major outcome of transcriptomics and proteomics approaches. Analysis of
network information, whether biological or human-made, is an active field leading to algorithms that
detect nodes with strategic positions within a network7
or that analyze networks to identify modular
structures8
(a topic of earlier progress during the past period of support for the NRNB).
While incredibly influential, the network is likely not the ultimate representation of a cell, for two
reasons. First, network diagrams do not visually resemble the contents of cells. Nowhere in the cell do
we observe actual wires running between genes and proteins– unlike for the Internet, which is truly a
network of wires among processing units. Rather, the cell involves a multi-scale hierarchy of
components that is not readily captured by basic network representations. For example, the
proteasome has been mapped extensively to identify its key genes and interactions, but the network
visualization of these data (Figure 1A) is very different from the proteasome’s spatial appearance
(Figure 1B). The interactions making up the proteasome factor into a regulatory particle and a core,
which, in turn, factor into a base and a lid, and an alpha and beta subunit, respectively. This
hierarchical structure is obscured by the network visualization of pairwise relationships between gene
products. Aim 1 will address this shortcoming, by using molecular networks and other ‘omics data to
build hierarchical models of the cell parallel to the Gene Ontology9
.
Figure 1. From networks to ontologies. (A) Network representation of three types of interactions that form the
proteasome structure, displayed using a force directed layout. (B) Cartoon representation of the structure of the
proteasome (PDB entry 4b4t), created by integrating partial crystallographic structures obtained by analysis of
2.4 million images from electron microscopy. (C) Hierarchical factorization of the proteasome sub-components
as described by our data-driven gene ontology NeXO. Across all panels, colors indicate membership to the core
complex beta subunit (red), core complex alpha subunit (orange), regulatory particle lid complex (blue) and
regulatory particle base complex (purple) according to the GO (A), the Protein Data Bank (B) and NeXO (C).
From description to prediction. Second, many of the molecular networks published to date,
including many from the NRNB or earlier research by our labs10-21
, are descriptive maps of physical or
functional connectivity rather than predictive models. For example, technologies such as yeast two
hybrid, protein affinity purification, and chromatin immunoprecipitation are often used to define and
draw large networks of protein-protein and protein-DNA interactions22
, but these static maps do not,
by themselves, predict cell behavior. Although we and many others in the field of network biology
have inferred networks capable of predicting gene function or phenotypic responses [reviewed
here23,24
; network inference was the focus of previous Aim 4 of the past funding period], these efforts
have tended to focus on a specific class of predictions, i.e. gene expression level or cell growth rate.
Assembling a model that would predict a range of phenotypes, rather than only one type of outcome,
requires understanding how phenotypes are interrelated. Here again a hierarchy is important, since
cellular organization involves a multi-scale hierarchy not only in structure but also in function. For
example, the proteasome is a central component of ubiquitin-mediated protein degradation, which,
depending on an intricate set of inputs and rules, can result in cellular homeostasis, differentiation,
death, and other fates. This multi-scale hierarchy of processes is, again, simply not exposed by a
standard pairwise network representation. Aim 2 will address this shortcoming by developing methods
to ‘functionalize’ the Gene Ontology, so that it is not merely a static description of the contents of cells,
but an active framework for predicting phenotype from genotype.
From networks to ontologies: Building better models of cell structure from omics data. To
capture hierarchical organization, a particularly promising direction in computer science has been the
development of the ontology, a model that divides its subject domain into a set of fundamental
concepts or entities and relationships among those entities25
. Ontologies arise from the metaphysics
branch of philosophy, which is concerned with the nature of what exists and the categories into which
the world’s objects naturally fall. Ontologies build upon and extend network models in two key ways:
‘entities’ refer not only to elemental objects but also to any meaningful grouping of objects, and
‘relationships’ refer not only to direct connections but also to nested structures, such as one entity
being a part or type of another. Thus, ontologies explicitly allow for a higher order organization of
knowledge, missing from raw networks. They have been key for building powerful knowledge
representation and reasoning systems in many domains26
including biomedicine27
.
Ontologies became very influential in cell biology through the development of the Gene Ontology
(GO)9
. GO is a major resource of knowledge about genes, gene products, and the hierarchy of cellular
components, molecular functions and biological processes in which they participate. Entities in GO
(GO terms) are hierarchical groupings of other entities. The GO resource is presently very large, with
nearly 35,000 GO terms connected by ~65,000 hierarchical term-term relations, describing more than
80 different species. The impact of GO is hard to overstate – just try to think of a single modern ‘omics
analysis that does not use GO to validate a novel data set or approach, or to generate new
mechanistic hypotheses. In a sense GO is the most universal, and universally accepted, model of a
cell that we currently have.
One limitation of GO lies in the fact that the ontology structure is constructed by a diverse team of
scientists according to their best abilities to curate the published scientific literature. Thus, GO
inevitably misses the large proportion of cell biology that is not yet known or has not yet been curated,
and it contains biases that are hard to control. To address these challenges, in the prior period of
support we investigated whether gene ontologies could be inferred computationally directly from
systematic molecular interaction networks28
. In this study, a large fraction of the GO hierarchy was
recapitulated de novo, directly from network data gathered in budding yeast. For example, the
pairwise interaction network for genes and gene products encoding the proteasome (Figure 2A) was
transformed to infer the hierarchical structure of proteasomal components to a high degree of
accuracy (Figure 2C). In addition, several hundred cellular entities were identified from the data that
had not yet been catalogued in GO, pointing to potentially novel or uncurated molecular machinery
which we are pursuing in collaboration with the Gene Ontology Consortium (formerly a CSP, now a
DBP).
Over the next few years, we will expand on this preliminary work to introduce a system for organizing
molecular interactions and cancer ‘omics data as a genomics-driven, crowd-sourced Gene Ontology.
This will address several parallel challenges in the ‘omics sciences:
(1) The need to move beyond clustering to recognize the multi-scale structure embedded in data
(2) The need to improve ontologies of gene function in their scalability, consistency and coverage
(3) The continued need to provide biomedicine with an accurate map of hallmark pathways and
processes that drive disease progression.
Taking clues from Siri: ‘Active’ networks and ontologies. Whether based on expert knowledge or
inferred from data, current gene ontologies are static descriptions of cellular organization. They
enable representing and reasoning on the structural relationships among biological entities27,29
but
lack any native capacity to capture dynamic biological states or make phenotypic predictions.
However, since gene ontologies inherently represent multi-scale hierarchy in cellular organization,
they provide in theory an ideal substrate for building models that would also be predictive of a range
of cellular responses and phenotypes.
In this respect, intelligent agents developed in the field of knowledge representation and reasoning26
,
such as Apple’s Siri and IBM’s Watson, provide an excellent example of what a predictive, or
‘executable’, ontology looks like. At Siri’s core is a series of ontologies containing knowledge that
concerns Siri – answers to questions one would normally ask an iPhone30
. For instance, Siri uses an
ontology for event planning which treats both meals and movies as types of events, where meals
involve a restaurant and a restaurant consists of components such as a name, address, and style of
food. In many ways, such ontologies are similar in structure to bio-ontologies such as GO (Figure 2).
Figure 2. From ontologies to active ontologies. A subset of the Gene Ontology
9
, left, alongside a subset of
an active ontology for event planning
30
, right. Red relationships and entities indicate dynamic computation.
Unlike gene ontologies, however, which are essentially descriptive, Siri’s ontologies are coupled with
dynamic reasoning systems that render them active: “Whereas a conventional ontology is a formal
representation of domain knowledge with distinct concepts and relations among concepts, an Active
Ontology is a processing formalism where distinct processing elements are arranged according to
ontology notions; it is an execution environment”30
. These active ontologies not only encode entities
and relations, but entities are associated with states and relations are associated with rule sets that
perform actions within and among entities. Through a bottom up execution, input states are
incrementally propagated up the hierarchy to impact higher-level entities, whose states are output as
the answer to the initial question – the best prediction based on the inputs.
For example, try asking Siri to “Find a good sushi restaurant for two tonight”. This query is translated
by setting the states of several entities: style is set to ‘sushi’, address to the user’s current location,
party size to the value ‘2’, and event date to today’s date (Figure 2). These values are propagated
through the ontology to generate a list of restaurants, which becomes the state of the event entity.
This event result can then be provided to the user or included in further computations. In Aim 2, we
will explore whether such systems can teach us how to develop question-and-answer, or genotype-to-
phenotype prediction, systems for cell biology31
.
Cell-cell interaction networks. We will also develop technology for understanding network structure
above the cellular level. In so-called cell-cell interaction networks, nodes are cells and edges
represent physical or chemical (e.g. hormonal) interactions. Chemical interactions are of greatest
interest as they describe inter-cellular signaling and regulation pathways, which could in the future be
controlled to grow artificial organs, heal tissues and develop novel therapies. Increasing information in
this area will enable network science to contribute to the wider understanding of physiological
systems. We have gained experience and interest in this area via analysis of two novel experimentally
mapped cell-cell interaction networks of the developing human hematopoietic system32,33
in
collaboration with Peter Zandstra at the University of Toronto (Zandstra is now a DBP). The Zandstra
lab is interested in mapping inter-cellular networks and feedback in regulating stem and progenitor cell
fate for the purposes of growing blood from stem cells, which would be safer than blood donations.
Cell-cell interaction networks demand new analysis tools that consider their autocrine and paracrine
structure and how they are controlled by intra-cellular molecular networks. Despite the recognized
importance of inter-cellular networks and feedback in regulating multicellular organism development,
the specific cell populations involved and underlying molecular mechanisms are largely undefined. For
example, blood cells are known to secrete and respond to a large number of regulatory proteins in
lineage- and differentiation stage-specific patterns34,35
. Dynamic mathematical models of cells
patterning into tissues during development have been built36-38
, but they function at the cell
population/tissue level and treat cells as a compartment or spatial gradient and do not consider actual
cell-cell interactions. Perhaps the best-studied cell-cell interaction network is that of the worm,
Caenorhabditis elegans, which has been completely mapped over organism development by
microscopy. Network analysis by clustering found that interneurons are more densely connected in
the nervous system compared to sensory or motor neurons, leading to the interpretation that these
cells act as central processing units39
. More recent work predicted cell-cell networks involved in
cancer therapy resistance40
, and found that specific network motifs are enriched in inter-cellular
cytokine mediated communication networks41
and that specific components are more important than
others42
, however this work has thus far studied small cell-cell network models that were never
experimentally validated. As technology for single cell and stem cell measurement improves, we
expect a growth in the amount of cell-cell network information. We are already observing this growth
in projects such as a new CSP from Laurie Ailles at the University Health Network in Toronto, who is
studying how cancer-associated fibroblasts provide a supportive microenvironment for cancer stem
cells within high-grade serous ovarian cancer and other cancers. New technology she has developed
quantifies the protein levels of 363 cell surface antigens in single cell populations43
.
INNOVATION
Central innovation and hypothesis. The central innovation of this TRD project is a set of ideas and
approaches for transitioning Network Biology from the current status-quo of flat, pairwise, and
descriptive representations of biological interactions, to a future in which the same interaction data
lead to the construction of hierarchical models of biological structure and function. We will explore the
hypothesis that current network representations, which view a dataset of pairwise interactions as a
mathematical graph of nodes and edges, may be “too close” to the raw data to allow for complete or
even accurate biological insight. Models derived from the same interactions, such as gene ontologies
and biological process diagrams, may form a more intuitive result, provided these multi-scale
formulations can avoid the tendency towards over-fitting or -interpretation.
The most direct representations of data are not always the most desirable for meaningful
interpretation of those data. In x-ray crystallography, the most direct representations of x-ray
diffraction patterns are two-dimensional images44
. However, when many such images are integrated
and analyzed, exquisite 3D structural models of proteins emerge which, in turn, enable accurate
predictions of protein dynamics and function. Similarly, from many molecular measurements and
interaction data sets the higher order structure and function of the cell might emerge, if only we could
figure out how to assemble these images properly.
Turning networks into ontologies: towards a Network-eXtracted Ontology. Recently we and
others have shown very promising results in the hierarchical analysis of physical and genetic
networks—i.e., that networks harbor rich structure which is not only modular but also hierarchical and
multi-scale45-50
(Aim 1 Progress Report). In particular, we have been able to recover ~60% of the
hierarchical GO Cellular Component hierarchy de novo, directly from physical and genetic network
data gathered in S. cerevisiae and in a manner that is completely independent from the known
structure of GO or from the literature. The resulting Network-eXtracted Ontology, which we call NeXO,
provides a structured hierarchical interpretation of network data which will in most cases be vastly
preferable to flat lists of interaction (a.k.a. interaction ‘hairballs’) or flat lists of network
clusters/complexes. The focus of Aim 1, and an innovative aspect of this proposal, is to explore how
these ontologies can be iteratively updated by a community of biomedical investigators.
3.1 ASSEMBLY AND REFINEMENT OF ONTOLOGY STRUCTURE FROM
BIOLOGICAL NETWORK DATA
Project Leader: Trey Ideker (UCSD)
Overview. Ontologies have been very successful at capturing hierarchical, multi-scale cellular
organization. In the prior period of support we introduced methods for assembling a gene ontology
directly from the hierarchical structure contained in molecular networks and other ‘omics data. We
prototyped a web resource to distribute network-based ontologies (NeXO, nexontology.org), but it is
still at an early stage. In the next support period, we will research methods to iteratively and flexibly
incorporate new experimental results and data into a ‘working’ NeXO ontology, highlighting new terms
and term relations that are created, alongside existing terms/relations that are further supported or
weakened. We will transform nexontology.org to an interactive community resource that enables
investigators not only to browse an existing ontology but to create, share, and iteratively update,
revise, correct, and expand these ontologies. These tools will be built and explored alongside Driving
Biomedical Projects including a Yeast Gene Ontology, a Cancer Gene Ontology, a Viral-Host Gene
Ontology and a hierarchical exploration of social networks. The goal is a means of systematically
incorporating ‘omics data into whole-cell ontological models, with the potential to systematize and
crowd-source an important type of model construction.
Preliminary Results and Progress Report: Proof-of-concept and maturation of a Network-
eXtracted Ontology (NeXO). The previous award supported research by NRNB investigators that led
to creation and prototyping of the first gene ontology inferred from ‘omics data, the NeXO
Resource28,51
(http://nexontology.org). This work fell naturally under previous TRD-C: Visualization
and Representation of Biological Networks. NeXO provides a methodology whereby physical and
genetic network data can be transformed to assemble a structured ontology of protein complexes.
Using this system, we assembled an ontology based on four large yeast networks capturing current
knowledge of physical protein-protein interactions, genetic interactions (synthetic-lethality and
epistasis), co-expressed genes, as well as an integrated functional network known as YeastNet52
. The
resulting Network-eXtracted Ontology (NeXO) contains a total of 4,123 terms and 7,804 term-term
relationships (Figure 3). Based on alignment of the systematic NeXO to the literature-curated Gene
Ontology (GO), it appears that NeXO captures ~60% of terms in the Cellular Component branch of
GO. To further validate NeXO vs. GO, we have used both ontologies to perform functional enrichment
of gene sets, the task to which GO is most often applied. In this regard, NeXO performs at least as
well as GO for functional enrichment in several different genome-scale data sets. Thus, the computed
ontology provides functionally-relevant terms which cover a wide spectrum of yeast biology to an
extent comparable to manually-curated efforts. Since the original proof-of-concept work was published
in early 201328
, we have released a visually integrated website for browsing NeXO and GO ontologies
in the style of Google Maps51
. This summer we published a major improvement to the ontology
inference algorithm53
which was presented and well-received at the Intelligent Systems in Molecular
Biology (ISMB 2014) conference. Progress Report Publications 1-11.
Methods
Basic inference of ontologies and alignment to a reference. To construct a data-driven ontology, a set
of input features is first gathered for each gene, representing information collected from ‘omics studies
such as its interaction partners in molecular networks, its expression levels over time or conditions, or
other data depending on the DBP. These features are analyzed to generate a pairwise gene-gene
similarity matrix, in which the similarity between two genes reflects their closeness in input features.
Many methods have been proposed for this purpose54-56
, presently we have been successful with the
technique of random forest regression57
. The pairwise similarity matrix is then clustered (Figure 4)
using either of several algorithms we have published in prior work28,53
. For example, our original
method is to use a hierarchical probabilistic model for community detection50,58
which constructs a
binary tree, or dendrogram, seeking to maximize the overall probability of the network data by
iteratively joining sets of genes with similar patterns of interaction. Gene sets, represented by nodes in
the tree, are suggestive of biological entities or ‘terms’ in an ontology. Joining of two sets, represented
by connecting two nodes beneath a third, suggests specialized terms that are part of a more general
one. The tree is then expanded to allow for creation of terms with multiple (>2) children and/or parents
which is important for identifying complexes with many subunits or which participate in multiple parent
processes [transforming the hierarchical tree into a directed acyclic graph— we do not detail this
method here but it involves evaluating the probability of the network under the new vs. old structure].
This method yields a novel structure that we call the Network-Extracted Ontology, or NeXO, in which
genes are organized under a hierarchy of terms and parent-child term relations strongly supported by
the input datasets. At this stage terms simply represent structures detected in data and are given
systematic IDs, much like ORFs detected in a newly-sequenced genome. To annotate these terms
with information from known biology, the NeXO structure is aligned against a reference ontology,
Figure 3. Building the
NeXO ontology. The
ontology is reduced to a
tree, with nodes indicating
terms and edges indicating
hierarchical relations
between terms, i.e. that
one term contains another.
Node sizes indicate the
number of genes assigned
to a term. Node colors
represent the degree of
correspondence to a term
in GO as determined by
ontology alignment, with
high-level alignments
labeled. Insets show the
hierarchy identified for the
ribosome and actin
cytoskeleton.
much like ORFs are annotated by alignment against a reference genome whose genes are well-
annotated. As in past work, our default reference ontology for this step will be the literature-curated
Gene Ontology. The desired result of aligning NeXO and GO is to identify NeXO terms that
correspond to well-known versus novel structures, as well as GO terms that are well-supported by the
available data. For high confidence matches, the GO annotations are transferred to the NeXO term,
including the term name and description. Terms that are novel (similar to ‘ORFaned’ genes) may
become extremely interesting for further biological exploration and experimental follow-up.
Although methods for ontology alignment have not received much attention in molecular biology or
bioinformatics, they are under active research in the computer science and semantic web
communities. We will implement an ontology alignment algorithm based on a previously-proposed
method called ASMOV59
, which was the winning ontology alignment algorithm in the 2010 Ontology
Alignment Evaluation Initiative (om2010.ontologymatching.org/). The method was designed to align
semantic ontologies, and it is based on a score function that measures the lexical similarity of text
labels and comments associated with terms. Hence, we will modify and expand this approach to align
ontologies in which the terms refer to sets of genes (technically, the set of genes assigned to a term
defines the ‘label’ of that term).
Application of current and new procedures for data-driven ontologies to Driving Projects. We will begin
work immediately to construct and/or revise data-driven ontologies with each of our Driving
Biomedical Projects, an activity that is expected to continue for most of the next five-year performance
period. The projects are:
1. Creating new terms and term relations in the Gene Ontology. Our previous efforts to infer gene
ontologies from network data were initially carried out as a Collaboration and Service Project (CSP)
with Mike Cherry, Professor of Genetics at Stanford and head of the Gene Ontology Consortium for
the Saccharomyces model organism. Together with Cherry, we will continually apply tools developed
in this TRD to revise and expand the yeast NeXO based on new data, and to communicate the most
promising new terms and term relations it identifies to the Saccharomyces GO.
2. Elucidating the hierarchy of modules in the virus-human protein interaction network. Dr. Nevan
Krogan at UCSF is a world leader in generating large-scale maps of protein complexes based on
affinity purification mass spectrometry as well as in systems for synthetic lethal genetic interaction
screening. NRNB and Krogan have a long-standing relationship in developing physical and genetic
interaction maps of biological systems of interest11,12,18,60-62
, including the original NeXO paper28
. We
expect this productive relationship to continue as we develop tools for data-driven assembly and
refinement of gene ontologies within this TRD, initially as applied to physical and genetic interactions
of viral protein subunits with proteins encoded by the human host.
A
A
B
A
A
F
i
g
u
r
e
X
.
A
u
t
o
m
a
t
e
d
a
s
s
e
m
b
l
y
a
n
d
a
l
i
g
n
m
e
n
t
o
f
g
e
n
e
Figure 4. Automated assembly and alignment of gene ontologies. (A) Probabilistic community detection
within the input networks yields a binary tree in which nodes correspond to ontology terms and links
correspond to parent-child term relations. Unsupported terms are replaced by multi-way joins, and additional
parent-child relations are added based on network data. The resulting ontology is aligned against the Gene
Ontology, in a way that (B) prohibits non-unique mappings and ancestor-descendant criss-crossing.
3. Gene ontology inference based on binding-site-resolved ‘edgetic’ protein networks. Drs. Marc Vidal
and David Hill are pioneers in protein interaction mapping via the yeast-two-hybrid system. Recently
they developed the capability to map interactions at binding site resolution, by using modular protein
domains as baits combined with phage display knowledge of the preferred binding motif of each
domain. We will together explore whether this binding interface information can be used to inform the
inferred gene ontology structures we are building in this TRD.
4. Hierarchical analysis of cancer subtypes with TCGA / ICGC and Sage Bionetworks. Cancer
genomics projects are generating large cancer specific ‘omics data sets. Therefore, natural DBPs for
this project are provided by The Cancer Genome Atlas, International Cancer Genome Sequencing
Consortium, and Sage Bionetworks, all of which are associated with major cancer genomics projects
nationally and internationally. Our focus will be to construct a Cancer Gene Ontology based on a pan-
cancer analysis of data from all ~20 major TCGA tissue types. Such a Cancer GO would provide
insight into the hierarchy of biological processes and cellular components that is somatically mutated
or differentially activated during cancer progression.
5. Understanding the multi-scale hierarchy of social interactions. We will work with UCSD Professor
James Fowler, a renowned social networks researcher, to apply the hierarchical methods developed
in this aim to analyze the structure of a large social network generated from the Framingham Heart
Study. This study has surveyed health behaviors, disease outcomes, and social relationships among
>12,000 people for over 37 years25-27
.
During these collaborations, we will experiment with ontologies constructed with different sources and
types of data, e.g. using genetic interactions only versus those that also include physical interactions
and other types. Such exploration is needed to evaluate which interaction types are most revealing of
cellular componentry such as protein complexes and larger macro-molecular structures, and how to
weight genetic versus physical interactions for this purpose. We will seek to determine how much
interaction data one needs to construct a robust ontology for each of the DBP datasets, e.g., one
which is able to faithfully recover a substantial fraction of knowledge in the manually-curated GO. At
present, what we know is that this is possible using an integrated network including all genetic and
physical interactions that have been mapped to-date for budding yeast.
Development of iterative procedures for incorporating new data into a data-driven ontology. We will
conduct a major program of exploratory research and development on approaches by which data-
driven gene ontologies such as NeXO can evolve over time, by incorporating new datasets as they
are generated and published. We will begin by evaluating a relatively straightforward approach, which
is to integrate the new dataset(s) into the pairwise gene similarity matrix which forms the input to the
ontology inference method (see above). Once the similarities have been adjusted, an ‘updated’
ontology is constructed based on the old+new data and aligned against the ‘previous’ ontology based
on old data only. Similar to alignment against GO (see above), the desired result is to identify terms
and term relations in the updated ontology that are newly-created as well as previous terms / relations
that are reinforced by the new data. Ultimately one might also imagine downgrading or retiring terms
that have remained unsupported over many diverse dataset updates, but this is admittedly a more
delicate proposition than adding new terms. A limitation of this simple update approach is that the
complete ontology must be reconstituted each time a new data set is evaluated. An alternative and
more optimal approach may be to directly modify the previous ontology using information from the
new data set. We will explore both simple and these more advanced approaches in the course of
research.
Given an update procedure, the experimentalist may wish to design further studies aimed at the new
terms. These specially directed new data could then spawn another ontology update, enabling the
exciting possibility of continued iteration between improving the ontology (aka the biological model)
and the experimental data generation phases of a study.
An online system for distribution and community construction of data-driven ontologies. Ontology
models developed with our DBPs will be made available to the scientific community via query from the
stand-alone NeXO website, nexontology.org, as well as through a specialized App for Cytoscape. We
will also prototype a web-based system whereby a unified and common ‘Crowd-Sourced NeXO
Ontology’ can be iteratively updated from biological data sets uploaded by investigators from the
biomedical research community at large. Achieving this vision will require the addition of major
features to nexontology.org, including user accounts, data upload, and a cloud-based implementation
of ontology inference. If successful, we will seek to transition the new website to independent funding
to support what could ultimately become a large community of users. The allure of such a system is
that the wealth of ‘omics data being generated every year could be analyzed to assemble different
types of gene ontology systematically, with less and less reliance on back curation of the literature.
Ultimately, the desired outcome is to enable a shift from using ontologies to evaluate data to using
data to construct and evaluate ontologies—that is, from a regime in which the ontology is viewed as
gold standard to one in which it is the major result.
3.2 FUNCTIONALIZED GENE ONTOLOGIES AS A HIERARCHY OF
PHENOTYPIC PREDICTION
Project Leader: Trey Ideker (UCSD)
Overview. Whether based on expert knowledge (GO) or inferred from data (NeXO in Aim 1), current
gene ontologies are static descriptions of cellular structure and organization. They enable
representing and reasoning on the structural relationships among biological entities27,29
but lack any
native capacity to capture dynamic functional states or make phenotypic predictions. However, since
gene ontologies inherently represent multi-scale hierarchy in cellular organization, they provide in
theory an ideal substrate for building models that would also be predictive of a range of cellular
functions and phenotypes. In this respect, question and answer systems developed in the field of
knowledge representation and reasoning26
, such as Apple’s Siri and IBM’s Watson, provide an
excellent example of what a predictive, or ‘executable’, ontology looks like. In this aim, we will explore
whether such systems can teach us how to develop predictive systems for cell biology31
. This aim
intersects with the separate TRD project on Predictive Networks and serves as a bridge between the
two TRDs.
Preliminary Results and Progress Report: Activating static networks as predictive models. The
Ideker laboratory has over the years introduced a progression of approaches that seek to use
molecular network information to guide the prediction of phenotypic outcomes such as disease state
or drug response. Relevant works include ActiveModules63
, Network-Based Classification64
, Network-
Guided Forests65
, Network-Based Stratification66
and several influential reviews on using networks
predictively67,68
. The more recent works (2011 to present) were supported by the past period of NRNB
funding. Generally, our methodology has been to identify subnetworks of genes whose expression
levels (molecular profile) or mutation states (genotype) can be functionally combined to predict
disease outcome (phenotype or class). For example, Network-Guided Forests is a classification
method that associates subnetworks of genes with decision trees that evaluate the expression levels
of those genes to predict sample class. Such approaches have shown success in classification of
metastatic vs. non-metastatic breast cancer64
, aggressive vs. indolent leukemia69
, as well as
classification of cell fate decisions during development16,65
. We have found repeatedly that, unlike the
gene sets identified by regular classifiers, the subnetworks identified by network-based methods are
highly enriched for causal factors of disease, and they show very consistent performance across
different sample datasets. Progress Report Publications 12-17.
Methods
Taking clues from Siri: propagation of state on predictive ontologies. We will explore use of the
structure of ontologies, rather than the structure of networks, in making phenotypic predictions. The
key distinction is that networks are concerned mainly with pairwise associations between genes,
whereas ontologies represent hierarchical relations across a range of biological modules at various
scales including genes and proteins, protein complexes, pathways and processes, and organelles.
Question and answering systems such as Apple’s Siri provide a useful model of how hierarchical
relations in an ontology can propagate state information. Unlike current gene ontologies which are
descriptive, Siri’s ontologies are coupled with dynamic reasoning systems that render them active:
“Whereas a conventional ontology is a formal representation of domain knowledge with distinct
concepts and relations among concepts, an Active Ontology is a processing formalism where distinct
processing elements are arranged according to ontology notions; it is an execution environment”30
.
These active ontologies not only encode entities and relations, but entities are associated with states
and relations are associated with rule sets that perform actions within and among entities. During
execution, input states are incrementally propagated up and down the hierarchy to impact other
entities, whose states provide the answer to the initial question – the best prediction based on the
inputs. How the ontologies within Siri are used to answer questions, however, is very different from
how GO is used today in bioinformatics. Typically, GO terms are associated with a set of genes
(annotations), but not with dynamic states; the relationships between GO terms are not associated
with rule sets that perform actions, at least beyond propagation of gene set annotations. Given this
similarity, we will explore construction of such an ‘active’ gene ontology as a general engine for
genotype-phenotype translation.
Genotype-to-phenotype prediction challenges from Driving Biological Projects. We will base our
methods development on data and prediction challenges motivated by DBPs in yeast (Cherry DBP)
and cancer (TCGA DBP). Yeast has by far the largest number of genotype-phenotype measurements
of any organism: most single and double gene knockout strains have been constructed and assayed
for growth, yielding over 10 million ‘simple’ genotypes systematically tested for the same phenotype70-
72
. In addition, hundreds of natural yeast genetic isolates have been fully sequenced and extensively
phenotyped, providing examples of complex genotype backgrounds73
. In cancer, TCGA currently has
tumor exomes available for over 8000 cancer patients (genotypes), along with clinical information
such as survival time, tumor grade, and in some cases drug response (phenotypes). In both yeast and
cancer, the goal is to predict the phenotype of growth, survival, etc. given the genotype of a strain or
patient.
Transformation of genotype to ‘ontotype’. The genotype indicates the set of mutation states of all
genes, which for each gene might be represented simply as {mutated, wildtype} or {loss-of-function,
wild-type, gain-of-function} before considering more precise values. We will prototype propagation
approaches by which these states on genes can be integrated with a gene ontology to infer
corresponding states on terms. For example, since the gene SWI4 encodes a subunit of the SBF
complex, the yeast swi4Δ genotype {Swi4 <= loss-of-function} might propagate upwards in the
ontology to set the state of the parent term {SBF transcription complex <= loss-of-function}, and
continue to propagate upwards to affect ancestor terms at higher scales such as ‘RNA pol II
transcription factor complex’ and ultimately ‘nucleus’ and ‘cell’. We call the set of mutation states of all
terms the ‘ontotype.’
For prediction problems, the ontotype and genotype can then be used together or separately as a set
of features for classification of a phenotypic class, e.g. {alive, dead}, or regression against a
quantitative phenotype, e.g. numerical growth rate or progression-free time interval. Alternatively, the
state of any particular term, representing a cellular component or process, can itself be considered as
the phenotype of interest. Predictions will be benchmarked using metrics such as ROC and PR curves
along with standard statistical techniques such as cross-validation or bootstrapping.
Open questions and milestones. A major research question will be to determine how to dynamically
compute the states of ontology terms based on the states of their children, parents, descendants, and
ancestors. The underlying mathematical function could take many forms, including logic gates such as
AND / OR, linear or additive functions, probabilistic functions, or polynomial or logistic equations. How
to determine the specific forms and parameters of these functions, regardless of what form they take,
is also unclear. This step could happen by statistical association from many input-output examples
using machine learning methods, by including externally generated biological knowledge specific to
each entity, or by manual curation from literature. As this aim is quite exploratory, we do not include
specific algorithmic plans or mathematical details here. Some important milestones for success,
however, will be (1) a proof-of-principle bioinformatic method for propagating molecular profiles on a
gene ontology to predict a phenotypic outcome, and (2) implementation of this method in a robust
software tool as a Cytoscape App.
3.3 BRIDGING LIGAND-RECEPTOR NETWORKS TO CELL-CELL
COMMUNICATION NETWORKS
Project Leader: Gary Bader (University of Toronto)
Overview. Cell-cell interaction networks are an emerging area of network science. In collaboration
with the Zandstra DBP, which is mapping cell-cell interaction networks in the hematopoietic system to
help engineer blood tissue, we will develop novel technology for cell network analysis. We will develop
methods to infer cell-cell interaction networks from molecular profiling data of purified cell populations,
cell-cell interaction network topology analysis software, methods to identify intracellular pathways that
control cell-cell interactions and methods to visualize multi-scale models of inter-cellular
communication networks and their intracellular signaling systems.
Preliminary Results and Progress Report. In the past funding period, we worked with the Zandstra
lab to prototype cell-cell interaction network inference methods and their analysis. Two papers were
published in Molecular Systems Biology that experimentally mapped novel cell-cell interaction
networks for the purpose of identifying growth and inhibitory factors that modulate self-renewal, which
is useful for blood stem cell control. The second paper included network topology analysis and
discovered that ligand production is cell type dependent, whereas ligand binding is promiscuous.
Consequently, additional control strategies such as cell frequency modulation and
compartmentalization were needed to achieve specificity in HSC fate regulation. These proof-of-
concept methods now need to be further developed to extend and streamline their use, as described
below. Progress Report Publications 20,118.
Methods
Cell-cell interaction network inference from single cell population molecular profiles. Cell-cell
interaction networks are currently mapped by inferring regulatory relationships based on the
expression of transmitters and receptors at the cell surface. For instance, if cell type A expresses the
epidermal growth factor peptide hormone and cell type B expresses the epidermal growth factor
receptor protein, and there is a means to transmit the hormone to the target receptor (e.g. by diffusion
within a tissue or in the blood stream), then a directional edge is inferred from cell type A to cell type
B. This process depends on the availability of relatively pure cell populations and ability to measure
the expression of their secreted and surface proteins, both of which are practical with current
technology43,74,75
. We will develop technology to automatically process mRNA and protein expression
profiles from cell populations into cell-cell interaction networks using the following steps:
1. Identify all known ligands and receptors based on known gene function annotation. For
instance, using gene ontology terms “cytokine activity,” “growth factor activity,” “hormone
activity,” and “receptor activity,” genes with ligand or receptor activity will be compiled from the
Ensembl BioMart web service76
.
2. Collect all known protein interactions between ligands and receptors (e.g. from iRefIndex77
,
GeneMANIA78
, Pathway Commons79
and related comprehensive interaction resources). We
have previously literature curated ~270 ligand-receptor pairs not currently in standard
databases and these will also be included32,33
.
3. Compile a list of expressed ligands and receptors from each available cell type population,
based on available gene or protein expression data43,74,75
. We will prefer protein expression
information, but will use mRNA expression levels a proxy when protein levels are not available
(with appropriate caveats).
4. Infer directed regulatory edges between expressed ligand and receptor pairs.
5. Visualize the resulting cell-cell interaction network.
Preliminary work successfully used this approach, but we will develop it into a generally applicable
technology that can be conveniently automatically updated. Our initial focus will be on available
human data, but the technology will be applicable to any organism with enough information available.
Discovery of key players and rules of cell-cell interaction networks. We will develop technology to
make it easy for biologists to computationally analyze the topological properties of cell-cell interaction
networks to help identify key control points and general organizational principles. We will use multiple
established measures of node importance in networks (centrality measures), including hub detection
(find highly connected nodes that when removed cause the network to split into parts80
) and
betweenness centrality (find important connection points between different network regions81
). This
analysis will be accomplished using the CytoHubba, CentiScaPe and/or NetMatch network analyzer
Cytoscape apps, which we will tailor to function on directed cell-cell interaction networks. In particular,
selected network analysis functions in these apps will be published as Cytoscape commands so they
can be made available in a cell-cell interaction network analysis app that we will develop.
Identify intracellular pathways that control and are controlled by cell-cell interactions. We will develop
novel computational methods to explain how signals observed to occur between cells are controlled
by and control internal molecular networks and pathways. First, we will gather an intracellular network
of physical molecular and control interactions between all identified receptors and secreted chemical
signal genes from available molecular interaction and pathway databases (e.g. iRefIndex,
GeneMANIA, Pathway Commons). We will then use established path finding algorithms (e.g. as
implemented in Cytoscape apps such as PathExplorer and in the Pathway Commons web service
system) to identify potential signaling pathways that control chemical signal secretion, and links from
activated receptors to activation of pathways in target cells. Paths will be limited to genes expressed
in the given cell population. To identify pathways that are controlled by a given cell-cell
communication path, we will apply pathway enrichment analysis to downstream molecules in target
cells. Thus, we will predict how inter-cellular signaling impinges on intracellular systems, which in turn
could impinge on additional cell-cell signaling paths. We will also use the Pathway Extraction and
Reduction Algorithm (PERA) method described in TRD1 to identify signaling systems involving cell-
cell communication factors.
Multi-scale visualization of cell-cell interaction networks in the context of internal molecular networks.
We will develop novel multi-scale network visualization methods to help interpret networks generated
in this aim. In particular, we will group ligand and receptor families (using Cytoscape’s grouping
function) to reduce complexity of the resulting network, based on family information in the Gene
Ontology. We will also develop methods to display intracellular molecular paths, where nodes
represent genes, within nodes representing cells. These paths will also connect to intracellular nodes
representing pathways to visualize which pathways are activated by specific cell-cell communication
signals.
Links with other TRDs. As the active collection of molecular profiles for secreted and receptor protein
expression grows, we expect data sets to become available that cover multiple time points and
samples (e.g. disease patients and healthy controls). Thus, we will develop multi-scale cell-cell
interaction networks across conditions and use technology developed in the Differential Networks
TRD to compare them. We will also explore how patient specific versions of these networks can be
used as predictive features in work described in the Predictive Networks TRD.
TRD 3: MULTI-SCALE NETWORKS –
BIBLIOGRAPHY AND REFERENCES CITED
1. Mathews, C.K. The Cell-Bag of Enzymes or Network of Channels? J Bacteriol 175, 6377-81
(1993).
2. Reddy, G.P., Singh, A., Stafford, M.E. & Mathews, C.K. Enzyme Associations in T4 Phage
DNA Precursor Synthesis. Proc Natl Acad Sci U S A 74, 3152-6 (1977).
3. Monod, J., Changeux, J.P. & Jacob, F. Allosteric Proteins and Cellular Control Systems.
Journal of Molecular Biology 6, 306-& (1963).
4. Kauffman, S.A. The Origins of Order : Self-Organization and Selection in Evolution, xviii, 709
p. (Oxford University Press, New York, 1993).
5. Pollack, G.H. Cells, Gels and the Engines of Life : A New, Unifying Approach to Cell Function,
xiv, 305 p. (Ebner & Sons, Seattle, WA, 2001).
6. Bray, D. Wetware : A Computer in Every Living Cell, xii, 267 p. (Yale University Press, New
Haven ; London, 2009).
7. Barabasi, A.L. & Oltvai, Z.N. Network Biology: Understanding the Cell's Functional
Organization. Nat Rev Genet 5, 101-13 (2004).
8. Mitra, K., Carvunis, A.R., Ramesh, S.K. & Ideker, T. Integrative Approaches for Finding
Modular Structure in Biological Networks. Nat Rev Genet 14, 719-32 (2013).
9. Ashburner, M. et al. Gene Ontology: Tool for the Unification of Biology. The Gene Ontology
Consortium. Nat Genet 25, 25-9 (2000).
10. Novarino, G. et al. Exome Sequencing Links Corticospinal Motor Neuron Disease to Common
Neurodegenerative Disorders. Science 343, 506-11 (2014).
11. Bandyopadhyay, S. et al. Rewiring of Genetic Networks in Response to DNA Damage.
Science 330, 1385-9 (2010).
12. Roguev, A. et al. Conservation and Rewiring of Functional Modules Revealed by an Epistasis
Map in Fission Yeast. Science 322, 405-10 (2008).
13. Workman, C.T. et al. A Systems Approach to Mapping DNA Damage Response Pathways.
Science 312, 1054-9 (2006).
14. Konig, R. et al. Human Host Factors Required for Influenza Virus Replication. Nature 463,
813-7 (2010).
15. Suthram, S., Sittler, T. & Ideker, T. The Plasmodium Protein Network Diverges from Those of
Other Eukaryotes. Nature 438, 108-12 (2005).
16. Ravasi, T. et al. An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man. Cell
140, 744-52 (2010).
17. Bandyopadhyay, S. et al. A Human Map Kinase Interactome. Nat Methods 7, 801-5 (2010).
18. Guenole, A. et al. Dissection of DNA Damage Responses Using Multiconditional Genetic
Interaction Maps. Mol Cell 49, 346-58 (2013).
19. Begley, T.J., Rosenbach, A.S., Ideker, T. & Samson, L.D. Hot Spots for Modulating Toxicity
Identified by Genomic Phenotyping and Localization Mapping. Mol Cell 16, 117-25 (2004).
20. Srivas, R. et al. A Uv-Induced Genetic Network Links the Rsc Complex to Nucleotide Excision
Repair and Shows Dose-Dependent Rewiring. Cell Rep 5, 1714-24 (2013).
21. Jaehnig, E.J., Kuo, D., Hombauer, H., Ideker, T.G. & Kolodner, R.D. Checkpoint Kinases
Regulate a Global Network of Transcription Factors in Response to DNA Damage. Cell Rep 4,
174-88 (2013).
22. Chuang, H.Y., Hofree, M. & Ideker, T. A Decade of Systems Biology. Annu Rev Cell Dev Biol
26, 721-44 (2010).
23. Walhout, A.J.M., Vidal, M. & Dekker, J. Handbook of Systems Biology : Concepts and Insights,
xiii, 538 p. (Waltham Academic Press, London ;, 2013).
24. Koller, D. & Friedman, N. Probabilistic Graphical Models : Principles and Techniques, xxi,
1231 p. (MIT Press, Cambridge, MA, 2009).
25. Gruber, T.R. Toward Principles for the Design of Ontologies Used for Knowledge Sharing.
International Journal of Human-Computer Studies 43, 907-928 (1995).
26. Brachman, R.J. & Levesque, H.J. Knowledge Representation and Reasoning, xxix, 381 p.
(Morgan Kaufmann, Amsterdam ; Boston, 2004).
27. Robinson, P.N. & Bauer, S. Introduction to Bio-Ontologies, xxvii, 488 p. (Taylor & Francis,
Boca Raton, 2011).
28. Dutkowski, J. et al. A Gene Ontology Inferred from Molecular Networks. Nat Biotechnol 31, 38-
45 (2013).
29. Myhre, S., Tveit, H., Mollestad, T. & Laegreid, A. Additional Gene Ontology Structure for
Improved Biological Reasoning. Bioinformatics 22, 2020-7 (2006).
30. Guzzoni, D., Baur, C. & Cheyer, A. Active: A Unified Platform for Building Intelligent Web
Interaction Assistants. 2006 IEEE/WIC/ACM International Conference on Web Intelligence and
Intelligent Agent Technology, Workshops Proceedings, 417-420 (2006).
31. Wren, J.D. Question Answering Systems in Biology and Medicine--the Time Is Now.
Bioinformatics 27, 2025-6 (2011).
32. Qiao, W. et al. Intercellular Network Structure and Regulatory Motifs in the Human
Hematopoietic System. Molecular systems biology 10, 741 (2014).
33. Kirouac, D.C. et al. Dynamic Interaction Networks in a Hierarchically Organized Tissue. Mol
Syst Biol 6, 417 (2010).
34. Billia, F., Barbara, M., McEwen, J., Trevisan, M. & Iscove, N.N. Resolution of Pluripotential
Intermediates in Murine Hematopoietic Differentiation by Global Complementary DNA
Amplification from Single Cells: Confirmation of Assignments by Expression Profiling of
Cytokine Receptor Transcripts. Blood 97, 2257-68 (2001).
35. Majka, M. et al. Numerous Growth Factors, Cytokines, and Chemokines Are Secreted by
Human Cd34(+) Cells, Myeloblasts, Erythroblasts, and Megakaryoblasts and Regulate Normal
Hematopoiesis in an Autocrine/Paracrine Manner. Blood 97, 3075-85 (2001).
36. von Dassow, G., Meir, E., Munro, E.M. & Odell, G.M. The Segment Polarity Network Is a
Robust Developmental Module. Nature 406, 188-92 (2000).
37. Kondo, S. Cell-Cell Interaction Network That Generates the Skin Pattern of Animal. Genome
Inform 16, 287-91 (2005).
38. De Matteis, G., Graudenzi, A. & Antoniotti, M. A Review of Spatial Computational Models for
Multi-Cellular Systems, with Regard to Intestinal Crypts and Colorectal Cancer Development.
Journal of mathematical biology 66, 1409-62 (2013).
39. Eckmann, J.P. & Moses, E. Curvature of Co-Links Uncovers Hidden Thematic Layers in the
World Wide Web. Proc Natl Acad Sci U S A 99, 5825-9 (2002).
40. Komurov, K. Modeling Community-Wide Molecular Networks of Multicellular Systems.
Bioinformatics 28, 694-700 (2012).
41. Frankenstein, Z., Alon, U. & Cohen, I.R. The Immune-Body Cytokine Network Defines a Social
Architecture of Cell Interactions. Biol Direct 1, 32 (2006).
42. Tieri, P. et al. Quantifying the Relevance of Different Mediators in the Human Immune Cell
Network. Bioinformatics 21, 1639-43 (2005).
43. Gedye, C.A. et al. Cell Surface Profiling Using High-Throughput Flow Cytometry: A Platform
for Biomarker Discovery and Analysis of Cellular Heterogeneity. PLoS ONE 9, e105602
(2014).
44. McPherson, A. Introduction to Macromolecular Crystallography, x, 267 p. (Wiley-Blackwell,
Hoboken, N.J., 2009).
45. Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N. & Barabasi, A.L. Hierarchical
Organization of Modularity in Metabolic Networks. Science 297, 1551-5 (2002).
46. Dotan-Cohen, D., Letovsky, S., Melkman, A.A. & Kasif, S. Biological Process Linkage
Networks. PLoS One 4, e5313 (2009).
47. Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing Modularity and Organization in the
Yeast Molecular Network by Integrated Analysis of Highly Heterogeneous Genomewide Data.
Proc Natl Acad Sci U S A 101, 2981-6 (2004).
48. Kelley, R. & Ideker, T. Systematic Interpretation of Genetic Interactions Using Protein
Networks. Nat Biotechnol 23, 561-6 (2005).
49. Jaimovich, A., Rinott, R., Schuldiner, M., Margalit, H. & Friedman, N. Modularity and
Directionality in Genetic Interaction Maps. Bioinformatics 26, i228-36 (2010).
50. Park, Y. & Bader, J.S. Resolving the Structure of Interactomes with Hierarchical Agglomerative
Clustering. BMC Bioinformatics 12 Suppl 1, S44 (2011).
51. Dutkowski, J. et al. Nexo Web: The Nexo Ontology Database and Visualization Platform.
Nucleic Acids Res 42, D1269-74 (2014).
52. Lee, I., Li, Z. & Marcotte, E.M. An Improved, Bias-Reduced Probabilistic Functional Gene
Network of Baker's Yeast, Saccharomyces Cerevisiae. PLoS One 2, e988 (2007).
53. Kramer, M., Dutkowski, J., Yu, M., Bafna, V. & Ideker, T. Inferring Gene Ontologies from
Pairwise Similarity Data. Bioinformatics 30, i34-42 (2014).
54. Jensen, L.J. et al. String 8--a Global View on Proteins and Their Functional Interactions in 630
Organisms. Nucleic Acids Res 37, D412-6 (2009).
55. Lee, I., Date, S.V., Adai, A.T. & Marcotte, E.M. A Probabilistic Functional Network of Yeast
Genes. Science 306, 1555-8 (2004).
56. Jansen, R. et al. A Bayesian Networks Approach for Predicting Protein-Protein Interactions
from Genomic Data. Science 302, 449-53 (2003).
57. Breiman, L. Random Forests. Machine Learning 45, 5-32 (2001).
58. Clauset, A., Moore, C. & Newman, M.E. Hierarchical Structure and the Prediction of Missing
Links in Networks. Nature 453, 98-101 (2008).
59. Jean-Mary, Y.R., Shironoshita, E.P. & Kabuka, M.R. Ontology Matching with Semantic
Verification. Web Semant 7, 235-251 (2009).
60. Ryan, C.J. et al. Hierarchical Modularity and the Evolution of Genetic Interactomes across
Species. Mol Cell 46, 691-704 (2012).
61. Wilmes, G.M. et al. A Genetic Interaction Map of Rna-Processing Factors Reveals Links
between Sem1/Dss1-Containing Complexes and Mrna Export and Splicing. Mol Cell 32, 735-
46 (2008).
62. Hannum, G. et al. Genome-Wide Association Data Reveal a Global Map of Genetic
Interactions among Protein Complexes. PLoS Genet 5, e1000782 (2009).
63. Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A.F. Discovering Regulatory and Signalling
Circuits in Molecular Interaction Networks. Bioinformatics 18 Suppl 1, S233-40 (2002).
64. Chuang, H.Y., Lee, E., Liu, Y.T., Lee, D. & Ideker, T. Network-Based Classification of Breast
Cancer Metastasis. Mol Syst Biol 3, 140 (2007).
65. Dutkowski, J. & Ideker, T. Protein Networks as Logic Functions in Development and Cancer.
PLoS Comput Biol 7, e1002180 (2011).
66. Hofree, M., Shen, J.P., Carter, H., Gross, A. & Ideker, T. Network-Based Stratification of
Tumor Mutations. Nat Methods 10, 1108-15 (2013).
67. Ideker, T., Dutkowski, J. & Hood, L. Boosting Signal-to-Noise in Complex Biology: Prior
Knowledge Is Power. Cell 144, 860-3 (2011).
68. Carvunis, A.R. & Ideker, T. Siri of the Cell: What Biology Could Learn from the Iphone. Cell
157, 534-8 (2014).
69. Chuang, H.Y. et al. Subnetwork-Based Analysis of Chronic Lymphocytic Leukemia Identifies
Pathways That Associate with Disease Progression. Blood 120, 2639-49 (2012).
70. Costanzo, M. et al. The Genetic Landscape of a Cell. Science 327, 425-31 (2010).
71. Winzeler, E.A. et al. Functional Characterization of the S. Cerevisiae Genome by Gene
Deletion and Parallel Analysis. Science 285, 901-6 (1999).
72. Hillenmeyer, M.E. et al. The Chemical Genomic Portrait of Yeast: Uncovering a Phenotype for
All Genes. Science 320, 362-5 (2008).
73. Bloom, J.S., Ehrenreich, I.M., Loo, W.T., Lite, T.L. & Kruglyak, L. Finding the Sources of
Missing Heritability in a Yeast Cross. Nature 494, 234-7 (2013).
74. Novershtern, N. et al. Densely Interconnected Transcriptional Circuits Control Cell States in
Human Hematopoiesis. Cell 144, 296-309 (2011).
75. Laurenti, E. et al. The Transcriptional Architecture of Early Human Hematopoiesis Identifies
Multilevel Control of Lymphoid Commitment. Nature immunology 14, 756-63 (2013).
76. Kinsella, R.J. et al. Ensembl Biomarts: A Hub for Data Retrieval across Taxonomic Space.
Database : the journal of biological databases and curation 2011, bar030 (2011).
77. Turner, B. et al. Irefweb: Interactive Analysis of Consolidated Protein Interaction Data and
Their Supporting Evidence. Database (Oxford) 2010, baq023 (2010).
78. Zuberi, K. et al. Genemania Prediction Server 2013 Update. Nucleic acids research 41, W115-
22 (2013).
79. Cerami, E.G. et al. Pathway Commons, a Web Resource for Biological Pathway Data. Nucleic
Acids Res (2010).
80. Jeong, H., Mason, S.P., Barabasi, A.L. & Oltvai, Z.N. Lethality and Centrality in Protein
Networks. Nature 411, 41-2 (2001).
81. Yu, H., Kim, P.M., Sprecher, E., Trifonov, V. & Gerstein, M. The Importance of Bottlenecks in
Protein Networks: Correlation with Gene Essentiality and Expression Dynamics. PLoS Comput
Biol 3, e59 (2007).

Weitere ähnliche Inhalte

Was ist angesagt?

NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloAlexander Pico
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizAlexander Pico
 
NRNB Annual Report 2013
NRNB Annual Report 2013NRNB Annual Report 2013
NRNB Annual Report 2013Alexander Pico
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicAlexander Pico
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuAlexander Pico
 
NetBioSIG2013-Talk Tijana Milenkovic
NetBioSIG2013-Talk Tijana MilenkovicNetBioSIG2013-Talk Tijana Milenkovic
NetBioSIG2013-Talk Tijana MilenkovicAlexander Pico
 
NetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno SchwikowskiNetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno SchwikowskiAlexander Pico
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017Alexander Pico
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoAlexander Pico
 
NetBioSIG2013-Talk Vuk Janjic
NetBioSIG2013-Talk Vuk JanjicNetBioSIG2013-Talk Vuk Janjic
NetBioSIG2013-Talk Vuk JanjicAlexander Pico
 
NetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina KutmonNetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina KutmonAlexander Pico
 
NetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David AmarNetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David AmarAlexander Pico
 
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Luís Rita
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioAlexander Pico
 
NetBioSIG2013-KEYNOTE Natasa Przulj
NetBioSIG2013-KEYNOTE Natasa PrzuljNetBioSIG2013-KEYNOTE Natasa Przulj
NetBioSIG2013-KEYNOTE Natasa PrzuljAlexander Pico
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsRafael C. Jimenez
 
Friend harvard 2013-01-30
Friend harvard 2013-01-30Friend harvard 2013-01-30
Friend harvard 2013-01-30Sage Base
 

Was ist angesagt? (20)

NetBioSIG2012 chrisevelo
NetBioSIG2012 chriseveloNetBioSIG2012 chrisevelo
NetBioSIG2012 chrisevelo
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-viz
 
NRNB Annual Report 2013
NRNB Annual Report 2013NRNB Annual Report 2013
NRNB Annual Report 2013
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana Milenkovic
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang Su
 
NetBioSIG2013-Talk Tijana Milenkovic
NetBioSIG2013-Talk Tijana MilenkovicNetBioSIG2013-Talk Tijana Milenkovic
NetBioSIG2013-Talk Tijana Milenkovic
 
NRNB EAC Report 2011
NRNB EAC Report 2011NRNB EAC Report 2011
NRNB EAC Report 2011
 
NetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno SchwikowskiNetBioSIG2013-KEYNOTE Benno Schwikowski
NetBioSIG2013-KEYNOTE Benno Schwikowski
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon Cho
 
NetBioSIG2013-Talk Vuk Janjic
NetBioSIG2013-Talk Vuk JanjicNetBioSIG2013-Talk Vuk Janjic
NetBioSIG2013-Talk Vuk Janjic
 
NetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina KutmonNetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina Kutmon
 
NetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David AmarNetBioSIG2013-Talk David Amar
NetBioSIG2013-Talk David Amar
 
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
Community Finding with Applications on Phylogenetic Networks [Extended Abstract]
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbio
 
NetBioSIG2013-KEYNOTE Natasa Przulj
NetBioSIG2013-KEYNOTE Natasa PrzuljNetBioSIG2013-KEYNOTE Natasa Przulj
NetBioSIG2013-KEYNOTE Natasa Przulj
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular Interactions
 
Condspe
CondspeCondspe
Condspe
 
Friend harvard 2013-01-30
Friend harvard 2013-01-30Friend harvard 2013-01-30
Friend harvard 2013-01-30
 

Ähnlich wie Technology R&D Theme 3: Multi-scale Network Representations

System Biology and Pathway Network.pptx
System Biology and Pathway Network.pptxSystem Biology and Pathway Network.pptx
System Biology and Pathway Network.pptxssuserecbdb6
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksJim Bagrow
 
systems bioligy.pptx
systems bioligy.pptxsystems bioligy.pptx
systems bioligy.pptxAnandSGiri
 
Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Samuel Sattath
 
Protein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modelingProtein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modelingRangarajan Chari
 
The physics behind systems biology
The physics behind systems biologyThe physics behind systems biology
The physics behind systems biologyImam Rosadi
 
Huwang-2-7.ppt
Huwang-2-7.pptHuwang-2-7.ppt
Huwang-2-7.pptkobra22
 
Unveiling the role of network and systems biology in drug discovery
Unveiling the role of network and systems biology in drug discoveryUnveiling the role of network and systems biology in drug discovery
Unveiling the role of network and systems biology in drug discoverychengcheng zhou
 
A Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkA Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkLisa Muthukumar
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksAkua Biaa Adu
 
Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network MedicineMarc Santolini
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to CancerRaunak Shrestha
 
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...ijcsa
 
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...gerogepatton
 
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...gerogepatton
 
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...ijaia
 
Biological models of security for virus propagation in computer networks
Biological models of security for virus propagation in computer networksBiological models of security for virus propagation in computer networks
Biological models of security for virus propagation in computer networksUltraUploader
 

Ähnlich wie Technology R&D Theme 3: Multi-scale Network Representations (20)

System Biology and Pathway Network.pptx
System Biology and Pathway Network.pptxSystem Biology and Pathway Network.pptx
System Biology and Pathway Network.pptx
 
String.pptx
String.pptxString.pptx
String.pptx
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
 
systems bioligy.pptx
systems bioligy.pptxsystems bioligy.pptx
systems bioligy.pptx
 
evolutionary game theory presentation
evolutionary game theory presentationevolutionary game theory presentation
evolutionary game theory presentation
 
Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...
 
Protein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modelingProtein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modeling
 
The physics behind systems biology
The physics behind systems biologyThe physics behind systems biology
The physics behind systems biology
 
Huwang-2-7.ppt
Huwang-2-7.pptHuwang-2-7.ppt
Huwang-2-7.ppt
 
gky1131.pdf
gky1131.pdfgky1131.pdf
gky1131.pdf
 
Unveiling the role of network and systems biology in drug discovery
Unveiling the role of network and systems biology in drug discoveryUnveiling the role of network and systems biology in drug discovery
Unveiling the role of network and systems biology in drug discovery
 
A Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkA Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration Framework
 
Curveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein NetworksCurveball Algorithm for Random Sampling of Protein Networks
Curveball Algorithm for Random Sampling of Protein Networks
 
Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network Medicine
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to Cancer
 
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
 
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
 
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
 
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...
 
Biological models of security for virus propagation in computer networks
Biological models of security for virus propagation in computer networksBiological models of security for virus propagation in computer networks
Biological models of security for virus propagation in computer networks
 

Mehr von Alexander Pico

2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 TutorialAlexander Pico
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerAlexander Pico
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoAlexander Pico
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartAlexander Pico
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaAlexander Pico
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutAlexander Pico
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilAlexander Pico
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarAlexander Pico
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonAlexander Pico
 
Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Alexander Pico
 
Introduction to WikiPathways
Introduction to WikiPathwaysIntroduction to WikiPathways
Introduction to WikiPathwaysAlexander Pico
 
Network Visualization and Analysis with Cytoscape
Network Visualization and Analysis with CytoscapeNetwork Visualization and Analysis with Cytoscape
Network Visualization and Analysis with CytoscapeAlexander Pico
 
NetBioSIG2013-KEYNOTE Michael Schroeder
NetBioSIG2013-KEYNOTE Michael SchroederNetBioSIG2013-KEYNOTE Michael Schroeder
NetBioSIG2013-KEYNOTE Michael SchroederAlexander Pico
 
NetBioSIG2013-KEYNOTE Stefan Schuster
NetBioSIG2013-KEYNOTE Stefan SchusterNetBioSIG2013-KEYNOTE Stefan Schuster
NetBioSIG2013-KEYNOTE Stefan SchusterAlexander Pico
 
NetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-LotemNetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-LotemAlexander Pico
 

Mehr von Alexander Pico (17)

2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank Kramer
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore Loguercio
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex Pico
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu Xia
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian Walhout
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini Patil
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald Quon
 
Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks
 
Introduction to WikiPathways
Introduction to WikiPathwaysIntroduction to WikiPathways
Introduction to WikiPathways
 
Network Visualization and Analysis with Cytoscape
Network Visualization and Analysis with CytoscapeNetwork Visualization and Analysis with Cytoscape
Network Visualization and Analysis with Cytoscape
 
NetBioSIG2013-KEYNOTE Michael Schroeder
NetBioSIG2013-KEYNOTE Michael SchroederNetBioSIG2013-KEYNOTE Michael Schroeder
NetBioSIG2013-KEYNOTE Michael Schroeder
 
NetBioSIG2013-KEYNOTE Stefan Schuster
NetBioSIG2013-KEYNOTE Stefan SchusterNetBioSIG2013-KEYNOTE Stefan Schuster
NetBioSIG2013-KEYNOTE Stefan Schuster
 
NetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-LotemNetBioSIG2013-KEYNOTE Esti Yeger-Lotem
NetBioSIG2013-KEYNOTE Esti Yeger-Lotem
 

Kürzlich hochgeladen

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 

Kürzlich hochgeladen (20)

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

Technology R&D Theme 3: Multi-scale Network Representations

  • 1. TRD 3: MULTI-SCALE NETWORKS – PROJECT SUMMARY Although networks have been extremely useful for representing molecular interactions and mechanisms, network diagrams do not visually resemble the contents of cells. Rather, the cell involves a multi-scale hierarchy of components – proteins are subunits of protein complexes which, in turn, are parts of pathways, biological processes, organelles, cells, tissues, and so on. In this Technology Research and Development Project (TRD), we will pursue methods that move Network Biology towards such hierarchical, multi-scale views of the structure and function of biological systems. Biological ontologies are one very successful framework for capturing hierarchical multi- scale organization, but they have so far been only indirectly connected to biological networks and other types of ‘omics data. Recently, we introduced methods for inferring the terms and term relations of a gene ontology directly from the hierarchical structure contained in molecular networks, and we prototyped a web resource to distribute network-based ontologies (NeXO, nexontology.org). This recent progress motivates and lays groundwork for our present focus on hierarchical multi-scale representations. Specific aims are to develop tools that: (1) Iteratively and flexibly incorporate new network experimental results into a ‘working’ NeXO ontology, (2) Use a gene ontology structure, either inferred or literature curated, to guide an engine for generalized functional predictions, and (3) Explore multi-scale analysis above the cellular level, by bridging ligand-receptor networks to networks of cell- cell communication. These aims are stimulated by a range of Driving Biomedical Projects involving the Gene Ontology project, the Saccharomyces Genome Database, a Cancer Gene Ontology, and multi-scale analysis of viral-host, cell-cell communication and social networks. Ultimately, all research aims synergize to use network data to propel hierarchical models of biological structure and function.
  • 2. TRD 3: MULTI-SCALE NETWORKS – PROJECT NARRATIVE Although networks have been extremely useful for representing interactions and group formation, network diagrams fail to capture important aspects of biological structure and function. We will pursue methods that move Network Biology towards more accurate hierarchical, multi-scale views of biological systems. The hierarchical models developed here will enable integration of both basic and clinical data to predict disease outcomes in response to specific therapies.
  • 3. TRD 3: MULTI-SCALE NETWORKS – SPECIFIC AIMS Although networks have been very useful for representing molecular interactions and mechanisms, network diagrams do not visually resemble the contents of cells. Rather, the cell involves a multi-scale hierarchy of components – proteins are subunits of protein complexes which, in turn, are parts of pathways, biological processes, organelles, cells, tissues, and so on. In this technology research project, we will pursue methods that move Network Biology towards such hierarchical, multi-scale views of biological structure and function. Aim 1. Assembly and refinement of gene ontology structure from biological network data. Ontologies have been very successful at capturing hierarchical, multi-scale cellular organization. In the prior period of support we introduced methods for assembling a gene ontology directly from the hierarchical structure evidenced by molecular networks and other ‘omics data. We prototyped a web resource to distribute network-based ontologies (NeXO, nexontology.org), but it is still at an early stage. In the next support period, we will research methods to iteratively and flexibly incorporate new experimental results and data into a ‘working’ NeXO ontology, highlighting new terms and term relations that are created, alongside existing terms/relations that are further supported or weakened. We will transform nexontology.org to an interactive community resource that enables investigators not only to browse an existing ontology but to create, share, and iteratively update, revise, correct, and expand these ontologies. The potential of this aim is to effectively systematize and crowd-source an important type of biological model – the ontology. Aim 2. Functionalized gene ontologies as a hierarchy of phenotypic prediction. Hierarchy and scale are important not only for capturing the physical architecture of a system (Aim 1) but also its function. Recent progress in artificial intelligence (AI), embodied by agents such as Siri and Watson, inspires an approach for moving from networks and gene ontologies, which are currently descriptive in nature, towards predictive models that are able to predict a range of cellular phenotypes and answer biological questions. Using these AIs as a rough inspirational guideline, we will develop gene ontologies as a major platform for the functional translation of genotype to phenotype, with a particular focus on personalized cancer therapeutics. This aim intersects with the separate TRD project on Predictive Networks and serves as a bridge between the two TRDs. Aim 3. Bridging ligand-receptor networks to cell-cell communication networks. We will also explore multi-scale network analysis above the cellular level, in the context of an emerging class of biological networks called cell-cell interaction networks. In these networks nodes are cells, and edges represent physical or chemical (e.g. hormonal) interactions. Inter-cellular signaling and regulation networks could in the future be controlled to grow artificial organs, heal tissues and develop novel therapies. We will infer, analyze and visualize multi-scale models of inter-cellular communication networks and their corresponding intracellular signaling networks and pathways, which link to traditional molecular interaction network analysis methods. For instance, we will use network analysis methods to identify potential control points in the cell-cell and intracellular interaction network with applications to regenerative medicine (growing blood from stem cells). Growth of data and analysis methods in this area will enable network science to contribute to the wider understanding of physiological systems. These aims are stimulated by a range of Driving Biomedical Projects involving the Gene Ontology project, the Saccharomyces Genome Database, a Cancer Gene Ontology, and multi-scale analysis of viral-host, cell-cell communication and social networks. Ultimately, all research aims synergize to use network data to propel hierarchical models of biological structure and function.
  • 4. TRD 3: MULTI-SCALE NETWORKS – RESEARCH STRATEGY SIGNIFICANCE Why it is time to move beyond flat models of biological networks. Like any model of the world, our view of the cell is inescapably bound by the time and place in which we live. Over the years different schools have fashioned the cell in a variety of forms, from bags of enzymes1 , to metabolic channels2 , to feedback circuits3 , to complex systems4 , to gels5 , to self-modifying programs in software6 . A model that has pervaded cell biology for the past fifteen years is the so-called “network” view (Figure 1A), which has bloomed in parallel with the emergence of human-made networks such as the Internet and Facebook. This view treats cells as containers for vast networks of “nodes” (genes, gene products, metabolites, or other biomolecules) connected by “links” (physical interactions or functional associations)7 . Network representations of the cell flow directly from the ability to characterize not only genes and proteins in isolation, but also their functional similarities and physical binding partners— a major outcome of transcriptomics and proteomics approaches. Analysis of network information, whether biological or human-made, is an active field leading to algorithms that detect nodes with strategic positions within a network7 or that analyze networks to identify modular structures8 (a topic of earlier progress during the past period of support for the NRNB). While incredibly influential, the network is likely not the ultimate representation of a cell, for two reasons. First, network diagrams do not visually resemble the contents of cells. Nowhere in the cell do we observe actual wires running between genes and proteins– unlike for the Internet, which is truly a network of wires among processing units. Rather, the cell involves a multi-scale hierarchy of components that is not readily captured by basic network representations. For example, the proteasome has been mapped extensively to identify its key genes and interactions, but the network visualization of these data (Figure 1A) is very different from the proteasome’s spatial appearance (Figure 1B). The interactions making up the proteasome factor into a regulatory particle and a core, which, in turn, factor into a base and a lid, and an alpha and beta subunit, respectively. This hierarchical structure is obscured by the network visualization of pairwise relationships between gene products. Aim 1 will address this shortcoming, by using molecular networks and other ‘omics data to build hierarchical models of the cell parallel to the Gene Ontology9 . Figure 1. From networks to ontologies. (A) Network representation of three types of interactions that form the proteasome structure, displayed using a force directed layout. (B) Cartoon representation of the structure of the proteasome (PDB entry 4b4t), created by integrating partial crystallographic structures obtained by analysis of 2.4 million images from electron microscopy. (C) Hierarchical factorization of the proteasome sub-components as described by our data-driven gene ontology NeXO. Across all panels, colors indicate membership to the core complex beta subunit (red), core complex alpha subunit (orange), regulatory particle lid complex (blue) and regulatory particle base complex (purple) according to the GO (A), the Protein Data Bank (B) and NeXO (C).
  • 5. From description to prediction. Second, many of the molecular networks published to date, including many from the NRNB or earlier research by our labs10-21 , are descriptive maps of physical or functional connectivity rather than predictive models. For example, technologies such as yeast two hybrid, protein affinity purification, and chromatin immunoprecipitation are often used to define and draw large networks of protein-protein and protein-DNA interactions22 , but these static maps do not, by themselves, predict cell behavior. Although we and many others in the field of network biology have inferred networks capable of predicting gene function or phenotypic responses [reviewed here23,24 ; network inference was the focus of previous Aim 4 of the past funding period], these efforts have tended to focus on a specific class of predictions, i.e. gene expression level or cell growth rate. Assembling a model that would predict a range of phenotypes, rather than only one type of outcome, requires understanding how phenotypes are interrelated. Here again a hierarchy is important, since cellular organization involves a multi-scale hierarchy not only in structure but also in function. For example, the proteasome is a central component of ubiquitin-mediated protein degradation, which, depending on an intricate set of inputs and rules, can result in cellular homeostasis, differentiation, death, and other fates. This multi-scale hierarchy of processes is, again, simply not exposed by a standard pairwise network representation. Aim 2 will address this shortcoming by developing methods to ‘functionalize’ the Gene Ontology, so that it is not merely a static description of the contents of cells, but an active framework for predicting phenotype from genotype. From networks to ontologies: Building better models of cell structure from omics data. To capture hierarchical organization, a particularly promising direction in computer science has been the development of the ontology, a model that divides its subject domain into a set of fundamental concepts or entities and relationships among those entities25 . Ontologies arise from the metaphysics branch of philosophy, which is concerned with the nature of what exists and the categories into which the world’s objects naturally fall. Ontologies build upon and extend network models in two key ways: ‘entities’ refer not only to elemental objects but also to any meaningful grouping of objects, and ‘relationships’ refer not only to direct connections but also to nested structures, such as one entity being a part or type of another. Thus, ontologies explicitly allow for a higher order organization of knowledge, missing from raw networks. They have been key for building powerful knowledge representation and reasoning systems in many domains26 including biomedicine27 . Ontologies became very influential in cell biology through the development of the Gene Ontology (GO)9 . GO is a major resource of knowledge about genes, gene products, and the hierarchy of cellular components, molecular functions and biological processes in which they participate. Entities in GO (GO terms) are hierarchical groupings of other entities. The GO resource is presently very large, with nearly 35,000 GO terms connected by ~65,000 hierarchical term-term relations, describing more than 80 different species. The impact of GO is hard to overstate – just try to think of a single modern ‘omics analysis that does not use GO to validate a novel data set or approach, or to generate new mechanistic hypotheses. In a sense GO is the most universal, and universally accepted, model of a cell that we currently have. One limitation of GO lies in the fact that the ontology structure is constructed by a diverse team of scientists according to their best abilities to curate the published scientific literature. Thus, GO inevitably misses the large proportion of cell biology that is not yet known or has not yet been curated, and it contains biases that are hard to control. To address these challenges, in the prior period of support we investigated whether gene ontologies could be inferred computationally directly from systematic molecular interaction networks28 . In this study, a large fraction of the GO hierarchy was recapitulated de novo, directly from network data gathered in budding yeast. For example, the pairwise interaction network for genes and gene products encoding the proteasome (Figure 2A) was transformed to infer the hierarchical structure of proteasomal components to a high degree of accuracy (Figure 2C). In addition, several hundred cellular entities were identified from the data that had not yet been catalogued in GO, pointing to potentially novel or uncurated molecular machinery which we are pursuing in collaboration with the Gene Ontology Consortium (formerly a CSP, now a DBP).
  • 6. Over the next few years, we will expand on this preliminary work to introduce a system for organizing molecular interactions and cancer ‘omics data as a genomics-driven, crowd-sourced Gene Ontology. This will address several parallel challenges in the ‘omics sciences: (1) The need to move beyond clustering to recognize the multi-scale structure embedded in data (2) The need to improve ontologies of gene function in their scalability, consistency and coverage (3) The continued need to provide biomedicine with an accurate map of hallmark pathways and processes that drive disease progression. Taking clues from Siri: ‘Active’ networks and ontologies. Whether based on expert knowledge or inferred from data, current gene ontologies are static descriptions of cellular organization. They enable representing and reasoning on the structural relationships among biological entities27,29 but lack any native capacity to capture dynamic biological states or make phenotypic predictions. However, since gene ontologies inherently represent multi-scale hierarchy in cellular organization, they provide in theory an ideal substrate for building models that would also be predictive of a range of cellular responses and phenotypes. In this respect, intelligent agents developed in the field of knowledge representation and reasoning26 , such as Apple’s Siri and IBM’s Watson, provide an excellent example of what a predictive, or ‘executable’, ontology looks like. At Siri’s core is a series of ontologies containing knowledge that concerns Siri – answers to questions one would normally ask an iPhone30 . For instance, Siri uses an ontology for event planning which treats both meals and movies as types of events, where meals involve a restaurant and a restaurant consists of components such as a name, address, and style of food. In many ways, such ontologies are similar in structure to bio-ontologies such as GO (Figure 2). Figure 2. From ontologies to active ontologies. A subset of the Gene Ontology 9 , left, alongside a subset of an active ontology for event planning 30 , right. Red relationships and entities indicate dynamic computation. Unlike gene ontologies, however, which are essentially descriptive, Siri’s ontologies are coupled with dynamic reasoning systems that render them active: “Whereas a conventional ontology is a formal representation of domain knowledge with distinct concepts and relations among concepts, an Active Ontology is a processing formalism where distinct processing elements are arranged according to ontology notions; it is an execution environment”30 . These active ontologies not only encode entities and relations, but entities are associated with states and relations are associated with rule sets that perform actions within and among entities. Through a bottom up execution, input states are incrementally propagated up the hierarchy to impact higher-level entities, whose states are output as the answer to the initial question – the best prediction based on the inputs. For example, try asking Siri to “Find a good sushi restaurant for two tonight”. This query is translated by setting the states of several entities: style is set to ‘sushi’, address to the user’s current location,
  • 7. party size to the value ‘2’, and event date to today’s date (Figure 2). These values are propagated through the ontology to generate a list of restaurants, which becomes the state of the event entity. This event result can then be provided to the user or included in further computations. In Aim 2, we will explore whether such systems can teach us how to develop question-and-answer, or genotype-to- phenotype prediction, systems for cell biology31 . Cell-cell interaction networks. We will also develop technology for understanding network structure above the cellular level. In so-called cell-cell interaction networks, nodes are cells and edges represent physical or chemical (e.g. hormonal) interactions. Chemical interactions are of greatest interest as they describe inter-cellular signaling and regulation pathways, which could in the future be controlled to grow artificial organs, heal tissues and develop novel therapies. Increasing information in this area will enable network science to contribute to the wider understanding of physiological systems. We have gained experience and interest in this area via analysis of two novel experimentally mapped cell-cell interaction networks of the developing human hematopoietic system32,33 in collaboration with Peter Zandstra at the University of Toronto (Zandstra is now a DBP). The Zandstra lab is interested in mapping inter-cellular networks and feedback in regulating stem and progenitor cell fate for the purposes of growing blood from stem cells, which would be safer than blood donations. Cell-cell interaction networks demand new analysis tools that consider their autocrine and paracrine structure and how they are controlled by intra-cellular molecular networks. Despite the recognized importance of inter-cellular networks and feedback in regulating multicellular organism development, the specific cell populations involved and underlying molecular mechanisms are largely undefined. For example, blood cells are known to secrete and respond to a large number of regulatory proteins in lineage- and differentiation stage-specific patterns34,35 . Dynamic mathematical models of cells patterning into tissues during development have been built36-38 , but they function at the cell population/tissue level and treat cells as a compartment or spatial gradient and do not consider actual cell-cell interactions. Perhaps the best-studied cell-cell interaction network is that of the worm, Caenorhabditis elegans, which has been completely mapped over organism development by microscopy. Network analysis by clustering found that interneurons are more densely connected in the nervous system compared to sensory or motor neurons, leading to the interpretation that these cells act as central processing units39 . More recent work predicted cell-cell networks involved in cancer therapy resistance40 , and found that specific network motifs are enriched in inter-cellular cytokine mediated communication networks41 and that specific components are more important than others42 , however this work has thus far studied small cell-cell network models that were never experimentally validated. As technology for single cell and stem cell measurement improves, we expect a growth in the amount of cell-cell network information. We are already observing this growth in projects such as a new CSP from Laurie Ailles at the University Health Network in Toronto, who is studying how cancer-associated fibroblasts provide a supportive microenvironment for cancer stem cells within high-grade serous ovarian cancer and other cancers. New technology she has developed quantifies the protein levels of 363 cell surface antigens in single cell populations43 . INNOVATION Central innovation and hypothesis. The central innovation of this TRD project is a set of ideas and approaches for transitioning Network Biology from the current status-quo of flat, pairwise, and descriptive representations of biological interactions, to a future in which the same interaction data lead to the construction of hierarchical models of biological structure and function. We will explore the hypothesis that current network representations, which view a dataset of pairwise interactions as a mathematical graph of nodes and edges, may be “too close” to the raw data to allow for complete or even accurate biological insight. Models derived from the same interactions, such as gene ontologies and biological process diagrams, may form a more intuitive result, provided these multi-scale formulations can avoid the tendency towards over-fitting or -interpretation. The most direct representations of data are not always the most desirable for meaningful interpretation of those data. In x-ray crystallography, the most direct representations of x-ray
  • 8. diffraction patterns are two-dimensional images44 . However, when many such images are integrated and analyzed, exquisite 3D structural models of proteins emerge which, in turn, enable accurate predictions of protein dynamics and function. Similarly, from many molecular measurements and interaction data sets the higher order structure and function of the cell might emerge, if only we could figure out how to assemble these images properly. Turning networks into ontologies: towards a Network-eXtracted Ontology. Recently we and others have shown very promising results in the hierarchical analysis of physical and genetic networks—i.e., that networks harbor rich structure which is not only modular but also hierarchical and multi-scale45-50 (Aim 1 Progress Report). In particular, we have been able to recover ~60% of the hierarchical GO Cellular Component hierarchy de novo, directly from physical and genetic network data gathered in S. cerevisiae and in a manner that is completely independent from the known structure of GO or from the literature. The resulting Network-eXtracted Ontology, which we call NeXO, provides a structured hierarchical interpretation of network data which will in most cases be vastly preferable to flat lists of interaction (a.k.a. interaction ‘hairballs’) or flat lists of network clusters/complexes. The focus of Aim 1, and an innovative aspect of this proposal, is to explore how these ontologies can be iteratively updated by a community of biomedical investigators. 3.1 ASSEMBLY AND REFINEMENT OF ONTOLOGY STRUCTURE FROM BIOLOGICAL NETWORK DATA Project Leader: Trey Ideker (UCSD) Overview. Ontologies have been very successful at capturing hierarchical, multi-scale cellular organization. In the prior period of support we introduced methods for assembling a gene ontology directly from the hierarchical structure contained in molecular networks and other ‘omics data. We prototyped a web resource to distribute network-based ontologies (NeXO, nexontology.org), but it is still at an early stage. In the next support period, we will research methods to iteratively and flexibly incorporate new experimental results and data into a ‘working’ NeXO ontology, highlighting new terms and term relations that are created, alongside existing terms/relations that are further supported or weakened. We will transform nexontology.org to an interactive community resource that enables investigators not only to browse an existing ontology but to create, share, and iteratively update, revise, correct, and expand these ontologies. These tools will be built and explored alongside Driving Biomedical Projects including a Yeast Gene Ontology, a Cancer Gene Ontology, a Viral-Host Gene Ontology and a hierarchical exploration of social networks. The goal is a means of systematically incorporating ‘omics data into whole-cell ontological models, with the potential to systematize and crowd-source an important type of model construction. Preliminary Results and Progress Report: Proof-of-concept and maturation of a Network- eXtracted Ontology (NeXO). The previous award supported research by NRNB investigators that led to creation and prototyping of the first gene ontology inferred from ‘omics data, the NeXO Resource28,51 (http://nexontology.org). This work fell naturally under previous TRD-C: Visualization and Representation of Biological Networks. NeXO provides a methodology whereby physical and genetic network data can be transformed to assemble a structured ontology of protein complexes. Using this system, we assembled an ontology based on four large yeast networks capturing current knowledge of physical protein-protein interactions, genetic interactions (synthetic-lethality and epistasis), co-expressed genes, as well as an integrated functional network known as YeastNet52 . The resulting Network-eXtracted Ontology (NeXO) contains a total of 4,123 terms and 7,804 term-term relationships (Figure 3). Based on alignment of the systematic NeXO to the literature-curated Gene Ontology (GO), it appears that NeXO captures ~60% of terms in the Cellular Component branch of GO. To further validate NeXO vs. GO, we have used both ontologies to perform functional enrichment of gene sets, the task to which GO is most often applied. In this regard, NeXO performs at least as well as GO for functional enrichment in several different genome-scale data sets. Thus, the computed ontology provides functionally-relevant terms which cover a wide spectrum of yeast biology to an
  • 9. extent comparable to manually-curated efforts. Since the original proof-of-concept work was published in early 201328 , we have released a visually integrated website for browsing NeXO and GO ontologies in the style of Google Maps51 . This summer we published a major improvement to the ontology inference algorithm53 which was presented and well-received at the Intelligent Systems in Molecular Biology (ISMB 2014) conference. Progress Report Publications 1-11. Methods Basic inference of ontologies and alignment to a reference. To construct a data-driven ontology, a set of input features is first gathered for each gene, representing information collected from ‘omics studies such as its interaction partners in molecular networks, its expression levels over time or conditions, or other data depending on the DBP. These features are analyzed to generate a pairwise gene-gene similarity matrix, in which the similarity between two genes reflects their closeness in input features. Many methods have been proposed for this purpose54-56 , presently we have been successful with the technique of random forest regression57 . The pairwise similarity matrix is then clustered (Figure 4) using either of several algorithms we have published in prior work28,53 . For example, our original method is to use a hierarchical probabilistic model for community detection50,58 which constructs a binary tree, or dendrogram, seeking to maximize the overall probability of the network data by iteratively joining sets of genes with similar patterns of interaction. Gene sets, represented by nodes in the tree, are suggestive of biological entities or ‘terms’ in an ontology. Joining of two sets, represented by connecting two nodes beneath a third, suggests specialized terms that are part of a more general one. The tree is then expanded to allow for creation of terms with multiple (>2) children and/or parents which is important for identifying complexes with many subunits or which participate in multiple parent processes [transforming the hierarchical tree into a directed acyclic graph— we do not detail this method here but it involves evaluating the probability of the network under the new vs. old structure]. This method yields a novel structure that we call the Network-Extracted Ontology, or NeXO, in which genes are organized under a hierarchy of terms and parent-child term relations strongly supported by the input datasets. At this stage terms simply represent structures detected in data and are given systematic IDs, much like ORFs detected in a newly-sequenced genome. To annotate these terms with information from known biology, the NeXO structure is aligned against a reference ontology, Figure 3. Building the NeXO ontology. The ontology is reduced to a tree, with nodes indicating terms and edges indicating hierarchical relations between terms, i.e. that one term contains another. Node sizes indicate the number of genes assigned to a term. Node colors represent the degree of correspondence to a term in GO as determined by ontology alignment, with high-level alignments labeled. Insets show the hierarchy identified for the ribosome and actin cytoskeleton.
  • 10. much like ORFs are annotated by alignment against a reference genome whose genes are well- annotated. As in past work, our default reference ontology for this step will be the literature-curated Gene Ontology. The desired result of aligning NeXO and GO is to identify NeXO terms that correspond to well-known versus novel structures, as well as GO terms that are well-supported by the available data. For high confidence matches, the GO annotations are transferred to the NeXO term, including the term name and description. Terms that are novel (similar to ‘ORFaned’ genes) may become extremely interesting for further biological exploration and experimental follow-up. Although methods for ontology alignment have not received much attention in molecular biology or bioinformatics, they are under active research in the computer science and semantic web communities. We will implement an ontology alignment algorithm based on a previously-proposed method called ASMOV59 , which was the winning ontology alignment algorithm in the 2010 Ontology Alignment Evaluation Initiative (om2010.ontologymatching.org/). The method was designed to align semantic ontologies, and it is based on a score function that measures the lexical similarity of text labels and comments associated with terms. Hence, we will modify and expand this approach to align ontologies in which the terms refer to sets of genes (technically, the set of genes assigned to a term defines the ‘label’ of that term). Application of current and new procedures for data-driven ontologies to Driving Projects. We will begin work immediately to construct and/or revise data-driven ontologies with each of our Driving Biomedical Projects, an activity that is expected to continue for most of the next five-year performance period. The projects are: 1. Creating new terms and term relations in the Gene Ontology. Our previous efforts to infer gene ontologies from network data were initially carried out as a Collaboration and Service Project (CSP) with Mike Cherry, Professor of Genetics at Stanford and head of the Gene Ontology Consortium for the Saccharomyces model organism. Together with Cherry, we will continually apply tools developed in this TRD to revise and expand the yeast NeXO based on new data, and to communicate the most promising new terms and term relations it identifies to the Saccharomyces GO. 2. Elucidating the hierarchy of modules in the virus-human protein interaction network. Dr. Nevan Krogan at UCSF is a world leader in generating large-scale maps of protein complexes based on affinity purification mass spectrometry as well as in systems for synthetic lethal genetic interaction screening. NRNB and Krogan have a long-standing relationship in developing physical and genetic interaction maps of biological systems of interest11,12,18,60-62 , including the original NeXO paper28 . We expect this productive relationship to continue as we develop tools for data-driven assembly and refinement of gene ontologies within this TRD, initially as applied to physical and genetic interactions of viral protein subunits with proteins encoded by the human host. A A B A A F i g u r e X . A u t o m a t e d a s s e m b l y a n d a l i g n m e n t o f g e n e Figure 4. Automated assembly and alignment of gene ontologies. (A) Probabilistic community detection within the input networks yields a binary tree in which nodes correspond to ontology terms and links correspond to parent-child term relations. Unsupported terms are replaced by multi-way joins, and additional parent-child relations are added based on network data. The resulting ontology is aligned against the Gene Ontology, in a way that (B) prohibits non-unique mappings and ancestor-descendant criss-crossing.
  • 11. 3. Gene ontology inference based on binding-site-resolved ‘edgetic’ protein networks. Drs. Marc Vidal and David Hill are pioneers in protein interaction mapping via the yeast-two-hybrid system. Recently they developed the capability to map interactions at binding site resolution, by using modular protein domains as baits combined with phage display knowledge of the preferred binding motif of each domain. We will together explore whether this binding interface information can be used to inform the inferred gene ontology structures we are building in this TRD. 4. Hierarchical analysis of cancer subtypes with TCGA / ICGC and Sage Bionetworks. Cancer genomics projects are generating large cancer specific ‘omics data sets. Therefore, natural DBPs for this project are provided by The Cancer Genome Atlas, International Cancer Genome Sequencing Consortium, and Sage Bionetworks, all of which are associated with major cancer genomics projects nationally and internationally. Our focus will be to construct a Cancer Gene Ontology based on a pan- cancer analysis of data from all ~20 major TCGA tissue types. Such a Cancer GO would provide insight into the hierarchy of biological processes and cellular components that is somatically mutated or differentially activated during cancer progression. 5. Understanding the multi-scale hierarchy of social interactions. We will work with UCSD Professor James Fowler, a renowned social networks researcher, to apply the hierarchical methods developed in this aim to analyze the structure of a large social network generated from the Framingham Heart Study. This study has surveyed health behaviors, disease outcomes, and social relationships among >12,000 people for over 37 years25-27 . During these collaborations, we will experiment with ontologies constructed with different sources and types of data, e.g. using genetic interactions only versus those that also include physical interactions and other types. Such exploration is needed to evaluate which interaction types are most revealing of cellular componentry such as protein complexes and larger macro-molecular structures, and how to weight genetic versus physical interactions for this purpose. We will seek to determine how much interaction data one needs to construct a robust ontology for each of the DBP datasets, e.g., one which is able to faithfully recover a substantial fraction of knowledge in the manually-curated GO. At present, what we know is that this is possible using an integrated network including all genetic and physical interactions that have been mapped to-date for budding yeast. Development of iterative procedures for incorporating new data into a data-driven ontology. We will conduct a major program of exploratory research and development on approaches by which data- driven gene ontologies such as NeXO can evolve over time, by incorporating new datasets as they are generated and published. We will begin by evaluating a relatively straightforward approach, which is to integrate the new dataset(s) into the pairwise gene similarity matrix which forms the input to the ontology inference method (see above). Once the similarities have been adjusted, an ‘updated’ ontology is constructed based on the old+new data and aligned against the ‘previous’ ontology based on old data only. Similar to alignment against GO (see above), the desired result is to identify terms and term relations in the updated ontology that are newly-created as well as previous terms / relations that are reinforced by the new data. Ultimately one might also imagine downgrading or retiring terms that have remained unsupported over many diverse dataset updates, but this is admittedly a more delicate proposition than adding new terms. A limitation of this simple update approach is that the complete ontology must be reconstituted each time a new data set is evaluated. An alternative and more optimal approach may be to directly modify the previous ontology using information from the new data set. We will explore both simple and these more advanced approaches in the course of research. Given an update procedure, the experimentalist may wish to design further studies aimed at the new terms. These specially directed new data could then spawn another ontology update, enabling the exciting possibility of continued iteration between improving the ontology (aka the biological model) and the experimental data generation phases of a study.
  • 12. An online system for distribution and community construction of data-driven ontologies. Ontology models developed with our DBPs will be made available to the scientific community via query from the stand-alone NeXO website, nexontology.org, as well as through a specialized App for Cytoscape. We will also prototype a web-based system whereby a unified and common ‘Crowd-Sourced NeXO Ontology’ can be iteratively updated from biological data sets uploaded by investigators from the biomedical research community at large. Achieving this vision will require the addition of major features to nexontology.org, including user accounts, data upload, and a cloud-based implementation of ontology inference. If successful, we will seek to transition the new website to independent funding to support what could ultimately become a large community of users. The allure of such a system is that the wealth of ‘omics data being generated every year could be analyzed to assemble different types of gene ontology systematically, with less and less reliance on back curation of the literature. Ultimately, the desired outcome is to enable a shift from using ontologies to evaluate data to using data to construct and evaluate ontologies—that is, from a regime in which the ontology is viewed as gold standard to one in which it is the major result. 3.2 FUNCTIONALIZED GENE ONTOLOGIES AS A HIERARCHY OF PHENOTYPIC PREDICTION Project Leader: Trey Ideker (UCSD) Overview. Whether based on expert knowledge (GO) or inferred from data (NeXO in Aim 1), current gene ontologies are static descriptions of cellular structure and organization. They enable representing and reasoning on the structural relationships among biological entities27,29 but lack any native capacity to capture dynamic functional states or make phenotypic predictions. However, since gene ontologies inherently represent multi-scale hierarchy in cellular organization, they provide in theory an ideal substrate for building models that would also be predictive of a range of cellular functions and phenotypes. In this respect, question and answer systems developed in the field of knowledge representation and reasoning26 , such as Apple’s Siri and IBM’s Watson, provide an excellent example of what a predictive, or ‘executable’, ontology looks like. In this aim, we will explore whether such systems can teach us how to develop predictive systems for cell biology31 . This aim intersects with the separate TRD project on Predictive Networks and serves as a bridge between the two TRDs. Preliminary Results and Progress Report: Activating static networks as predictive models. The Ideker laboratory has over the years introduced a progression of approaches that seek to use molecular network information to guide the prediction of phenotypic outcomes such as disease state or drug response. Relevant works include ActiveModules63 , Network-Based Classification64 , Network- Guided Forests65 , Network-Based Stratification66 and several influential reviews on using networks predictively67,68 . The more recent works (2011 to present) were supported by the past period of NRNB funding. Generally, our methodology has been to identify subnetworks of genes whose expression levels (molecular profile) or mutation states (genotype) can be functionally combined to predict disease outcome (phenotype or class). For example, Network-Guided Forests is a classification method that associates subnetworks of genes with decision trees that evaluate the expression levels of those genes to predict sample class. Such approaches have shown success in classification of metastatic vs. non-metastatic breast cancer64 , aggressive vs. indolent leukemia69 , as well as classification of cell fate decisions during development16,65 . We have found repeatedly that, unlike the gene sets identified by regular classifiers, the subnetworks identified by network-based methods are highly enriched for causal factors of disease, and they show very consistent performance across different sample datasets. Progress Report Publications 12-17. Methods Taking clues from Siri: propagation of state on predictive ontologies. We will explore use of the structure of ontologies, rather than the structure of networks, in making phenotypic predictions. The key distinction is that networks are concerned mainly with pairwise associations between genes,
  • 13. whereas ontologies represent hierarchical relations across a range of biological modules at various scales including genes and proteins, protein complexes, pathways and processes, and organelles. Question and answering systems such as Apple’s Siri provide a useful model of how hierarchical relations in an ontology can propagate state information. Unlike current gene ontologies which are descriptive, Siri’s ontologies are coupled with dynamic reasoning systems that render them active: “Whereas a conventional ontology is a formal representation of domain knowledge with distinct concepts and relations among concepts, an Active Ontology is a processing formalism where distinct processing elements are arranged according to ontology notions; it is an execution environment”30 . These active ontologies not only encode entities and relations, but entities are associated with states and relations are associated with rule sets that perform actions within and among entities. During execution, input states are incrementally propagated up and down the hierarchy to impact other entities, whose states provide the answer to the initial question – the best prediction based on the inputs. How the ontologies within Siri are used to answer questions, however, is very different from how GO is used today in bioinformatics. Typically, GO terms are associated with a set of genes (annotations), but not with dynamic states; the relationships between GO terms are not associated with rule sets that perform actions, at least beyond propagation of gene set annotations. Given this similarity, we will explore construction of such an ‘active’ gene ontology as a general engine for genotype-phenotype translation. Genotype-to-phenotype prediction challenges from Driving Biological Projects. We will base our methods development on data and prediction challenges motivated by DBPs in yeast (Cherry DBP) and cancer (TCGA DBP). Yeast has by far the largest number of genotype-phenotype measurements of any organism: most single and double gene knockout strains have been constructed and assayed for growth, yielding over 10 million ‘simple’ genotypes systematically tested for the same phenotype70- 72 . In addition, hundreds of natural yeast genetic isolates have been fully sequenced and extensively phenotyped, providing examples of complex genotype backgrounds73 . In cancer, TCGA currently has tumor exomes available for over 8000 cancer patients (genotypes), along with clinical information such as survival time, tumor grade, and in some cases drug response (phenotypes). In both yeast and cancer, the goal is to predict the phenotype of growth, survival, etc. given the genotype of a strain or patient. Transformation of genotype to ‘ontotype’. The genotype indicates the set of mutation states of all genes, which for each gene might be represented simply as {mutated, wildtype} or {loss-of-function, wild-type, gain-of-function} before considering more precise values. We will prototype propagation approaches by which these states on genes can be integrated with a gene ontology to infer corresponding states on terms. For example, since the gene SWI4 encodes a subunit of the SBF complex, the yeast swi4Δ genotype {Swi4 <= loss-of-function} might propagate upwards in the ontology to set the state of the parent term {SBF transcription complex <= loss-of-function}, and continue to propagate upwards to affect ancestor terms at higher scales such as ‘RNA pol II transcription factor complex’ and ultimately ‘nucleus’ and ‘cell’. We call the set of mutation states of all terms the ‘ontotype.’ For prediction problems, the ontotype and genotype can then be used together or separately as a set of features for classification of a phenotypic class, e.g. {alive, dead}, or regression against a quantitative phenotype, e.g. numerical growth rate or progression-free time interval. Alternatively, the state of any particular term, representing a cellular component or process, can itself be considered as the phenotype of interest. Predictions will be benchmarked using metrics such as ROC and PR curves along with standard statistical techniques such as cross-validation or bootstrapping. Open questions and milestones. A major research question will be to determine how to dynamically compute the states of ontology terms based on the states of their children, parents, descendants, and ancestors. The underlying mathematical function could take many forms, including logic gates such as AND / OR, linear or additive functions, probabilistic functions, or polynomial or logistic equations. How
  • 14. to determine the specific forms and parameters of these functions, regardless of what form they take, is also unclear. This step could happen by statistical association from many input-output examples using machine learning methods, by including externally generated biological knowledge specific to each entity, or by manual curation from literature. As this aim is quite exploratory, we do not include specific algorithmic plans or mathematical details here. Some important milestones for success, however, will be (1) a proof-of-principle bioinformatic method for propagating molecular profiles on a gene ontology to predict a phenotypic outcome, and (2) implementation of this method in a robust software tool as a Cytoscape App. 3.3 BRIDGING LIGAND-RECEPTOR NETWORKS TO CELL-CELL COMMUNICATION NETWORKS Project Leader: Gary Bader (University of Toronto) Overview. Cell-cell interaction networks are an emerging area of network science. In collaboration with the Zandstra DBP, which is mapping cell-cell interaction networks in the hematopoietic system to help engineer blood tissue, we will develop novel technology for cell network analysis. We will develop methods to infer cell-cell interaction networks from molecular profiling data of purified cell populations, cell-cell interaction network topology analysis software, methods to identify intracellular pathways that control cell-cell interactions and methods to visualize multi-scale models of inter-cellular communication networks and their intracellular signaling systems. Preliminary Results and Progress Report. In the past funding period, we worked with the Zandstra lab to prototype cell-cell interaction network inference methods and their analysis. Two papers were published in Molecular Systems Biology that experimentally mapped novel cell-cell interaction networks for the purpose of identifying growth and inhibitory factors that modulate self-renewal, which is useful for blood stem cell control. The second paper included network topology analysis and discovered that ligand production is cell type dependent, whereas ligand binding is promiscuous. Consequently, additional control strategies such as cell frequency modulation and compartmentalization were needed to achieve specificity in HSC fate regulation. These proof-of- concept methods now need to be further developed to extend and streamline their use, as described below. Progress Report Publications 20,118. Methods Cell-cell interaction network inference from single cell population molecular profiles. Cell-cell interaction networks are currently mapped by inferring regulatory relationships based on the expression of transmitters and receptors at the cell surface. For instance, if cell type A expresses the epidermal growth factor peptide hormone and cell type B expresses the epidermal growth factor receptor protein, and there is a means to transmit the hormone to the target receptor (e.g. by diffusion within a tissue or in the blood stream), then a directional edge is inferred from cell type A to cell type B. This process depends on the availability of relatively pure cell populations and ability to measure the expression of their secreted and surface proteins, both of which are practical with current technology43,74,75 . We will develop technology to automatically process mRNA and protein expression profiles from cell populations into cell-cell interaction networks using the following steps: 1. Identify all known ligands and receptors based on known gene function annotation. For instance, using gene ontology terms “cytokine activity,” “growth factor activity,” “hormone activity,” and “receptor activity,” genes with ligand or receptor activity will be compiled from the Ensembl BioMart web service76 . 2. Collect all known protein interactions between ligands and receptors (e.g. from iRefIndex77 , GeneMANIA78 , Pathway Commons79 and related comprehensive interaction resources). We have previously literature curated ~270 ligand-receptor pairs not currently in standard databases and these will also be included32,33 .
  • 15. 3. Compile a list of expressed ligands and receptors from each available cell type population, based on available gene or protein expression data43,74,75 . We will prefer protein expression information, but will use mRNA expression levels a proxy when protein levels are not available (with appropriate caveats). 4. Infer directed regulatory edges between expressed ligand and receptor pairs. 5. Visualize the resulting cell-cell interaction network. Preliminary work successfully used this approach, but we will develop it into a generally applicable technology that can be conveniently automatically updated. Our initial focus will be on available human data, but the technology will be applicable to any organism with enough information available. Discovery of key players and rules of cell-cell interaction networks. We will develop technology to make it easy for biologists to computationally analyze the topological properties of cell-cell interaction networks to help identify key control points and general organizational principles. We will use multiple established measures of node importance in networks (centrality measures), including hub detection (find highly connected nodes that when removed cause the network to split into parts80 ) and betweenness centrality (find important connection points between different network regions81 ). This analysis will be accomplished using the CytoHubba, CentiScaPe and/or NetMatch network analyzer Cytoscape apps, which we will tailor to function on directed cell-cell interaction networks. In particular, selected network analysis functions in these apps will be published as Cytoscape commands so they can be made available in a cell-cell interaction network analysis app that we will develop. Identify intracellular pathways that control and are controlled by cell-cell interactions. We will develop novel computational methods to explain how signals observed to occur between cells are controlled by and control internal molecular networks and pathways. First, we will gather an intracellular network of physical molecular and control interactions between all identified receptors and secreted chemical signal genes from available molecular interaction and pathway databases (e.g. iRefIndex, GeneMANIA, Pathway Commons). We will then use established path finding algorithms (e.g. as implemented in Cytoscape apps such as PathExplorer and in the Pathway Commons web service system) to identify potential signaling pathways that control chemical signal secretion, and links from activated receptors to activation of pathways in target cells. Paths will be limited to genes expressed in the given cell population. To identify pathways that are controlled by a given cell-cell communication path, we will apply pathway enrichment analysis to downstream molecules in target cells. Thus, we will predict how inter-cellular signaling impinges on intracellular systems, which in turn could impinge on additional cell-cell signaling paths. We will also use the Pathway Extraction and Reduction Algorithm (PERA) method described in TRD1 to identify signaling systems involving cell- cell communication factors. Multi-scale visualization of cell-cell interaction networks in the context of internal molecular networks. We will develop novel multi-scale network visualization methods to help interpret networks generated in this aim. In particular, we will group ligand and receptor families (using Cytoscape’s grouping function) to reduce complexity of the resulting network, based on family information in the Gene Ontology. We will also develop methods to display intracellular molecular paths, where nodes represent genes, within nodes representing cells. These paths will also connect to intracellular nodes representing pathways to visualize which pathways are activated by specific cell-cell communication signals. Links with other TRDs. As the active collection of molecular profiles for secreted and receptor protein expression grows, we expect data sets to become available that cover multiple time points and samples (e.g. disease patients and healthy controls). Thus, we will develop multi-scale cell-cell interaction networks across conditions and use technology developed in the Differential Networks TRD to compare them. We will also explore how patient specific versions of these networks can be used as predictive features in work described in the Predictive Networks TRD.
  • 16. TRD 3: MULTI-SCALE NETWORKS – BIBLIOGRAPHY AND REFERENCES CITED 1. Mathews, C.K. The Cell-Bag of Enzymes or Network of Channels? J Bacteriol 175, 6377-81 (1993). 2. Reddy, G.P., Singh, A., Stafford, M.E. & Mathews, C.K. Enzyme Associations in T4 Phage DNA Precursor Synthesis. Proc Natl Acad Sci U S A 74, 3152-6 (1977). 3. Monod, J., Changeux, J.P. & Jacob, F. Allosteric Proteins and Cellular Control Systems. Journal of Molecular Biology 6, 306-& (1963). 4. Kauffman, S.A. The Origins of Order : Self-Organization and Selection in Evolution, xviii, 709 p. (Oxford University Press, New York, 1993). 5. Pollack, G.H. Cells, Gels and the Engines of Life : A New, Unifying Approach to Cell Function, xiv, 305 p. (Ebner & Sons, Seattle, WA, 2001). 6. Bray, D. Wetware : A Computer in Every Living Cell, xii, 267 p. (Yale University Press, New Haven ; London, 2009). 7. Barabasi, A.L. & Oltvai, Z.N. Network Biology: Understanding the Cell's Functional Organization. Nat Rev Genet 5, 101-13 (2004). 8. Mitra, K., Carvunis, A.R., Ramesh, S.K. & Ideker, T. Integrative Approaches for Finding Modular Structure in Biological Networks. Nat Rev Genet 14, 719-32 (2013). 9. Ashburner, M. et al. Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium. Nat Genet 25, 25-9 (2000). 10. Novarino, G. et al. Exome Sequencing Links Corticospinal Motor Neuron Disease to Common Neurodegenerative Disorders. Science 343, 506-11 (2014). 11. Bandyopadhyay, S. et al. Rewiring of Genetic Networks in Response to DNA Damage. Science 330, 1385-9 (2010). 12. Roguev, A. et al. Conservation and Rewiring of Functional Modules Revealed by an Epistasis Map in Fission Yeast. Science 322, 405-10 (2008). 13. Workman, C.T. et al. A Systems Approach to Mapping DNA Damage Response Pathways. Science 312, 1054-9 (2006). 14. Konig, R. et al. Human Host Factors Required for Influenza Virus Replication. Nature 463, 813-7 (2010). 15. Suthram, S., Sittler, T. & Ideker, T. The Plasmodium Protein Network Diverges from Those of Other Eukaryotes. Nature 438, 108-12 (2005). 16. Ravasi, T. et al. An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man. Cell 140, 744-52 (2010). 17. Bandyopadhyay, S. et al. A Human Map Kinase Interactome. Nat Methods 7, 801-5 (2010). 18. Guenole, A. et al. Dissection of DNA Damage Responses Using Multiconditional Genetic Interaction Maps. Mol Cell 49, 346-58 (2013). 19. Begley, T.J., Rosenbach, A.S., Ideker, T. & Samson, L.D. Hot Spots for Modulating Toxicity Identified by Genomic Phenotyping and Localization Mapping. Mol Cell 16, 117-25 (2004). 20. Srivas, R. et al. A Uv-Induced Genetic Network Links the Rsc Complex to Nucleotide Excision Repair and Shows Dose-Dependent Rewiring. Cell Rep 5, 1714-24 (2013). 21. Jaehnig, E.J., Kuo, D., Hombauer, H., Ideker, T.G. & Kolodner, R.D. Checkpoint Kinases Regulate a Global Network of Transcription Factors in Response to DNA Damage. Cell Rep 4, 174-88 (2013). 22. Chuang, H.Y., Hofree, M. & Ideker, T. A Decade of Systems Biology. Annu Rev Cell Dev Biol 26, 721-44 (2010). 23. Walhout, A.J.M., Vidal, M. & Dekker, J. Handbook of Systems Biology : Concepts and Insights, xiii, 538 p. (Waltham Academic Press, London ;, 2013). 24. Koller, D. & Friedman, N. Probabilistic Graphical Models : Principles and Techniques, xxi, 1231 p. (MIT Press, Cambridge, MA, 2009). 25. Gruber, T.R. Toward Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human-Computer Studies 43, 907-928 (1995).
  • 17. 26. Brachman, R.J. & Levesque, H.J. Knowledge Representation and Reasoning, xxix, 381 p. (Morgan Kaufmann, Amsterdam ; Boston, 2004). 27. Robinson, P.N. & Bauer, S. Introduction to Bio-Ontologies, xxvii, 488 p. (Taylor & Francis, Boca Raton, 2011). 28. Dutkowski, J. et al. A Gene Ontology Inferred from Molecular Networks. Nat Biotechnol 31, 38- 45 (2013). 29. Myhre, S., Tveit, H., Mollestad, T. & Laegreid, A. Additional Gene Ontology Structure for Improved Biological Reasoning. Bioinformatics 22, 2020-7 (2006). 30. Guzzoni, D., Baur, C. & Cheyer, A. Active: A Unified Platform for Building Intelligent Web Interaction Assistants. 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Workshops Proceedings, 417-420 (2006). 31. Wren, J.D. Question Answering Systems in Biology and Medicine--the Time Is Now. Bioinformatics 27, 2025-6 (2011). 32. Qiao, W. et al. Intercellular Network Structure and Regulatory Motifs in the Human Hematopoietic System. Molecular systems biology 10, 741 (2014). 33. Kirouac, D.C. et al. Dynamic Interaction Networks in a Hierarchically Organized Tissue. Mol Syst Biol 6, 417 (2010). 34. Billia, F., Barbara, M., McEwen, J., Trevisan, M. & Iscove, N.N. Resolution of Pluripotential Intermediates in Murine Hematopoietic Differentiation by Global Complementary DNA Amplification from Single Cells: Confirmation of Assignments by Expression Profiling of Cytokine Receptor Transcripts. Blood 97, 2257-68 (2001). 35. Majka, M. et al. Numerous Growth Factors, Cytokines, and Chemokines Are Secreted by Human Cd34(+) Cells, Myeloblasts, Erythroblasts, and Megakaryoblasts and Regulate Normal Hematopoiesis in an Autocrine/Paracrine Manner. Blood 97, 3075-85 (2001). 36. von Dassow, G., Meir, E., Munro, E.M. & Odell, G.M. The Segment Polarity Network Is a Robust Developmental Module. Nature 406, 188-92 (2000). 37. Kondo, S. Cell-Cell Interaction Network That Generates the Skin Pattern of Animal. Genome Inform 16, 287-91 (2005). 38. De Matteis, G., Graudenzi, A. & Antoniotti, M. A Review of Spatial Computational Models for Multi-Cellular Systems, with Regard to Intestinal Crypts and Colorectal Cancer Development. Journal of mathematical biology 66, 1409-62 (2013). 39. Eckmann, J.P. & Moses, E. Curvature of Co-Links Uncovers Hidden Thematic Layers in the World Wide Web. Proc Natl Acad Sci U S A 99, 5825-9 (2002). 40. Komurov, K. Modeling Community-Wide Molecular Networks of Multicellular Systems. Bioinformatics 28, 694-700 (2012). 41. Frankenstein, Z., Alon, U. & Cohen, I.R. The Immune-Body Cytokine Network Defines a Social Architecture of Cell Interactions. Biol Direct 1, 32 (2006). 42. Tieri, P. et al. Quantifying the Relevance of Different Mediators in the Human Immune Cell Network. Bioinformatics 21, 1639-43 (2005). 43. Gedye, C.A. et al. Cell Surface Profiling Using High-Throughput Flow Cytometry: A Platform for Biomarker Discovery and Analysis of Cellular Heterogeneity. PLoS ONE 9, e105602 (2014). 44. McPherson, A. Introduction to Macromolecular Crystallography, x, 267 p. (Wiley-Blackwell, Hoboken, N.J., 2009). 45. Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N. & Barabasi, A.L. Hierarchical Organization of Modularity in Metabolic Networks. Science 297, 1551-5 (2002). 46. Dotan-Cohen, D., Letovsky, S., Melkman, A.A. & Kasif, S. Biological Process Linkage Networks. PLoS One 4, e5313 (2009). 47. Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing Modularity and Organization in the Yeast Molecular Network by Integrated Analysis of Highly Heterogeneous Genomewide Data. Proc Natl Acad Sci U S A 101, 2981-6 (2004). 48. Kelley, R. & Ideker, T. Systematic Interpretation of Genetic Interactions Using Protein Networks. Nat Biotechnol 23, 561-6 (2005).
  • 18. 49. Jaimovich, A., Rinott, R., Schuldiner, M., Margalit, H. & Friedman, N. Modularity and Directionality in Genetic Interaction Maps. Bioinformatics 26, i228-36 (2010). 50. Park, Y. & Bader, J.S. Resolving the Structure of Interactomes with Hierarchical Agglomerative Clustering. BMC Bioinformatics 12 Suppl 1, S44 (2011). 51. Dutkowski, J. et al. Nexo Web: The Nexo Ontology Database and Visualization Platform. Nucleic Acids Res 42, D1269-74 (2014). 52. Lee, I., Li, Z. & Marcotte, E.M. An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces Cerevisiae. PLoS One 2, e988 (2007). 53. Kramer, M., Dutkowski, J., Yu, M., Bafna, V. & Ideker, T. Inferring Gene Ontologies from Pairwise Similarity Data. Bioinformatics 30, i34-42 (2014). 54. Jensen, L.J. et al. String 8--a Global View on Proteins and Their Functional Interactions in 630 Organisms. Nucleic Acids Res 37, D412-6 (2009). 55. Lee, I., Date, S.V., Adai, A.T. & Marcotte, E.M. A Probabilistic Functional Network of Yeast Genes. Science 306, 1555-8 (2004). 56. Jansen, R. et al. A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data. Science 302, 449-53 (2003). 57. Breiman, L. Random Forests. Machine Learning 45, 5-32 (2001). 58. Clauset, A., Moore, C. & Newman, M.E. Hierarchical Structure and the Prediction of Missing Links in Networks. Nature 453, 98-101 (2008). 59. Jean-Mary, Y.R., Shironoshita, E.P. & Kabuka, M.R. Ontology Matching with Semantic Verification. Web Semant 7, 235-251 (2009). 60. Ryan, C.J. et al. Hierarchical Modularity and the Evolution of Genetic Interactomes across Species. Mol Cell 46, 691-704 (2012). 61. Wilmes, G.M. et al. A Genetic Interaction Map of Rna-Processing Factors Reveals Links between Sem1/Dss1-Containing Complexes and Mrna Export and Splicing. Mol Cell 32, 735- 46 (2008). 62. Hannum, G. et al. Genome-Wide Association Data Reveal a Global Map of Genetic Interactions among Protein Complexes. PLoS Genet 5, e1000782 (2009). 63. Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A.F. Discovering Regulatory and Signalling Circuits in Molecular Interaction Networks. Bioinformatics 18 Suppl 1, S233-40 (2002). 64. Chuang, H.Y., Lee, E., Liu, Y.T., Lee, D. & Ideker, T. Network-Based Classification of Breast Cancer Metastasis. Mol Syst Biol 3, 140 (2007). 65. Dutkowski, J. & Ideker, T. Protein Networks as Logic Functions in Development and Cancer. PLoS Comput Biol 7, e1002180 (2011). 66. Hofree, M., Shen, J.P., Carter, H., Gross, A. & Ideker, T. Network-Based Stratification of Tumor Mutations. Nat Methods 10, 1108-15 (2013). 67. Ideker, T., Dutkowski, J. & Hood, L. Boosting Signal-to-Noise in Complex Biology: Prior Knowledge Is Power. Cell 144, 860-3 (2011). 68. Carvunis, A.R. & Ideker, T. Siri of the Cell: What Biology Could Learn from the Iphone. Cell 157, 534-8 (2014). 69. Chuang, H.Y. et al. Subnetwork-Based Analysis of Chronic Lymphocytic Leukemia Identifies Pathways That Associate with Disease Progression. Blood 120, 2639-49 (2012). 70. Costanzo, M. et al. The Genetic Landscape of a Cell. Science 327, 425-31 (2010). 71. Winzeler, E.A. et al. Functional Characterization of the S. Cerevisiae Genome by Gene Deletion and Parallel Analysis. Science 285, 901-6 (1999). 72. Hillenmeyer, M.E. et al. The Chemical Genomic Portrait of Yeast: Uncovering a Phenotype for All Genes. Science 320, 362-5 (2008). 73. Bloom, J.S., Ehrenreich, I.M., Loo, W.T., Lite, T.L. & Kruglyak, L. Finding the Sources of Missing Heritability in a Yeast Cross. Nature 494, 234-7 (2013). 74. Novershtern, N. et al. Densely Interconnected Transcriptional Circuits Control Cell States in Human Hematopoiesis. Cell 144, 296-309 (2011). 75. Laurenti, E. et al. The Transcriptional Architecture of Early Human Hematopoiesis Identifies Multilevel Control of Lymphoid Commitment. Nature immunology 14, 756-63 (2013).
  • 19. 76. Kinsella, R.J. et al. Ensembl Biomarts: A Hub for Data Retrieval across Taxonomic Space. Database : the journal of biological databases and curation 2011, bar030 (2011). 77. Turner, B. et al. Irefweb: Interactive Analysis of Consolidated Protein Interaction Data and Their Supporting Evidence. Database (Oxford) 2010, baq023 (2010). 78. Zuberi, K. et al. Genemania Prediction Server 2013 Update. Nucleic acids research 41, W115- 22 (2013). 79. Cerami, E.G. et al. Pathway Commons, a Web Resource for Biological Pathway Data. Nucleic Acids Res (2010). 80. Jeong, H., Mason, S.P., Barabasi, A.L. & Oltvai, Z.N. Lethality and Centrality in Protein Networks. Nature 411, 41-2 (2001). 81. Yu, H., Kim, P.M., Sprecher, E., Trifonov, V. & Gerstein, M. The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics. PLoS Comput Biol 3, e59 (2007).