SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
INTRODUCTION TO NETWORK MEDICINE
Marc Santolini

Center for Complex Network Research (CCNR)
Reductionism,which has dominated biological research
for over a century, has provided a wealth of knowledge
about individual cellular components and their func-
tions. Despite its enormous success, it is increasingly
clear that a discrete biological function can only rarely
be attributed to an individual molecule. Instead, most
biological characteristics arise from complex interac-
tions between the cell’s numerous constituents, such as
proteins,DNA,RNA and small molecules1–8
.Therefore,
akeychallengeforbiologyinthetwenty-firstcenturyisto
understand the structure and the dynamics of the com-
plex intercellular web of interactions that contribute to
the structure and function of a living cell.
The development of high-throughput data-collection
techniques, as epitomized by the widespread use of
microarrays,allows for the simultaneous interrogation
of the status of a cell’s components at any given time.
In turn,new technology platforms,such as PROTEIN CHIPS
or semi-automatedYEAST TWO-HYBRID SCREENS,help to deter-
mine how and when these molecules interact with each
other.Various types of interaction webs, or networks,
(including protein–protein interaction,metabolic,sig-
nalling and transcription-regulatory networks) emerge
from the sum of these interactions.None of these net-
works are independent,instead they form a‘network of
networks’ that is responsible for the behaviour of the
cell.A major challenge of contemporary biology is to
programmetomapout,understandandmodelinquan-
tifiabletermsthetopologicalanddynamicpropertiesof the
variousnetworksthatcontrolthebehaviourof thecell.
Helpalongthewayisprovidedbytherapidlydevelop-
ing theory of complex networks that, in the past few
years,has made advances towards uncovering the orga-
nizingprinciplesthatgoverntheformationandevolution
of various complex technological and social networks9–12
.
This research is already making an impact on cell biology.
It has led to the realization that the architectural features
of molecularinteractionnetworkswithinacellareshared
to a large degree by other complex systems,such as the
Internet,computer chips and society.This unexpected
universality indicates that similar laws may govern most
complex networks in nature,which allows the expertise
fromlargeandwell-mappednon-biologicalsystemstobe
usedtocharacterizetheintricateinterwovenrelationships
thatgoverncellularfunctions.
In this review,we show that the quantifiable tools of
network theory offer unforeseen possibilities to under-
stand the cell’s internal organization and evolution,
fundamentally altering our view of cell biology. The
emerging results are forcing the realization that, not-
withstanding the importance of individual molecules,
cellular function is a contextual attribute of strict
and quantifiable patterns of interactions between the
myriad of cellular constituents. Although uncovering
NETWORK BIOLOGY:
UNDERSTANDING THE CELL’S
FUNCTIONAL ORGANIZATION
Albert-László Barabási* & Zoltán N. Oltvai‡
A key aim of postgenomic biomedical research is to systematically catalogue all molecules and
their interactions within a living cell. There is a clear need to understand how these molecules and
the interactions between them determine the function of this enormously complex machinery, both
in isolation and when surrounded by other cells. Rapid advances in network biology indicate that
cellular networks are governed by universal laws and offer a new conceptual framework that could
potentially revolutionize our view of biology and disease pathologies in the twenty-first century.
oarrays,
gy
nomic set
surface
hem. The
t a high
e
inding
ysics,
Dame,
na 46556,
hology,
ersity,
611,
u;
R E V I E W S
Barabasi et al., Nat Rev Genet 2004
NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 105
Ba; blue nodes). In the Barabási–Albert model of a scale-free network , at each time point a node with M links is added to the network, which
connects to an already existing node I with probability ΠI
= kI
/ΣJ
kJ
, where kI
is the degree of node I (FIG. 3) and J is the index denoting the sum over
network nodes. The network that is generated by this growth process has a power-law degree distribution that is characterized by the degree
exponent γ = 3. Such distributions are seen as a straight line on a log–log plot (see figure, part Bb). The network that is created by the
Barabási–Albert model does not have an inherent modularity, so C(k) is independent of k (see figure, part Bc). Scale-free networks with degree
exponents 2<γ<3, a range that is observed in most biological and non-biological networks, are ultra-small34,35
, with the average path length
following ഞ ~ log log N, which is significantly shorter than log N that characterizes random small-world networks.
Hierarchicalnetworks
To account for the coexistence of modularity, local clustering and scale-free topology in many real systems it has to be assumed that clusters
combine in an iterative manner, generating a hierarchical network47,53
(see figure, part C). The starting point of this construction is a small cluster
of four densely linked nodes (see the four central nodes in figure, part Ca). Next, three replicas of this module are generated and the three external
nodes of the replicated clusters
connected to the central node of
the old cluster, which produces a
large 16-node module. Three
replicas of this 16-node module
are then generated and the 16
peripheral nodes connected to
the central node of the old
module, which produces a new
module of 64 nodes. The
hierarchical network model
seamlessly integrates a scale-free
topology with an inherent
modular structure by generating
a network that has a power-law
degree distribution with degree
exponent γ = 1 + ഞn4/ഞn3 = 2.26
(see figure, part Cb) and a large,
system-size independent average
clustering coefficient <C> ~ 0.6.
The most important signature of
hierarchical modularity is the
scaling of the clustering
coefficient, which follows
C(k) ~ k –1
a straight line of slope
–1 on a log–log plot (see figure,
part Cc). A hierarchical
architecture implies that sparsely
connected nodes are part of
highly clustered areas, with
communication between the
different highly clustered
neighbourhoods being
maintained by a few hubs
(see figure, part Ca).
A Random network
Ab
Ac
Aa
Bb
Bc
Ba
Cb
Cc
Ca
B Scale-free network C Hierarchical network
1
0.1
0.01
0.001
0.0001
1 10 100 1,000
P(k)C(k)
k k
k
k k
P(k)
P(k)
100
10
10–1
10–2
10–3
10–4
10–5
10–6
10–7
10–8
100 1,000 10,000
C(k)
logC(k)
log k
SCALE-FREE NETWORKS
R E V I E W S
mathematical properties of random networks14
.Their
much-investigated random network model assumes that
a fixed number of nodes are connected randomly to each
other(BOX2).Themostremarkablepropertyof themodel
is its‘democratic’or uniform character,characterizing the
degree,orconnectivity(k;BOX1),of theindividualnodes.
Because, in the model, the links are placed randomly
among the nodes,it is expected that some nodes collect
only a few links whereas others collect many more.In a
random network, the nodes degrees follow a Poisson
distribution, which indicates that most nodes have
roughly the same number of links,approximately equal
to the network’s average degree,<k> (where <> denotes
the average); nodes that have significantly more or less
linksthan<k>areabsentorveryrare(BOX2).
Despite its elegance, a series of recent findings indi-
cate that the random network model cannot explain
the topological properties of real networks. The
deviations from the random model have several key
signatures, the most striking being the finding that, in
contrast to the Poisson degree distribution, for many
social and technological networks the number of nodes
with a given degree follows a power law. That is, the
probability that a chosen node has exactly k links
follows P(k) ~ k –γ
, where γ is the degree exponent, with
its value for most networks being between 2 and 3
(REF.15).Networks that are characterized by a power-law
degree distribution are highly non-uniform, most of
the nodes have only a few links.A few nodes with a very
large number of links,which are often called hubs,hold
these nodes together. Networks with a power degree
Figure 2 | Yeast protein interaction network. A map of protein–protein interactions18
in
Saccharomyces cerevisiae, which is based on early yeast two-hybrid measurements23
, illustrates
that a few highly connected nodes (which are also known as hubs) hold the network together.
The largest cluster, which contains ~78% of all proteins, is shown. The colour of a node indicates
the phenotypic effect of removing the corresponding protein (red = lethal, green = non-lethal,
orange = slow growth, yellow = unknown). Reproduced with permission from REF.18 ©
Jeong et al., “Lethality and centrality in protein networks“ Nature 2001
THE YEAST INTERACTOME
FABRICATING HUBSR E V I E W S
major engineer of the genomic landscape, it is likely to
be a key mechanism for generating the scale-free
topology.
Two further results offer direct evidence that net-
work growth is responsible for the observed topological
features. The scale-free model (BOX 2) predicts that the
nodes that appeared early in the history of the network
are the most connected ones15
.Indeed,an inspection of
the metabolic hubs indicates that the remnants of the
RNA world, such as coenzyme A, NAD and GTP, are
among the most connected substrates of the metabolic
network, as are elements of some of the most ancient
metabolic pathways, such as glycolysis and the tricar-
boxylic acid cycle17
.In the context of the protein interac-
tion networks, cross-genome comparisons have found
that, on average, the evolutionarily older proteins have
more links to other proteins than their younger coun-
terparts45,46
. This offers direct empirical evidence for
preferential attachment.
Motifs, modules and hierarchical networks
Cellular functions are likely to be carried out in a highly
modular manner1
. In general, modularity refers to a
group of physically or functionally linked molecules
(nodes) that work together to achieve a (relatively) dis-
tinct function1,6,8,47
. Modules are seen in many systems,
for example,circles of friends in social networks or web-
sites that are devoted to similar topics on the World
Wide Web. Similarly, in many complex engineered sys-
tems, from a modern aircraft to a computer chip, a
highly modular structure is a fundamental design
a
b
Proteins
1
2
Proteins
Genes
Genes
Before duplication
After duplication
Figure 3 | The origin of the scale-free topology and hubs
in biological networks. The origin of the scale-free topology
NETWORK MOTIFS
(2003).
16. N. Keyghobadi, M. A. Matrone, G. D. Ebel, L. D.
Kramer, D. M. Fonseca, Mol. Ecol. Notes 4, 20
(2004).
17. D. M. Fonseca, C. T. Atkinson, R. C. Fleischer, Mol.
Ecol. 7, 1617 (1998).
18. F. H. Drummond, Trans. R. Entomol. Soc. Lond. 102,
369 (1951).
19. K. Tanaka, K. Mizusawa, E. S. Saugstad, Contrib. Am.
Entomol. Inst. 16, 1 (1979).
20. J. K. Pritchard, M. Stephens, P. Donnelly, Genetics
155, 945 (2000).
21. A. R. Barr, Am. J. Trop. Med. Hyg. 6, 153 (1957).
22. A. J. Cornel et al., J. Med. Entomol. 40, 36 (2003).
23. S. Urbanelli, F. Silvestrini, W. K. Reisen, E. De Vito,
L. Bullini, J. Med. Entomol. 34, 116 (1997).
24. L. L. Cavalli-Sforza, F. Cavalli-Sforza, The Great
Human Diasporas: The History of Diversity and
Evolution (Addison-Wesley, Reading, MA, 1995).
25. J. de Zulueta, Parassitologia 36, 7 (1994).
26. S. Urbanelli et al., in Ecologia, Atti I Congr. Naz.
versity of Pennsylvania, for technical assistance;
and A. Bhandoola and four anonymous reviewers
for comments and valuable suggestions on an ear-
lier version of this manuscript. Supported by a
National Research Council Associateship through
the Walter Reed Army Institute of Research
(D.M.F.), by NIH grant nos. U50/CCU220532 and
1R01GM063258, and by NSF grant no.
DEB-0083944. This material reflects the views of
the authors and should not be construed to repre-
sent those of the Department of the Army or the
Department of Defense.
Supporting Online Material
www.sciencemag.org/cgi/content/full/303/5663/1535/
DC1
Materials and Methods
Tables S1 to S8
References and Notes
2 December 2003; accepted 16 January 2004
Superfamilies of Evolved and
Designed Networks
Ron Milo, Shalev Itzkovitz, Nadav Kashtan, Reuven Levitt,
Shai Shen-Orr, Inbal Ayzenshtat, Michal Sheffer, Uri Alon*
Complex biological, technological, and sociological networks can be of very
different sizes and connectivities, making it difficult to compare their struc-
tures. Here we present an approach to systematically study similarity in the
local structure of networks, based on the significance profile (SP) of small
subgraphs in the network compared to randomized networks. We find
several superfamilies of previously unrelated networks with very similar SPs.
One superfamily, including transcription networks of microorganisms, rep-
resents “rate-limited” information-processing networks strongly con-
strained by the response time of their components. A distinct superfamily
includes protein signaling, developmental genetic networks, and neuronal
wiring. Additional superfamilies include power grids, protein-structure net-
works and geometric networks, World Wide Web links and social networks,
and word-adjacency networks from different languages.
Many networks in nature share global prop-
erties (1, 2). Their degree sequences (the
number of edges per node) often follow a
long-tailed distribution, in which some nodes
are much more connected than the average
(3). In addition, natural networks often show
the small-world property of short paths be-
tween nodes and highly clustered connections
(1, 2, 4). Despite these global similarities,
networks from different fields can have very
different local structure (5). It was recently
found that networks display certain patterns,
termed “network motifs,” at much higher fre-
quency than expected in randomized net-
works (6, 7). In biological networks, these
motifs were suggested to be recurring circuit
elements that carry out key information-
processing tasks (6, 8–10).
Departments of Molecular Cell Biology, Physics of
Complex Systems, and Computer Science, Weizmann
Institute of Science, Rehovot 76100, Israel.
*To whom correspondence should be addressed at
Department of Molecular Cell Biology, Weizmann In-
stitute of Science, Rehovot 76100, Israel. E-mail:
urialon@weizmann.ac.il
CH 2004 VOL 303 SCIENCE www.sciencemag.org
ors that readily transmit the vi-
and between avian hosts and
ld have created the current ep-
itions.
nt study suggests that changes in
pacity and the creation of new
tors may occur with new intro-
particular, the arrival of hybrid
rms in northern Europe has the
adically change the dynamics of
rope.
s and Notes
adova, Culex pipiens pipiens Mosquitoes:
Distribution, Ecology, Physiology, Genet-
Importance, and Control (Pensoft, Mos-
n, Ann. N.Y. Acad. Sci. 951, 220 (2001).
M. L. O’Guinn, D. J. Dohm, J. W. Jones,
omol. 38, 130 (2001).
ard et al., Emerg. Infect. Dis. 7, 679
ekera et al., Emerg. Infect. Dis. 7, 722
m, M. R. Sardelis, M. J. Turell, J. Med.
9, 640 (2002).
et al., Emerg. Infect. Dis. 7, 742 (2001).
local structure of networks, based on the significance profile (SP) of small
subgraphs in the network compared to randomized networks. We find
several superfamilies of previously unrelated networks with very similar SPs.
One superfamily, including transcription networks of microorganisms, rep-
resents “rate-limited” information-processing networks strongly con-
strained by the response time of their components. A distinct superfamily
includes protein signaling, developmental genetic networks, and neuronal
wiring. Additional superfamilies include power grids, protein-structure net-
works and geometric networks, World Wide Web links and social networks,
and word-adjacency networks from different languages.
Many networks in nature share global prop-
erties (1, 2). Their degree sequences (the
number of edges per node) often follow a
long-tailed distribution, in which some nodes
are much more connected than the average
(3). In addition, natural networks often show
the small-world property of short paths be-
tween nodes and highly clustered connections
(1, 2, 4). Despite these global similarities,
networks from different fields can have very
different local structure (5). It was recently
found that networks display certain patterns,
termed “network motifs,” at much higher fre-
quency than expected in randomized net-
works (6, 7). In biological networks, these
motifs were suggested to be recurring circuit
elements that carry out key information-
processing tasks (6, 8–10).
Departments of Molecular Cell Biology, Physics of
Complex Systems, and Computer Science, Weizmann
Institute of Science, Rehovot 76100, Israel.
*To whom correspondence should be addressed at
Department of Molecular Cell Biology, Weizmann In-
stitute of Science, Rehovot 76100, Israel. E-mail:
urialon@weizmann.ac.il
5 MARCH 2004 VOL 303 SCIENCE www.sciencemag.org
To understand the design principles of com-
plex networks, it is important to compare the local
structure of networks from different fields. The
main difficulty is that these networks can be of
vastly different sizes [for example, World Wide
Web (WWW) hyperlink networks with millions
of nodes and social networks with tens of nodes]
and degree sequences. Here, we present an ap-
proach for comparing network local structure,
based on the significance profile (SP). To calcu-
late the SP of a network, the network is compared
to an ensemble of randomized networks with the
same degree sequence. The comparison to ran-
domized networks compensates for effects due to
network size and degree sequence. For each sub-
graph i, the statistical significance is described by
the Z score (11):
Zi ϭ ͑Nreali Ϫ <Nrandi>)/std(Nrandi)
where Nreali is the number of times the sub-
graph appears in the network, and ϽNrandiϾ
and std(Nrandi) are the mean and standard
deviation of its appearances in the random-
ized network ensemble. The SP is the vector
of Z scores normalized to length 1:
SPiϭZi/(⌺Zi
2
)1/2
The normalization emphasizes the relative
significance of subgraphs, rather than the ab-
solute significance. This is important for
comparison of networks of different sizes,
because motifs (subgraphs that occur much
more often than expected at random) in large
networks tend to display higher Z scores than
motifs in small networks (7).
We present in Fig. 1 the SP of the 13
possible directed connected triads (triad sig-
nificance profile, TSP) for networks from
different fields (12). The TSP of these net-
works is almost always insensitive to removal
of 30% of the edges or to addition of 50%
new edges at random, demonstrating that it is
robust to missing data or random data errors
(SOM Text). Several superfamilies of net-
works with similar TSPs emerge from this
analysis. One superfamily includes sensory
transcription networks that control gene ex-
pression in bacteria and yeast in response to
external stimuli. In these transcription net-
works, the nodes represent genes or operons
and the edges represent direct transcriptional
regulation (6, 13–15). Networks from three
microorganisms, the bacteria Escherichia
coli (6) and Bacillus subtilis (14) and the
yeast Saccharomyces cerevisiae (7, 15), were
analyzed. The networks have very similar
TSPs (correlation coefficient c Ͼ 0.99). They
show one strong motif, triad 7, termed “feed-
forward loop.” The feedforward loop has
been theoretically and experimentally shown
Fig. 1. The triad significance profile (TSP) of networks from various
disciplines. The TSP shows the normalized significance level (Z score) for
each of the 13 triads. Networks with similar characteristic profiles are
URCHIN N ϭ 45, E ϭ 83), and synaptic connections between neurons in
C. elegans (NEURONS N ϭ 280, E ϭ 2170). (iii) WWW hyperlinks
between Web pages in the www.nd.edu site (3) (WWW-1 N ϭ 325729,
The human disease network
Kwang-Il Goh*†‡§
, Michael E. Cusick†‡¶
, David Valleʈ
, Barton Childsʈ
, Marc Vidal†‡¶
**, and Albert-La´szlo´ Baraba´si*†‡
**
*Center for Complex Network Research and Department of Physics, University of Notre Dame, Notre Dame, IN 46556; †Center for Cancer Systems Biology
(CCSB) and ¶Department of Cancer Biology, Dana–Farber Cancer Institute, 44 Binney Street, Boston, MA 02115; ‡Department of Genetics, Harvard Medical
School, 77 Avenue Louis Pasteur, Boston, MA 02115; §Department of Physics, Korea University, Seoul 136-713, Korea; and ʈDepartment of Pediatrics and the
McKusick–Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205
Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved April 3, 2007 (received for review February 14, 2007)
A network of disorders and disease genes linked by known disorder–
gene associations offers a platform to explore in a single graph-
theoretic framework all known phenotype and disease gene associ-
ations, indicating the common genetic origin of many diseases. Genes
associated with similar disorders show both higher likelihood of
physical interactions between their products and higher expression
profiling similarity for their transcripts, supporting the existence of
distinct disease-specific functional modules. We find that essential
human genes are likely to encode hub proteins and are expressed
widely in most tissues. This suggests that disease genes also would
play a central role in the human interactome. In contrast, we find that
the vast majority of disease genes are nonessential and show no
tendency to encode hub proteins, and their expression pattern indi-
cates that they are localized in the functional periphery of the
network. A selection-based model explains the observed difference
between essential and disease genes and also suggests that diseases
caused by somatic mutations should not be peripheral, a prediction
we confirm for cancer genes.
biological networks ͉ complex networks ͉ human genetics ͉ systems
biology ͉ diseasome
Decades-long efforts to map human disease loci, at first genet-
ically and later physically (1), followed by recent positional
cloning of many disease genes (2) and genome-wide association
studies (3), have generated an impressive list of disorder–gene
association pairs (4, 5). In addition, recent efforts to map the
protein–protein interactions in humans (6, 7), together with efforts
to curate an extensive map of human metabolism (8) and regulatory
networks offer increasingly detailed maps of the relationships
between different disease genes. Most of the successful studies
building on these new approaches have focused, however, on a
single disease, using network-based tools to gain a better under-
standing of the relationship between the genes implicated in a
selected disorder (9).
Here we take a conceptually different approach, exploring
whether human genetic disorders and the corresponding disease
genes might be related to each other at a higher level of cellular and
organismal organization. Support for the validity of this approach
is provided by examples of genetic disorders that arise from
mutations in more than a single gene (locus heterogeneity). For
example, Zellweger syndrome is caused by mutations in any of at
least 11 genes, all associated with peroxisome biogenesis (10).
Similarly, there are many examples of different mutations in the
same gene (allelic heterogeneity) giving rise to phenotypes cur-
rently classified as different disorders. For example, mutations in
TP53 have been linked to 11 clinically distinguishable cancer-
related disorders (11). Given the highly interlinked internal orga-
nization of the cell (12–17), it should be possible to improve the
single gene–single disorder approach by developing a conceptual
framework to link systematically all genetic disorders (the human
‘‘disease phenome’’) with the complete list of disease genes (the
‘‘disease genome’’), resulting in a global view of the ‘‘diseasome,’’
the combined set of all known disorder/disease gene associations.
Results
Construction of the Diseasome. We constructed a bipartite graph
consisting of two disjoint sets of nodes. One set corresponds to all
known genetic disorders, whereas the other set corresponds to all
known disease genes in the human genome (Fig. 1). A disorder and
a gene are then connected by a link if mutations in that gene are
implicated in that disorder. The list of disorders, disease genes, and
associations between them was obtained from the Online Mende-
lian Inheritance in Man (OMIM; ref. 18), a compendium of human
disease genes and phenotypes. As of December 2005, this list
contained 1,284 disorders and 1,777 disease genes. OMIM initially
focused on monogenic disorders but in recent years has expanded
to include complex traits and the associated genetic mutations that
confer susceptibility to these common disorders (18). Although this
history introduces some biases, and the disease gene record is far
from complete, OMIM represents the most complete and up-to-
date repository of all known disease genes and the disorders they
confer. We manually classified each disorder into one of 22 disorder
classes based on the physiological system affected [see supporting
information (SI) Text, SI Fig. 5, and SI Table 1 for details].
Starting from the diseasome bipartite graph we generated two
biologically relevant network projections (Fig. 1). In the ‘‘human
disease network’’ (HDN) nodes represent disorders, and two
disorders are connected to each other if they share at least one gene
in which mutations are associated with both disorders (Figs. 1 and
2a). In the ‘‘disease gene network’’ (DGN) nodes represent disease
genes, and two genes are connected if they are associated with the
same disorder (Figs. 1 and 2b). Next, we discuss the potential of
these networks to help us understand and represent in a single
framework all known disease gene and phenotype associations.
Properties of the HDN. If each human disorder tends to have a
distinct and unique genetic origin, then the HDN would be dis-
connected into many single nodes corresponding to specific disor-
ders or grouped into small clusters of a few closely related disorders.
In contrast, the obtained HDN displays many connections between
both individual disorders and disorder classes (Fig. 2a). Of 1,284
disorders, 867 have at least one link to other disorders, and 516
disorders form a giant component, suggesting that the genetic
origins of most diseases, to some extent, are shared with other
diseases. The number of genes associated with a disorder, s, has a
broad distribution (see SI Fig. 6a), indicating that most disorders
relate to a few disease genes, whereas a handful of phenotypes, such
as deafness (s ϭ 41), leukemia (s ϭ 37), and colon cancer (s ϭ 34),
relate to dozens of genes (Fig. 2a). The degree (k) distribution of
HDN (SI Fig. 6b) indicates that most disorders are linked to only
Author contributions: D.V., B.C., M.V., and A.-L.B. designed research; K.-I.G. and M.E.C.
performed research; K.-I.G. and M.E.C. analyzed data; and K.-I.G., M.E.C., D.V., M.V., and
A.-L.B. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Abbreviations: DGN, disease gene network; HDN, human disease network; GO, Gene
Ontology; OMIM, Online Mendelian Inheritance in Man; PCC, Pearson correlation coeffi-
cient.
**To whom correspondence may be addressed. E-mail: alb@nd.edu or marc࿝vidal@
dfci.harvard.edu.
This article contains supporting information online at www.pnas.org/cgi/content/full/
0701361104/DC1.
© 2007 by The National Academy of Sciences of the USA
www.pnas.org͞cgi͞doi͞10.1073͞pnas.0701361104 PNAS ͉ May 22, 2007 ͉ vol. 104 ͉ no. 21 ͉ 8685–8690
APPLIEDPHYSICAL
SCIENCES
AR
ATM
BRCA1
BRCA2
CDH1
GARS
HEXB
KRAS
LMNA
MSH2
PIK3CA
TP53
MAD1L1
RAD54L
VAPB
CHEK2
BSCL2
ALS2
BRIP1
Androgen insensitivity
Breast cancer
Perineal hypospadias
Prostate cancer
Spinal muscular atrophy
Ataxia-telangiectasia
Lymphoma
T-cell lymphoblastic leukemia
Ovarian cancer
Papillary serous carcinoma
Fanconi anemia
Pancreatic cancer
Wilms tumor
Charcot-Marie-Tooth disease
Sandhoff disease
Lipodystrophy
Amyotrophic lateral sclerosis
Silver spastic paraplegia syndrome
Spastic ataxia/paraplegia
AR
ATM
BRCA1
BRCA2
CDH1
GARS
HEXB
KRAS
LMNA
MSH2
PIK3CA
TP53
MAD1L1
RAD54L
VAPB
CHEK2
BSCL2
ALS2
BRIP1
Androgen insensitivity
Breast cancer
Perineal hypospadiasProstate cancer
Spinal muscular atrophy
Ataxia-telangiectasia
Lymphoma
T-cell lymphoblastic leukemia
Ovarian cancer
Papillary serous carcinoma
Fanconi anemia
Pancreatic cancer
Wilms tumor
Charcot-Marie-Tooth disease
Sandhoff disease
Lipodystrophy
Amyotrophic lateral sclerosis
Silver spastic paraplegia syndrome
Spastic ataxia/paraplegia
Human Disease Network
(HDN)
Disease Gene Network
(DGN)
disease genomedisease phenome
DISEASOME
Fig. 1. Construction of the diseasome bipartite network. (Center) A small subset of OMIM-based disorder–disease gene associations (18), where circles and rectangles
Goh et al., PNAS 2007
GENES AND DISEASES
Asthma
Atheroscierosis
Blood
group
Breast
cancer
Complement_component
deficiency
Cardiomyopathy
Cataract
Charcot-Marie-Tooth
disease
Colon
cancer
Deafness
Diabetes
mellitus
Epidermolysis
bullosa
Epilepsy
Fanconi
anemia
Gastric
cancer
Hypertension
Leigh
syndrome
Leukemia
Lymphoma
Mental
retardation
Muscular
dystrophy
Myocardial
infarction
Myopathy
Obesity
Parkinson
disease
Prostate
cancer
Retinitis
pigmentosa
Spherocytosis
Spinocereballar
ataxia
Stroke
Thyroid
carcinoma
Zellweger
syndrome
APC
COL2A1
ACE
PAX6
ERBB2
FBN1
FGFR3
FGFR2
GJB2
GNAS
KIT
KRAS
LRP5
MSH2
MEN1
NF1
PTEN
SCN4A
TP53
ARX
a
b
Human Disease Network
Disease Gene Network
Disorder Class
Bone
Cancer
Cardiovascular
Connective tissue
Dermatological
Developmental
Ear, Nose, Throat
Endocrine
Gastrointestinal
Hematological
Immunological
Metabolic
Muscular
Neurological
Nutritional
Ophthamological
Psychiatric
Renal
Respiratory
Skeletal
multiple
Unclassified
Node size
1
5
10
15
21
25
30
34
41
Hirschprung
disease
Trichothio-
dystrophy
Alzheimer
disease
Heinz
body
anemia
Bethlem
myopathy
Hemolytic
anemia
Ataxia-
telangiectasia
Pseudohypo-
aldosteronism
APPLIEDPHYSICAL
SCIENCES
Leading Edge
Review
Interactome Networks and Human Disease
Marc Vidal,1,2,* Michael E. Cusick,1,2 and Albert-La´ szlo´ Baraba´ si1,3,4,*
1Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
2Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
3Center for Complex Network Research (CCNR) and Departments of Physics, Biology and Computer Science, Northeastern University,
Boston, MA 02115, USA
4Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
*Correspondence: marc_vidal@dfci.harvard.edu (M.V.), alb@neu.edu (A.-L.B.)
DOI 10.1016/j.cell.2011.02.016
Complex biological systems and cellular networks may underlie most genotype to phenotype
relationships. Here, we review basic concepts in network biology, discussing different types of
interactome networks and the insights that can come from analyzing them. We elaborate on why
interactome networks are important to consider in biology, how they can be mapped and integrated
with each other, what global properties are starting to emerge from interactome network models,
and how these properties may relate to human disease.
Introduction
Since the advent of molecular biology, considerable progress
has been made in the quest to understand the mechanisms
that underlie human disease, particularly for genetically inherited
disorders. Genotype-phenotype relationships, as summarized in
the Online Mendelian Inheritance in Man (OMIM) database (Am-
berger et al., 2009), include mutations in more than 3000 human
genes known to be associated with one or more of over 2000
human disorders. This is a truly astounding number of geno-
type-phenotype relationships considering that a mere three
decades have passed since the initial description of Restriction
Fragment Length Polymorphisms (RFLPs) as molecular markers
to map genetic loci of interest (Botstein et al., 1980), only
two decades since the announcement of the first positional
cloning experiments of disease-associated genes using RFLPs
(Amberger et al., 2009), and just one decade since the release
of the first reference sequences of the human genome (Lander
et al., 2001; Venter et al., 2001). For complex traits, the informa-
tion gathered by recent genome-wide association studies
suggests high-confidence genotype-phenotype associations
between close to 1000 genomic loci and one or more of over
phenotypic associations, there would still be major problems
to fully understand and model human genetic variations and their
impact on diseases.
To understand why, consider the ‘‘one-gene/one-enzyme/
one-function’’ concept originally framed by Beadle and Tatum
(Beadle and Tatum, 1941), which holds that simple, linear
connections are expected between the genotype of an organism
and its phenotype. But the reality is that most genotype-pheno-
type relationships arise from a much higher underlying com-
plexity. Combinations of identical genotypes and nearly identical
environments do not always give rise to identical phenotypes.
The very coining of the words ‘‘genotype’’ and ‘‘phenotype’’ by
Johannsen more than a century ago derived from observations
that inbred isogenic lines of bean plants grown in well-controlled
environments give rise to pods of different size (Johannsen,
1909). Identical twins, although strikingly similar, nevertheless
often exhibit many differences (Raser and O’Shea, 2005). Like-
wise, genotypically indistinguishable bacterial or yeast cells
grown side by side can express different subsets of transcripts
and gene products at any given moment (Elowitz et al., 2002;
Blake et al., 2003; Taniguchi et al., 2010). Even straightforward
Mapping Interactome Networks
Network science deals with complexity by ‘‘simplifying’’ com-
plex systems, summarizing them merely as components (nodes)
and interactions (edges) between them. In this simplified
approach, the functional richness of each node is lost. Despite
or even perhaps because of such simplifications, useful discov-
eries can be made. As regards cellular systems, the nodes are
metabolites and macromolecules such as proteins, RNA mole-
cules and gene sequences, while the edges are physical,
biochemical and functional interactions that can be identified
with a plethora of technologies. One challenge of network
biology is to provide maps of such interactions using systematic
and standardized approaches and assays that are as unbiased
as possible. The resulting ‘‘interactome’’ networks, the networks
of interactions between cellular components, can serve as scaf-
fold information to extract global or local graph theory proper-
et al., 2010). Computational prediction maps are fast and effi-
cient to implement, and usually include satisfyingly large
numbers of nodes and edges, but are necessarily imperfect
because they use indirect information (Plewczynski and Ginalski,
2009). While high-throughput maps attempt to describe unbi-
ased, systematic, and well-controlled data, they were initially
more difficult to establish, although recent technological
advances suggest that near completion can be reached within
a few years for highly reliable, comprehensive protein-protein
were discovered in and are being applied genome-wide for these
model organisms (Mohr et al., 2010).
Metabolic Networks
Metabolic network maps attempt to comprehensively describe
all possible biochemical reactions for a particular cell or
organism (Schuster et al., 2000; Edwards et al., 2001). In many
representations of metabolic networks, nodes are biochemical
metabolites and edges are either the reactions that convert
Figure 2. Networks in Cellular Systems
To date, cellular networks are most available for the ‘‘super-model’’ organisms (Davis, 2004) yeast, worm, fly, and plant. High-throughput interactome mapping
relies upon genome-scale resources such as ORFeome resources. Several types of interactome networks discussed are depicted. In a protein interaction
network, nodes represent proteins and edges represent physical interactions. In a transcriptional regulatory network, nodes represent transcription factors
(circular nodes) or putative DNA regulatory elements (diamond nodes); and edges represent physical binding between the two. In a disease network, nodes
represent diseases, and edges represent gene mutations of which are associated with the linked diseases. In a virus-host network, nodes represent viral proteins
(square nodes) or host proteins (round nodes), and edges represent physical interactions between the two. In a metabolic network, nodes represent enzymes,
and edges represent metabolites that are products or substrates of the enzymes. The network depictions seem dense, but they represent only small portions of
available interactome network maps, which themselves constitute only a few percent of the complete interactomes within cells.
Cell 2011
DISEASES AS NETWORK
PERTURBATIONS
Most cellular components exert their functions through
interactions with other cellular components, which can
be located either in the same cell or across cells, and
even across organs. In humans, the potential complexity
of the resulting network — the human interactome — is
daunting: with ~25,000 protein-coding genes, ~1,000
metabolites and an undefined number of distinct
proteins1
and functional RNA molecules, the number of
cellular components that serve as the nodes of the inter-
actome easily exceeds 100,000. The number of function-
ally relevant interactions between the components of
Network-based approaches to human disease have
multiple potential biological and clinical applications. A
better understanding of the effects of cellular intercon-
nectedness on disease progression may lead to the iden-
tification of disease genes and disease pathways, which,
in turn, may offer better targets for drug development.
These advances may also lead to better and more accurate
biomarkers to monitor the functional integrity of net-
works that are perturbed by diseases as well as to better
disease classification. Here we present an overview of
the organizing principles that govern cellular networks
Network medicine: a network-based
approach to human disease
Albert-László Barabási*‡§
, Natali Gulbahce*‡||
and Joseph Loscalzo§
Abstract | Given the functional interdependencies between the molecular components in a
human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects
the perturbations of the complex intracellular and intercellular network that links tissue
and organ systems. The emerging tools of network medicine offer a platform to explore
systematically not only the molecular complexity of a particular disease, leading to the
identificationofdiseasemodulesandpathways,butalsothemolecularrelationshipsamong
apparentlydistinct(patho)phenotypes.Advancesinthisdirectionareessentialforidentifying
new disease genes, for uncovering the biological significance of disease-associated
mutationsidentifiedbygenome-wideassociationstudiesandfull-genomesequencing,and
foridentifyingdrugtargetsandbiomarkersforcomplexdiseases.
Predicting disease genes
Disease-associated genes have generally been identified
using linkage mapping or, more recently, genome-wide
association (GWA) studies53
. Both methodologies can
suggest large numbers of disease-gene candidates, but
identifying the particular gene and the causal muta-
tion remains difficult. Recently, a series of increasingly
sophisticated network-based tools have been devel-
oped to predict potential disease genes; these tools can
be loosely grouped into three categories, as discussed
below (FIG. 4).
Linkage methods. These methods assume that the direct
interaction partners of a disease protein are likely to
be associated with the same disease phenotype45,54–56
.
Indeed, for one disease locus, the set of genes within
the locus whose products interacted with a known
nes in the interactome. a | Of the approximately
iated with specific diseases. The figure shows the
ssociated genes that were known42
in 2007 and
ntial, that is, their absence is associated with
agram of the differences between essential and
ential disease genes (shown as blue nodes) are
eriphery, whereas in utero essential genes (shown
REVIEWS
Figure 2 | Disease modules. Schematic diagram of the three modularity concepts that are discussed in this Review.
a | Topological modules correspond to locally dense neighbourhoods of the interactome, such that the nodes of
REVIEWS
this network, representing the links of the interactome,
is expected to be much larger2
.
This inter- and intracellular interconnectivity implies
that the impact of a specific genetic abnormality is not
restricted to the activity of the gene product that carries
it, but can spread along the links of the network and
alter the activity of gene products that otherwise carry
no defects. Therefore, an understanding of a gene’s net-
work context is essential in determining the phenotypic
impact of defects that affect it3,4
. Following on from this
principle, a key hypothesis underlying this Review is
that a disease phenotype is rarely a consequence of
an abnormality in a single effector gene product, but
reflects various pathobiological processes that inter-
act in a complex network. A corollary of this widely
held hypothesis is that the interdependencies among
a cell’s molecular components lead to deep functional,
molecular and causal relationships among apparently
distinct phenotypes.
and the implications of these principles for understand-
ing disease. These principles and the tools and method-
ologies that are derived from them are facilitating the
emergence of a body of knowledge that is increasingly
referred to as network medicine5–7
.
The human interactome
Although much of our understanding of cellular net-
works is derived from model organisms, the past dec-
ade has seen an exceptional growth in human-specific
molecular interaction data8
. Most attention has been
directed towards molecular networks, including protein
interaction networks, whose nodes are proteins that are
linked to each other by physical (binding) interactions9,10
;
metabolic networks, whose nodes are metabolites that
are linked if they participate in the same biochemi-
cal reactions11–13
; regulatory networks, whose directed
links represent either regulatory relationships between
a transcription factor and a gene14
, or post-translational
111 Dana Research Center,
Boston, Massachusetts
02115, USA.
‡
Center for Cancer Systems
Biology, Dana-Farber Cancer
Institute, 44 Binney Street,
Boston, Massachusetts
02115, USA.
§
Department of Medicine,
Brigham and Women’s
Hospital, Harvard Medical
School, 75 Francis Street,
Boston, Massachusetts
02115, USA.
||
Department of Cellular and
Molecular Pharmacology,
University of California, 1700
4th Street, Byers Hall 309,
Box 2530, San Francisco,
California 94158, USA.
Correspondence to A.-L.B.
e-mail: alb@neu.edu
doi:10.1038/nrg2918
56 | JANUARY 2011 | VOLUME 12 www.nature.com/reviews/genetics
© 2011 Macmillan Publishers Limited. All rights reserved
between the components of
g the links of the interactome,
ger2
.
ular interconnectivity implies
ic genetic abnormality is not
the gene product that carries
he links of the network and
roducts that otherwise carry
understanding of a gene’s net-
n determining the phenotypic
ct it3,4
. Following on from this
is underlying this Review is
e is rarely a consequence of
e effector gene product, but
logical processes that inter-
k. A corollary of this widely
e interdependencies among
ents lead to deep functional,
tionships among apparently
the organizing principles that govern cellular networks
and the implications of these principles for understand-
ing disease. These principles and the tools and method-
ologies that are derived from them are facilitating the
emergence of a body of knowledge that is increasingly
referred to as network medicine5–7
.
The human interactome
Although much of our understanding of cellular net-
works is derived from model organisms, the past dec-
ade has seen an exceptional growth in human-specific
molecular interaction data8
. Most attention has been
directed towards molecular networks, including protein
interaction networks, whose nodes are proteins that are
linked to each other by physical (binding) interactions9,10
;
metabolic networks, whose nodes are metabolites that
are linked if they participate in the same biochemi-
cal reactions11–13
; regulatory networks, whose directed
links represent either regulatory relationships between
a transcription factor and a gene14
, or post-translational
www.nature.com/reviews/genetics
Macmillan Publishers Limited. All rights reserved
NETWORK MEDICINE
RESEARCH ARTICLE SUMMARY
◥
DISEASE NETWORKS
Uncovering disease-disease
relationships through the
incomplete interactome
Jörg Menche, Amitabh Sharma, Maksim Kitsak, Susan Dina Ghiassian, Marc Vidal,
Joseph Loscalzo, Albert-László Barabási*
INTRODUCTION: A disease is rarely a straight-
forward consequence of an abnormality in a
single gene, but rather reflects the interplay
of multiple molecular processes. The rela-
tionships among these processes are encoded
in the interactome, a network that integrates
all physical interactions within a cell, from
protein-protein to regulatory protein–DNA
and metabolic interactions. The documented
propensity of disease-associated proteins to
interact with each other suggests that they
tend to cluster in the same neighborhood of
the interactome, forming a disease module, a
connected subgraph that contains all molecu-
lar determinants of a disease. The accurate
identification of the corresponding disease
module represents the first step toward a sys-
tematic understanding of the molecular mech-
anisms underlying a complex disease. Here,
we present a network-based framework to iden-
tify the location of disease modules within the
interactome and use the overlap between the
modules to predict disease-disease relationships.
RATIONALE: Despite impressive advances
in high-throughput interactome mapping and
disease gene identification, both the interac-
tome and our knowledge of disease-associated
genes remain incomplete. This incomplete-
ness prompts us to ask to what extent the
current data are sufficient to map out the
disease modules, the first step toward an in-
tegrated approach toward human disease.
To make progress, we must formulate math-
ematically the impact of network inc
ness on the identifiability of disease
quantifying the predictive power and
itations of the current interactome.
RESULTS: Using the tools of network
we show that we can only uncover
modules for diseases whose number
ciated genes excee
ical threshold det
bythenetworkinc
ness. We find tha
proteins associa
226 diseases are
inthesame netwo
borhood, displaying a statistically sig
tendency to form identifiable disease m
The higher the degree of agglomerati
disease proteins within the interact
higher the biological and functional
ity of the corresponding genes. The
ings indicate that many local neighb
of the interactome represent the ob
part of the true, larger and denser
modules.
If two disease modules overlap, lo
turbations causing one disease can
pathways of the other disease module
resulting in shared clinical and path
ical characteristics. To test this hyp
we measure the network-based sepa
each disease pair, observing a direct
between the pathobiological simi
diseases and their relative distanc
RES
ON OUR WEB SITE
◥
Read the full article
at http://dx.doi.
org/10.1126/
science.1257601
..................................................
Menche et al., Science 2015
DISEASES AS NETWORK
NEIGHBORHOODS
The Interactome as a Map
Diseases As Local Neighborhoods
Asthma
Parkinson’s
Leukemia
MS
Hypertension
Rheumatoidarthritis
Crohn’s disease
Type 2 diabetes
Glioblastoma
Ulcerative colitis
Heart failure
Network	Clustering	Means	Explainable	Biology
AIF1
ZBTB12
NFKBIZ
MERTK
HHEX
CFB
Diseases As Local Neighborhoods
Interactome and disease genes
GWAS
Multiple sclerosis
genes
OMIM
Signalling
Complexes
Kinase - Substrate
Metabolic
Literature
Regulatory
Yeast two-hybrid
GWAS & OMIM
Other disease genes
Molecular interactions
Gene with multiple disease associations
OMIM
Immunologic deficiency
syndromes
Hematologic diseases
Blood protein disorders
GWAS
Connective tissue diseases
Autoimmune diseases
Joint diseases
Musculoskeletal diseases
Rheumatoid arthritis
Signaling, Complexes, Literature, Regulatory
Interaction with multiple lines of evidences
AKT1
HLA-B
HLA-C
STAT3
TAP2
NFKBIZ
IL2RA
TNFRSF1A
EHMT2
PTK2
IL7R
MAPK1
Observable module for Multiple sclerosis
• %The interactome contains 141,296 physical interactions between 13,460 proteins
• We study 299 diseases with at least 20 gene associations
Menche et al., Science 2015
Measures of network localization
multiple
sclerosis
proteins
shortest
distance
connected
component
of size S=11
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0 2 4 6 8 10 12 14 16 18 20
frequency
size of largest component
data
random
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 1 2 3 4 5 6 7 8 9
frequency
shortest distance d
data
random
AIF1
TRAF6
VCAM1
IRF8
ITGB7 ADRA1B
CD6 HLA-DQA1
CD5
HLA-DQB1
SLC30A7
TRIM27
NEDD4
C2
MLANA CBLB
CBL
MALT1
PTPRC
CD40
PTPRK CD4
BAG6 CD28
PLEK
TGFBR1
HLA-DQA2
CD58
CDSN
UBQLN4
TNFSF14
IL12A
IL12B
TNFRSF14
GRB2
CRK
MLH1
IL20RA
ZNF512B
DKKL1
SMYD2
FLNC
AHI1
ZFP36L1
UBE2I
TNFRSF1AAKT1
RAP1GAP
IL2RA
PTK2 EHMT2
HERPUD1
MERTK DDX39B
DHX16HAAO
ARHGDIA
CD86
LCP1
YAP1 METTL1
CD24 DENND3
PSMA4
FGR
STAT3
POU5F1
MAPK1
YWHAH
BATF
AR
PRRC2A
KIF1B
JUN
HLA-B
MICAHLA-C
MBP
KLRC4
MICB
ZBTB12TAP2
EXOC6
PDZK1
IL7R
MYOD1
ARRB1
ALB
NEDD9 NFKBIZ
TNXB
BACH2
BANP
RDBP
HLA-DRB1
NOTCH4
HLA-DOB
HLA-DRB5
ZFP36L2
HLA-DRAFBXW7
4276
HLA-DMB
HHEX
PFDN1
SIRT2
SLC15A2
SP140
CFB
EOMES
d=2
d=3
d=3
• We use two measures to quantify the interactome-based localization of a disease
• 226 out of 299 diseases are significantly localized according to both measures
Relations between Diseases
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5
DistributionP(d)
Shortest distance d
PD
MS
Pairwise
Separated Modules
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3 4
DistributionP(d)
Shortest distance d
MS
Pairwise
Overlapping Modules
RA
s = 1.3ABs = - 0.2AB
ABs
(d , d or d )AA BB AB (d , d or d )AA BB AB
s < 0AB
s ≥ 0AB
FKBP7
PEX12
PEX3
SLC2A4
PEX19
UBQLN4
WRAP73
CCDC14
PEX10
ZNF512B
CAT
GPX3
ACAD
ACOX1
AHI1HADHA
TM6SF1
PEX16
HADHB
PEX11B
ABCD1
SMYD2
SLC27A2
HERPUD1
EWSR1
CD58
MED8
PEX1
PRR13
PEX5
PEX14
PEX2
IDH1
ZNF772
PEX13
PEX6
BANP
PEX26
TNXB
MVK
SLC30A7
ZSCAN1
LK
FGR IL7R
FYN
4
CD5 BATF
JUN
NFKBIL1
YOD1
DDX39B
6KA1 PTPN11
SIRT2
DHX16
NFKBIZ
STAT3
VDR
NCOR1
KIF1B
RDBP
RBPJ
HLA-DMB
TNPO3
HLA-DRA
SRSF1
HLA-DOB
ITGB7
PTPRC
CBLB
VCAM1
CDKN2A
BAG6TRAF2
PSMA4CD40 GMCL1
FAM107A
SUMO1
PFKL
TRAF6
MIF
C2
RHOA
TNFAIP3
TNFSF14
ATF7IP
USP53
HLA-B
EHMT2
TRAF1
OLIG3
RPL14
PHYH
CCL21
CAPRIN2 KLF6
CDSN
PEX7AGXT
IL2RA
MAPK1
YWHAG
PTK2
TNFRSF1A
HDAC2
STAT4
HLA-DRB1
TRA@
FCRL3
SMAD3
HLA-DRB5
TAP2
HLA-C
MALT1
POU5F1
ARHGDIA
HAAO
FAM167A
ARRB1
HSPA5
REL
BACH2
SMARCC2
ALB
UBE2I
RSBN1
C5orf30
EXOC6
GRB2
APOM
DDO
MKRN3
RAB35
SLX4
PHF19
GNPATPRRC2A
PTPN22
HLA-DQA2
PTPRK
GHR
HHEX
RTF1
CFBAGPS
ADRA1B
IL23A
ACTA1
SLC22A4
S100A6
SLC15A2
F8PDZK1
MLYCDMICA
SSTR5
FKBP4PFDN1
RNF167
OTUD5
FLNC
MLH1
PADI4
MERTK
HRAS
AFF3 IL20RA
HLA-DQA1
PPARG
Multiple sclerosis (MS)
Peroxisomal disorders (PD)
Rheumatoid arthritis (RA)
• We introduce a network-based measure to quantify the overlap/separation of two diseases
• Most disease pairs are well separated on the Interactome
Menche et al., Science 2015
Network Distance vs. Biomedical
Similarity
22
RRksirevitaler
Separation smean
ytiralimismotpmyS
Separation smean
ytiralimismretOG
Separation smean
10-3
10-2
10-1
100
-3 -2 -1 0 1 2
ytiralimismretOG
Separation
Expectation
smean
10-1
100
-3 -2 0 1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
-3 -2 -1 0 1 2
noisserpxe-oC
Separation smean
10-3
10-2
10-1
100
-3 -2 -1 0 1 2
ytiralimismretOG
Separation smean
10-3
10-2
10-1
-3 -2 -1 0 1 2
10-1
100
101
102
103
-3 -2 -1-1 0 1
biological process molecular function
co-expression symptoms comorbidity
s AB s AB s AB
s AB
s ABs AB
cellular component
• Diseases that are close in the Interactome have similar biomedical properties
The Disease Space
Type 1 diabetes
Rheumatoid arthritis
Sutoimmune diseases
of the nervous system
Demyelinating
autoimmune diseases
Immune System Diseases
1
5
6
7
8
12
13
14
11
10
9
Retinitis pigmentosa
Retinal degeneration
Graves disease
Macular degeneration
Eye Diseases
1
2
3
4
9
10
Asthma
Respiratory hypersensitivity
Respiratory Tract Diseases
13
14
11
12
Cerebrovascular disorders
Myocardial infarction
Coronary artery disease
Myocardial ischemia
Cardiovascular Diseases
5
6
7
8
2
3
4
• Diseases and their network-based relationships can be represented in a 3D Diseases-Space
• Diseases belonging to the same class agglomerate
Menche et al., Science 2015
Overlapping diseases
• Examples of unexpected disease relationships uncovered using the disease space
IL1RL1
IL18R1
HLA-DRA
HLA-DPA1
HLA-DQB1
HLA-DPB1
HLA-DOA
HLA-DQA2
IL33
CDK2
SMAD3
NOTCH4
IL2RB
PTPN2
RUNX3
ETS1
BACH2
UBE2E3
IL18RAP
XCR1
OLIG3
TNFAIP3
CTLA4
EGFR
KIAA1109
MYO9B
CCR4
SH2B3
PLEK
CCR1
PTPRK
ARHGAP31
RGS1
ZMIZ1
SLC9A4
IL12A
RMI2
SYF2
LPP
IL21
PRM1
ATXN2
GLB1
HLA-DQA1
IL2
ITGA4
ICOS
ICOSLG
IKZF4 DPP10
ELF3
ORMDL3
ADAM33
RANBP6
TSLP
CRB1
PLA2G7
USP38 IL6RSLC25A46
SLC30A8
TBX21
MUC7
CHIT1
PBX2 PDE4D
C11orf30
BRD2
SUOX
Celiac disease
Celiac disease
asthma
asthma
celiac
diseaseasthma
atherosclerosis
coronary
artery disease
biliary
tract diseases
hepatic
cirrhosis
Intestinal immune
network for IGA
production
Intestinal immune network
O R I G I N A L A R T I C L E
A disease module in the interactome explains disease
heterogeneity, drug response and captures novel
pathways and genes in asthma
Amitabh Sharma1,2,3,†, Jörg Menche1,2,4,8,†, C. Chris Huang5, Tatiana Ort5,
Xiaobo Zhou3, Maksim Kitsak1,2, Nidhi Sahni2, Derek Thibault3, Linh Voung3,
Feng Guo3, Susan Dina Ghiassian1,2, Natali Gulbahce6, Frédéric Baribaud5, Joel
Tocker5, Radu Dobrin5, Elliot Barnathan5, Hao Liu5, Reynold A. Panettieri Jr7,
Kelan G. Tantisira3, Weiliang Qiu3, Benjamin A. Raby3, Edwin K. Silverman3,
Marc Vidal2,9, Scott T. Weiss3 and Albert-László Barabási1,2,3,4,8,*
1
Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115,
USA, 2
Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute,
Boston, MA 02215, USA, 3
Channing Division of Network Medicine, Department of Medicine, Brigham and
Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA, 4
Department of Theoretical Physics,
Budapest University of Technology and Economics, H1111, Budapest, Hungary, 5
Janssen Research &
Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA, 6
Department of Cellular and Molecular
Pharmacology, University of California 1700, 4th Street, Byers Hall 308D, San Francisco, CA 94158, USA,
7
Pulmonary Allergy and Critical Care Division, Department of Medicine, University of Pennsylvania, 125 South
31st Street, TRL Suite 1200, Philadelphia, PA 19104, USA, 8
Center for Network Science, Central European
University, Nador u. 9, 1051 Budapest, Hungary and 9
Department of Genetics, Harvard Medical School, Boston,
MA 02115, USA
*To whom correspondence should be addressed at: Center for Complex Networks Research, Department of Physics, Northeastern University, Boston,
MA 02115, USA. Email: barabasi@gmail.com
Abstract
Recent advances in genetics have spurred rapid progress towards the systematic identification of genes involved in complex
diseases. Still, the detailed understanding of the molecular and physiological mechanisms through which these genes affect
disease phenotypes remains a major challenge. Here, we identify the asthma disease module, i.e. the local neighborhood of the
interactome whose perturbation is associated with asthma, and validate it for functional and pathophysiological relevance,
using both computational and experimental approaches. We find that the asthma disease module is enriched with modest
GWAS P-values against the background of random variation, and with differentially expressed genes from normal and
asthmatic fibroblast cells treated with an asthma-specific drug. The asthma module also contains immune response
mechanisms that are shared with other immune-related disease modules. Further, using diverse omics (genomics,
†
These authors contributed equally to this work.
Received: September 1, 2014. Revised: November 19, 2014. Accepted: January 5, 2015
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Human Molecular Genetics, 2015, Vol. 24, No. 11 3005–3020
doi: 10.1093/hmg/ddv001
Advance Access Publication Date: 12 January 2015
Original Article
3005
atNortheasternUniversityLibrariesonDecember8,2015http://hmg.oxfordjournals.org/Downloadedfrom
RESEARCH ARTICLE
A DIseAse MOdule Detection (DIAMOnD)
Algorithm Derived from a Systematic
Analysis of Connectivity Patterns of Disease
Proteins in the Human Interactome
Susan Dina Ghiassian1,2☯
, Jörg Menche1,2,3☯
, Albert-László Barabási1,2,3,4
*
1 Center for Complex Networks Research and Department of Physics, Northeastern University, Boston,
Massachusetts, United States of America, 2 Center for Cancer Systems Biology (CCSB) and Department of
Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America, 3 Center
for Network Science, Central European University, Budapest, Hungary, 4 Channing Division of Network
Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston,
Massachusetts, United States of America
☯ These authors contributed equally to this work.
* barabasi@gmail.com
Abstract
The observation that disease associated proteins often interact with each other has fueled
the development of network-based approaches to elucidate the molecular mechanisms of
human disease. Such approaches build on the assumption that protein interaction networks
can be viewed as maps in which diseases can be identified with localized perturbation with-
in a certain neighborhood. The identification of these neighborhoods, or disease modules,
is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While
numerous heuristic methods exist that successfully pinpoint disease associated modules,
OPEN ACCESS
Citation: Ghiassian SD, Menche J, Barabási A-L
(2015) A DIseAse MOdule Detection (DIAMOnD)
Algorithm Derived from a Systematic Analysis of
Connectivity Patterns of Disease Proteins in the
Human Interactome. PLoS Comput Biol 11(4):
e1004120. doi:10.1371/journal.pcbi.1004120
Editor: Andrey Rzhetsky, University of Chicago,
UNITED STATES
Received: August 25, 2014
Accepted: January 9, 2015
Sharma et al., HMG 2015
Ghiassian et al., PLoS Comp Biol 2015
BUILDING DISEASE MODULES
Disease Module Detection and
Analysis
The general workflow of a detailed analysis for a disease of interest:
I Interactome construction II Disease Module
Identification
III Validation IV Biological interpretation
- Gene expression data
- Gene Ontologies
- Pathways
- Comorbidity
- OMIM, GWAS, literature
- DIAMOnD: Disease
Module Detection Algorithm
- Pathway prioritization
- Molecular mechanism
Seed gene selection
- Binary interactions, metabolic
couplings, regulatory interactions ...
DIAM
DISEASE MODULES VS COMMUNITIES
original seed genes
gene selected
at iteration i
DIAMOnD genes
legend:
iteration 3
iteration 2iteration 1initial seeds
0.18
0.46
0.46
0.07
0.53
0.46
0.21
0.46
0.29
p-value:
A B
genes connected to a
seed gene
proto-module
DIAMOnD
genes
Disease module:C
DIAMOnD –Disease Module Detection
Algorithm
 purely%topological%method%
 %all%genes%in%the%network%are%
prioriSzed%according%to%their%
potenSal%relevance%for%the%disease%
DIAMOnD and Disease Modules within the Human
BUILDING A DISEASE MODULE
ARTICLE
Received 7 May 2015 | Accepted 29 Nov 2015 | Published 1 Feb 2016
Network-based in silico drug efficacy screening
Emre Guney1,2, Jo¨rg Menche1,3, Marc Vidal2,4 & Albert-La´szlo´ Bara´basi1,2,3,5
The increasing cost of drug development together with a significant drop in the number of
new drug approvals raises the need for innovative approaches for target identification
and efficacy prediction. Here, we take advantage of our increasing understanding of the
network-based origins of diseases to introduce a drug-disease proximity measure that
quantifies the interplay between drugs targets and diseases. By correcting for the known
biases of the interactome, proximity helps us uncover the therapeutic effect of drugs, as well
as to distinguish palliative from effective treatments. Our analysis of 238 drugs used in 78
diseases indicates that the therapeutic effect of drugs is localized in a small network
neighborhood of the disease genes and highlights efficacy issues for drugs used in Parkinson
and several inflammatory disorders. Finally, network-based proximity allows us to predict
novel drug-disease associations that offer unprecedented opportunities for drug repurposing
and the detection of adverse effects.
DOI: 10.1038/ncomms10331 OPEN
Guney et al., Nature Comm 2015
DRUGS
DRUGS
ABCC8
VEGFA
RUNX1
INS
KAT6A
TOP2A
IRS1
TOP2B
CAPN10
NPM1
A
Disease gene
Drug target
Shortest path to the
closest disease gene
d
R
R
R
RR
z =s2
s3
1
t2
Random gene sets with the same degrees
...
T1
S1d1`
Tn Sndn
s1 t1
s2
s3
t2
2+3
2
d=
Drug - disease proximity
Gliclazide
Daunorubicin
Type 2 diabetes
Acute myeloid leukaemia
dc = 2.5
zc = 1.3
dc = 1.0
zc = –1.6
zc = 1.0zc = –3.3
dc = 2.0
dc = 1.0
b
c
Disease genes
Acute myeloid leukaemiaType 2 diabetes
Drug targetsDrug targets
Gliclazide Daunorubicin
Figure 1 | Network-based drug-disease proximity. (a) Illustration of the closest distance (dc) of a drug T with targets t1 and t2 to the proteins s1, s2 and s3
PROXIMITY TO DISEASE MODULES
a
b c
Seperation (dss)
dc dk
dcc
dss
Disease
module
Drugds
Center (dcc)Kernel (dk)Shortest (ds)Closest (dc)
AUC(%)
R2
= 0.175
th(dc)
R2
= 0.003
80
70
60
50
40
30
4
3
E COMMUNICATIONS | DOI: 10.1038/ncomms10331 A
GOING FURTHER
Full text: http://barabasi.com/networksciencebook/

Weitere ähnliche Inhalte

Was ist angesagt?

Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to Cancer
Raunak Shrestha
 
Systems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological systemSystems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological system
Lars Juhl Jensen
 
DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in Phylogeny
Bikash1489
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
Lars Juhl Jensen
 
Bioinformatics.Assignment
Bioinformatics.AssignmentBioinformatics.Assignment
Bioinformatics.Assignment
Naima Tahsin
 

Was ist angesagt? (20)

Interactomeee
InteractomeeeInteractomeee
Interactomeee
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to Cancer
 
Systems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological systemSystems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological system
 
Systems biology
Systems biologySystems biology
Systems biology
 
Systems biology: Bioinformatics on complete biological systems
Systems biology: Bioinformatics on complete biological systemsSystems biology: Bioinformatics on complete biological systems
Systems biology: Bioinformatics on complete biological systems
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
Advanced Systems Biology Methods in Drug Discovery
Advanced Systems Biology Methods in Drug DiscoveryAdvanced Systems Biology Methods in Drug Discovery
Advanced Systems Biology Methods in Drug Discovery
 
proteomics and system biology
proteomics and system biologyproteomics and system biology
proteomics and system biology
 
NetBioSIG2013-KEYNOTE Stefan Schuster
NetBioSIG2013-KEYNOTE Stefan SchusterNetBioSIG2013-KEYNOTE Stefan Schuster
NetBioSIG2013-KEYNOTE Stefan Schuster
 
evolutionary game theory presentation
evolutionary game theory presentationevolutionary game theory presentation
evolutionary game theory presentation
 
Neuroinformatics conference 2012
Neuroinformatics conference 2012Neuroinformatics conference 2012
Neuroinformatics conference 2012
 
DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in Phylogeny
 
Construction of phylogenetic tree from multiple gene trees using principal co...
Construction of phylogenetic tree from multiple gene trees using principal co...Construction of phylogenetic tree from multiple gene trees using principal co...
Construction of phylogenetic tree from multiple gene trees using principal co...
 
Protein interaction networks
Protein interaction networksProtein interaction networks
Protein interaction networks
 
Protein protein interaction, functional proteomics
Protein protein interaction, functional proteomicsProtein protein interaction, functional proteomics
Protein protein interaction, functional proteomics
 
Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networks
 
20042016_pizzaclub_part2
20042016_pizzaclub_part220042016_pizzaclub_part2
20042016_pizzaclub_part2
 
Bioinformatics.Assignment
Bioinformatics.AssignmentBioinformatics.Assignment
Bioinformatics.Assignment
 
protein-protein interaction
protein-protein  interactionprotein-protein  interaction
protein-protein interaction
 
Protein interaction, types by kk sahu
Protein interaction, types by kk sahuProtein interaction, types by kk sahu
Protein interaction, types by kk sahu
 

Andere mochten auch

Graph properties of biological networks
Graph properties of biological networksGraph properties of biological networks
Graph properties of biological networks
ngulbahce
 
Gene Expression Data Analysis
Gene Expression Data AnalysisGene Expression Data Analysis
Gene Expression Data Analysis
Jhoirene Clemente
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
Lars Juhl Jensen
 

Andere mochten auch (9)

Artificial Intelligence in Data Curation
Artificial Intelligence in Data CurationArtificial Intelligence in Data Curation
Artificial Intelligence in Data Curation
 
Graph properties of biological networks
Graph properties of biological networksGraph properties of biological networks
Graph properties of biological networks
 
RT-PCR
RT-PCRRT-PCR
RT-PCR
 
Gene Expression Data Analysis
Gene Expression Data AnalysisGene Expression Data Analysis
Gene Expression Data Analysis
 
Gene expression concept and analysis
Gene expression concept and analysisGene expression concept and analysis
Gene expression concept and analysis
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
The Opera of Phantome - 2017 (presented at the 22nd Biennial Evergreen Phage ...
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
 

Ähnlich wie Introduction to Network Medicine

Protein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modelingProtein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modeling
Rangarajan Chari
 
Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013
Areejit Samal
 
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Kevin Keraudren
 
Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...
Samuel Sattath
 
intracell-networks.ppt
intracell-networks.pptintracell-networks.ppt
intracell-networks.ppt
Khush318896
 

Ähnlich wie Introduction to Network Medicine (20)

Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network Representations
 
Protein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modelingProtein-protein interactions-graph-theoretic-modeling
Protein-protein interactions-graph-theoretic-modeling
 
Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013
 
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
Segmenting Epithelial Cells in High-Throughput RNAi Screens (Miaab 2011)
 
Paper
PaperPaper
Paper
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
 
Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...Network motifs in integrated cellular networks of transcription–regulation an...
Network motifs in integrated cellular networks of transcription–regulation an...
 
A tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus KaiserA tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus Kaiser
 
Synapse miltan chowdhury
Synapse miltan chowdhurySynapse miltan chowdhury
Synapse miltan chowdhury
 
SciReport 2015
SciReport 2015SciReport 2015
SciReport 2015
 
Percolation in interacting networks
Percolation in interacting networksPercolation in interacting networks
Percolation in interacting networks
 
Huwang-2-7.ppt
Huwang-2-7.pptHuwang-2-7.ppt
Huwang-2-7.ppt
 
A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...
A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...
A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...
 
A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...
A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...
A Multiset Rule Based Petri net Algorithm for the Synthesis and Secretary Pat...
 
Node similarity
Node similarityNode similarity
Node similarity
 
Cell junctions molecular biology of the cell - ncbi bookshelf
Cell junctions   molecular biology of the cell - ncbi bookshelfCell junctions   molecular biology of the cell - ncbi bookshelf
Cell junctions molecular biology of the cell - ncbi bookshelf
 
Opposite Opinions
Opposite OpinionsOpposite Opinions
Opposite Opinions
 
intracell-networks.ppt
intracell-networks.pptintracell-networks.ppt
intracell-networks.ppt
 
Large scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithmLarge scale cell tracking using an approximated Sinkhorn algorithm
Large scale cell tracking using an approximated Sinkhorn algorithm
 
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
Graph Algorithm to Find Core Periphery Structures using Mutual K-nearest Neig...
 

Kürzlich hochgeladen

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 

Kürzlich hochgeladen (20)

Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 

Introduction to Network Medicine

  • 1. INTRODUCTION TO NETWORK MEDICINE Marc Santolini Center for Complex Network Research (CCNR)
  • 2. Reductionism,which has dominated biological research for over a century, has provided a wealth of knowledge about individual cellular components and their func- tions. Despite its enormous success, it is increasingly clear that a discrete biological function can only rarely be attributed to an individual molecule. Instead, most biological characteristics arise from complex interac- tions between the cell’s numerous constituents, such as proteins,DNA,RNA and small molecules1–8 .Therefore, akeychallengeforbiologyinthetwenty-firstcenturyisto understand the structure and the dynamics of the com- plex intercellular web of interactions that contribute to the structure and function of a living cell. The development of high-throughput data-collection techniques, as epitomized by the widespread use of microarrays,allows for the simultaneous interrogation of the status of a cell’s components at any given time. In turn,new technology platforms,such as PROTEIN CHIPS or semi-automatedYEAST TWO-HYBRID SCREENS,help to deter- mine how and when these molecules interact with each other.Various types of interaction webs, or networks, (including protein–protein interaction,metabolic,sig- nalling and transcription-regulatory networks) emerge from the sum of these interactions.None of these net- works are independent,instead they form a‘network of networks’ that is responsible for the behaviour of the cell.A major challenge of contemporary biology is to programmetomapout,understandandmodelinquan- tifiabletermsthetopologicalanddynamicpropertiesof the variousnetworksthatcontrolthebehaviourof thecell. Helpalongthewayisprovidedbytherapidlydevelop- ing theory of complex networks that, in the past few years,has made advances towards uncovering the orga- nizingprinciplesthatgoverntheformationandevolution of various complex technological and social networks9–12 . This research is already making an impact on cell biology. It has led to the realization that the architectural features of molecularinteractionnetworkswithinacellareshared to a large degree by other complex systems,such as the Internet,computer chips and society.This unexpected universality indicates that similar laws may govern most complex networks in nature,which allows the expertise fromlargeandwell-mappednon-biologicalsystemstobe usedtocharacterizetheintricateinterwovenrelationships thatgoverncellularfunctions. In this review,we show that the quantifiable tools of network theory offer unforeseen possibilities to under- stand the cell’s internal organization and evolution, fundamentally altering our view of cell biology. The emerging results are forcing the realization that, not- withstanding the importance of individual molecules, cellular function is a contextual attribute of strict and quantifiable patterns of interactions between the myriad of cellular constituents. Although uncovering NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION Albert-László Barabási* & Zoltán N. Oltvai‡ A key aim of postgenomic biomedical research is to systematically catalogue all molecules and their interactions within a living cell. There is a clear need to understand how these molecules and the interactions between them determine the function of this enormously complex machinery, both in isolation and when surrounded by other cells. Rapid advances in network biology indicate that cellular networks are governed by universal laws and offer a new conceptual framework that could potentially revolutionize our view of biology and disease pathologies in the twenty-first century. oarrays, gy nomic set surface hem. The t a high e inding ysics, Dame, na 46556, hology, ersity, 611, u; R E V I E W S Barabasi et al., Nat Rev Genet 2004 NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 105 Ba; blue nodes). In the Barabási–Albert model of a scale-free network , at each time point a node with M links is added to the network, which connects to an already existing node I with probability ΠI = kI /ΣJ kJ , where kI is the degree of node I (FIG. 3) and J is the index denoting the sum over network nodes. The network that is generated by this growth process has a power-law degree distribution that is characterized by the degree exponent γ = 3. Such distributions are seen as a straight line on a log–log plot (see figure, part Bb). The network that is created by the Barabási–Albert model does not have an inherent modularity, so C(k) is independent of k (see figure, part Bc). Scale-free networks with degree exponents 2<γ<3, a range that is observed in most biological and non-biological networks, are ultra-small34,35 , with the average path length following ഞ ~ log log N, which is significantly shorter than log N that characterizes random small-world networks. Hierarchicalnetworks To account for the coexistence of modularity, local clustering and scale-free topology in many real systems it has to be assumed that clusters combine in an iterative manner, generating a hierarchical network47,53 (see figure, part C). The starting point of this construction is a small cluster of four densely linked nodes (see the four central nodes in figure, part Ca). Next, three replicas of this module are generated and the three external nodes of the replicated clusters connected to the central node of the old cluster, which produces a large 16-node module. Three replicas of this 16-node module are then generated and the 16 peripheral nodes connected to the central node of the old module, which produces a new module of 64 nodes. The hierarchical network model seamlessly integrates a scale-free topology with an inherent modular structure by generating a network that has a power-law degree distribution with degree exponent γ = 1 + ഞn4/ഞn3 = 2.26 (see figure, part Cb) and a large, system-size independent average clustering coefficient <C> ~ 0.6. The most important signature of hierarchical modularity is the scaling of the clustering coefficient, which follows C(k) ~ k –1 a straight line of slope –1 on a log–log plot (see figure, part Cc). A hierarchical architecture implies that sparsely connected nodes are part of highly clustered areas, with communication between the different highly clustered neighbourhoods being maintained by a few hubs (see figure, part Ca). A Random network Ab Ac Aa Bb Bc Ba Cb Cc Ca B Scale-free network C Hierarchical network 1 0.1 0.01 0.001 0.0001 1 10 100 1,000 P(k)C(k) k k k k k P(k) P(k) 100 10 10–1 10–2 10–3 10–4 10–5 10–6 10–7 10–8 100 1,000 10,000 C(k) logC(k) log k SCALE-FREE NETWORKS
  • 3. R E V I E W S mathematical properties of random networks14 .Their much-investigated random network model assumes that a fixed number of nodes are connected randomly to each other(BOX2).Themostremarkablepropertyof themodel is its‘democratic’or uniform character,characterizing the degree,orconnectivity(k;BOX1),of theindividualnodes. Because, in the model, the links are placed randomly among the nodes,it is expected that some nodes collect only a few links whereas others collect many more.In a random network, the nodes degrees follow a Poisson distribution, which indicates that most nodes have roughly the same number of links,approximately equal to the network’s average degree,<k> (where <> denotes the average); nodes that have significantly more or less linksthan<k>areabsentorveryrare(BOX2). Despite its elegance, a series of recent findings indi- cate that the random network model cannot explain the topological properties of real networks. The deviations from the random model have several key signatures, the most striking being the finding that, in contrast to the Poisson degree distribution, for many social and technological networks the number of nodes with a given degree follows a power law. That is, the probability that a chosen node has exactly k links follows P(k) ~ k –γ , where γ is the degree exponent, with its value for most networks being between 2 and 3 (REF.15).Networks that are characterized by a power-law degree distribution are highly non-uniform, most of the nodes have only a few links.A few nodes with a very large number of links,which are often called hubs,hold these nodes together. Networks with a power degree Figure 2 | Yeast protein interaction network. A map of protein–protein interactions18 in Saccharomyces cerevisiae, which is based on early yeast two-hybrid measurements23 , illustrates that a few highly connected nodes (which are also known as hubs) hold the network together. The largest cluster, which contains ~78% of all proteins, is shown. The colour of a node indicates the phenotypic effect of removing the corresponding protein (red = lethal, green = non-lethal, orange = slow growth, yellow = unknown). Reproduced with permission from REF.18 © Jeong et al., “Lethality and centrality in protein networks“ Nature 2001 THE YEAST INTERACTOME
  • 4. FABRICATING HUBSR E V I E W S major engineer of the genomic landscape, it is likely to be a key mechanism for generating the scale-free topology. Two further results offer direct evidence that net- work growth is responsible for the observed topological features. The scale-free model (BOX 2) predicts that the nodes that appeared early in the history of the network are the most connected ones15 .Indeed,an inspection of the metabolic hubs indicates that the remnants of the RNA world, such as coenzyme A, NAD and GTP, are among the most connected substrates of the metabolic network, as are elements of some of the most ancient metabolic pathways, such as glycolysis and the tricar- boxylic acid cycle17 .In the context of the protein interac- tion networks, cross-genome comparisons have found that, on average, the evolutionarily older proteins have more links to other proteins than their younger coun- terparts45,46 . This offers direct empirical evidence for preferential attachment. Motifs, modules and hierarchical networks Cellular functions are likely to be carried out in a highly modular manner1 . In general, modularity refers to a group of physically or functionally linked molecules (nodes) that work together to achieve a (relatively) dis- tinct function1,6,8,47 . Modules are seen in many systems, for example,circles of friends in social networks or web- sites that are devoted to similar topics on the World Wide Web. Similarly, in many complex engineered sys- tems, from a modern aircraft to a computer chip, a highly modular structure is a fundamental design a b Proteins 1 2 Proteins Genes Genes Before duplication After duplication Figure 3 | The origin of the scale-free topology and hubs in biological networks. The origin of the scale-free topology
  • 5. NETWORK MOTIFS (2003). 16. N. Keyghobadi, M. A. Matrone, G. D. Ebel, L. D. Kramer, D. M. Fonseca, Mol. Ecol. Notes 4, 20 (2004). 17. D. M. Fonseca, C. T. Atkinson, R. C. Fleischer, Mol. Ecol. 7, 1617 (1998). 18. F. H. Drummond, Trans. R. Entomol. Soc. Lond. 102, 369 (1951). 19. K. Tanaka, K. Mizusawa, E. S. Saugstad, Contrib. Am. Entomol. Inst. 16, 1 (1979). 20. J. K. Pritchard, M. Stephens, P. Donnelly, Genetics 155, 945 (2000). 21. A. R. Barr, Am. J. Trop. Med. Hyg. 6, 153 (1957). 22. A. J. Cornel et al., J. Med. Entomol. 40, 36 (2003). 23. S. Urbanelli, F. Silvestrini, W. K. Reisen, E. De Vito, L. Bullini, J. Med. Entomol. 34, 116 (1997). 24. L. L. Cavalli-Sforza, F. Cavalli-Sforza, The Great Human Diasporas: The History of Diversity and Evolution (Addison-Wesley, Reading, MA, 1995). 25. J. de Zulueta, Parassitologia 36, 7 (1994). 26. S. Urbanelli et al., in Ecologia, Atti I Congr. Naz. versity of Pennsylvania, for technical assistance; and A. Bhandoola and four anonymous reviewers for comments and valuable suggestions on an ear- lier version of this manuscript. Supported by a National Research Council Associateship through the Walter Reed Army Institute of Research (D.M.F.), by NIH grant nos. U50/CCU220532 and 1R01GM063258, and by NSF grant no. DEB-0083944. This material reflects the views of the authors and should not be construed to repre- sent those of the Department of the Army or the Department of Defense. Supporting Online Material www.sciencemag.org/cgi/content/full/303/5663/1535/ DC1 Materials and Methods Tables S1 to S8 References and Notes 2 December 2003; accepted 16 January 2004 Superfamilies of Evolved and Designed Networks Ron Milo, Shalev Itzkovitz, Nadav Kashtan, Reuven Levitt, Shai Shen-Orr, Inbal Ayzenshtat, Michal Sheffer, Uri Alon* Complex biological, technological, and sociological networks can be of very different sizes and connectivities, making it difficult to compare their struc- tures. Here we present an approach to systematically study similarity in the local structure of networks, based on the significance profile (SP) of small subgraphs in the network compared to randomized networks. We find several superfamilies of previously unrelated networks with very similar SPs. One superfamily, including transcription networks of microorganisms, rep- resents “rate-limited” information-processing networks strongly con- strained by the response time of their components. A distinct superfamily includes protein signaling, developmental genetic networks, and neuronal wiring. Additional superfamilies include power grids, protein-structure net- works and geometric networks, World Wide Web links and social networks, and word-adjacency networks from different languages. Many networks in nature share global prop- erties (1, 2). Their degree sequences (the number of edges per node) often follow a long-tailed distribution, in which some nodes are much more connected than the average (3). In addition, natural networks often show the small-world property of short paths be- tween nodes and highly clustered connections (1, 2, 4). Despite these global similarities, networks from different fields can have very different local structure (5). It was recently found that networks display certain patterns, termed “network motifs,” at much higher fre- quency than expected in randomized net- works (6, 7). In biological networks, these motifs were suggested to be recurring circuit elements that carry out key information- processing tasks (6, 8–10). Departments of Molecular Cell Biology, Physics of Complex Systems, and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel. *To whom correspondence should be addressed at Department of Molecular Cell Biology, Weizmann In- stitute of Science, Rehovot 76100, Israel. E-mail: urialon@weizmann.ac.il CH 2004 VOL 303 SCIENCE www.sciencemag.org ors that readily transmit the vi- and between avian hosts and ld have created the current ep- itions. nt study suggests that changes in pacity and the creation of new tors may occur with new intro- particular, the arrival of hybrid rms in northern Europe has the adically change the dynamics of rope. s and Notes adova, Culex pipiens pipiens Mosquitoes: Distribution, Ecology, Physiology, Genet- Importance, and Control (Pensoft, Mos- n, Ann. N.Y. Acad. Sci. 951, 220 (2001). M. L. O’Guinn, D. J. Dohm, J. W. Jones, omol. 38, 130 (2001). ard et al., Emerg. Infect. Dis. 7, 679 ekera et al., Emerg. Infect. Dis. 7, 722 m, M. R. Sardelis, M. J. Turell, J. Med. 9, 640 (2002). et al., Emerg. Infect. Dis. 7, 742 (2001). local structure of networks, based on the significance profile (SP) of small subgraphs in the network compared to randomized networks. We find several superfamilies of previously unrelated networks with very similar SPs. One superfamily, including transcription networks of microorganisms, rep- resents “rate-limited” information-processing networks strongly con- strained by the response time of their components. A distinct superfamily includes protein signaling, developmental genetic networks, and neuronal wiring. Additional superfamilies include power grids, protein-structure net- works and geometric networks, World Wide Web links and social networks, and word-adjacency networks from different languages. Many networks in nature share global prop- erties (1, 2). Their degree sequences (the number of edges per node) often follow a long-tailed distribution, in which some nodes are much more connected than the average (3). In addition, natural networks often show the small-world property of short paths be- tween nodes and highly clustered connections (1, 2, 4). Despite these global similarities, networks from different fields can have very different local structure (5). It was recently found that networks display certain patterns, termed “network motifs,” at much higher fre- quency than expected in randomized net- works (6, 7). In biological networks, these motifs were suggested to be recurring circuit elements that carry out key information- processing tasks (6, 8–10). Departments of Molecular Cell Biology, Physics of Complex Systems, and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel. *To whom correspondence should be addressed at Department of Molecular Cell Biology, Weizmann In- stitute of Science, Rehovot 76100, Israel. E-mail: urialon@weizmann.ac.il 5 MARCH 2004 VOL 303 SCIENCE www.sciencemag.org To understand the design principles of com- plex networks, it is important to compare the local structure of networks from different fields. The main difficulty is that these networks can be of vastly different sizes [for example, World Wide Web (WWW) hyperlink networks with millions of nodes and social networks with tens of nodes] and degree sequences. Here, we present an ap- proach for comparing network local structure, based on the significance profile (SP). To calcu- late the SP of a network, the network is compared to an ensemble of randomized networks with the same degree sequence. The comparison to ran- domized networks compensates for effects due to network size and degree sequence. For each sub- graph i, the statistical significance is described by the Z score (11): Zi ϭ ͑Nreali Ϫ <Nrandi>)/std(Nrandi) where Nreali is the number of times the sub- graph appears in the network, and ϽNrandiϾ and std(Nrandi) are the mean and standard deviation of its appearances in the random- ized network ensemble. The SP is the vector of Z scores normalized to length 1: SPiϭZi/(⌺Zi 2 )1/2 The normalization emphasizes the relative significance of subgraphs, rather than the ab- solute significance. This is important for comparison of networks of different sizes, because motifs (subgraphs that occur much more often than expected at random) in large networks tend to display higher Z scores than motifs in small networks (7). We present in Fig. 1 the SP of the 13 possible directed connected triads (triad sig- nificance profile, TSP) for networks from different fields (12). The TSP of these net- works is almost always insensitive to removal of 30% of the edges or to addition of 50% new edges at random, demonstrating that it is robust to missing data or random data errors (SOM Text). Several superfamilies of net- works with similar TSPs emerge from this analysis. One superfamily includes sensory transcription networks that control gene ex- pression in bacteria and yeast in response to external stimuli. In these transcription net- works, the nodes represent genes or operons and the edges represent direct transcriptional regulation (6, 13–15). Networks from three microorganisms, the bacteria Escherichia coli (6) and Bacillus subtilis (14) and the yeast Saccharomyces cerevisiae (7, 15), were analyzed. The networks have very similar TSPs (correlation coefficient c Ͼ 0.99). They show one strong motif, triad 7, termed “feed- forward loop.” The feedforward loop has been theoretically and experimentally shown Fig. 1. The triad significance profile (TSP) of networks from various disciplines. The TSP shows the normalized significance level (Z score) for each of the 13 triads. Networks with similar characteristic profiles are URCHIN N ϭ 45, E ϭ 83), and synaptic connections between neurons in C. elegans (NEURONS N ϭ 280, E ϭ 2170). (iii) WWW hyperlinks between Web pages in the www.nd.edu site (3) (WWW-1 N ϭ 325729,
  • 6. The human disease network Kwang-Il Goh*†‡§ , Michael E. Cusick†‡¶ , David Valleʈ , Barton Childsʈ , Marc Vidal†‡¶ **, and Albert-La´szlo´ Baraba´si*†‡ ** *Center for Complex Network Research and Department of Physics, University of Notre Dame, Notre Dame, IN 46556; †Center for Cancer Systems Biology (CCSB) and ¶Department of Cancer Biology, Dana–Farber Cancer Institute, 44 Binney Street, Boston, MA 02115; ‡Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115; §Department of Physics, Korea University, Seoul 136-713, Korea; and ʈDepartment of Pediatrics and the McKusick–Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved April 3, 2007 (received for review February 14, 2007) A network of disorders and disease genes linked by known disorder– gene associations offers a platform to explore in a single graph- theoretic framework all known phenotype and disease gene associ- ations, indicating the common genetic origin of many diseases. Genes associated with similar disorders show both higher likelihood of physical interactions between their products and higher expression profiling similarity for their transcripts, supporting the existence of distinct disease-specific functional modules. We find that essential human genes are likely to encode hub proteins and are expressed widely in most tissues. This suggests that disease genes also would play a central role in the human interactome. In contrast, we find that the vast majority of disease genes are nonessential and show no tendency to encode hub proteins, and their expression pattern indi- cates that they are localized in the functional periphery of the network. A selection-based model explains the observed difference between essential and disease genes and also suggests that diseases caused by somatic mutations should not be peripheral, a prediction we confirm for cancer genes. biological networks ͉ complex networks ͉ human genetics ͉ systems biology ͉ diseasome Decades-long efforts to map human disease loci, at first genet- ically and later physically (1), followed by recent positional cloning of many disease genes (2) and genome-wide association studies (3), have generated an impressive list of disorder–gene association pairs (4, 5). In addition, recent efforts to map the protein–protein interactions in humans (6, 7), together with efforts to curate an extensive map of human metabolism (8) and regulatory networks offer increasingly detailed maps of the relationships between different disease genes. Most of the successful studies building on these new approaches have focused, however, on a single disease, using network-based tools to gain a better under- standing of the relationship between the genes implicated in a selected disorder (9). Here we take a conceptually different approach, exploring whether human genetic disorders and the corresponding disease genes might be related to each other at a higher level of cellular and organismal organization. Support for the validity of this approach is provided by examples of genetic disorders that arise from mutations in more than a single gene (locus heterogeneity). For example, Zellweger syndrome is caused by mutations in any of at least 11 genes, all associated with peroxisome biogenesis (10). Similarly, there are many examples of different mutations in the same gene (allelic heterogeneity) giving rise to phenotypes cur- rently classified as different disorders. For example, mutations in TP53 have been linked to 11 clinically distinguishable cancer- related disorders (11). Given the highly interlinked internal orga- nization of the cell (12–17), it should be possible to improve the single gene–single disorder approach by developing a conceptual framework to link systematically all genetic disorders (the human ‘‘disease phenome’’) with the complete list of disease genes (the ‘‘disease genome’’), resulting in a global view of the ‘‘diseasome,’’ the combined set of all known disorder/disease gene associations. Results Construction of the Diseasome. We constructed a bipartite graph consisting of two disjoint sets of nodes. One set corresponds to all known genetic disorders, whereas the other set corresponds to all known disease genes in the human genome (Fig. 1). A disorder and a gene are then connected by a link if mutations in that gene are implicated in that disorder. The list of disorders, disease genes, and associations between them was obtained from the Online Mende- lian Inheritance in Man (OMIM; ref. 18), a compendium of human disease genes and phenotypes. As of December 2005, this list contained 1,284 disorders and 1,777 disease genes. OMIM initially focused on monogenic disorders but in recent years has expanded to include complex traits and the associated genetic mutations that confer susceptibility to these common disorders (18). Although this history introduces some biases, and the disease gene record is far from complete, OMIM represents the most complete and up-to- date repository of all known disease genes and the disorders they confer. We manually classified each disorder into one of 22 disorder classes based on the physiological system affected [see supporting information (SI) Text, SI Fig. 5, and SI Table 1 for details]. Starting from the diseasome bipartite graph we generated two biologically relevant network projections (Fig. 1). In the ‘‘human disease network’’ (HDN) nodes represent disorders, and two disorders are connected to each other if they share at least one gene in which mutations are associated with both disorders (Figs. 1 and 2a). In the ‘‘disease gene network’’ (DGN) nodes represent disease genes, and two genes are connected if they are associated with the same disorder (Figs. 1 and 2b). Next, we discuss the potential of these networks to help us understand and represent in a single framework all known disease gene and phenotype associations. Properties of the HDN. If each human disorder tends to have a distinct and unique genetic origin, then the HDN would be dis- connected into many single nodes corresponding to specific disor- ders or grouped into small clusters of a few closely related disorders. In contrast, the obtained HDN displays many connections between both individual disorders and disorder classes (Fig. 2a). Of 1,284 disorders, 867 have at least one link to other disorders, and 516 disorders form a giant component, suggesting that the genetic origins of most diseases, to some extent, are shared with other diseases. The number of genes associated with a disorder, s, has a broad distribution (see SI Fig. 6a), indicating that most disorders relate to a few disease genes, whereas a handful of phenotypes, such as deafness (s ϭ 41), leukemia (s ϭ 37), and colon cancer (s ϭ 34), relate to dozens of genes (Fig. 2a). The degree (k) distribution of HDN (SI Fig. 6b) indicates that most disorders are linked to only Author contributions: D.V., B.C., M.V., and A.-L.B. designed research; K.-I.G. and M.E.C. performed research; K.-I.G. and M.E.C. analyzed data; and K.-I.G., M.E.C., D.V., M.V., and A.-L.B. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Abbreviations: DGN, disease gene network; HDN, human disease network; GO, Gene Ontology; OMIM, Online Mendelian Inheritance in Man; PCC, Pearson correlation coeffi- cient. **To whom correspondence may be addressed. E-mail: alb@nd.edu or marc࿝vidal@ dfci.harvard.edu. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0701361104/DC1. © 2007 by The National Academy of Sciences of the USA www.pnas.org͞cgi͞doi͞10.1073͞pnas.0701361104 PNAS ͉ May 22, 2007 ͉ vol. 104 ͉ no. 21 ͉ 8685–8690 APPLIEDPHYSICAL SCIENCES AR ATM BRCA1 BRCA2 CDH1 GARS HEXB KRAS LMNA MSH2 PIK3CA TP53 MAD1L1 RAD54L VAPB CHEK2 BSCL2 ALS2 BRIP1 Androgen insensitivity Breast cancer Perineal hypospadias Prostate cancer Spinal muscular atrophy Ataxia-telangiectasia Lymphoma T-cell lymphoblastic leukemia Ovarian cancer Papillary serous carcinoma Fanconi anemia Pancreatic cancer Wilms tumor Charcot-Marie-Tooth disease Sandhoff disease Lipodystrophy Amyotrophic lateral sclerosis Silver spastic paraplegia syndrome Spastic ataxia/paraplegia AR ATM BRCA1 BRCA2 CDH1 GARS HEXB KRAS LMNA MSH2 PIK3CA TP53 MAD1L1 RAD54L VAPB CHEK2 BSCL2 ALS2 BRIP1 Androgen insensitivity Breast cancer Perineal hypospadiasProstate cancer Spinal muscular atrophy Ataxia-telangiectasia Lymphoma T-cell lymphoblastic leukemia Ovarian cancer Papillary serous carcinoma Fanconi anemia Pancreatic cancer Wilms tumor Charcot-Marie-Tooth disease Sandhoff disease Lipodystrophy Amyotrophic lateral sclerosis Silver spastic paraplegia syndrome Spastic ataxia/paraplegia Human Disease Network (HDN) Disease Gene Network (DGN) disease genomedisease phenome DISEASOME Fig. 1. Construction of the diseasome bipartite network. (Center) A small subset of OMIM-based disorder–disease gene associations (18), where circles and rectangles Goh et al., PNAS 2007 GENES AND DISEASES
  • 7. Asthma Atheroscierosis Blood group Breast cancer Complement_component deficiency Cardiomyopathy Cataract Charcot-Marie-Tooth disease Colon cancer Deafness Diabetes mellitus Epidermolysis bullosa Epilepsy Fanconi anemia Gastric cancer Hypertension Leigh syndrome Leukemia Lymphoma Mental retardation Muscular dystrophy Myocardial infarction Myopathy Obesity Parkinson disease Prostate cancer Retinitis pigmentosa Spherocytosis Spinocereballar ataxia Stroke Thyroid carcinoma Zellweger syndrome APC COL2A1 ACE PAX6 ERBB2 FBN1 FGFR3 FGFR2 GJB2 GNAS KIT KRAS LRP5 MSH2 MEN1 NF1 PTEN SCN4A TP53 ARX a b Human Disease Network Disease Gene Network Disorder Class Bone Cancer Cardiovascular Connective tissue Dermatological Developmental Ear, Nose, Throat Endocrine Gastrointestinal Hematological Immunological Metabolic Muscular Neurological Nutritional Ophthamological Psychiatric Renal Respiratory Skeletal multiple Unclassified Node size 1 5 10 15 21 25 30 34 41 Hirschprung disease Trichothio- dystrophy Alzheimer disease Heinz body anemia Bethlem myopathy Hemolytic anemia Ataxia- telangiectasia Pseudohypo- aldosteronism APPLIEDPHYSICAL SCIENCES
  • 8. Leading Edge Review Interactome Networks and Human Disease Marc Vidal,1,2,* Michael E. Cusick,1,2 and Albert-La´ szlo´ Baraba´ si1,3,4,* 1Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA 2Department of Genetics, Harvard Medical School, Boston, MA 02115, USA 3Center for Complex Network Research (CCNR) and Departments of Physics, Biology and Computer Science, Northeastern University, Boston, MA 02115, USA 4Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA *Correspondence: marc_vidal@dfci.harvard.edu (M.V.), alb@neu.edu (A.-L.B.) DOI 10.1016/j.cell.2011.02.016 Complex biological systems and cellular networks may underlie most genotype to phenotype relationships. Here, we review basic concepts in network biology, discussing different types of interactome networks and the insights that can come from analyzing them. We elaborate on why interactome networks are important to consider in biology, how they can be mapped and integrated with each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease. Introduction Since the advent of molecular biology, considerable progress has been made in the quest to understand the mechanisms that underlie human disease, particularly for genetically inherited disorders. Genotype-phenotype relationships, as summarized in the Online Mendelian Inheritance in Man (OMIM) database (Am- berger et al., 2009), include mutations in more than 3000 human genes known to be associated with one or more of over 2000 human disorders. This is a truly astounding number of geno- type-phenotype relationships considering that a mere three decades have passed since the initial description of Restriction Fragment Length Polymorphisms (RFLPs) as molecular markers to map genetic loci of interest (Botstein et al., 1980), only two decades since the announcement of the first positional cloning experiments of disease-associated genes using RFLPs (Amberger et al., 2009), and just one decade since the release of the first reference sequences of the human genome (Lander et al., 2001; Venter et al., 2001). For complex traits, the informa- tion gathered by recent genome-wide association studies suggests high-confidence genotype-phenotype associations between close to 1000 genomic loci and one or more of over phenotypic associations, there would still be major problems to fully understand and model human genetic variations and their impact on diseases. To understand why, consider the ‘‘one-gene/one-enzyme/ one-function’’ concept originally framed by Beadle and Tatum (Beadle and Tatum, 1941), which holds that simple, linear connections are expected between the genotype of an organism and its phenotype. But the reality is that most genotype-pheno- type relationships arise from a much higher underlying com- plexity. Combinations of identical genotypes and nearly identical environments do not always give rise to identical phenotypes. The very coining of the words ‘‘genotype’’ and ‘‘phenotype’’ by Johannsen more than a century ago derived from observations that inbred isogenic lines of bean plants grown in well-controlled environments give rise to pods of different size (Johannsen, 1909). Identical twins, although strikingly similar, nevertheless often exhibit many differences (Raser and O’Shea, 2005). Like- wise, genotypically indistinguishable bacterial or yeast cells grown side by side can express different subsets of transcripts and gene products at any given moment (Elowitz et al., 2002; Blake et al., 2003; Taniguchi et al., 2010). Even straightforward Mapping Interactome Networks Network science deals with complexity by ‘‘simplifying’’ com- plex systems, summarizing them merely as components (nodes) and interactions (edges) between them. In this simplified approach, the functional richness of each node is lost. Despite or even perhaps because of such simplifications, useful discov- eries can be made. As regards cellular systems, the nodes are metabolites and macromolecules such as proteins, RNA mole- cules and gene sequences, while the edges are physical, biochemical and functional interactions that can be identified with a plethora of technologies. One challenge of network biology is to provide maps of such interactions using systematic and standardized approaches and assays that are as unbiased as possible. The resulting ‘‘interactome’’ networks, the networks of interactions between cellular components, can serve as scaf- fold information to extract global or local graph theory proper- et al., 2010). Computational prediction maps are fast and effi- cient to implement, and usually include satisfyingly large numbers of nodes and edges, but are necessarily imperfect because they use indirect information (Plewczynski and Ginalski, 2009). While high-throughput maps attempt to describe unbi- ased, systematic, and well-controlled data, they were initially more difficult to establish, although recent technological advances suggest that near completion can be reached within a few years for highly reliable, comprehensive protein-protein were discovered in and are being applied genome-wide for these model organisms (Mohr et al., 2010). Metabolic Networks Metabolic network maps attempt to comprehensively describe all possible biochemical reactions for a particular cell or organism (Schuster et al., 2000; Edwards et al., 2001). In many representations of metabolic networks, nodes are biochemical metabolites and edges are either the reactions that convert Figure 2. Networks in Cellular Systems To date, cellular networks are most available for the ‘‘super-model’’ organisms (Davis, 2004) yeast, worm, fly, and plant. High-throughput interactome mapping relies upon genome-scale resources such as ORFeome resources. Several types of interactome networks discussed are depicted. In a protein interaction network, nodes represent proteins and edges represent physical interactions. In a transcriptional regulatory network, nodes represent transcription factors (circular nodes) or putative DNA regulatory elements (diamond nodes); and edges represent physical binding between the two. In a disease network, nodes represent diseases, and edges represent gene mutations of which are associated with the linked diseases. In a virus-host network, nodes represent viral proteins (square nodes) or host proteins (round nodes), and edges represent physical interactions between the two. In a metabolic network, nodes represent enzymes, and edges represent metabolites that are products or substrates of the enzymes. The network depictions seem dense, but they represent only small portions of available interactome network maps, which themselves constitute only a few percent of the complete interactomes within cells. Cell 2011 DISEASES AS NETWORK PERTURBATIONS
  • 9. Most cellular components exert their functions through interactions with other cellular components, which can be located either in the same cell or across cells, and even across organs. In humans, the potential complexity of the resulting network — the human interactome — is daunting: with ~25,000 protein-coding genes, ~1,000 metabolites and an undefined number of distinct proteins1 and functional RNA molecules, the number of cellular components that serve as the nodes of the inter- actome easily exceeds 100,000. The number of function- ally relevant interactions between the components of Network-based approaches to human disease have multiple potential biological and clinical applications. A better understanding of the effects of cellular intercon- nectedness on disease progression may lead to the iden- tification of disease genes and disease pathways, which, in turn, may offer better targets for drug development. These advances may also lead to better and more accurate biomarkers to monitor the functional integrity of net- works that are perturbed by diseases as well as to better disease classification. Here we present an overview of the organizing principles that govern cellular networks Network medicine: a network-based approach to human disease Albert-László Barabási*‡§ , Natali Gulbahce*‡|| and Joseph Loscalzo§ Abstract | Given the functional interdependencies between the molecular components in a human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects the perturbations of the complex intracellular and intercellular network that links tissue and organ systems. The emerging tools of network medicine offer a platform to explore systematically not only the molecular complexity of a particular disease, leading to the identificationofdiseasemodulesandpathways,butalsothemolecularrelationshipsamong apparentlydistinct(patho)phenotypes.Advancesinthisdirectionareessentialforidentifying new disease genes, for uncovering the biological significance of disease-associated mutationsidentifiedbygenome-wideassociationstudiesandfull-genomesequencing,and foridentifyingdrugtargetsandbiomarkersforcomplexdiseases. Predicting disease genes Disease-associated genes have generally been identified using linkage mapping or, more recently, genome-wide association (GWA) studies53 . Both methodologies can suggest large numbers of disease-gene candidates, but identifying the particular gene and the causal muta- tion remains difficult. Recently, a series of increasingly sophisticated network-based tools have been devel- oped to predict potential disease genes; these tools can be loosely grouped into three categories, as discussed below (FIG. 4). Linkage methods. These methods assume that the direct interaction partners of a disease protein are likely to be associated with the same disease phenotype45,54–56 . Indeed, for one disease locus, the set of genes within the locus whose products interacted with a known nes in the interactome. a | Of the approximately iated with specific diseases. The figure shows the ssociated genes that were known42 in 2007 and ntial, that is, their absence is associated with agram of the differences between essential and ential disease genes (shown as blue nodes) are eriphery, whereas in utero essential genes (shown REVIEWS Figure 2 | Disease modules. Schematic diagram of the three modularity concepts that are discussed in this Review. a | Topological modules correspond to locally dense neighbourhoods of the interactome, such that the nodes of REVIEWS this network, representing the links of the interactome, is expected to be much larger2 . This inter- and intracellular interconnectivity implies that the impact of a specific genetic abnormality is not restricted to the activity of the gene product that carries it, but can spread along the links of the network and alter the activity of gene products that otherwise carry no defects. Therefore, an understanding of a gene’s net- work context is essential in determining the phenotypic impact of defects that affect it3,4 . Following on from this principle, a key hypothesis underlying this Review is that a disease phenotype is rarely a consequence of an abnormality in a single effector gene product, but reflects various pathobiological processes that inter- act in a complex network. A corollary of this widely held hypothesis is that the interdependencies among a cell’s molecular components lead to deep functional, molecular and causal relationships among apparently distinct phenotypes. and the implications of these principles for understand- ing disease. These principles and the tools and method- ologies that are derived from them are facilitating the emergence of a body of knowledge that is increasingly referred to as network medicine5–7 . The human interactome Although much of our understanding of cellular net- works is derived from model organisms, the past dec- ade has seen an exceptional growth in human-specific molecular interaction data8 . Most attention has been directed towards molecular networks, including protein interaction networks, whose nodes are proteins that are linked to each other by physical (binding) interactions9,10 ; metabolic networks, whose nodes are metabolites that are linked if they participate in the same biochemi- cal reactions11–13 ; regulatory networks, whose directed links represent either regulatory relationships between a transcription factor and a gene14 , or post-translational 111 Dana Research Center, Boston, Massachusetts 02115, USA. ‡ Center for Cancer Systems Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, Massachusetts 02115, USA. § Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street, Boston, Massachusetts 02115, USA. || Department of Cellular and Molecular Pharmacology, University of California, 1700 4th Street, Byers Hall 309, Box 2530, San Francisco, California 94158, USA. Correspondence to A.-L.B. e-mail: alb@neu.edu doi:10.1038/nrg2918 56 | JANUARY 2011 | VOLUME 12 www.nature.com/reviews/genetics © 2011 Macmillan Publishers Limited. All rights reserved between the components of g the links of the interactome, ger2 . ular interconnectivity implies ic genetic abnormality is not the gene product that carries he links of the network and roducts that otherwise carry understanding of a gene’s net- n determining the phenotypic ct it3,4 . Following on from this is underlying this Review is e is rarely a consequence of e effector gene product, but logical processes that inter- k. A corollary of this widely e interdependencies among ents lead to deep functional, tionships among apparently the organizing principles that govern cellular networks and the implications of these principles for understand- ing disease. These principles and the tools and method- ologies that are derived from them are facilitating the emergence of a body of knowledge that is increasingly referred to as network medicine5–7 . The human interactome Although much of our understanding of cellular net- works is derived from model organisms, the past dec- ade has seen an exceptional growth in human-specific molecular interaction data8 . Most attention has been directed towards molecular networks, including protein interaction networks, whose nodes are proteins that are linked to each other by physical (binding) interactions9,10 ; metabolic networks, whose nodes are metabolites that are linked if they participate in the same biochemi- cal reactions11–13 ; regulatory networks, whose directed links represent either regulatory relationships between a transcription factor and a gene14 , or post-translational www.nature.com/reviews/genetics Macmillan Publishers Limited. All rights reserved NETWORK MEDICINE
  • 10. RESEARCH ARTICLE SUMMARY ◥ DISEASE NETWORKS Uncovering disease-disease relationships through the incomplete interactome Jörg Menche, Amitabh Sharma, Maksim Kitsak, Susan Dina Ghiassian, Marc Vidal, Joseph Loscalzo, Albert-László Barabási* INTRODUCTION: A disease is rarely a straight- forward consequence of an abnormality in a single gene, but rather reflects the interplay of multiple molecular processes. The rela- tionships among these processes are encoded in the interactome, a network that integrates all physical interactions within a cell, from protein-protein to regulatory protein–DNA and metabolic interactions. The documented propensity of disease-associated proteins to interact with each other suggests that they tend to cluster in the same neighborhood of the interactome, forming a disease module, a connected subgraph that contains all molecu- lar determinants of a disease. The accurate identification of the corresponding disease module represents the first step toward a sys- tematic understanding of the molecular mech- anisms underlying a complex disease. Here, we present a network-based framework to iden- tify the location of disease modules within the interactome and use the overlap between the modules to predict disease-disease relationships. RATIONALE: Despite impressive advances in high-throughput interactome mapping and disease gene identification, both the interac- tome and our knowledge of disease-associated genes remain incomplete. This incomplete- ness prompts us to ask to what extent the current data are sufficient to map out the disease modules, the first step toward an in- tegrated approach toward human disease. To make progress, we must formulate math- ematically the impact of network inc ness on the identifiability of disease quantifying the predictive power and itations of the current interactome. RESULTS: Using the tools of network we show that we can only uncover modules for diseases whose number ciated genes excee ical threshold det bythenetworkinc ness. We find tha proteins associa 226 diseases are inthesame netwo borhood, displaying a statistically sig tendency to form identifiable disease m The higher the degree of agglomerati disease proteins within the interact higher the biological and functional ity of the corresponding genes. The ings indicate that many local neighb of the interactome represent the ob part of the true, larger and denser modules. If two disease modules overlap, lo turbations causing one disease can pathways of the other disease module resulting in shared clinical and path ical characteristics. To test this hyp we measure the network-based sepa each disease pair, observing a direct between the pathobiological simi diseases and their relative distanc RES ON OUR WEB SITE ◥ Read the full article at http://dx.doi. org/10.1126/ science.1257601 .................................................. Menche et al., Science 2015 DISEASES AS NETWORK NEIGHBORHOODS
  • 12. Diseases As Local Neighborhoods
  • 13. Asthma Parkinson’s Leukemia MS Hypertension Rheumatoidarthritis Crohn’s disease Type 2 diabetes Glioblastoma Ulcerative colitis Heart failure Network Clustering Means Explainable Biology AIF1 ZBTB12 NFKBIZ MERTK HHEX CFB Diseases As Local Neighborhoods
  • 14. Interactome and disease genes GWAS Multiple sclerosis genes OMIM Signalling Complexes Kinase - Substrate Metabolic Literature Regulatory Yeast two-hybrid GWAS & OMIM Other disease genes Molecular interactions Gene with multiple disease associations OMIM Immunologic deficiency syndromes Hematologic diseases Blood protein disorders GWAS Connective tissue diseases Autoimmune diseases Joint diseases Musculoskeletal diseases Rheumatoid arthritis Signaling, Complexes, Literature, Regulatory Interaction with multiple lines of evidences AKT1 HLA-B HLA-C STAT3 TAP2 NFKBIZ IL2RA TNFRSF1A EHMT2 PTK2 IL7R MAPK1 Observable module for Multiple sclerosis • %The interactome contains 141,296 physical interactions between 13,460 proteins • We study 299 diseases with at least 20 gene associations Menche et al., Science 2015
  • 15. Measures of network localization multiple sclerosis proteins shortest distance connected component of size S=11 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 2 4 6 8 10 12 14 16 18 20 frequency size of largest component data random 11 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 2 3 4 5 6 7 8 9 frequency shortest distance d data random AIF1 TRAF6 VCAM1 IRF8 ITGB7 ADRA1B CD6 HLA-DQA1 CD5 HLA-DQB1 SLC30A7 TRIM27 NEDD4 C2 MLANA CBLB CBL MALT1 PTPRC CD40 PTPRK CD4 BAG6 CD28 PLEK TGFBR1 HLA-DQA2 CD58 CDSN UBQLN4 TNFSF14 IL12A IL12B TNFRSF14 GRB2 CRK MLH1 IL20RA ZNF512B DKKL1 SMYD2 FLNC AHI1 ZFP36L1 UBE2I TNFRSF1AAKT1 RAP1GAP IL2RA PTK2 EHMT2 HERPUD1 MERTK DDX39B DHX16HAAO ARHGDIA CD86 LCP1 YAP1 METTL1 CD24 DENND3 PSMA4 FGR STAT3 POU5F1 MAPK1 YWHAH BATF AR PRRC2A KIF1B JUN HLA-B MICAHLA-C MBP KLRC4 MICB ZBTB12TAP2 EXOC6 PDZK1 IL7R MYOD1 ARRB1 ALB NEDD9 NFKBIZ TNXB BACH2 BANP RDBP HLA-DRB1 NOTCH4 HLA-DOB HLA-DRB5 ZFP36L2 HLA-DRAFBXW7 4276 HLA-DMB HHEX PFDN1 SIRT2 SLC15A2 SP140 CFB EOMES d=2 d=3 d=3 • We use two measures to quantify the interactome-based localization of a disease • 226 out of 299 diseases are significantly localized according to both measures
  • 16. Relations between Diseases 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 DistributionP(d) Shortest distance d PD MS Pairwise Separated Modules 0.1 0.2 0.3 0.4 0.5 0.6 0 1 2 3 4 DistributionP(d) Shortest distance d MS Pairwise Overlapping Modules RA s = 1.3ABs = - 0.2AB ABs (d , d or d )AA BB AB (d , d or d )AA BB AB s < 0AB s ≥ 0AB FKBP7 PEX12 PEX3 SLC2A4 PEX19 UBQLN4 WRAP73 CCDC14 PEX10 ZNF512B CAT GPX3 ACAD ACOX1 AHI1HADHA TM6SF1 PEX16 HADHB PEX11B ABCD1 SMYD2 SLC27A2 HERPUD1 EWSR1 CD58 MED8 PEX1 PRR13 PEX5 PEX14 PEX2 IDH1 ZNF772 PEX13 PEX6 BANP PEX26 TNXB MVK SLC30A7 ZSCAN1 LK FGR IL7R FYN 4 CD5 BATF JUN NFKBIL1 YOD1 DDX39B 6KA1 PTPN11 SIRT2 DHX16 NFKBIZ STAT3 VDR NCOR1 KIF1B RDBP RBPJ HLA-DMB TNPO3 HLA-DRA SRSF1 HLA-DOB ITGB7 PTPRC CBLB VCAM1 CDKN2A BAG6TRAF2 PSMA4CD40 GMCL1 FAM107A SUMO1 PFKL TRAF6 MIF C2 RHOA TNFAIP3 TNFSF14 ATF7IP USP53 HLA-B EHMT2 TRAF1 OLIG3 RPL14 PHYH CCL21 CAPRIN2 KLF6 CDSN PEX7AGXT IL2RA MAPK1 YWHAG PTK2 TNFRSF1A HDAC2 STAT4 HLA-DRB1 TRA@ FCRL3 SMAD3 HLA-DRB5 TAP2 HLA-C MALT1 POU5F1 ARHGDIA HAAO FAM167A ARRB1 HSPA5 REL BACH2 SMARCC2 ALB UBE2I RSBN1 C5orf30 EXOC6 GRB2 APOM DDO MKRN3 RAB35 SLX4 PHF19 GNPATPRRC2A PTPN22 HLA-DQA2 PTPRK GHR HHEX RTF1 CFBAGPS ADRA1B IL23A ACTA1 SLC22A4 S100A6 SLC15A2 F8PDZK1 MLYCDMICA SSTR5 FKBP4PFDN1 RNF167 OTUD5 FLNC MLH1 PADI4 MERTK HRAS AFF3 IL20RA HLA-DQA1 PPARG Multiple sclerosis (MS) Peroxisomal disorders (PD) Rheumatoid arthritis (RA) • We introduce a network-based measure to quantify the overlap/separation of two diseases • Most disease pairs are well separated on the Interactome Menche et al., Science 2015
  • 17. Network Distance vs. Biomedical Similarity 22 RRksirevitaler Separation smean ytiralimismotpmyS Separation smean ytiralimismretOG Separation smean 10-3 10-2 10-1 100 -3 -2 -1 0 1 2 ytiralimismretOG Separation Expectation smean 10-1 100 -3 -2 0 1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 -3 -2 -1 0 1 2 noisserpxe-oC Separation smean 10-3 10-2 10-1 100 -3 -2 -1 0 1 2 ytiralimismretOG Separation smean 10-3 10-2 10-1 -3 -2 -1 0 1 2 10-1 100 101 102 103 -3 -2 -1-1 0 1 biological process molecular function co-expression symptoms comorbidity s AB s AB s AB s AB s ABs AB cellular component • Diseases that are close in the Interactome have similar biomedical properties
  • 18. The Disease Space Type 1 diabetes Rheumatoid arthritis Sutoimmune diseases of the nervous system Demyelinating autoimmune diseases Immune System Diseases 1 5 6 7 8 12 13 14 11 10 9 Retinitis pigmentosa Retinal degeneration Graves disease Macular degeneration Eye Diseases 1 2 3 4 9 10 Asthma Respiratory hypersensitivity Respiratory Tract Diseases 13 14 11 12 Cerebrovascular disorders Myocardial infarction Coronary artery disease Myocardial ischemia Cardiovascular Diseases 5 6 7 8 2 3 4 • Diseases and their network-based relationships can be represented in a 3D Diseases-Space • Diseases belonging to the same class agglomerate Menche et al., Science 2015
  • 19. Overlapping diseases • Examples of unexpected disease relationships uncovered using the disease space IL1RL1 IL18R1 HLA-DRA HLA-DPA1 HLA-DQB1 HLA-DPB1 HLA-DOA HLA-DQA2 IL33 CDK2 SMAD3 NOTCH4 IL2RB PTPN2 RUNX3 ETS1 BACH2 UBE2E3 IL18RAP XCR1 OLIG3 TNFAIP3 CTLA4 EGFR KIAA1109 MYO9B CCR4 SH2B3 PLEK CCR1 PTPRK ARHGAP31 RGS1 ZMIZ1 SLC9A4 IL12A RMI2 SYF2 LPP IL21 PRM1 ATXN2 GLB1 HLA-DQA1 IL2 ITGA4 ICOS ICOSLG IKZF4 DPP10 ELF3 ORMDL3 ADAM33 RANBP6 TSLP CRB1 PLA2G7 USP38 IL6RSLC25A46 SLC30A8 TBX21 MUC7 CHIT1 PBX2 PDE4D C11orf30 BRD2 SUOX Celiac disease Celiac disease asthma asthma celiac diseaseasthma atherosclerosis coronary artery disease biliary tract diseases hepatic cirrhosis Intestinal immune network for IGA production Intestinal immune network
  • 20. O R I G I N A L A R T I C L E A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma Amitabh Sharma1,2,3,†, Jörg Menche1,2,4,8,†, C. Chris Huang5, Tatiana Ort5, Xiaobo Zhou3, Maksim Kitsak1,2, Nidhi Sahni2, Derek Thibault3, Linh Voung3, Feng Guo3, Susan Dina Ghiassian1,2, Natali Gulbahce6, Frédéric Baribaud5, Joel Tocker5, Radu Dobrin5, Elliot Barnathan5, Hao Liu5, Reynold A. Panettieri Jr7, Kelan G. Tantisira3, Weiliang Qiu3, Benjamin A. Raby3, Edwin K. Silverman3, Marc Vidal2,9, Scott T. Weiss3 and Albert-László Barabási1,2,3,4,8,* 1 Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA, 2 Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA, 3 Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA, 4 Department of Theoretical Physics, Budapest University of Technology and Economics, H1111, Budapest, Hungary, 5 Janssen Research & Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA, 6 Department of Cellular and Molecular Pharmacology, University of California 1700, 4th Street, Byers Hall 308D, San Francisco, CA 94158, USA, 7 Pulmonary Allergy and Critical Care Division, Department of Medicine, University of Pennsylvania, 125 South 31st Street, TRL Suite 1200, Philadelphia, PA 19104, USA, 8 Center for Network Science, Central European University, Nador u. 9, 1051 Budapest, Hungary and 9 Department of Genetics, Harvard Medical School, Boston, MA 02115, USA *To whom correspondence should be addressed at: Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115, USA. Email: barabasi@gmail.com Abstract Recent advances in genetics have spurred rapid progress towards the systematic identification of genes involved in complex diseases. Still, the detailed understanding of the molecular and physiological mechanisms through which these genes affect disease phenotypes remains a major challenge. Here, we identify the asthma disease module, i.e. the local neighborhood of the interactome whose perturbation is associated with asthma, and validate it for functional and pathophysiological relevance, using both computational and experimental approaches. We find that the asthma disease module is enriched with modest GWAS P-values against the background of random variation, and with differentially expressed genes from normal and asthmatic fibroblast cells treated with an asthma-specific drug. The asthma module also contains immune response mechanisms that are shared with other immune-related disease modules. Further, using diverse omics (genomics, † These authors contributed equally to this work. Received: September 1, 2014. Revised: November 19, 2014. Accepted: January 5, 2015 © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com Human Molecular Genetics, 2015, Vol. 24, No. 11 3005–3020 doi: 10.1093/hmg/ddv001 Advance Access Publication Date: 12 January 2015 Original Article 3005 atNortheasternUniversityLibrariesonDecember8,2015http://hmg.oxfordjournals.org/Downloadedfrom RESEARCH ARTICLE A DIseAse MOdule Detection (DIAMOnD) Algorithm Derived from a Systematic Analysis of Connectivity Patterns of Disease Proteins in the Human Interactome Susan Dina Ghiassian1,2☯ , Jörg Menche1,2,3☯ , Albert-László Barabási1,2,3,4 * 1 Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, Massachusetts, United States of America, 2 Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America, 3 Center for Network Science, Central European University, Budapest, Hungary, 4 Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America ☯ These authors contributed equally to this work. * barabasi@gmail.com Abstract The observation that disease associated proteins often interact with each other has fueled the development of network-based approaches to elucidate the molecular mechanisms of human disease. Such approaches build on the assumption that protein interaction networks can be viewed as maps in which diseases can be identified with localized perturbation with- in a certain neighborhood. The identification of these neighborhoods, or disease modules, is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While numerous heuristic methods exist that successfully pinpoint disease associated modules, OPEN ACCESS Citation: Ghiassian SD, Menche J, Barabási A-L (2015) A DIseAse MOdule Detection (DIAMOnD) Algorithm Derived from a Systematic Analysis of Connectivity Patterns of Disease Proteins in the Human Interactome. PLoS Comput Biol 11(4): e1004120. doi:10.1371/journal.pcbi.1004120 Editor: Andrey Rzhetsky, University of Chicago, UNITED STATES Received: August 25, 2014 Accepted: January 9, 2015 Sharma et al., HMG 2015 Ghiassian et al., PLoS Comp Biol 2015 BUILDING DISEASE MODULES
  • 21. Disease Module Detection and Analysis The general workflow of a detailed analysis for a disease of interest: I Interactome construction II Disease Module Identification III Validation IV Biological interpretation - Gene expression data - Gene Ontologies - Pathways - Comorbidity - OMIM, GWAS, literature - DIAMOnD: Disease Module Detection Algorithm - Pathway prioritization - Molecular mechanism Seed gene selection - Binary interactions, metabolic couplings, regulatory interactions ...
  • 22. DIAM DISEASE MODULES VS COMMUNITIES
  • 23. original seed genes gene selected at iteration i DIAMOnD genes legend: iteration 3 iteration 2iteration 1initial seeds 0.18 0.46 0.46 0.07 0.53 0.46 0.21 0.46 0.29 p-value: A B genes connected to a seed gene proto-module DIAMOnD genes Disease module:C DIAMOnD –Disease Module Detection Algorithm  purely%topological%method%  %all%genes%in%the%network%are% prioriSzed%according%to%their% potenSal%relevance%for%the%disease%
  • 24. DIAMOnD and Disease Modules within the Human BUILDING A DISEASE MODULE
  • 25. ARTICLE Received 7 May 2015 | Accepted 29 Nov 2015 | Published 1 Feb 2016 Network-based in silico drug efficacy screening Emre Guney1,2, Jo¨rg Menche1,3, Marc Vidal2,4 & Albert-La´szlo´ Bara´basi1,2,3,5 The increasing cost of drug development together with a significant drop in the number of new drug approvals raises the need for innovative approaches for target identification and efficacy prediction. Here, we take advantage of our increasing understanding of the network-based origins of diseases to introduce a drug-disease proximity measure that quantifies the interplay between drugs targets and diseases. By correcting for the known biases of the interactome, proximity helps us uncover the therapeutic effect of drugs, as well as to distinguish palliative from effective treatments. Our analysis of 238 drugs used in 78 diseases indicates that the therapeutic effect of drugs is localized in a small network neighborhood of the disease genes and highlights efficacy issues for drugs used in Parkinson and several inflammatory disorders. Finally, network-based proximity allows us to predict novel drug-disease associations that offer unprecedented opportunities for drug repurposing and the detection of adverse effects. DOI: 10.1038/ncomms10331 OPEN Guney et al., Nature Comm 2015 DRUGS
  • 26. DRUGS ABCC8 VEGFA RUNX1 INS KAT6A TOP2A IRS1 TOP2B CAPN10 NPM1 A Disease gene Drug target Shortest path to the closest disease gene d R R R RR z =s2 s3 1 t2 Random gene sets with the same degrees ... T1 S1d1` Tn Sndn s1 t1 s2 s3 t2 2+3 2 d= Drug - disease proximity Gliclazide Daunorubicin Type 2 diabetes Acute myeloid leukaemia dc = 2.5 zc = 1.3 dc = 1.0 zc = –1.6 zc = 1.0zc = –3.3 dc = 2.0 dc = 1.0 b c Disease genes Acute myeloid leukaemiaType 2 diabetes Drug targetsDrug targets Gliclazide Daunorubicin Figure 1 | Network-based drug-disease proximity. (a) Illustration of the closest distance (dc) of a drug T with targets t1 and t2 to the proteins s1, s2 and s3
  • 27. PROXIMITY TO DISEASE MODULES a b c Seperation (dss) dc dk dcc dss Disease module Drugds Center (dcc)Kernel (dk)Shortest (ds)Closest (dc) AUC(%) R2 = 0.175 th(dc) R2 = 0.003 80 70 60 50 40 30 4 3 E COMMUNICATIONS | DOI: 10.1038/ncomms10331 A
  • 28. GOING FURTHER Full text: http://barabasi.com/networksciencebook/