Introduction to Network Medicine

INTRODUCTION TO NETWORK MEDICINE
Marc Santolini

Center for Complex Network Research (CCNR)

Reductionism,which has dominated biological research
for over a century, has provided a wealth of knowledge
about individual cellular components and their func-
tions. Despite its enormous success, it is increasingly
clear that a discrete biological function can only rarely
be attributed to an individual molecule. Instead, most
biological characteristics arise from complex interac-
tions between the cell’s numerous constituents, such as
proteins,DNA,RNA and small molecules1–8
.Therefore,
akeychallengeforbiologyinthetwenty-firstcenturyisto
understand the structure and the dynamics of the com-
plex intercellular web of interactions that contribute to
the structure and function of a living cell.
The development of high-throughput data-collection
techniques, as epitomized by the widespread use of
microarrays,allows for the simultaneous interrogation
of the status of a cell’s components at any given time.
In turn,new technology platforms,such as PROTEIN CHIPS
or semi-automatedYEAST TWO-HYBRID SCREENS,help to deter-
mine how and when these molecules interact with each
other.Various types of interaction webs, or networks,
(including protein–protein interaction,metabolic,sig-
nalling and transcription-regulatory networks) emerge
from the sum of these interactions.None of these net-
works are independent,instead they form a‘network of
networks’ that is responsible for the behaviour of the
cell.A major challenge of contemporary biology is to
programmetomapout,understandandmodelinquan-
tifiabletermsthetopologicalanddynamicpropertiesof the
variousnetworksthatcontrolthebehaviourof thecell.
Helpalongthewayisprovidedbytherapidlydevelop-
ing theory of complex networks that, in the past few
years,has made advances towards uncovering the orga-
nizingprinciplesthatgoverntheformationandevolution
of various complex technological and social networks9–12
.
This research is already making an impact on cell biology.
It has led to the realization that the architectural features
of molecularinteractionnetworkswithinacellareshared
to a large degree by other complex systems,such as the
Internet,computer chips and society.This unexpected
universality indicates that similar laws may govern most
complex networks in nature,which allows the expertise
fromlargeandwell-mappednon-biologicalsystemstobe
usedtocharacterizetheintricateinterwovenrelationships
thatgoverncellularfunctions.
In this review,we show that the quantifiable tools of
network theory offer unforeseen possibilities to under-
stand the cell’s internal organization and evolution,
fundamentally altering our view of cell biology. The
emerging results are forcing the realization that, not-
withstanding the importance of individual molecules,
cellular function is a contextual attribute of strict
and quantifiable patterns of interactions between the
myriad of cellular constituents. Although uncovering
NETWORK BIOLOGY:
UNDERSTANDING THE CELL’S
FUNCTIONAL ORGANIZATION
Albert-László Barabási* & Zoltán N. Oltvai‡
A key aim of postgenomic biomedical research is to systematically catalogue all molecules and
their interactions within a living cell. There is a clear need to understand how these molecules and
the interactions between them determine the function of this enormously complex machinery, both
in isolation and when surrounded by other cells. Rapid advances in network biology indicate that
cellular networks are governed by universal laws and offer a new conceptual framework that could
potentially revolutionize our view of biology and disease pathologies in the twenty-first century.
oarrays,
gy
nomic set
surface
hem. The
t a high
e
inding
ysics,
Dame,
na 46556,
hology,
ersity,
611,
u;
R E V I E W S
Barabasi et al., Nat Rev Genet 2004
NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 105
Ba; blue nodes). In the Barabási–Albert model of a scale-free network , at each time point a node with M links is added to the network, which
connects to an already existing node I with probability ΠI
= kI
/ΣJ
kJ
, where kI
is the degree of node I (FIG. 3) and J is the index denoting the sum over
network nodes. The network that is generated by this growth process has a power-law degree distribution that is characterized by the degree
exponent γ = 3. Such distributions are seen as a straight line on a log–log plot (see figure, part Bb). The network that is created by the
Barabási–Albert model does not have an inherent modularity, so C(k) is independent of k (see figure, part Bc). Scale-free networks with degree
exponents 2<γ<3, a range that is observed in most biological and non-biological networks, are ultra-small34,35
, with the average path length
following ഞ ~ log log N, which is significantly shorter than log N that characterizes random small-world networks.
Hierarchicalnetworks
To account for the coexistence of modularity, local clustering and scale-free topology in many real systems it has to be assumed that clusters
combine in an iterative manner, generating a hierarchical network47,53
(see figure, part C). The starting point of this construction is a small cluster
of four densely linked nodes (see the four central nodes in figure, part Ca). Next, three replicas of this module are generated and the three external
nodes of the replicated clusters
connected to the central node of
the old cluster, which produces a
large 16-node module. Three
replicas of this 16-node module
are then generated and the 16
peripheral nodes connected to
the central node of the old
module, which produces a new
module of 64 nodes. The
hierarchical network model
seamlessly integrates a scale-free
topology with an inherent
modular structure by generating
a network that has a power-law
degree distribution with degree
exponent γ = 1 + ഞn4/ഞn3 = 2.26
(see figure, part Cb) and a large,
system-size independent average
clustering coefficient <C> ~ 0.6.
The most important signature of
hierarchical modularity is the
scaling of the clustering
coefficient, which follows
C(k) ~ k –1
a straight line of slope
–1 on a log–log plot (see figure,
part Cc). A hierarchical
architecture implies that sparsely
connected nodes are part of
highly clustered areas, with
communication between the
different highly clustered
neighbourhoods being
maintained by a few hubs
(see figure, part Ca).
A Random network
Ab
Ac
Aa
Bb
Bc
Ba
Cb
Cc
Ca
B Scale-free network C Hierarchical network
1
0.1
0.01
0.001
0.0001
1 10 100 1,000
P(k)C(k)
k k
k
k k
P(k)
P(k)
100
10
10–1
10–2
10–3
10–4
10–5
10–6
10–7
10–8
100 1,000 10,000
C(k)
logC(k)
log k
SCALE-FREE NETWORKS

R E V I E W S
mathematical properties of random networks14
.Their
much-investigated random network model assumes that
a fixed number of nodes are connected randomly to each
other(BOX2).Themostremarkablepropertyof themodel
is its‘democratic’or uniform character,characterizing the
degree,orconnectivity(k;BOX1),of theindividualnodes.
Because, in the model, the links are placed randomly
among the nodes,it is expected that some nodes collect
only a few links whereas others collect many more.In a
random network, the nodes degrees follow a Poisson
distribution, which indicates that most nodes have
roughly the same number of links,approximately equal
to the network’s average degree,<k> (where <> denotes
the average); nodes that have significantly more or less
linksthan<k>areabsentorveryrare(BOX2).
Despite its elegance, a series of recent findings indi-
cate that the random network model cannot explain
the topological properties of real networks. The
deviations from the random model have several key
signatures, the most striking being the finding that, in
contrast to the Poisson degree distribution, for many
social and technological networks the number of nodes
with a given degree follows a power law. That is, the
probability that a chosen node has exactly k links
follows P(k) ~ k –γ
, where γ is the degree exponent, with
its value for most networks being between 2 and 3
(REF.15).Networks that are characterized by a power-law
degree distribution are highly non-uniform, most of
the nodes have only a few links.A few nodes with a very
large number of links,which are often called hubs,hold
these nodes together. Networks with a power degree
Figure 2 | Yeast protein interaction network. A map of protein–protein interactions18
in
Saccharomyces cerevisiae, which is based on early yeast two-hybrid measurements23
, illustrates
that a few highly connected nodes (which are also known as hubs) hold the network together.
The largest cluster, which contains ~78% of all proteins, is shown. The colour of a node indicates
the phenotypic effect of removing the corresponding protein (red = lethal, green = non-lethal,
orange = slow growth, yellow = unknown). Reproduced with permission from REF.18 ©
Jeong et al., “Lethality and centrality in protein networks“ Nature 2001
THE YEAST INTERACTOME

FABRICATING HUBSR E V I E W S
major engineer of the genomic landscape, it is likely to
be a key mechanism for generating the scale-free
topology.
Two further results offer direct evidence that net-
work growth is responsible for the observed topological
features. The scale-free model (BOX 2) predicts that the
nodes that appeared early in the history of the network
are the most connected ones15
.Indeed,an inspection of
the metabolic hubs indicates that the remnants of the
RNA world, such as coenzyme A, NAD and GTP, are
among the most connected substrates of the metabolic
network, as are elements of some of the most ancient
metabolic pathways, such as glycolysis and the tricar-
boxylic acid cycle17
.In the context of the protein interac-
tion networks, cross-genome comparisons have found
that, on average, the evolutionarily older proteins have
more links to other proteins than their younger coun-
terparts45,46
. This offers direct empirical evidence for
preferential attachment.
Motifs, modules and hierarchical networks
Cellular functions are likely to be carried out in a highly
modular manner1
. In general, modularity refers to a
group of physically or functionally linked molecules
(nodes) that work together to achieve a (relatively) dis-
tinct function1,6,8,47
. Modules are seen in many systems,
for example,circles of friends in social networks or web-
sites that are devoted to similar topics on the World
Wide Web. Similarly, in many complex engineered sys-
tems, from a modern aircraft to a computer chip, a
highly modular structure is a fundamental design
a
b
Proteins
1
2
Proteins
Genes
Genes
Before duplication
After duplication
Figure 3 | The origin of the scale-free topology and hubs
in biological networks. The origin of the scale-free topology

NETWORK MOTIFS
(2003).
16. N. Keyghobadi, M. A. Matrone, G. D. Ebel, L. D.
Kramer, D. M. Fonseca, Mol. Ecol. Notes 4, 20
(2004).
17. D. M. Fonseca, C. T. Atkinson, R. C. Fleischer, Mol.
Ecol. 7, 1617 (1998).
18. F. H. Drummond, Trans. R. Entomol. Soc. Lond. 102,
369 (1951).
19. K. Tanaka, K. Mizusawa, E. S. Saugstad, Contrib. Am.
Entomol. Inst. 16, 1 (1979).
20. J. K. Pritchard, M. Stephens, P. Donnelly, Genetics
155, 945 (2000).
21. A. R. Barr, Am. J. Trop. Med. Hyg. 6, 153 (1957).
22. A. J. Cornel et al., J. Med. Entomol. 40, 36 (2003).
23. S. Urbanelli, F. Silvestrini, W. K. Reisen, E. De Vito,
L. Bullini, J. Med. Entomol. 34, 116 (1997).
24. L. L. Cavalli-Sforza, F. Cavalli-Sforza, The Great
Human Diasporas: The History of Diversity and
Evolution (Addison-Wesley, Reading, MA, 1995).
25. J. de Zulueta, Parassitologia 36, 7 (1994).
26. S. Urbanelli et al., in Ecologia, Atti I Congr. Naz.
versity of Pennsylvania, for technical assistance;
and A. Bhandoola and four anonymous reviewers
for comments and valuable suggestions on an ear-
lier version of this manuscript. Supported by a
National Research Council Associateship through
the Walter Reed Army Institute of Research
(D.M.F.), by NIH grant nos. U50/CCU220532 and
1R01GM063258, and by NSF grant no.
DEB-0083944. This material reflects the views of
the authors and should not be construed to repre-
sent those of the Department of the Army or the
Department of Defense.
Supporting Online Material
www.sciencemag.org/cgi/content/full/303/5663/1535/
DC1
Materials and Methods
Tables S1 to S8
References and Notes
2 December 2003; accepted 16 January 2004
Superfamilies of Evolved and
Designed Networks
Ron Milo, Shalev Itzkovitz, Nadav Kashtan, Reuven Levitt,
Shai Shen-Orr, Inbal Ayzenshtat, Michal Sheffer, Uri Alon*
Complex biological, technological, and sociological networks can be of very
different sizes and connectivities, making it difficult to compare their struc-
tures. Here we present an approach to systematically study similarity in the
local structure of networks, based on the significance profile (SP) of small
subgraphs in the network compared to randomized networks. We find
several superfamilies of previously unrelated networks with very similar SPs.
One superfamily, including transcription networks of microorganisms, rep-
resents “rate-limited” information-processing networks strongly con-
strained by the response time of their components. A distinct superfamily
includes protein signaling, developmental genetic networks, and neuronal
wiring. Additional superfamilies include power grids, protein-structure net-
works and geometric networks, World Wide Web links and social networks,
and word-adjacency networks from different languages.
Many networks in nature share global prop-
erties (1, 2). Their degree sequences (the
number of edges per node) often follow a
long-tailed distribution, in which some nodes
are much more connected than the average
(3). In addition, natural networks often show
the small-world property of short paths be-
tween nodes and highly clustered connections
(1, 2, 4). Despite these global similarities,
networks from different fields can have very
different local structure (5). It was recently
found that networks display certain patterns,
termed “network motifs,” at much higher fre-
quency than expected in randomized net-
works (6, 7). In biological networks, these
motifs were suggested to be recurring circuit
elements that carry out key information-
processing tasks (6, 8–10).
Departments of Molecular Cell Biology, Physics of
Complex Systems, and Computer Science, Weizmann
Institute of Science, Rehovot 76100, Israel.
*To whom correspondence should be addressed at
Department of Molecular Cell Biology, Weizmann In-
stitute of Science, Rehovot 76100, Israel. E-mail:
urialon@weizmann.ac.il
CH 2004 VOL 303 SCIENCE www.sciencemag.org
ors that readily transmit the vi-
and between avian hosts and
ld have created the current ep-
itions.
nt study suggests that changes in
pacity and the creation of new
tors may occur with new intro-
particular, the arrival of hybrid
rms in northern Europe has the
adically change the dynamics of
rope.
s and Notes
adova, Culex pipiens pipiens Mosquitoes:
Distribution, Ecology, Physiology, Genet-
Importance, and Control (Pensoft, Mos-
n, Ann. N.Y. Acad. Sci. 951, 220 (2001).
M. L. O’Guinn, D. J. Dohm, J. W. Jones,
omol. 38, 130 (2001).
ard et al., Emerg. Infect. Dis. 7, 679
ekera et al., Emerg. Infect. Dis. 7, 722
m, M. R. Sardelis, M. J. Turell, J. Med.
9, 640 (2002).
et al., Emerg. Infect. Dis. 7, 742 (2001).
local structure of networks, based on the significance profile (SP) of small
subgraphs in the network compared to randomized networks. We find
several superfamilies of previously unrelated networks with very similar SPs.
One superfamily, including transcription networks of microorganisms, rep-
resents “rate-limited” information-processing networks strongly con-
strained by the response time of their components. A distinct superfamily
includes protein signaling, developmental genetic networks, and neuronal
wiring. Additional superfamilies include power grids, protein-structure net-
works and geometric networks, World Wide Web links and social networks,
and word-adjacency networks from different languages.
Many networks in nature share global prop-
erties (1, 2). Their degree sequences (the
number of edges per node) often follow a
long-tailed distribution, in which some nodes
are much more connected than the average
(3). In addition, natural networks often show
the small-world property of short paths be-
tween nodes and highly clustered connections
(1, 2, 4). Despite these global similarities,
networks from different fields can have very
different local structure (5). It was recently
found that networks display certain patterns,
termed “network motifs,” at much higher fre-
quency than expected in randomized net-
works (6, 7). In biological networks, these
motifs were suggested to be recurring circuit
elements that carry out key information-
processing tasks (6, 8–10).
Departments of Molecular Cell Biology, Physics of
Complex Systems, and Computer Science, Weizmann
Institute of Science, Rehovot 76100, Israel.
*To whom correspondence should be addressed at
Department of Molecular Cell Biology, Weizmann In-
stitute of Science, Rehovot 76100, Israel. E-mail:
urialon@weizmann.ac.il
5 MARCH 2004 VOL 303 SCIENCE www.sciencemag.org
To understand the design principles of com-
plex networks, it is important to compare the local
structure of networks from different fields. The
main difficulty is that these networks can be of
vastly different sizes [for example, World Wide
Web (WWW) hyperlink networks with millions
of nodes and social networks with tens of nodes]
and degree sequences. Here, we present an ap-
proach for comparing network local structure,
based on the significance profile (SP). To calcu-
late the SP of a network, the network is compared
to an ensemble of randomized networks with the
same degree sequence. The comparison to ran-
domized networks compensates for effects due to
network size and degree sequence. For each sub-
graph i, the statistical significance is described by
the Z score (11):
Zi ϭ ͑Nreali Ϫ <Nrandi>)/std(Nrandi)
where Nreali is the number of times the sub-
graph appears in the network, and ϽNrandiϾ
and std(Nrandi) are the mean and standard
deviation of its appearances in the random-
ized network ensemble. The SP is the vector
of Z scores normalized to length 1:
SPiϭZi/(⌺Zi
2
)1/2
The normalization emphasizes the relative
significance of subgraphs, rather than the ab-
solute significance. This is important for
comparison of networks of different sizes,
because motifs (subgraphs that occur much
more often than expected at random) in large
networks tend to display higher Z scores than
motifs in small networks (7).
We present in Fig. 1 the SP of the 13
possible directed connected triads (triad sig-
nificance profile, TSP) for networks from
different fields (12). The TSP of these net-
works is almost always insensitive to removal
of 30% of the edges or to addition of 50%
new edges at random, demonstrating that it is
robust to missing data or random data errors
(SOM Text). Several superfamilies of net-
works with similar TSPs emerge from this
analysis. One superfamily includes sensory
transcription networks that control gene ex-
pression in bacteria and yeast in response to
external stimuli. In these transcription net-
works, the nodes represent genes or operons
and the edges represent direct transcriptional
regulation (6, 13–15). Networks from three
microorganisms, the bacteria Escherichia
coli (6) and Bacillus subtilis (14) and the
yeast Saccharomyces cerevisiae (7, 15), were
analyzed. The networks have very similar
TSPs (correlation coefficient c Ͼ 0.99). They
show one strong motif, triad 7, termed “feed-
forward loop.” The feedforward loop has
been theoretically and experimentally shown
Fig. 1. The triad significance profile (TSP) of networks from various
disciplines. The TSP shows the normalized significance level (Z score) for
each of the 13 triads. Networks with similar characteristic profiles are
URCHIN N ϭ 45, E ϭ 83), and synaptic connections between neurons in
C. elegans (NEURONS N ϭ 280, E ϭ 2170). (iii) WWW hyperlinks
between Web pages in the www.nd.edu site (3) (WWW-1 N ϭ 325729,

The human disease network
Kwang-Il Goh*†‡§
, Michael E. Cusick†‡¶
, David Valleʈ
, Barton Childsʈ
, Marc Vidal†‡¶
**, and Albert-La´szlo´ Baraba´si*†‡
**
*Center for Complex Network Research and Department of Physics, University of Notre Dame, Notre Dame, IN 46556; †Center for Cancer Systems Biology
(CCSB) and ¶Department of Cancer Biology, Dana–Farber Cancer Institute, 44 Binney Street, Boston, MA 02115; ‡Department of Genetics, Harvard Medical
School, 77 Avenue Louis Pasteur, Boston, MA 02115; §Department of Physics, Korea University, Seoul 136-713, Korea; and ʈDepartment of Pediatrics and the
McKusick–Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205
Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved April 3, 2007 (received for review February 14, 2007)
A network of disorders and disease genes linked by known disorder–
gene associations offers a platform to explore in a single graph-
theoretic framework all known phenotype and disease gene associ-
ations, indicating the common genetic origin of many diseases. Genes
associated with similar disorders show both higher likelihood of
physical interactions between their products and higher expression
profiling similarity for their transcripts, supporting the existence of
distinct disease-specific functional modules. We find that essential
human genes are likely to encode hub proteins and are expressed
widely in most tissues. This suggests that disease genes also would
play a central role in the human interactome. In contrast, we find that
the vast majority of disease genes are nonessential and show no
tendency to encode hub proteins, and their expression pattern indi-
cates that they are localized in the functional periphery of the
network. A selection-based model explains the observed difference
between essential and disease genes and also suggests that diseases
caused by somatic mutations should not be peripheral, a prediction
we confirm for cancer genes.
biological networks ͉ complex networks ͉ human genetics ͉ systems
biology ͉ diseasome
Decades-long efforts to map human disease loci, at first genet-
ically and later physically (1), followed by recent positional
cloning of many disease genes (2) and genome-wide association
studies (3), have generated an impressive list of disorder–gene
association pairs (4, 5). In addition, recent efforts to map the
protein–protein interactions in humans (6, 7), together with efforts
to curate an extensive map of human metabolism (8) and regulatory
networks offer increasingly detailed maps of the relationships
between different disease genes. Most of the successful studies
building on these new approaches have focused, however, on a
single disease, using network-based tools to gain a better under-
standing of the relationship between the genes implicated in a
selected disorder (9).
Here we take a conceptually different approach, exploring
whether human genetic disorders and the corresponding disease
genes might be related to each other at a higher level of cellular and
organismal organization. Support for the validity of this approach
is provided by examples of genetic disorders that arise from
mutations in more than a single gene (locus heterogeneity). For
example, Zellweger syndrome is caused by mutations in any of at
least 11 genes, all associated with peroxisome biogenesis (10).
Similarly, there are many examples of different mutations in the
same gene (allelic heterogeneity) giving rise to phenotypes cur-
rently classified as different disorders. For example, mutations in
TP53 have been linked to 11 clinically distinguishable cancer-
related disorders (11). Given the highly interlinked internal orga-
nization of the cell (12–17), it should be possible to improve the
single gene–single disorder approach by developing a conceptual
framework to link systematically all genetic disorders (the human
‘‘disease phenome’’) with the complete list of disease genes (the
‘‘disease genome’’), resulting in a global view of the ‘‘diseasome,’’
the combined set of all known disorder/disease gene associations.
Results
Construction of the Diseasome. We constructed a bipartite graph
consisting of two disjoint sets of nodes. One set corresponds to all
known genetic disorders, whereas the other set corresponds to all
known disease genes in the human genome (Fig. 1). A disorder and
a gene are then connected by a link if mutations in that gene are
implicated in that disorder. The list of disorders, disease genes, and
associations between them was obtained from the Online Mende-
lian Inheritance in Man (OMIM; ref. 18), a compendium of human
disease genes and phenotypes. As of December 2005, this list
contained 1,284 disorders and 1,777 disease genes. OMIM initially
focused on monogenic disorders but in recent years has expanded
to include complex traits and the associated genetic mutations that
confer susceptibility to these common disorders (18). Although this
history introduces some biases, and the disease gene record is far
from complete, OMIM represents the most complete and up-to-
date repository of all known disease genes and the disorders they
confer. We manually classified each disorder into one of 22 disorder
classes based on the physiological system affected [see supporting
information (SI) Text, SI Fig. 5, and SI Table 1 for details].
Starting from the diseasome bipartite graph we generated two
biologically relevant network projections (Fig. 1). In the ‘‘human
disease network’’ (HDN) nodes represent disorders, and two
disorders are connected to each other if they share at least one gene
in which mutations are associated with both disorders (Figs. 1 and
2a). In the ‘‘disease gene network’’ (DGN) nodes represent disease
genes, and two genes are connected if they are associated with the
same disorder (Figs. 1 and 2b). Next, we discuss the potential of
these networks to help us understand and represent in a single
framework all known disease gene and phenotype associations.
Properties of the HDN. If each human disorder tends to have a
distinct and unique genetic origin, then the HDN would be dis-
connected into many single nodes corresponding to specific disor-
ders or grouped into small clusters of a few closely related disorders.
In contrast, the obtained HDN displays many connections between
both individual disorders and disorder classes (Fig. 2a). Of 1,284
disorders, 867 have at least one link to other disorders, and 516
disorders form a giant component, suggesting that the genetic
origins of most diseases, to some extent, are shared with other
diseases. The number of genes associated with a disorder, s, has a
broad distribution (see SI Fig. 6a), indicating that most disorders
relate to a few disease genes, whereas a handful of phenotypes, such
as deafness (s ϭ 41), leukemia (s ϭ 37), and colon cancer (s ϭ 34),
relate to dozens of genes (Fig. 2a). The degree (k) distribution of
HDN (SI Fig. 6b) indicates that most disorders are linked to only
Author contributions: D.V., B.C., M.V., and A.-L.B. designed research; K.-I.G. and M.E.C.
performed research; K.-I.G. and M.E.C. analyzed data; and K.-I.G., M.E.C., D.V., M.V., and
A.-L.B. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Abbreviations: DGN, disease gene network; HDN, human disease network; GO, Gene
Ontology; OMIM, Online Mendelian Inheritance in Man; PCC, Pearson correlation coeffi-
cient.
**To whom correspondence may be addressed. E-mail: alb@nd.edu or marc࿝vidal@
dfci.harvard.edu.
This article contains supporting information online at www.pnas.org/cgi/content/full/
0701361104/DC1.
© 2007 by The National Academy of Sciences of the USA
www.pnas.org͞cgi͞doi͞10.1073͞pnas.0701361104 PNAS ͉ May 22, 2007 ͉ vol. 104 ͉ no. 21 ͉ 8685–8690
APPLIEDPHYSICAL
SCIENCES
AR
ATM
BRCA1
BRCA2
CDH1
GARS
HEXB
KRAS
LMNA
MSH2
PIK3CA
TP53
MAD1L1
RAD54L
VAPB
CHEK2
BSCL2
ALS2
BRIP1
Androgen insensitivity
Breast cancer
Perineal hypospadias
Prostate cancer
Spinal muscular atrophy
Ataxia-telangiectasia
Lymphoma
T-cell lymphoblastic leukemia
Ovarian cancer
Papillary serous carcinoma
Fanconi anemia
Pancreatic cancer
Wilms tumor
Charcot-Marie-Tooth disease
Sandhoff disease
Lipodystrophy
Amyotrophic lateral sclerosis
Silver spastic paraplegia syndrome
Spastic ataxia/paraplegia
AR
ATM
BRCA1
BRCA2
CDH1
GARS
HEXB
KRAS
LMNA
MSH2
PIK3CA
TP53
MAD1L1
RAD54L
VAPB
CHEK2
BSCL2
ALS2
BRIP1
Androgen insensitivity
Breast cancer
Perineal hypospadiasProstate cancer
Spinal muscular atrophy
Ataxia-telangiectasia
Lymphoma
T-cell lymphoblastic leukemia
Ovarian cancer
Papillary serous carcinoma
Fanconi anemia
Pancreatic cancer
Wilms tumor
Charcot-Marie-Tooth disease
Sandhoff disease
Lipodystrophy
Amyotrophic lateral sclerosis
Silver spastic paraplegia syndrome
Spastic ataxia/paraplegia
Human Disease Network
(HDN)
Disease Gene Network
(DGN)
disease genomedisease phenome
DISEASOME
Fig. 1. Construction of the diseasome bipartite network. (Center) A small subset of OMIM-based disorder–disease gene associations (18), where circles and rectangles
Goh et al., PNAS 2007
GENES AND DISEASES

Asthma
Atheroscierosis
Blood
group
Breast
cancer
Complement_component
deficiency
Cardiomyopathy
Cataract
Charcot-Marie-Tooth
disease
Colon
cancer
Deafness
Diabetes
mellitus
Epidermolysis
bullosa
Epilepsy
Fanconi
anemia
Gastric
cancer
Hypertension
Leigh
syndrome
Leukemia
Lymphoma
Mental
retardation
Muscular
dystrophy
Myocardial
infarction
Myopathy
Obesity
Parkinson
disease
Prostate
cancer
Retinitis
pigmentosa
Spherocytosis
Spinocereballar
ataxia
Stroke
Thyroid
carcinoma
Zellweger
syndrome
APC
COL2A1
ACE
PAX6
ERBB2
FBN1
FGFR3
FGFR2
GJB2
GNAS
KIT
KRAS
LRP5
MSH2
MEN1
NF1
PTEN
SCN4A
TP53
ARX
a
b
Human Disease Network
Disease Gene Network
Disorder Class
Bone
Cancer
Cardiovascular
Connective tissue
Dermatological
Developmental
Ear, Nose, Throat
Endocrine
Gastrointestinal
Hematological
Immunological
Metabolic
Muscular
Neurological
Nutritional
Ophthamological
Psychiatric
Renal
Respiratory
Skeletal
multiple
Unclassified
Node size
1
5
10
15
21
25
30
34
41
Hirschprung
disease
Trichothio-
dystrophy
Alzheimer
disease
Heinz
body
anemia
Bethlem
myopathy
Hemolytic
anemia
Ataxia-
telangiectasia
Pseudohypo-
aldosteronism
APPLIEDPHYSICAL
SCIENCES

Leading Edge
Review
Interactome Networks and Human Disease
Marc Vidal,1,2,* Michael E. Cusick,1,2 and Albert-La´ szlo´ Baraba´ si1,3,4,*
1Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
2Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
3Center for Complex Network Research (CCNR) and Departments of Physics, Biology and Computer Science, Northeastern University,
Boston, MA 02115, USA
4Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
*Correspondence: marc_vidal@dfci.harvard.edu (M.V.), alb@neu.edu (A.-L.B.)
DOI 10.1016/j.cell.2011.02.016
Complex biological systems and cellular networks may underlie most genotype to phenotype
relationships. Here, we review basic concepts in network biology, discussing different types of
interactome networks and the insights that can come from analyzing them. We elaborate on why
interactome networks are important to consider in biology, how they can be mapped and integrated
with each other, what global properties are starting to emerge from interactome network models,
and how these properties may relate to human disease.
Introduction
Since the advent of molecular biology, considerable progress
has been made in the quest to understand the mechanisms
that underlie human disease, particularly for genetically inherited
disorders. Genotype-phenotype relationships, as summarized in
the Online Mendelian Inheritance in Man (OMIM) database (Am-
berger et al., 2009), include mutations in more than 3000 human
genes known to be associated with one or more of over 2000
human disorders. This is a truly astounding number of geno-
type-phenotype relationships considering that a mere three
decades have passed since the initial description of Restriction
Fragment Length Polymorphisms (RFLPs) as molecular markers
to map genetic loci of interest (Botstein et al., 1980), only
two decades since the announcement of the first positional
cloning experiments of disease-associated genes using RFLPs
(Amberger et al., 2009), and just one decade since the release
of the first reference sequences of the human genome (Lander
et al., 2001; Venter et al., 2001). For complex traits, the informa-
tion gathered by recent genome-wide association studies
suggests high-confidence genotype-phenotype associations
between close to 1000 genomic loci and one or more of over
phenotypic associations, there would still be major problems
to fully understand and model human genetic variations and their
impact on diseases.
To understand why, consider the ‘‘one-gene/one-enzyme/
one-function’’ concept originally framed by Beadle and Tatum
(Beadle and Tatum, 1941), which holds that simple, linear
connections are expected between the genotype of an organism
and its phenotype. But the reality is that most genotype-pheno-
type relationships arise from a much higher underlying com-
plexity. Combinations of identical genotypes and nearly identical
environments do not always give rise to identical phenotypes.
The very coining of the words ‘‘genotype’’ and ‘‘phenotype’’ by
Johannsen more than a century ago derived from observations
that inbred isogenic lines of bean plants grown in well-controlled
environments give rise to pods of different size (Johannsen,
1909). Identical twins, although strikingly similar, nevertheless
often exhibit many differences (Raser and O’Shea, 2005). Like-
wise, genotypically indistinguishable bacterial or yeast cells
grown side by side can express different subsets of transcripts
and gene products at any given moment (Elowitz et al., 2002;
Blake et al., 2003; Taniguchi et al., 2010). Even straightforward
Mapping Interactome Networks
Network science deals with complexity by ‘‘simplifying’’ com-
plex systems, summarizing them merely as components (nodes)
and interactions (edges) between them. In this simplified
approach, the functional richness of each node is lost. Despite
or even perhaps because of such simplifications, useful discov-
eries can be made. As regards cellular systems, the nodes are
metabolites and macromolecules such as proteins, RNA mole-
cules and gene sequences, while the edges are physical,
biochemical and functional interactions that can be identified
with a plethora of technologies. One challenge of network
biology is to provide maps of such interactions using systematic
and standardized approaches and assays that are as unbiased
as possible. The resulting ‘‘interactome’’ networks, the networks
of interactions between cellular components, can serve as scaf-
fold information to extract global or local graph theory proper-
et al., 2010). Computational prediction maps are fast and effi-
cient to implement, and usually include satisfyingly large
numbers of nodes and edges, but are necessarily imperfect
because they use indirect information (Plewczynski and Ginalski,
2009). While high-throughput maps attempt to describe unbi-
ased, systematic, and well-controlled data, they were initially
more difficult to establish, although recent technological
advances suggest that near completion can be reached within
a few years for highly reliable, comprehensive protein-protein
were discovered in and are being applied genome-wide for these
model organisms (Mohr et al., 2010).
Metabolic Networks
Metabolic network maps attempt to comprehensively describe
all possible biochemical reactions for a particular cell or
organism (Schuster et al., 2000; Edwards et al., 2001). In many
representations of metabolic networks, nodes are biochemical
metabolites and edges are either the reactions that convert
Figure 2. Networks in Cellular Systems
To date, cellular networks are most available for the ‘‘super-model’’ organisms (Davis, 2004) yeast, worm, fly, and plant. High-throughput interactome mapping
relies upon genome-scale resources such as ORFeome resources. Several types of interactome networks discussed are depicted. In a protein interaction
network, nodes represent proteins and edges represent physical interactions. In a transcriptional regulatory network, nodes represent transcription factors
(circular nodes) or putative DNA regulatory elements (diamond nodes); and edges represent physical binding between the two. In a disease network, nodes
represent diseases, and edges represent gene mutations of which are associated with the linked diseases. In a virus-host network, nodes represent viral proteins
(square nodes) or host proteins (round nodes), and edges represent physical interactions between the two. In a metabolic network, nodes represent enzymes,
and edges represent metabolites that are products or substrates of the enzymes. The network depictions seem dense, but they represent only small portions of
available interactome network maps, which themselves constitute only a few percent of the complete interactomes within cells.
Cell 2011
DISEASES AS NETWORK
PERTURBATIONS

Most cellular components exert their functions through
interactions with other cellular components, which can
be located either in the same cell or across cells, and
even across organs. In humans, the potential complexity
of the resulting network — the human interactome — is
daunting: with ~25,000 protein-coding genes, ~1,000
metabolites and an undefined number of distinct
proteins1
and functional RNA molecules, the number of
cellular components that serve as the nodes of the inter-
actome easily exceeds 100,000. The number of function-
ally relevant interactions between the components of
Network-based approaches to human disease have
multiple potential biological and clinical applications. A
better understanding of the effects of cellular intercon-
nectedness on disease progression may lead to the iden-
tification of disease genes and disease pathways, which,
in turn, may offer better targets for drug development.
These advances may also lead to better and more accurate
biomarkers to monitor the functional integrity of net-
works that are perturbed by diseases as well as to better
disease classification. Here we present an overview of
the organizing principles that govern cellular networks
Network medicine: a network-based
approach to human disease
Albert-László Barabási*‡§
, Natali Gulbahce*‡||
and Joseph Loscalzo§
Abstract | Given the functional interdependencies between the molecular components in a
human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects
the perturbations of the complex intracellular and intercellular network that links tissue
and organ systems. The emerging tools of network medicine offer a platform to explore
systematically not only the molecular complexity of a particular disease, leading to the
identificationofdiseasemodulesandpathways,butalsothemolecularrelationshipsamong
apparentlydistinct(patho)phenotypes.Advancesinthisdirectionareessentialforidentifying
new disease genes, for uncovering the biological significance of disease-associated
mutationsidentifiedbygenome-wideassociationstudiesandfull-genomesequencing,and
foridentifyingdrugtargetsandbiomarkersforcomplexdiseases.
Predicting disease genes
Disease-associated genes have generally been identified
using linkage mapping or, more recently, genome-wide
association (GWA) studies53
. Both methodologies can
suggest large numbers of disease-gene candidates, but
identifying the particular gene and the causal muta-
tion remains difficult. Recently, a series of increasingly
sophisticated network-based tools have been devel-
oped to predict potential disease genes; these tools can
be loosely grouped into three categories, as discussed
below (FIG. 4).
Linkage methods. These methods assume that the direct
interaction partners of a disease protein are likely to
be associated with the same disease phenotype45,54–56
.
Indeed, for one disease locus, the set of genes within
the locus whose products interacted with a known
nes in the interactome. a | Of the approximately
iated with specific diseases. The figure shows the
ssociated genes that were known42
in 2007 and
ntial, that is, their absence is associated with
agram of the differences between essential and
ential disease genes (shown as blue nodes) are
eriphery, whereas in utero essential genes (shown
REVIEWS
Figure 2 | Disease modules. Schematic diagram of the three modularity concepts that are discussed in this Review.
a | Topological modules correspond to locally dense neighbourhoods of the interactome, such that the nodes of
REVIEWS
this network, representing the links of the interactome,
is expected to be much larger2
.
This inter- and intracellular interconnectivity implies
that the impact of a specific genetic abnormality is not
restricted to the activity of the gene product that carries
it, but can spread along the links of the network and
alter the activity of gene products that otherwise carry
no defects. Therefore, an understanding of a gene’s net-
work context is essential in determining the phenotypic
impact of defects that affect it3,4
. Following on from this
principle, a key hypothesis underlying this Review is
that a disease phenotype is rarely a consequence of
an abnormality in a single effector gene product, but
reflects various pathobiological processes that inter-
act in a complex network. A corollary of this widely
held hypothesis is that the interdependencies among
a cell’s molecular components lead to deep functional,
molecular and causal relationships among apparently
distinct phenotypes.
and the implications of these principles for understand-
ing disease. These principles and the tools and method-
ologies that are derived from them are facilitating the
emergence of a body of knowledge that is increasingly
referred to as network medicine5–7
.
The human interactome
Although much of our understanding of cellular net-
works is derived from model organisms, the past dec-
ade has seen an exceptional growth in human-specific
molecular interaction data8
. Most attention has been
directed towards molecular networks, including protein
interaction networks, whose nodes are proteins that are
linked to each other by physical (binding) interactions9,10
;
metabolic networks, whose nodes are metabolites that
are linked if they participate in the same biochemi-
cal reactions11–13
; regulatory networks, whose directed
links represent either regulatory relationships between
a transcription factor and a gene14
, or post-translational
111 Dana Research Center,
Boston, Massachusetts
02115, USA.
‡
Center for Cancer Systems
Biology, Dana-Farber Cancer
Institute, 44 Binney Street,
02115, USA.
§
Department of Medicine,
Brigham and Women’s
Hospital, Harvard Medical
School, 75 Francis Street,
02115, USA.
||
Department of Cellular and
Molecular Pharmacology,
University of California, 1700
4th Street, Byers Hall 309,
Box 2530, San Francisco,
California 94158, USA.
Correspondence to A.-L.B.
e-mail: alb@neu.edu
doi:10.1038/nrg2918
56 | JANUARY 2011 | VOLUME 12 www.nature.com/reviews/genetics
© 2011 Macmillan Publishers Limited. All rights reserved
between the components of
g the links of the interactome,
ger2
.
ular interconnectivity implies
ic genetic abnormality is not
the gene product that carries
he links of the network and
roducts that otherwise carry
understanding of a gene’s net-
n determining the phenotypic
ct it3,4
. Following on from this
is underlying this Review is
e is rarely a consequence of
e effector gene product, but
logical processes that inter-
k. A corollary of this widely
e interdependencies among
ents lead to deep functional,
tionships among apparently
the organizing principles that govern cellular networks
and the implications of these principles for understand-
ing disease. These principles and the tools and method-
ologies that are derived from them are facilitating the
emergence of a body of knowledge that is increasingly
referred to as network medicine5–7
.
The human interactome
Although much of our understanding of cellular net-
works is derived from model organisms, the past dec-
ade has seen an exceptional growth in human-specific
molecular interaction data8
. Most attention has been
directed towards molecular networks, including protein
interaction networks, whose nodes are proteins that are
linked to each other by physical (binding) interactions9,10
;
metabolic networks, whose nodes are metabolites that
are linked if they participate in the same biochemi-
cal reactions11–13
; regulatory networks, whose directed
links represent either regulatory relationships between
a transcription factor and a gene14
, or post-translational
www.nature.com/reviews/genetics
Macmillan Publishers Limited. All rights reserved
NETWORK MEDICINE

RESEARCH ARTICLE SUMMARY
◥
DISEASE NETWORKS
Uncovering disease-disease
relationships through the
incomplete interactome
Jörg Menche, Amitabh Sharma, Maksim Kitsak, Susan Dina Ghiassian, Marc Vidal,
Joseph Loscalzo, Albert-László Barabási*
INTRODUCTION: A disease is rarely a straight-
forward consequence of an abnormality in a
single gene, but rather reflects the interplay
of multiple molecular processes. The rela-
tionships among these processes are encoded
in the interactome, a network that integrates
all physical interactions within a cell, from
protein-protein to regulatory protein–DNA
and metabolic interactions. The documented
propensity of disease-associated proteins to
interact with each other suggests that they
tend to cluster in the same neighborhood of
the interactome, forming a disease module, a
connected subgraph that contains all molecu-
lar determinants of a disease. The accurate
identification of the corresponding disease
module represents the first step toward a sys-
tematic understanding of the molecular mech-
anisms underlying a complex disease. Here,
we present a network-based framework to iden-
tify the location of disease modules within the
interactome and use the overlap between the
modules to predict disease-disease relationships.
RATIONALE: Despite impressive advances
in high-throughput interactome mapping and
disease gene identification, both the interac-
tome and our knowledge of disease-associated
genes remain incomplete. This incomplete-
ness prompts us to ask to what extent the
current data are sufficient to map out the
disease modules, the first step toward an in-
tegrated approach toward human disease.
To make progress, we must formulate math-
ematically the impact of network inc
ness on the identifiability of disease
quantifying the predictive power and
itations of the current interactome.
RESULTS: Using the tools of network
we show that we can only uncover
modules for diseases whose number
ciated genes excee
ical threshold det
bythenetworkinc
ness. We find tha
proteins associa
226 diseases are
inthesame netwo
borhood, displaying a statistically sig
tendency to form identifiable disease m
The higher the degree of agglomerati
disease proteins within the interact
higher the biological and functional
ity of the corresponding genes. The
ings indicate that many local neighb
of the interactome represent the ob
part of the true, larger and denser
modules.
If two disease modules overlap, lo
turbations causing one disease can
pathways of the other disease module
resulting in shared clinical and path
ical characteristics. To test this hyp
we measure the network-based sepa
each disease pair, observing a direct
between the pathobiological simi
diseases and their relative distanc
RES
ON OUR WEB SITE
◥
Read the full article
at http://dx.doi.
org/10.1126/
science.1257601
..................................................
Menche et al., Science 2015
DISEASES AS NETWORK
NEIGHBORHOODS

Diseases As Local Neighborhoods

Asthma
Parkinson’s
Leukemia
MS
Hypertension
Rheumatoidarthritis
Crohn’s disease
Type 2 diabetes
Glioblastoma
Ulcerative colitis
Heart failure
Network Clustering Means Explainable Biology
AIF1
ZBTB12
NFKBIZ
MERTK
HHEX
CFB
Diseases As Local Neighborhoods

Interactome and disease genes
GWAS
Multiple sclerosis
genes
OMIM
Signalling
Complexes
Kinase - Substrate
Metabolic
Literature
Regulatory
Yeast two-hybrid
GWAS & OMIM
Other disease genes
Molecular interactions
Gene with multiple disease associations
OMIM
Immunologic deﬁciency
syndromes
Hematologic diseases
Blood protein disorders
GWAS
Connective tissue diseases
Autoimmune diseases
Joint diseases
Musculoskeletal diseases
Rheumatoid arthritis
Signaling, Complexes, Literature, Regulatory
Interaction with multiple lines of evidences
AKT1
HLA-B
HLA-C
STAT3
TAP2
NFKBIZ
IL2RA
TNFRSF1A
EHMT2
PTK2
IL7R
MAPK1
Observable module for Multiple sclerosis
• %The interactome contains 141,296 physical interactions between 13,460 proteins
• We study 299 diseases with at least 20 gene associations

Measures of network localization
multiple
sclerosis
proteins
shortest
distance
connected
component
of size S=11
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0 2 4 6 8 10 12 14 16 18 20
frequency
size of largest component
data
random
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 1 2 3 4 5 6 7 8 9
frequency
shortest distance d
data
random
AIF1
TRAF6
VCAM1
IRF8
ITGB7 ADRA1B
CD6 HLA-DQA1
CD5
HLA-DQB1
SLC30A7
TRIM27
NEDD4
C2
MLANA CBLB
CBL
MALT1
PTPRC
CD40
PTPRK CD4
BAG6 CD28
PLEK
TGFBR1
HLA-DQA2
CD58
CDSN
UBQLN4
TNFSF14
IL12A
IL12B
TNFRSF14
GRB2
CRK
MLH1
IL20RA
ZNF512B
DKKL1
SMYD2
FLNC
AHI1
ZFP36L1
UBE2I
TNFRSF1AAKT1
RAP1GAP
IL2RA
PTK2 EHMT2
HERPUD1
MERTK DDX39B
DHX16HAAO
ARHGDIA
CD86
LCP1
YAP1 METTL1
CD24 DENND3
PSMA4
FGR
STAT3
POU5F1
MAPK1
YWHAH
BATF
AR
PRRC2A
KIF1B
JUN
HLA-B
MICAHLA-C
MBP
KLRC4
MICB
ZBTB12TAP2
EXOC6
PDZK1
IL7R
MYOD1
ARRB1
ALB
NEDD9 NFKBIZ
TNXB
BACH2
BANP
RDBP
HLA-DRB1
NOTCH4
HLA-DOB
HLA-DRB5
ZFP36L2
HLA-DRAFBXW7
4276
HLA-DMB
HHEX
PFDN1
SIRT2
SLC15A2
SP140
CFB
EOMES
d=2
d=3
d=3
• We use two measures to quantify the interactome-based localization of a disease
• 226 out of 299 diseases are significantly localized according to both measures

Relations between Diseases
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5
DistributionP(d)
Shortest distance d
PD
MS
Pairwise
Separated Modules
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3 4
DistributionP(d)
Shortest distance d
MS
Pairwise
Overlapping Modules
RA
s = 1.3ABs = - 0.2AB
ABs
(d , d or d )AA BB AB (d , d or d )AA BB AB
s < 0AB
s ≥ 0AB
FKBP7
PEX12
PEX3
SLC2A4
PEX19
UBQLN4
WRAP73
CCDC14
PEX10
ZNF512B
CAT
GPX3
ACAD
ACOX1
AHI1HADHA
TM6SF1
PEX16
HADHB
PEX11B
ABCD1
SMYD2
SLC27A2
HERPUD1
EWSR1
CD58
MED8
PEX1
PRR13
PEX5
PEX14
PEX2
IDH1
ZNF772
PEX13
PEX6
BANP
PEX26
TNXB
MVK
SLC30A7
ZSCAN1
LK
FGR IL7R
FYN
4
CD5 BATF
JUN
NFKBIL1
YOD1
DDX39B
6KA1 PTPN11
SIRT2
DHX16
NFKBIZ
STAT3
VDR
NCOR1
KIF1B
RDBP
RBPJ
HLA-DMB
TNPO3
HLA-DRA
SRSF1
HLA-DOB
ITGB7
PTPRC
CBLB
VCAM1
CDKN2A
BAG6TRAF2
PSMA4CD40 GMCL1
FAM107A
SUMO1
PFKL
TRAF6
MIF
C2
RHOA
TNFAIP3
TNFSF14
ATF7IP
USP53
HLA-B
EHMT2
TRAF1
OLIG3
RPL14
PHYH
CCL21
CAPRIN2 KLF6
CDSN
PEX7AGXT
IL2RA
MAPK1
YWHAG
PTK2
TNFRSF1A
HDAC2
STAT4
HLA-DRB1
TRA@
FCRL3
SMAD3
HLA-DRB5
TAP2
HLA-C
MALT1
POU5F1
ARHGDIA
HAAO
FAM167A
ARRB1
HSPA5
REL
BACH2
SMARCC2
ALB
UBE2I
RSBN1
C5orf30
EXOC6
GRB2
APOM
DDO
MKRN3
RAB35
SLX4
PHF19
GNPATPRRC2A
PTPN22
HLA-DQA2
PTPRK
GHR
HHEX
RTF1
CFBAGPS
ADRA1B
IL23A
ACTA1
SLC22A4
S100A6
SLC15A2
F8PDZK1
MLYCDMICA
SSTR5
FKBP4PFDN1
RNF167
OTUD5
FLNC
MLH1
PADI4
MERTK
HRAS
AFF3 IL20RA
HLA-DQA1
PPARG
Multiple sclerosis (MS)
Peroxisomal disorders (PD)
Rheumatoid arthritis (RA)
• We introduce a network-based measure to quantify the overlap/separation of two diseases
• Most disease pairs are well separated on the Interactome

Network Distance vs. Biomedical
Similarity
22
RRksirevitaler
Separation smean
ytiralimismotpmyS
Separation smean
ytiralimismretOG
Separation smean
10-3
10-2
10-1
100
-3 -2 -1 0 1 2
ytiralimismretOG
Separation
Expectation
smean
10-1
100
-3 -2 0 1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
-3 -2 -1 0 1 2
noisserpxe-oC
Separation smean
10-3
10-2
10-1
100
-3 -2 -1 0 1 2
ytiralimismretOG
Separation smean
10-3
10-2
10-1
-3 -2 -1 0 1 2
10-1
100
101
102
103
-3 -2 -1-1 0 1
biological process molecular function
co-expression symptoms comorbidity
s AB s AB s AB
s AB
s ABs AB
cellular component
• Diseases that are close in the Interactome have similar biomedical properties

The Disease Space
Type 1 diabetes
Rheumatoid arthritis
Sutoimmune diseases
of the nervous system
Demyelinating
autoimmune diseases
Immune System Diseases
1
5
6
7
8
12
13
14
11
10
9
Retinitis pigmentosa
Retinal degeneration
Graves disease
Macular degeneration
Eye Diseases
1
2
3
4
9
10
Asthma
Respiratory hypersensitivity
Respiratory Tract Diseases
13
14
11
12
Cerebrovascular disorders
Myocardial infarction
Coronary artery disease
Myocardial ischemia
Cardiovascular Diseases
5
6
7
8
2
3
4
• Diseases and their network-based relationships can be represented in a 3D Diseases-Space
• Diseases belonging to the same class agglomerate

Overlapping diseases
• Examples of unexpected disease relationships uncovered using the disease space
IL1RL1
IL18R1
HLA-DRA
HLA-DPA1
HLA-DQB1
HLA-DPB1
HLA-DOA
HLA-DQA2
IL33
CDK2
SMAD3
NOTCH4
IL2RB
PTPN2
RUNX3
ETS1
BACH2
UBE2E3
IL18RAP
XCR1
OLIG3
TNFAIP3
CTLA4
EGFR
KIAA1109
MYO9B
CCR4
SH2B3
PLEK
CCR1
PTPRK
ARHGAP31
RGS1
ZMIZ1
SLC9A4
IL12A
RMI2
SYF2
LPP
IL21
PRM1
ATXN2
GLB1
HLA-DQA1
IL2
ITGA4
ICOS
ICOSLG
IKZF4 DPP10
ELF3
ORMDL3
ADAM33
RANBP6
TSLP
CRB1
PLA2G7
USP38 IL6RSLC25A46
SLC30A8
TBX21
MUC7
CHIT1
PBX2 PDE4D
C11orf30
BRD2
SUOX
Celiac disease
Celiac disease
asthma
asthma
celiac
diseaseasthma
atherosclerosis
coronary
artery disease
biliary
tract diseases
hepatic
cirrhosis
Intestinal immune
network for IGA
production
Intestinal immune network

O R I G I N A L A R T I C L E
A disease module in the interactome explains disease
heterogeneity, drug response and captures novel
pathways and genes in asthma
Amitabh Sharma1,2,3,†, Jörg Menche1,2,4,8,†, C. Chris Huang5, Tatiana Ort5,
Xiaobo Zhou3, Maksim Kitsak1,2, Nidhi Sahni2, Derek Thibault3, Linh Voung3,
Feng Guo3, Susan Dina Ghiassian1,2, Natali Gulbahce6, Frédéric Baribaud5, Joel
Tocker5, Radu Dobrin5, Elliot Barnathan5, Hao Liu5, Reynold A. Panettieri Jr7,
Kelan G. Tantisira3, Weiliang Qiu3, Benjamin A. Raby3, Edwin K. Silverman3,
Marc Vidal2,9, Scott T. Weiss3 and Albert-László Barabási1,2,3,4,8,*
1
Center for Complex Networks Research, Department of Physics, Northeastern University, Boston, MA 02115,
USA, 2
Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute,
Boston, MA 02215, USA, 3
Channing Division of Network Medicine, Department of Medicine, Brigham and
Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA, 4
Department of Theoretical Physics,
Budapest University of Technology and Economics, H1111, Budapest, Hungary, 5
Janssen Research &
Development, Inc., 1400 McKean Road, Spring House, PA 19477, USA, 6
Department of Cellular and Molecular
Pharmacology, University of California 1700, 4th Street, Byers Hall 308D, San Francisco, CA 94158, USA,
7
Pulmonary Allergy and Critical Care Division, Department of Medicine, University of Pennsylvania, 125 South
31st Street, TRL Suite 1200, Philadelphia, PA 19104, USA, 8
Center for Network Science, Central European
University, Nador u. 9, 1051 Budapest, Hungary and 9
Department of Genetics, Harvard Medical School, Boston,
MA 02115, USA
*To whom correspondence should be addressed at: Center for Complex Networks Research, Department of Physics, Northeastern University, Boston,
MA 02115, USA. Email: barabasi@gmail.com
Abstract
Recent advances in genetics have spurred rapid progress towards the systematic identification of genes involved in complex
diseases. Still, the detailed understanding of the molecular and physiological mechanisms through which these genes affect
disease phenotypes remains a major challenge. Here, we identify the asthma disease module, i.e. the local neighborhood of the
interactome whose perturbation is associated with asthma, and validate it for functional and pathophysiological relevance,
using both computational and experimental approaches. We find that the asthma disease module is enriched with modest
GWAS P-values against the background of random variation, and with differentially expressed genes from normal and
asthmatic fibroblast cells treated with an asthma-specific drug. The asthma module also contains immune response
mechanisms that are shared with other immune-related disease modules. Further, using diverse omics (genomics,
†
These authors contributed equally to this work.
Received: September 1, 2014. Revised: November 19, 2014. Accepted: January 5, 2015
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Human Molecular Genetics, 2015, Vol. 24, No. 11 3005–3020
doi: 10.1093/hmg/ddv001
Advance Access Publication Date: 12 January 2015
Original Article
3005
atNortheasternUniversityLibrariesonDecember8,2015http://hmg.oxfordjournals.org/Downloadedfrom
RESEARCH ARTICLE
A DIseAse MOdule Detection (DIAMOnD)
Algorithm Derived from a Systematic
Analysis of Connectivity Patterns of Disease
Proteins in the Human Interactome
Susan Dina Ghiassian1,2☯
, Jörg Menche1,2,3☯
, Albert-László Barabási1,2,3,4
*
1 Center for Complex Networks Research and Department of Physics, Northeastern University, Boston,
Massachusetts, United States of America, 2 Center for Cancer Systems Biology (CCSB) and Department of
Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America, 3 Center
for Network Science, Central European University, Budapest, Hungary, 4 Channing Division of Network
Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston,
Massachusetts, United States of America
☯ These authors contributed equally to this work.
* barabasi@gmail.com
Abstract
The observation that disease associated proteins often interact with each other has fueled
the development of network-based approaches to elucidate the molecular mechanisms of
human disease. Such approaches build on the assumption that protein interaction networks
can be viewed as maps in which diseases can be identified with localized perturbation with-
in a certain neighborhood. The identification of these neighborhoods, or disease modules,
is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While
numerous heuristic methods exist that successfully pinpoint disease associated modules,
OPEN ACCESS
Citation: Ghiassian SD, Menche J, Barabási A-L
(2015) A DIseAse MOdule Detection (DIAMOnD)
Algorithm Derived from a Systematic Analysis of
Connectivity Patterns of Disease Proteins in the
Human Interactome. PLoS Comput Biol 11(4):
e1004120. doi:10.1371/journal.pcbi.1004120
Editor: Andrey Rzhetsky, University of Chicago,
UNITED STATES
Received: August 25, 2014
Accepted: January 9, 2015
Sharma et al., HMG 2015
Ghiassian et al., PLoS Comp Biol 2015
BUILDING DISEASE MODULES

Disease Module Detection and
Analysis
The general workflow of a detailed analysis for a disease of interest:
I Interactome construction II Disease Module
Identiﬁcation
III Validation IV Biological interpretation
- Gene expression data
- Gene Ontologies
- Pathways
- Comorbidity
- OMIM, GWAS, literature
- DIAMOnD: Disease
Module Detection Algorithm
- Pathway prioritization
- Molecular mechanism
Seed gene selection
- Binary interactions, metabolic
couplings, regulatory interactions ...

DIAM
DISEASE MODULES VS COMMUNITIES

original seed genes
gene selected
at iteration i
DIAMOnD genes
legend:
iteration 3
iteration 2iteration 1initial seeds
0.18
0.46
0.46
0.07
0.53
0.46
0.21
0.46
0.29
p-value:
A B
genes connected to a
seed gene
proto-module
DIAMOnD
genes
Disease module:C
DIAMOnD –Disease Module Detection
Algorithm
 purely%topological%method%
 %all%genes%in%the%network%are%
prioriSzed%according%to%their%
potenSal%relevance%for%the%disease%

DIAMOnD and Disease Modules within the Human
BUILDING A DISEASE MODULE

ARTICLE
Received 7 May 2015 | Accepted 29 Nov 2015 | Published 1 Feb 2016
Network-based in silico drug efficacy screening
Emre Guney1,2, Jo¨rg Menche1,3, Marc Vidal2,4 & Albert-La´szlo´ Bara´basi1,2,3,5
The increasing cost of drug development together with a significant drop in the number of
new drug approvals raises the need for innovative approaches for target identification
and efficacy prediction. Here, we take advantage of our increasing understanding of the
network-based origins of diseases to introduce a drug-disease proximity measure that
quantifies the interplay between drugs targets and diseases. By correcting for the known
biases of the interactome, proximity helps us uncover the therapeutic effect of drugs, as well
as to distinguish palliative from effective treatments. Our analysis of 238 drugs used in 78
diseases indicates that the therapeutic effect of drugs is localized in a small network
neighborhood of the disease genes and highlights efficacy issues for drugs used in Parkinson
and several inflammatory disorders. Finally, network-based proximity allows us to predict
novel drug-disease associations that offer unprecedented opportunities for drug repurposing
and the detection of adverse effects.
DOI: 10.1038/ncomms10331 OPEN
Guney et al., Nature Comm 2015
DRUGS

DRUGS
ABCC8
VEGFA
RUNX1
INS
KAT6A
TOP2A
IRS1
TOP2B
CAPN10
NPM1
A
Disease gene
Drug target
Shortest path to the
closest disease gene
d
R
R
R
RR
z =s2
s3
1
t2
Random gene sets with the same degrees
...
T1
S1d1`
Tn Sndn
s1 t1
s2
s3
t2
2+3
2
d=
Drug - disease proximity
Gliclazide
Daunorubicin
Type 2 diabetes
Acute myeloid leukaemia
dc = 2.5
zc = 1.3
dc = 1.0
zc = –1.6
zc = 1.0zc = –3.3
dc = 2.0
dc = 1.0
b
c
Disease genes
Acute myeloid leukaemiaType 2 diabetes
Drug targetsDrug targets
Gliclazide Daunorubicin
Figure 1 | Network-based drug-disease proximity. (a) Illustration of the closest distance (dc) of a drug T with targets t1 and t2 to the proteins s1, s2 and s3

PROXIMITY TO DISEASE MODULES
a
b c
Seperation (dss)
dc dk
dcc
dss
Disease
module
Drugds
Center (dcc)Kernel (dk)Shortest (ds)Closest (dc)
AUC(%)
R2
= 0.175
th(dc)
R2
= 0.003
80
70
60
50
40
30
4
3
E COMMUNICATIONS | DOI: 10.1038/ncomms10331 A

GOING FURTHER
Full text: http://barabasi.com/networksciencebook/

Introduction to Network Medicine

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Introduction to Network Medicine

Ähnlich wie Introduction to Network Medicine (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Introduction to Network Medicine