2. Large number of
“interactomes” are currently
accumulated.
These interaction networks
combine measurements from
a large number of sources to
produce a network of
interactions.
We here assume that the
network is only characterized
by the graph of interactions
and nothing is known about
the content of the nodes.
3. Interactomes occur in
biology:
• protein networks.
• genetic networks. neural
networks.
in social sciences
• Social networks
In information
• Wikipediae
• Content networks
Most such networks are
not validated and contain a
large amount of
superfluous data.
4. We are looking for important features in
the networks.
These features can be:
1. Important nodes.
2. Information flow.
We propose algorithms to extract those
from the network topology and methods
to validate the results.
5.
6. We compared the ratio of scores of two
neighboring nodes, and define that a node is
higher in hierarchy than its neighbor and if its
score is higher and the score ratio is between the
lower and upper thresholds.
For many neighboring nodes there is no
hierarchical relation. Their score ratio can be too
close to one and thus above the upper threshold
(e.g. cheese and meat).
Their ratio could also be too far from one and
thus below the lower threshold (e.g. Obama and
myself)
7. • The Hierarchy score is the centrality normalized to the
indegree and the outdegree.
• We checked whether the nodes participation in
information flow on the network (betweenness) is higher
or lower than what is expected merely from its
connectivity.
CB (i)
H (i)
(kin (i) 1)(kout (i) 1)
8. •The score is proportional only to the local neighberhood
•Very fast – low CPU and memory cost
•Average 82%
kin kout
H (i) kin kout
(kin kout ) (kin kout )
Problem : the algorithm is not sensitive to the network structure
for example: for binary tree the algorithm is inefficient
9. Nin (m) m N out (m) m
H (i) /
m Nin (m) m N out (m)
Nin (i) number of incoming neighbours of level m
Nin (i) average number of neighbours of level m
weighted base
10.
11. As we decrease the upper
threshold, we reduce the
fraction of node couples for
which a hierarchical position
can be obtained
On the other hand, we
increase the success rate for
the remaining node couples.
Low upper cutoff leads to a
tight definition of the
hierarchy, with practically all
edges in the proper
direction, but with a low
number of categorized edges,
High upper cutoff leads to a
hierarchy, which is often
unnatural.
12.
13. Microsoft Windows XP Pro operating system.
Links are directional and that the obtained network is
practically acyclic,
Using the attraction basin hierarchy 6869 links out of
6899 (99.57%) were marked in the proper direction.
local hierarchy producing 98.57% of properly
computed links,
PageRank with 96.13%.
HITS 90%
The centrality-based hierarchy 63%
14.
15.
16. Isthere a
meaningful
information flow
between nodes in
networks?
Specifically, can we
extract from the
network the
meaningful paths of
information?
17.
18. Ifinformation flows, it should be sensitive
to the precise direction of each edge.
We thus checked what happens when the
direction of edges is flipped.
Strangely nothing changes in the network
except for two things:
The distance distribution gets slightly
shorter.
The circle distribution length gets
drastically shorter.
19.
20.
21. The long circles include a well defined
limited number of essential genes/neurons.
In the neural network these neurons map
to the main trajectories from the sensor to
the interneuron circles to the motor
neurons.
In genetic network these trajectories relate
the most essential genes (genes that their
deletion leads to organism death).
22. A simple toy network explains the
observed results.
23. The spaghetti ball of networks can be
replaced by clear hierarchies or organized
information pathways.
The vast majority of edges can be removed
while maintaining the important information
flow.