2. Networks
• a collection of individuals or
entities, each called a vertex or node
• a list of pairs of vertices that are
neighbors, representing edges or links
• vertices are Facebook users, edges represent Facebook friendships
• Vertices of Drugs and Targets ,edges represent Drug target relations
3. Network Basic definitions
N to denote the number of vertices in a network
• Number of possible edges: N(N-1)/2 ~ N2/2
• The degree of a vertex is its number of neighbors
• The distance between two vertices is the length of the shortest path connecting
them.
• If two vertices are in different components, their distance is undefined or infinite.
• The diameter of a network is the average distance between pairs(It measures how
near or far typical individuals are from each other)
4. Network Basic definitions
Directed and Undirected Networks
• If the pairs are unordered, then the graph is undirected:
vertices = {A, B, C, D, E}
edges = ({A, B}, {A, C}, {B, C}, {C, E}).
• Otherwise it is directed:
vertices = {A, B, C, D, E}
edges = ((A, B),(A, C),(B, C),(C, E)).
Example
• I can follow you on Twitter without you following me
• web page A may link to page B, but not vice-versa.
6. Degree Properties
• Indegree
how many directed edges (arcs) are incident on a node 2
• outdegree
how many directed edges (arcs) originate at a node 2
• degree (in or out)
number of edges incident on a node 3
• Degree distribution: A frequency count of the occurrence of each
degree
• In-degree distribution:
[(2,3) (1,4) (0,1)]
• Out-degree distribution:
[(2,4) (1,3) (0,1)]
• (undirected) distribution:
[(3,3) (2,2) (1,3)]
7. Connected Components
• Strongly connected components
• Each node within the component can be reached from every other node in
the component by following directed links
Strongly connected components
B C D E
A
G H
F
• Weakly connected components:
every node can be reached from every other node by following links in either
direction
Weakly connected components
A B C D E
G H F
• In undirected networks one talks simply about
“connected components”
B
C
D
F
H
GA
E
A
C
D
E
F
G
H
B
8. Giant Component
In a network, a "component" is a group of nodes (people) that are all connected to
each other, directly or indirectly. So if a network has a "giant component", that
means almost every node is reachable from almost every other.
9. Centrality Measures
• We often want to know important nodes in a network .
• Centrality is one of the methods
There are many but for this course we will look into :
Degree - connectedness
Closeness – ease of reaching other nodes
Betweeness – role as an intermediary node
Eigenvector - not what you know, but who you know
Things to Remember
• Centrality is a measure of an node, centralization is a measure of
the network.
• It matters whether you are considering a directed or an undirected
network.
• Most centrality measures work on binary/unweighted networks
10. Centrality Measures
Node Degree Centrality : divide degree by the max. possible, i.e. (N-1)
Degree Centralization:
CD =
CD (n*
) - CD (i)[ ]i=1
g
å
[(N -1)(N -2)]
maximum value in the network
CD = 0.167CD = 1.0
11. Centrality Measures
Closeness Centrality
• one still wants to be in the “middle” of network, not
too far from the center.
• Closeness is based on the length of the average shortest
path between a node and all other nodes in the network
CC
'
(i) = (CC (i))/(N -1)
Closeness Centrality:
Normalized Closeness Centrality:
Cc
'
(A) =
d(A, j)
j=1
N
å
N -1
é
ë
ê
ê
ê
ê
ù
û
ú
ú
ú
ú
-1
=
1+ 2 +3+ 4
4
é
ëê
ù
ûú
-1
=
10
4
é
ëê
ù
ûú
-1
= 0.4
An actor who has very low closeness centrality takes many more steps to get to everyone.
12. Centrality Measures
Betweeness Centrality
• A node has a high betweenness centrality when they occupy a
position in the geodesics connecting many pairs of other actors
in the network
• It is equal to the number of shortest paths from all vertices
to all others that pass through that node.
CB (i) = gjk (i)/gjk
j<k
å
Where gjk = the number of shortest paths connecting jk
gjk(i) = the number that actor i is on.
Usually
normalized by:
CB
'
(i) = CB (i )/[(n -1)(n -2)/2]
number of pairs of vertices
excluding the vertex itself
13. • non-normalized version:
A lies between no two other vertices
B lies between A and 3 other vertices: C, D, and E
C lies between 4 pairs of vertices (A,D),(A,E),(B,D),(B,E)
note that there are no alternate paths for these pairs to
take, so C gets full credit
A B C ED
Betweeness Centrality
14. Betweeness Example
why do C and D each have betweenness
1?
They are both on shortest paths for pairs
(A,E), and (B,E), and so must share
credit:
• ½+½ = 1
15. Eigen Vector Centrality
Eigenvector centrality is calculated by assessing how well connected an individual is
to the parts of the network with the greatest connectivity.
Here yellow has more eigen vector centrality because it is connected to nodes with
greatest connectivity
Applications: High eigenvector centrality individuals are leaders of the network.
They are often public figures with many connections to other high-profile
individuals. Thus, they often play roles of key opinion leaders and shape public
perception. A related example of this is Google’s page rank algorithm, which is
closely related to eigenvector centrality calculated on websites based on links to
them.