SlideShare ist ein Scribd-Unternehmen logo
1 von 75
Network
Measures
SOCIAL
MEDIA
MINING
2Social Media Mining Measures and Metrics 2Social Media Mining Network Measureshttp://socialmediamining.info/
Dear instructors/users of these slides:
Please feel free to include these slides in your own
material, or modify them as you see fit. If you decide
to incorporate these slides into your presentations,
please include the following note:
R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining:
An Introduction, Cambridge University Press, 2014.
Free book and slides at http://socialmediamining.info/
or include a link to the website:
http://socialmediamining.info/
3Social Media Mining Measures and Metrics 3Social Media Mining Network Measureshttp://socialmediamining.info/
Klout
It is difficult
to measure
influence!
4Social Media Mining Measures and Metrics 4Social Media Mining Network Measureshttp://socialmediamining.info/
Why Do We Need Measures?
• Who are the central figures (influential individuals) in
the network?
– Centrality
• What interaction patterns are common in friends?
– Reciprocity and Transitivity
– Balance and Status
• Who are the like-minded users and how can we find
these similar individuals?
– Similarity
• To answer these and similar questions, one first
needs to define measures for quantifying centrality,
level of interactions, and similarity, among others.
5Social Media Mining Measures and Metrics 5Social Media Mining Network Measureshttp://socialmediamining.info/
Centrality defines how important a node is within a network
Centrality
6Social Media Mining Measures and Metrics 6Social Media Mining Network Measureshttp://socialmediamining.info/
Centrality in terms of those
who you are connected to
7Social Media Mining Measures and Metrics 7Social Media Mining Network Measureshttp://socialmediamining.info/
Degree Centrality
• Degree centrality: ranks nodes with more
connections higher in terms of centrality
• 𝑑𝑖 is the degree (number of friends) for node 𝑣𝑖
– i.e., the number of length-1 paths (can be generalized)
In this graph, degree centrality
for node 𝑣1 is 𝑑1=8 and for all
others is 𝑑𝑗 = 1, 𝑗 ≠ 1
8Social Media Mining Measures and Metrics 8Social Media Mining Network Measureshttp://socialmediamining.info/
Degree Centrality in Directed Graphs
• In directed graphs, we can either use the in-
degree, the out-degree, or the combination as
the degree centrality value:
• In practice, mostly in-degree is used.
𝑑𝑖
𝑜𝑢𝑡
is the number of outgoing links for node 𝑣𝑖
9Social Media Mining Measures and Metrics 9Social Media Mining Network Measureshttp://socialmediamining.info/
Normalized Degree Centrality
• Normalized by the maximum
possible degree
• Normalized by the maximum
degree
• Normalized by the degree sum
10Social Media Mining Measures and Metrics 10Social Media Mining Network Measureshttp://socialmediamining.info/
Degree Centrality (Directed Graph)Example
Normalized by the maximum possible degree
E
B
A
C F
D
G
Node In-Degree Out-Degree Centrality Rank
A 1 3 1/2 1
B 1 2 1/3 3
C 2 3 1/2 1
D 3 1 1/6 5
E 2 1 1/6 5
F 2 2 1/3 3
G 2 1 1/6 5
11Social Media Mining Measures and Metrics 11Social Media Mining Network Measureshttp://socialmediamining.info/
Degree Centrality (undirected Graph) Example
Node Degree Centrality Rank
A 4 2/3 2
B 3 1/2 5
C 5 5/6 1
D 4 2/3 2
E 3 1/2 5
F 4 2/3 2
G 3 1/2 5
E
B
A
C F
D
G
12Social Media Mining Measures and Metrics 12Social Media Mining Network Measureshttp://socialmediamining.info/
Eigenvector Centrality
• Having more friends does not by
itself guarantee that someone is
more important
– Having more important friends
provides a stronger signal Phillip Bonacich
• Eigenvector centrality generalizes degree
centrality by incorporating the importance of
the neighbors (undirected)
• For directed graphs, we can use incoming or
outgoing edges
13Social Media Mining Measures and Metrics 13Social Media Mining Network Measureshttp://socialmediamining.info/
Formulation
• Let’s assume the eigenvector centrality of a node is
𝑐 𝑒 𝑣𝑖 (unknown)
• We would like 𝑐 𝑒 𝑣𝑖 to be higher when important
neighbors (node 𝑣𝑗 with higher 𝑐 𝑒 𝑣𝑗 ) point to us
– Incoming or outgoing neighbors?
– For incoming neighbors 𝐴𝑗,𝑖 = 1
• We can assume that 𝑣𝑖’s centrality is the summation
of its neighbors’ centralities
• Is this summation bounded?
• We have to normalize!
: some fixed constant
14Social Media Mining Measures and Metrics 14Social Media Mining Network Measureshttp://socialmediamining.info/
• Let

• This means that 𝑪 𝒆 is an eigenvector of
adjacency matrix 𝐴 𝑇
(or 𝐴 when undirected) and
 is the corresponding eigenvalue
• Which eigenvalue-eigenvector pair should we
choose?
Eigenvector Centrality (Matrix Formulation)
15Social Media Mining Measures and Metrics 15Social Media Mining Network Measureshttp://socialmediamining.info/
Finding the eigenvalue by finding a fixed point…
• Start from an initial guess 𝐶𝑒(0) (e.g., all
centralities are 1) and iterative 𝑡 times
• We can write 𝐶𝑒(0) as a linear combination of
eigenvectors 𝑣𝑖’s of the 𝐴 𝑇
• Substituting this, we get
𝜆1 is the largest
eigenvalue
16Social Media Mining Measures and Metrics 16Social Media Mining Network Measureshttp://socialmediamining.info/
Finding the eigenvalue by finding a fixed point…
• As 𝑡 grows, we will have in the limit
• Or equivalently
• If we start with an all positive 𝐶𝑒(0) all 𝐶𝑒(𝑡)’s
will be positive (why?)
– All the centrality values would be positive
– We need an eigenvalue-eigenvector pair that
guarantees all centralities have the same sign
• E.g., for comparison purposes
17Social Media Mining Measures and Metrics 17Social Media Mining Network Measureshttp://socialmediamining.info/
Eigenvector Centrality, cont.
So, to compute eigenvector centrality of 𝐴,
1. We compute the eigenvalues of A
2. Select the largest eigenvalue 
3. The corresponding eigenvector of  is 𝐂 𝐞.
4. Based on the Perron-Frobenius theorem, all the
components of 𝐂 𝐞will be positive
5. The components of 𝐂 𝐞 are the eigenvector centralities
for the graph.
18Social Media Mining Measures and Metrics 18Social Media Mining Network Measureshttp://socialmediamining.info/
Eigenvector Centrality: Example 1
Eigenvalues are
Largest Eigenvalue
Corresponding eigenvector (assuming 𝐂 𝐞 has norm 1)
19Social Media Mining Measures and Metrics 19Social Media Mining Network Measureshttp://socialmediamining.info/
Eigenvector Centrality: Example 2
 = (2.68, -1.74, -1.27, 0.33, 0.00)
Eigenvalues Vector
max = 2.68
20Social Media Mining Measures and Metrics 20Social Media Mining Network Measureshttp://socialmediamining.info/
Katz Centrality
• A major problem with eigenvector
centrality arises when it deals with
directed graphs
• Centrality only passes over outgoing
edges and in special cases such as
when a node is in a directed acyclic
graph centrality becomes zero
– The node can have many edge
connected to it
Eigenvector Centrality
Elihu Katz
• To resolve this problem we add bias term  to the centrality
values for all nodes
21Social Media Mining Measures and Metrics 21Social Media Mining Network Measureshttp://socialmediamining.info/
Katz Centrality, cont.
Bias termControlling term
Rewriting equation in a vector form
vector of all 1’s
Katzcentrality:
22Social Media Mining Measures and Metrics 22Social Media Mining Network Measureshttp://socialmediamining.info/
Katz Centrality, cont.
• When α=0, the eigenvector centrality is removed and
all nodes get the same centrality value 𝛽
– As 𝛼 gets larger the effect of 𝛽 is reduced
• For the matrix (𝐼 − 𝛼𝐴 𝑇) to be invertible, we must have
– 𝑑𝑒𝑡 𝐼 − 𝛼𝐴 𝑇 ≠ 0
– By rearranging we get 𝑑𝑒𝑡 AT − 𝛼−1
𝐼 = 0
– This is basically the characteristic equation,
– The characteristic equation first becomes zero
when the largest eigenvalue equals α-1
The largest eigenvalue
is easier to compute
(power method)
In practice we select 𝜶 < 𝟏/𝝀, where 𝜆 is the largest eigenvalue of 𝑨 𝑻
23Social Media Mining Measures and Metrics 23Social Media Mining Network Measureshttp://socialmediamining.info/
• The Eigenvalues are -1.68, -1.0, -1.0, 0.35, 3.32
• We assume α=0.25 < 1/3.32 and 𝛽 = 0.2
Katz Centrality Example
Most
important
nodes!
24Social Media Mining Measures and Metrics 24Social Media Mining Network Measureshttp://socialmediamining.info/
PageRank
• Problem with Katz Centrality:
– In directed graphs, once a node becomes an authority
(high centrality), it passes all its centrality along all of its
out-links
• This is less desirable since not everyone known by
a well-known person is well-known
• Solution?
– We can divide the value of passed centrality by the
number of outgoing links, i.e., out-degree of that node
– Each connected neighbor gets a fraction of the source
node’s centrality
25Social Media Mining Measures and Metrics 25Social Media Mining Network Measureshttp://socialmediamining.info/
PageRank, cont.
What if the
degree is
zero?
Similar to Katz Centrality, in practice, 𝜶 < 𝟏/𝝀, where 𝜆 is
the largest eigenvalue of 𝐴 𝑇
𝐷−1
. In undirected graphs, the
largest eigenvalue of 𝐴 𝑇
𝐷−1
is 𝝀 = 1; therefore, 𝜶 < 𝟏.
26Social Media Mining Measures and Metrics 26Social Media Mining Network Measureshttp://socialmediamining.info/
PageRank Example
• We assume α=0.95 < 1 and and 𝛽 = 0.1
27Social Media Mining Measures and Metrics 27Social Media Mining Network Measureshttp://socialmediamining.info/
PageRank Example – Alternative Approach [Markov Chains]
Step A B C D E F G
0 1/7 1/7 1/7 1/7 1/7 1/7 1/7
1 B/2 C/3 A/3 + G A/3 + C/3 + F/2 A/3 + D C/3 + B/2 F/2 + E
0.071 0.048 0.190 0.167 0.190 0.119 0.214
Using Power
Method
”You don't understand
anything until you learn it
more than one way”
𝛼=1 and 𝛽 =0?
Marvin Minsky (1927-2016)
28Social Media Mining Measures and Metrics 28Social Media Mining Network Measureshttp://socialmediamining.info/
PageRank: Example
Step A B C D E F G Sum
1 0.143 0.143 0.143 0.143 0.143 0.143 0.143 1.000
2 0.071 0.048 0.190 0.167 0.190 0.119 0.214 1.000
3 0.024 0.063 0.238 0.147 0.190 0.087 0.250 1.000
4 0.032 0.079 0.258 0.131 0.155 0.111 0.234 1.000
5 0.040 0.086 0.245 0.152 0.142 0.126 0.210 1.000
6 0.043 0.082 0.224 0.158 0.165 0.125 0.204 1.000
7 0.041 0.075 0.219 0.151 0.172 0.115 0.228 1.000
8 0.037 0.073 0.241 0.144 0.165 0.110 0.230 1.000
9 0.036 0.080 0.242 0.148 0.157 0.117 0.220 1.000
10 0.040 0.081 0.232 0.151 0.160 0.121 0.215 1.000
11 0.040 0.077 0.228 0.151 0.165 0.118 0.220 1.000
12 0.039 0.076 0.234 0.148 0.165 0.115 0.223 1.000
13 0.038 0.078 0.236 0.148 0.161 0.116 0.222 1.000
14 0.039 0.079 0.235 0.149 0.161 0.118 0.219 1.000
15 0.039 0.078 0.232 0.150 0.162 0.118 0.220 1.000
Rank 7 6 1 4 3 5 2
29Social Media Mining Measures and Metrics 29Social Media Mining Network Measureshttp://socialmediamining.info/
Effect of PageRank
PageRank
Node Rank
A 7
B 6
C 1
D 4
E 3
F 5
G 2
30Social Media Mining Measures and Metrics 30Social Media Mining Network Measureshttp://socialmediamining.info/
Centrality in terms of how
you connect others
(information broker)
31Social Media Mining Measures and Metrics 31Social Media Mining Network Measureshttp://socialmediamining.info/
Betweenness Centrality
Another way of looking at centrality is
by considering how important nodes
are in connecting other nodes
The number of shortest paths from 𝑠 to 𝑡 that pass
through 𝑣𝑖
The number of shortest paths from vertex 𝑠 to 𝑡 – a.k.a.
information pathways
Linton Freeman
32Social Media Mining Measures and Metrics 32Social Media Mining Network Measureshttp://socialmediamining.info/
Normalizing Betweenness Centrality
• In the best case, node 𝑣𝑖 is on all shortest
paths from 𝑠 to 𝑡, hence,
Therefore, the maximum value is (𝑛 − 1)(𝑛 − 2)
Betweenness centrality:
33Social Media Mining Measures and Metrics 33Social Media Mining Network Measureshttp://socialmediamining.info/
Betweenness Centrality: Example 1
34Social Media Mining Measures and Metrics 34Social Media Mining Network Measureshttp://socialmediamining.info/
Betweenness Centrality: Example 2
Node Betweenness Centrality Rank
A 16 + 1/2 + 1/2 1
B 7+5/2 3
C 0 7
D 5/2 5
E 1/2 + 1/2 6
F 15 + 2 1
G 0 7
H 0 7
I 7 4
35Social Media Mining Measures and Metrics 35Social Media Mining Network Measureshttp://socialmediamining.info/
Computing Betweenness
• In betweenness centrality, we compute
shortest paths between all pairs of nodes to
compute the betweenness value.
• Trivial Solution:
– Use Dijkstra and run it 𝑂(𝑛) times
– We get an 𝑂(𝑛3
) solution
• Better Solution:
– Brandes Algorithm:
• 𝑂(𝑛𝑚) for unweighted graphs
• 𝑂(𝑛𝑚 + 𝑛2 log 𝑛) for weighted graphs
36Social Media Mining Measures and Metrics 36Social Media Mining Network Measureshttp://socialmediamining.info/
Brandes Algorithm [2001]
𝑝𝑟𝑒𝑑(𝑠, 𝑤) is the set of predecessors of 𝑤 in the
shortest paths from 𝑠 to 𝑤.
– In the most basic scenario, 𝑤 is the immediate child of 𝑣𝑖
There exists a recurrence equation that can help
us determine 𝛿𝑠(𝑣𝑖)
37Social Media Mining Measures and Metrics 37Social Media Mining Network Measureshttp://socialmediamining.info/
How to compute 𝝈 𝒔𝒕
Source: Networks, Crowds, and Markets:
Reasoning about a Highly Connected World.
By David Easley and Jon Kleinberg
Original Network
Sum of
Parents
values
BFS starting at A (i.e., 𝑠)
38Social Media Mining Measures and Metrics 38Social Media Mining Network Measureshttp://socialmediamining.info/
How do you compute 𝛿𝑠(𝑣𝑖)
No shortest path starting
from 1 passes through 9
2/2 (1+0)
1/1(3/2+1)+1/1(3/2+1)
39Social Media Mining Measures and Metrics 39Social Media Mining Network Measureshttp://socialmediamining.info/
Centrality in terms of how
fast you can reach others
40Social Media Mining Measures and Metrics 40Social Media Mining Network Measureshttp://socialmediamining.info/
Closeness Centrality
• The intuition is that influential/central
nodes can quickly reach other nodes
• These nodes should have a smaller
average shortest path length to others
Closeness centrality:
Linton Freeman
41Social Media Mining Measures and Metrics 41Social Media Mining Network Measureshttp://socialmediamining.info/
Closeness Centrality: Example 1
42Social Media Mining Measures and Metrics 42Social Media Mining Network Measureshttp://socialmediamining.info/
Closeness Centrality: Example 2 (Undirected)
Node A B C D E F G H I D_Avg
Closeness
Centrality Rank
A 0 1 2 1 2 1 2 3 2 1.750 0.571 1
B 1 0 1 2 1 2 3 4 3 2.125 0.471 3
C 2 1 0 3 2 3 4 5 4 3.000 0.333 8
D 1 2 3 0 1 2 3 4 3 2.375 0.421 4
E 2 1 2 1 0 3 4 5 4 2.750 0.364 7
F 1 2 3 2 3 0 1 2 1 1.875 0.533 2
G 2 3 4 3 4 1 0 3 2 2.750 0.364 7
H 3 4 5 4 5 2 3 0 1 3.375 0.296 9
I 2 3 4 3 4 1 2 1 0 2.500 0.400 5
43Social Media Mining Measures and Metrics 43Social Media Mining Network Measureshttp://socialmediamining.info/
Closeness Centrality: Example 3 (Directed)
Node A B C D E F G H I D_Avg
Closeness
Centrality Rank
A 0 1 2 3 2 2 1 3 3 2.125 0.471 1
B 3 0 1 2 1 4 4 2 3 2.500 0.400 2
C 4 5 0 7 6 3 5 1 2 4.125 0.242 9
D 1 2 3 0 3 3 2 4 5 2.875 0.348 3
E 2 3 4 1 0 4 3 5 5 3.375 0.296 6
F 1 2 3 4 3 0 2 4 4 2.875 0.348 4
G 2 3 4 5 4 1 0 5 2 3.250 0.308 5
H 4 4 5 6 5 2 4 0 1 3.875 0.258 8
I 2 3 4 5 4 1 4 5 0 3.500 0.286 7
44Social Media Mining Measures and Metrics 44Social Media Mining Network Measureshttp://socialmediamining.info/
An Interesting Comparison!
Comparing three centrality values
• Generally, the 3 centrality types will be positively correlated
• When they are not (or low correlation), it usually reveals interesting information
Low
Degree
Low
Closeness
Low
Betweenness
High
Degree
Node is embedded in a
community that is far from
the rest of the network
Ego's connections are
redundant -
communication bypasses
the node
High
Closeness
Key node connected to
important/active alters
Probably multiple paths in
the network, ego is near
many people, but so are
many others
High
Betweenness
Ego's few ties are crucial
for network flow
Very rare! Ego
monopolizes the ties from
a small number of people
to many others.
This slide is modified from a slide developed by James Moody
45Social Media Mining Measures and Metrics 45Social Media Mining Network Measureshttp://socialmediamining.info/
Centrality for a
group of nodes
46Social Media Mining Measures and Metrics 46Social Media Mining Network Measureshttp://socialmediamining.info/
Group Centrality
• All centrality measures defined so far measure
centrality for a single node. These measures
can be generalized for a group of nodes.
• A simple approach is to replace all nodes in a
group with a super node
– The group structure is disregarded.
• Let 𝑆 denote the set of nodes in the group and
𝑉 − 𝑆 the set of outsiders
47Social Media Mining Measures and Metrics 47Social Media Mining Network Measureshttp://socialmediamining.info/
I. Group Degree Centrality
– Normalization:
II. Group Betweenness Centrality
– Normalization:
Group Centrality
divide by |𝑉 − 𝑆|
divide by
48Social Media Mining Measures and Metrics 48Social Media Mining Network Measureshttp://socialmediamining.info/
III. Group Closeness Centrality
– It is the average distance from non-members to
the group
• One can also utilize the maximum distance or
the average distance
Group Centrality
49Social Media Mining Measures and Metrics 49Social Media Mining Network Measureshttp://socialmediamining.info/
Group Centrality Example
• Consider 𝑆 = {𝑣2, 𝑣3}
• Group degree centrality =
• Group betweenness centrality =
• Group closeness centrality =
3
3
1
50Social Media Mining Measures and Metrics 50Social Media Mining Network Measureshttp://socialmediamining.info/
• Transitivity/Reciprocity
• Status/Balance
Friendship Patterns
51Social Media Mining Measures and Metrics 51Social Media Mining Network Measureshttp://socialmediamining.info/
I. Transitivity and Reciprocity
52Social Media Mining Measures and Metrics 52Social Media Mining Network Measureshttp://socialmediamining.info/
Transitivity
• Mathematic representation:
– For a transitive relation 𝑅:
• In a social network:
– Transitivity is when a friend of my friend is my friend
– Transitivity in a social network leads to a denser graph,
which in turn is closer to a complete graph
– We can determine how close graphs are to the
complete graph by measuring transitivity
𝒄𝑹𝒂 or 𝒂𝑹𝒄 ?
53Social Media Mining Measures and Metrics 53Social Media Mining Network Measureshttp://socialmediamining.info/
[Global] Clustering Coefficient
• Clustering coefficient measures transitivity
in undirected graphs
– Count paths of length two and check whether the
third edge exists
When counting triangles, since every triangle has 6
closed paths of length 2
54Social Media Mining Measures and Metrics 54Social Media Mining Network Measureshttp://socialmediamining.info/
Clustering Coefficient and Triples
Or we can rewrite it as
• Triple: an ordered set of three
nodes,
– connected by two (open triple)
edges or
– three edges (closed triple)
• A triangle can miss any of its
three edges
– A triangle has 3 Triples
𝑣𝑖 𝑣𝑗 𝑣 𝑘 and 𝑣𝑗 𝑣 𝑘 𝑣𝑖are
different triples
• The same members
• First missing edge
𝑒(𝑣 𝑘, 𝑣𝑖) and second
missing 𝑒(𝑣𝑖, 𝑣𝑗)
𝑣𝑖 𝑣𝑗 𝑣 𝑘and 𝑣 𝑘 𝑣𝑗 𝑣𝑖are
the same triple
55Social Media Mining Measures and Metrics 55Social Media Mining Network Measureshttp://socialmediamining.info/
[Global] Clustering Coefficient: Example
56Social Media Mining Measures and Metrics 56Social Media Mining Network Measureshttp://socialmediamining.info/
Local Clustering Coefficient
• Local clustering coefficient measures
transitivity at the node level
– Commonly employed for undirected graphs
– Computes how strongly neighbors of a node 𝑣
(nodes adjacent to 𝑣) are themselves connected
In an undirected graph, the
denominator can be rewritten as:
Provides a way to determine
structural holes Structural
Holes
57Social Media Mining Measures and Metrics 57Social Media Mining Network Measureshttp://socialmediamining.info/
Local Clustering Coefficient: Example
• Thin lines depict connections to neighbors
• Dashed lines are the missing link among neighbors
• Solid lines indicate connected neighbors
– When none of neighbors are connected 𝐶 = 0
– When all neighbors are connected 𝐶 = 1
58Social Media Mining Measures and Metrics 58Social Media Mining Network Measureshttp://socialmediamining.info/
Reciprocity
If you become my friend,
I’ll be yours
• Reciprocity is simplified
version of transitivity
– It considers closed loops
of length 2
• If node 𝑣 is connected to
node 𝑢,
– 𝑢 by connecting to 𝑣,
exhibits reciprocity
What
about
𝒊 = 𝒋 ?
59Social Media Mining Measures and Metrics 59Social Media Mining Network Measureshttp://socialmediamining.info/
Reciprocity: Example
Reciprocal nodes: 𝑣1, 𝑣2
60Social Media Mining Measures and Metrics 60Social Media Mining Network Measureshttp://socialmediamining.info/
• Measuring
consistency in
friendships
II. Balance and Status
61Social Media Mining Measures and Metrics 61Social Media Mining Network Measureshttp://socialmediamining.info/
Social Balance Theory
Social balance theory
– Consistency in friend/foe relationships among individuals
– Informally, friend/foe relationships are consistent when
• In the network
– Positive edges demonstrate friendships (𝑤𝑖𝑗 = 1)
– Negative edges demonstrate being enemies (𝑤𝑖𝑗 = −1)
• Triangle of nodes 𝑖, 𝑗, and 𝑘, is balanced, if and only if
– 𝑤𝑖𝑗 denotes the value of the edge between nodes 𝑖 and 𝑗
62Social Media Mining Measures and Metrics 62Social Media Mining Network Measureshttp://socialmediamining.info/
Social Balance Theory: Possible Combinations
For any cycle, if the multiplication of edge values become
positive, then the cycle is socially balanced
63Social Media Mining Measures and Metrics 63Social Media Mining Network Measureshttp://socialmediamining.info/
Social Status Theory
• Status: how prestigious an individual is
ranked within a society
• Social status theory:
– How consistent individuals are in assigning status
to their neighbors
– Informally,
64Social Media Mining Measures and Metrics 64Social Media Mining Network Measureshttp://socialmediamining.info/
Social Status Theory: Example
• A directed ‘+’ edge from node 𝑋 to node 𝑌
shows that 𝑌 has a higher status than 𝑋 and a
‘-’ one shows vice versa
Unstable configuration Stable configuration
65Social Media Mining Measures and Metrics 65Social Media Mining Network Measureshttp://socialmediamining.info/
• Structural Equivalence
• Regular Equivalence
Similarity
How similar are two nodes in a network?
66Social Media Mining Measures and Metrics 66Social Media Mining Network Measureshttp://socialmediamining.info/
Structural Equivalence
• Structural Equivalence:
– We look at the neighborhood shared by two nodes;
– The size of this shared neighborhood defines how
similar two nodes are.
• Example:
– Two brothers have in common
• sisters, mother, father, grandparents, etc.
– This shows that they are similar,
– Two random male or female individuals do not have
much in common and are dissimilar.
67Social Media Mining Measures and Metrics 67Social Media Mining Network Measureshttp://socialmediamining.info/
• Vertex similarity:
• The neighborhood 𝑁(𝑣) often excludes the node itself 𝑣.
– What can go wrong?
• Connected nodes not sharing a neighbor will be assigned zero similarity
– Solution:
• We can assume nodes are included in their neighborhoods
Structural Equivalence: Definitions
Jaccard Similarity:
Cosine Similarity:
Normalize?
68Social Media Mining Measures and Metrics 68Social Media Mining Network Measureshttp://socialmediamining.info/
Similarity: Example
69Social Media Mining Measures and Metrics 69Social Media Mining Network Measureshttp://socialmediamining.info/
Similarity Significance
Measuring Similarity Significance: compare the
calculated similarity value with its expected value
where vertices pick their neighbors at random
• For vertices 𝑣𝑖 and 𝑣𝑗 with degrees 𝑑𝑖 and 𝑑𝑗 this
expectation is 𝑑𝑖 𝑑𝑗/𝑛
– There is a 𝑑𝑖/𝑛 chance of becoming 𝑣𝑖‘s neighbor
– 𝑣𝑗 selects 𝑑𝑗 neighbors
• We can rewrite neighborhood overlap as
70Social Media Mining Measures and Metrics 70Social Media Mining Network Measureshttp://socialmediamining.info/
Normalized Similarity, cont.
What is this?
71Social Media Mining Measures and Metrics 71Social Media Mining Network Measureshttp://socialmediamining.info/
Normalized Similarity, cont.
𝒏 times the Covariance between 𝑨𝒊 and 𝑨𝒋
Normalize covariance by the multiplication of Variances.
We get Pearson correlation coefficient
(range of   [-1,1] )
72Social Media Mining Measures and Metrics 72Social Media Mining Network Measureshttp://socialmediamining.info/
Regular Equivalence
• In regular equivalence,
– We do not look at
neighborhoods shared
between individuals, but
– How neighborhoods
themselves are similar
• Example:
– Athletes are similar not
because they know each
other in person, but since
they know similar
individuals, such as
coaches, trainers, other
players, etc.
73Social Media Mining Measures and Metrics 73Social Media Mining Network Measureshttp://socialmediamining.info/
• 𝑣𝑖, 𝑣𝑗 are similar when their neighbors 𝑣 𝑘 and 𝑣𝑙
are similar
• The equation (left figure) is hard to solve since it is
self referential so we relax our definition using
the right figure
Regular Equivalence
74Social Media Mining Measures and Metrics 74Social Media Mining Network Measureshttp://socialmediamining.info/
Regular Equivalence
• 𝑣𝑖 and 𝑣𝑗 are similar when 𝑣𝑗 is similar to
𝑣𝑖’s neighbors 𝑣 𝑘
• In vector format
A vertex is highly similar
to itself, we guarantee
this by adding an
identity matrix to the
equation
W𝐡𝐞𝐧 𝛼 < 𝟏/𝝀 𝒎𝒂𝒙 the matrix is invertible
75Social Media Mining Measures and Metrics 75Social Media Mining Network Measureshttp://socialmediamining.info/
Regular Equivalence: Example
• Any row/column of this matrix shows the similarity to other vertices
• Vertex 1 is most similar (other than itself) to vertices 2 and 3
• Nodes 2 and 3 have the highest similarity (regular equivalence)
The largest eigenvalue of 𝐴 is 2.43
Set 𝛼 = 0.3 < 1/2.43

Weitere ähnliche Inhalte

Was ist angesagt?

Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101librarianrafia
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systemsguest77b0cd12
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphsNicola Barbieri
 
Network measures used in social network analysis
Network measures used in social network analysis Network measures used in social network analysis
Network measures used in social network analysis Dragan Gasevic
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectivenessemapesce
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social NetworksKent State University
 
Social Network Analysis power point presentation
Social Network Analysis power point presentation Social Network Analysis power point presentation
Social Network Analysis power point presentation Ratnesh Shah
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit Vpkaviya
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011guillaume ereteo
 
CS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit ICS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit Ipkaviya
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social MediaSymeon Papadopoulos
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithmsAlireza Andalib
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)SocialMediaMining
 
CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit IIIpkaviya
 
Unit 1 - SNA QUESTION BANK
Unit 1 - SNA QUESTION BANKUnit 1 - SNA QUESTION BANK
Unit 1 - SNA QUESTION BANKUsha Rani M
 

Was ist angesagt? (20)

Social Network Visualization 101
Social Network Visualization 101Social Network Visualization 101
Social Network Visualization 101
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 
Network measures used in social network analysis
Network measures used in social network analysis Network measures used in social network analysis
Network measures used in social network analysis
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectiveness
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social Networks
 
Social Network Analysis power point presentation
Social Network Analysis power point presentation Social Network Analysis power point presentation
Social Network Analysis power point presentation
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data Mining
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit V
 
Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011Social network analysis course 2010 - 2011
Social network analysis course 2010 - 2011
 
06 Community Detection
06 Community Detection06 Community Detection
06 Community Detection
 
CS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit ICS6010 Social Network Analysis Unit I
CS6010 Social Network Analysis Unit I
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social Media
 
Link prediction
Link predictionLink prediction
Link prediction
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)
 
CS6010 Social Network Analysis Unit III
CS6010 Social Network Analysis   Unit IIICS6010 Social Network Analysis   Unit III
CS6010 Social Network Analysis Unit III
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Unit 1 - SNA QUESTION BANK
Unit 1 - SNA QUESTION BANKUnit 1 - SNA QUESTION BANK
Unit 1 - SNA QUESTION BANK
 

Andere mochten auch

Web scraping in python
Web scraping in pythonWeb scraping in python
Web scraping in pythonSaurav Tomar
 
Almost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without ProgrammingAlmost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without ProgrammingMichelle Minkoff
 
Scraping data from the web and documents
Scraping data from the web and documentsScraping data from the web and documents
Scraping data from the web and documentsTommy Tavenner
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction ServicePromptCloud
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With PythonRobert Dempsey
 
Web Scraping with Python
Web Scraping with PythonWeb Scraping with Python
Web Scraping with PythonPaul Schreiber
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPTChhavi Mathur
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
DIY basic Facebook data mining
DIY basic Facebook data miningDIY basic Facebook data mining
DIY basic Facebook data miningSTEM/MARK
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Weiai Wayne Xu
 
SimpleXML In PHP 5
SimpleXML In PHP 5SimpleXML In PHP 5
SimpleXML In PHP 5Ron Pringle
 
Journalists and the Social Web 1
Journalists and the Social Web 1Journalists and the Social Web 1
Journalists and the Social Web 1ardessie
 
When RSS Fails: Web Scraping with HTTP
When RSS Fails: Web Scraping with HTTPWhen RSS Fails: Web Scraping with HTTP
When RSS Fails: Web Scraping with HTTPMatthew Turland
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Web Scraping Technologies
Web Scraping TechnologiesWeb Scraping Technologies
Web Scraping TechnologiesKrishna Sunuwar
 
JamNeo news aggregator
JamNeo news aggregatorJamNeo news aggregator
JamNeo news aggregatorJamNeo
 

Andere mochten auch (20)

Web scraping in python
Web scraping in pythonWeb scraping in python
Web scraping in python
 
Almost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without ProgrammingAlmost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without Programming
 
Scraping data from the web and documents
Scraping data from the web and documentsScraping data from the web and documents
Scraping data from the web and documents
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
Web Scraping with Python
Web Scraping with PythonWeb Scraping with Python
Web Scraping with Python
 
Scraping the web with python
Scraping the web with pythonScraping the web with python
Scraping the web with python
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPT
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
DIY basic Facebook data mining
DIY basic Facebook data miningDIY basic Facebook data mining
DIY basic Facebook data mining
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
 
Web scraping com python
Web scraping com pythonWeb scraping com python
Web scraping com python
 
Php Rss
Php RssPhp Rss
Php Rss
 
SimpleXML In PHP 5
SimpleXML In PHP 5SimpleXML In PHP 5
SimpleXML In PHP 5
 
Journalists and the Social Web 1
Journalists and the Social Web 1Journalists and the Social Web 1
Journalists and the Social Web 1
 
When RSS Fails: Web Scraping with HTTP
When RSS Fails: Web Scraping with HTTPWhen RSS Fails: Web Scraping with HTTP
When RSS Fails: Web Scraping with HTTP
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Web Scraping Technologies
Web Scraping TechnologiesWeb Scraping Technologies
Web Scraping Technologies
 
JamNeo news aggregator
JamNeo news aggregatorJamNeo news aggregator
JamNeo news aggregator
 

Ähnlich wie Social Media Mining - Chapter 3 (Network Measures)

Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processingjins0618
 
Anomaly detection Meetup Slides
Anomaly detection Meetup SlidesAnomaly detection Meetup Slides
Anomaly detection Meetup SlidesQuantUniversity
 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningQuantUniversity
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationDatabricks
 
PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1
PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1
PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1Shweta Sood
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceAmit Sharma
 
Six sigma
Six sigmaSix sigma
Six sigmakmsonam
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Rajul Kukreja
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018Arsalan Khan
 
Algorithm in Social network of graph and social network analysis
Algorithm in Social network of graph and social network analysisAlgorithm in Social network of graph and social network analysis
Algorithm in Social network of graph and social network analysisoliviaclark2905
 
Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive ModellingAmit Kumar
 
How can we rely upon Social Network Measures? Agent-base modelling as the nex...
How can we rely upon Social Network Measures? Agent-base modelling as the nex...How can we rely upon Social Network Measures? Agent-base modelling as the nex...
How can we rely upon Social Network Measures? Agent-base modelling as the nex...Bruce Edmonds
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataMilan Stankovic
 

Ähnlich wie Social Media Mining - Chapter 3 (Network Measures) (20)

Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
 
Anomaly detection Meetup Slides
Anomaly detection Meetup SlidesAnomaly detection Meetup Slides
Anomaly detection Meetup Slides
 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy Exploration
 
PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1
PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1
PPT_ADML_PGM_KnowledgeSharing_9JULY2015_v1
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Six sigma
Six sigmaSix sigma
Six sigma
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018
 
Algorithm in Social network of graph and social network analysis
Algorithm in Social network of graph and social network analysisAlgorithm in Social network of graph and social network analysis
Algorithm in Social network of graph and social network analysis
 
Conjoint.pdf
Conjoint.pdfConjoint.pdf
Conjoint.pdf
 
Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
 
Week_2_Lecture.pdf
Week_2_Lecture.pdfWeek_2_Lecture.pdf
Week_2_Lecture.pdf
 
Krupa rm
Krupa rmKrupa rm
Krupa rm
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
How can we rely upon Social Network Measures? Agent-base modelling as the nex...
How can we rely upon Social Network Measures? Agent-base modelling as the nex...How can we rely upon Social Network Measures? Agent-base modelling as the nex...
How can we rely upon Social Network Measures? Agent-base modelling as the nex...
 
Data Science 1.pdf
Data Science 1.pdfData Science 1.pdf
Data Science 1.pdf
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked Data
 
Web Mining .ppt
Web Mining .pptWeb Mining .ppt
Web Mining .ppt
 

Kürzlich hochgeladen

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 

Kürzlich hochgeladen (20)

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 

Social Media Mining - Chapter 3 (Network Measures)

  • 2. 2Social Media Mining Measures and Metrics 2Social Media Mining Network Measureshttp://socialmediamining.info/ Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate these slides into your presentations, please include the following note: R. Zafarani, M. A. Abbasi, and H. Liu, Social Media Mining: An Introduction, Cambridge University Press, 2014. Free book and slides at http://socialmediamining.info/ or include a link to the website: http://socialmediamining.info/
  • 3. 3Social Media Mining Measures and Metrics 3Social Media Mining Network Measureshttp://socialmediamining.info/ Klout It is difficult to measure influence!
  • 4. 4Social Media Mining Measures and Metrics 4Social Media Mining Network Measureshttp://socialmediamining.info/ Why Do We Need Measures? • Who are the central figures (influential individuals) in the network? – Centrality • What interaction patterns are common in friends? – Reciprocity and Transitivity – Balance and Status • Who are the like-minded users and how can we find these similar individuals? – Similarity • To answer these and similar questions, one first needs to define measures for quantifying centrality, level of interactions, and similarity, among others.
  • 5. 5Social Media Mining Measures and Metrics 5Social Media Mining Network Measureshttp://socialmediamining.info/ Centrality defines how important a node is within a network Centrality
  • 6. 6Social Media Mining Measures and Metrics 6Social Media Mining Network Measureshttp://socialmediamining.info/ Centrality in terms of those who you are connected to
  • 7. 7Social Media Mining Measures and Metrics 7Social Media Mining Network Measureshttp://socialmediamining.info/ Degree Centrality • Degree centrality: ranks nodes with more connections higher in terms of centrality • 𝑑𝑖 is the degree (number of friends) for node 𝑣𝑖 – i.e., the number of length-1 paths (can be generalized) In this graph, degree centrality for node 𝑣1 is 𝑑1=8 and for all others is 𝑑𝑗 = 1, 𝑗 ≠ 1
  • 8. 8Social Media Mining Measures and Metrics 8Social Media Mining Network Measureshttp://socialmediamining.info/ Degree Centrality in Directed Graphs • In directed graphs, we can either use the in- degree, the out-degree, or the combination as the degree centrality value: • In practice, mostly in-degree is used. 𝑑𝑖 𝑜𝑢𝑡 is the number of outgoing links for node 𝑣𝑖
  • 9. 9Social Media Mining Measures and Metrics 9Social Media Mining Network Measureshttp://socialmediamining.info/ Normalized Degree Centrality • Normalized by the maximum possible degree • Normalized by the maximum degree • Normalized by the degree sum
  • 10. 10Social Media Mining Measures and Metrics 10Social Media Mining Network Measureshttp://socialmediamining.info/ Degree Centrality (Directed Graph)Example Normalized by the maximum possible degree E B A C F D G Node In-Degree Out-Degree Centrality Rank A 1 3 1/2 1 B 1 2 1/3 3 C 2 3 1/2 1 D 3 1 1/6 5 E 2 1 1/6 5 F 2 2 1/3 3 G 2 1 1/6 5
  • 11. 11Social Media Mining Measures and Metrics 11Social Media Mining Network Measureshttp://socialmediamining.info/ Degree Centrality (undirected Graph) Example Node Degree Centrality Rank A 4 2/3 2 B 3 1/2 5 C 5 5/6 1 D 4 2/3 2 E 3 1/2 5 F 4 2/3 2 G 3 1/2 5 E B A C F D G
  • 12. 12Social Media Mining Measures and Metrics 12Social Media Mining Network Measureshttp://socialmediamining.info/ Eigenvector Centrality • Having more friends does not by itself guarantee that someone is more important – Having more important friends provides a stronger signal Phillip Bonacich • Eigenvector centrality generalizes degree centrality by incorporating the importance of the neighbors (undirected) • For directed graphs, we can use incoming or outgoing edges
  • 13. 13Social Media Mining Measures and Metrics 13Social Media Mining Network Measureshttp://socialmediamining.info/ Formulation • Let’s assume the eigenvector centrality of a node is 𝑐 𝑒 𝑣𝑖 (unknown) • We would like 𝑐 𝑒 𝑣𝑖 to be higher when important neighbors (node 𝑣𝑗 with higher 𝑐 𝑒 𝑣𝑗 ) point to us – Incoming or outgoing neighbors? – For incoming neighbors 𝐴𝑗,𝑖 = 1 • We can assume that 𝑣𝑖’s centrality is the summation of its neighbors’ centralities • Is this summation bounded? • We have to normalize! : some fixed constant
  • 14. 14Social Media Mining Measures and Metrics 14Social Media Mining Network Measureshttp://socialmediamining.info/ • Let  • This means that 𝑪 𝒆 is an eigenvector of adjacency matrix 𝐴 𝑇 (or 𝐴 when undirected) and  is the corresponding eigenvalue • Which eigenvalue-eigenvector pair should we choose? Eigenvector Centrality (Matrix Formulation)
  • 15. 15Social Media Mining Measures and Metrics 15Social Media Mining Network Measureshttp://socialmediamining.info/ Finding the eigenvalue by finding a fixed point… • Start from an initial guess 𝐶𝑒(0) (e.g., all centralities are 1) and iterative 𝑡 times • We can write 𝐶𝑒(0) as a linear combination of eigenvectors 𝑣𝑖’s of the 𝐴 𝑇 • Substituting this, we get 𝜆1 is the largest eigenvalue
  • 16. 16Social Media Mining Measures and Metrics 16Social Media Mining Network Measureshttp://socialmediamining.info/ Finding the eigenvalue by finding a fixed point… • As 𝑡 grows, we will have in the limit • Or equivalently • If we start with an all positive 𝐶𝑒(0) all 𝐶𝑒(𝑡)’s will be positive (why?) – All the centrality values would be positive – We need an eigenvalue-eigenvector pair that guarantees all centralities have the same sign • E.g., for comparison purposes
  • 17. 17Social Media Mining Measures and Metrics 17Social Media Mining Network Measureshttp://socialmediamining.info/ Eigenvector Centrality, cont. So, to compute eigenvector centrality of 𝐴, 1. We compute the eigenvalues of A 2. Select the largest eigenvalue  3. The corresponding eigenvector of  is 𝐂 𝐞. 4. Based on the Perron-Frobenius theorem, all the components of 𝐂 𝐞will be positive 5. The components of 𝐂 𝐞 are the eigenvector centralities for the graph.
  • 18. 18Social Media Mining Measures and Metrics 18Social Media Mining Network Measureshttp://socialmediamining.info/ Eigenvector Centrality: Example 1 Eigenvalues are Largest Eigenvalue Corresponding eigenvector (assuming 𝐂 𝐞 has norm 1)
  • 19. 19Social Media Mining Measures and Metrics 19Social Media Mining Network Measureshttp://socialmediamining.info/ Eigenvector Centrality: Example 2  = (2.68, -1.74, -1.27, 0.33, 0.00) Eigenvalues Vector max = 2.68
  • 20. 20Social Media Mining Measures and Metrics 20Social Media Mining Network Measureshttp://socialmediamining.info/ Katz Centrality • A major problem with eigenvector centrality arises when it deals with directed graphs • Centrality only passes over outgoing edges and in special cases such as when a node is in a directed acyclic graph centrality becomes zero – The node can have many edge connected to it Eigenvector Centrality Elihu Katz • To resolve this problem we add bias term  to the centrality values for all nodes
  • 21. 21Social Media Mining Measures and Metrics 21Social Media Mining Network Measureshttp://socialmediamining.info/ Katz Centrality, cont. Bias termControlling term Rewriting equation in a vector form vector of all 1’s Katzcentrality:
  • 22. 22Social Media Mining Measures and Metrics 22Social Media Mining Network Measureshttp://socialmediamining.info/ Katz Centrality, cont. • When α=0, the eigenvector centrality is removed and all nodes get the same centrality value 𝛽 – As 𝛼 gets larger the effect of 𝛽 is reduced • For the matrix (𝐼 − 𝛼𝐴 𝑇) to be invertible, we must have – 𝑑𝑒𝑡 𝐼 − 𝛼𝐴 𝑇 ≠ 0 – By rearranging we get 𝑑𝑒𝑡 AT − 𝛼−1 𝐼 = 0 – This is basically the characteristic equation, – The characteristic equation first becomes zero when the largest eigenvalue equals α-1 The largest eigenvalue is easier to compute (power method) In practice we select 𝜶 < 𝟏/𝝀, where 𝜆 is the largest eigenvalue of 𝑨 𝑻
  • 23. 23Social Media Mining Measures and Metrics 23Social Media Mining Network Measureshttp://socialmediamining.info/ • The Eigenvalues are -1.68, -1.0, -1.0, 0.35, 3.32 • We assume α=0.25 < 1/3.32 and 𝛽 = 0.2 Katz Centrality Example Most important nodes!
  • 24. 24Social Media Mining Measures and Metrics 24Social Media Mining Network Measureshttp://socialmediamining.info/ PageRank • Problem with Katz Centrality: – In directed graphs, once a node becomes an authority (high centrality), it passes all its centrality along all of its out-links • This is less desirable since not everyone known by a well-known person is well-known • Solution? – We can divide the value of passed centrality by the number of outgoing links, i.e., out-degree of that node – Each connected neighbor gets a fraction of the source node’s centrality
  • 25. 25Social Media Mining Measures and Metrics 25Social Media Mining Network Measureshttp://socialmediamining.info/ PageRank, cont. What if the degree is zero? Similar to Katz Centrality, in practice, 𝜶 < 𝟏/𝝀, where 𝜆 is the largest eigenvalue of 𝐴 𝑇 𝐷−1 . In undirected graphs, the largest eigenvalue of 𝐴 𝑇 𝐷−1 is 𝝀 = 1; therefore, 𝜶 < 𝟏.
  • 26. 26Social Media Mining Measures and Metrics 26Social Media Mining Network Measureshttp://socialmediamining.info/ PageRank Example • We assume α=0.95 < 1 and and 𝛽 = 0.1
  • 27. 27Social Media Mining Measures and Metrics 27Social Media Mining Network Measureshttp://socialmediamining.info/ PageRank Example – Alternative Approach [Markov Chains] Step A B C D E F G 0 1/7 1/7 1/7 1/7 1/7 1/7 1/7 1 B/2 C/3 A/3 + G A/3 + C/3 + F/2 A/3 + D C/3 + B/2 F/2 + E 0.071 0.048 0.190 0.167 0.190 0.119 0.214 Using Power Method ”You don't understand anything until you learn it more than one way” 𝛼=1 and 𝛽 =0? Marvin Minsky (1927-2016)
  • 28. 28Social Media Mining Measures and Metrics 28Social Media Mining Network Measureshttp://socialmediamining.info/ PageRank: Example Step A B C D E F G Sum 1 0.143 0.143 0.143 0.143 0.143 0.143 0.143 1.000 2 0.071 0.048 0.190 0.167 0.190 0.119 0.214 1.000 3 0.024 0.063 0.238 0.147 0.190 0.087 0.250 1.000 4 0.032 0.079 0.258 0.131 0.155 0.111 0.234 1.000 5 0.040 0.086 0.245 0.152 0.142 0.126 0.210 1.000 6 0.043 0.082 0.224 0.158 0.165 0.125 0.204 1.000 7 0.041 0.075 0.219 0.151 0.172 0.115 0.228 1.000 8 0.037 0.073 0.241 0.144 0.165 0.110 0.230 1.000 9 0.036 0.080 0.242 0.148 0.157 0.117 0.220 1.000 10 0.040 0.081 0.232 0.151 0.160 0.121 0.215 1.000 11 0.040 0.077 0.228 0.151 0.165 0.118 0.220 1.000 12 0.039 0.076 0.234 0.148 0.165 0.115 0.223 1.000 13 0.038 0.078 0.236 0.148 0.161 0.116 0.222 1.000 14 0.039 0.079 0.235 0.149 0.161 0.118 0.219 1.000 15 0.039 0.078 0.232 0.150 0.162 0.118 0.220 1.000 Rank 7 6 1 4 3 5 2
  • 29. 29Social Media Mining Measures and Metrics 29Social Media Mining Network Measureshttp://socialmediamining.info/ Effect of PageRank PageRank Node Rank A 7 B 6 C 1 D 4 E 3 F 5 G 2
  • 30. 30Social Media Mining Measures and Metrics 30Social Media Mining Network Measureshttp://socialmediamining.info/ Centrality in terms of how you connect others (information broker)
  • 31. 31Social Media Mining Measures and Metrics 31Social Media Mining Network Measureshttp://socialmediamining.info/ Betweenness Centrality Another way of looking at centrality is by considering how important nodes are in connecting other nodes The number of shortest paths from 𝑠 to 𝑡 that pass through 𝑣𝑖 The number of shortest paths from vertex 𝑠 to 𝑡 – a.k.a. information pathways Linton Freeman
  • 32. 32Social Media Mining Measures and Metrics 32Social Media Mining Network Measureshttp://socialmediamining.info/ Normalizing Betweenness Centrality • In the best case, node 𝑣𝑖 is on all shortest paths from 𝑠 to 𝑡, hence, Therefore, the maximum value is (𝑛 − 1)(𝑛 − 2) Betweenness centrality:
  • 33. 33Social Media Mining Measures and Metrics 33Social Media Mining Network Measureshttp://socialmediamining.info/ Betweenness Centrality: Example 1
  • 34. 34Social Media Mining Measures and Metrics 34Social Media Mining Network Measureshttp://socialmediamining.info/ Betweenness Centrality: Example 2 Node Betweenness Centrality Rank A 16 + 1/2 + 1/2 1 B 7+5/2 3 C 0 7 D 5/2 5 E 1/2 + 1/2 6 F 15 + 2 1 G 0 7 H 0 7 I 7 4
  • 35. 35Social Media Mining Measures and Metrics 35Social Media Mining Network Measureshttp://socialmediamining.info/ Computing Betweenness • In betweenness centrality, we compute shortest paths between all pairs of nodes to compute the betweenness value. • Trivial Solution: – Use Dijkstra and run it 𝑂(𝑛) times – We get an 𝑂(𝑛3 ) solution • Better Solution: – Brandes Algorithm: • 𝑂(𝑛𝑚) for unweighted graphs • 𝑂(𝑛𝑚 + 𝑛2 log 𝑛) for weighted graphs
  • 36. 36Social Media Mining Measures and Metrics 36Social Media Mining Network Measureshttp://socialmediamining.info/ Brandes Algorithm [2001] 𝑝𝑟𝑒𝑑(𝑠, 𝑤) is the set of predecessors of 𝑤 in the shortest paths from 𝑠 to 𝑤. – In the most basic scenario, 𝑤 is the immediate child of 𝑣𝑖 There exists a recurrence equation that can help us determine 𝛿𝑠(𝑣𝑖)
  • 37. 37Social Media Mining Measures and Metrics 37Social Media Mining Network Measureshttp://socialmediamining.info/ How to compute 𝝈 𝒔𝒕 Source: Networks, Crowds, and Markets: Reasoning about a Highly Connected World. By David Easley and Jon Kleinberg Original Network Sum of Parents values BFS starting at A (i.e., 𝑠)
  • 38. 38Social Media Mining Measures and Metrics 38Social Media Mining Network Measureshttp://socialmediamining.info/ How do you compute 𝛿𝑠(𝑣𝑖) No shortest path starting from 1 passes through 9 2/2 (1+0) 1/1(3/2+1)+1/1(3/2+1)
  • 39. 39Social Media Mining Measures and Metrics 39Social Media Mining Network Measureshttp://socialmediamining.info/ Centrality in terms of how fast you can reach others
  • 40. 40Social Media Mining Measures and Metrics 40Social Media Mining Network Measureshttp://socialmediamining.info/ Closeness Centrality • The intuition is that influential/central nodes can quickly reach other nodes • These nodes should have a smaller average shortest path length to others Closeness centrality: Linton Freeman
  • 41. 41Social Media Mining Measures and Metrics 41Social Media Mining Network Measureshttp://socialmediamining.info/ Closeness Centrality: Example 1
  • 42. 42Social Media Mining Measures and Metrics 42Social Media Mining Network Measureshttp://socialmediamining.info/ Closeness Centrality: Example 2 (Undirected) Node A B C D E F G H I D_Avg Closeness Centrality Rank A 0 1 2 1 2 1 2 3 2 1.750 0.571 1 B 1 0 1 2 1 2 3 4 3 2.125 0.471 3 C 2 1 0 3 2 3 4 5 4 3.000 0.333 8 D 1 2 3 0 1 2 3 4 3 2.375 0.421 4 E 2 1 2 1 0 3 4 5 4 2.750 0.364 7 F 1 2 3 2 3 0 1 2 1 1.875 0.533 2 G 2 3 4 3 4 1 0 3 2 2.750 0.364 7 H 3 4 5 4 5 2 3 0 1 3.375 0.296 9 I 2 3 4 3 4 1 2 1 0 2.500 0.400 5
  • 43. 43Social Media Mining Measures and Metrics 43Social Media Mining Network Measureshttp://socialmediamining.info/ Closeness Centrality: Example 3 (Directed) Node A B C D E F G H I D_Avg Closeness Centrality Rank A 0 1 2 3 2 2 1 3 3 2.125 0.471 1 B 3 0 1 2 1 4 4 2 3 2.500 0.400 2 C 4 5 0 7 6 3 5 1 2 4.125 0.242 9 D 1 2 3 0 3 3 2 4 5 2.875 0.348 3 E 2 3 4 1 0 4 3 5 5 3.375 0.296 6 F 1 2 3 4 3 0 2 4 4 2.875 0.348 4 G 2 3 4 5 4 1 0 5 2 3.250 0.308 5 H 4 4 5 6 5 2 4 0 1 3.875 0.258 8 I 2 3 4 5 4 1 4 5 0 3.500 0.286 7
  • 44. 44Social Media Mining Measures and Metrics 44Social Media Mining Network Measureshttp://socialmediamining.info/ An Interesting Comparison! Comparing three centrality values • Generally, the 3 centrality types will be positively correlated • When they are not (or low correlation), it usually reveals interesting information Low Degree Low Closeness Low Betweenness High Degree Node is embedded in a community that is far from the rest of the network Ego's connections are redundant - communication bypasses the node High Closeness Key node connected to important/active alters Probably multiple paths in the network, ego is near many people, but so are many others High Betweenness Ego's few ties are crucial for network flow Very rare! Ego monopolizes the ties from a small number of people to many others. This slide is modified from a slide developed by James Moody
  • 45. 45Social Media Mining Measures and Metrics 45Social Media Mining Network Measureshttp://socialmediamining.info/ Centrality for a group of nodes
  • 46. 46Social Media Mining Measures and Metrics 46Social Media Mining Network Measureshttp://socialmediamining.info/ Group Centrality • All centrality measures defined so far measure centrality for a single node. These measures can be generalized for a group of nodes. • A simple approach is to replace all nodes in a group with a super node – The group structure is disregarded. • Let 𝑆 denote the set of nodes in the group and 𝑉 − 𝑆 the set of outsiders
  • 47. 47Social Media Mining Measures and Metrics 47Social Media Mining Network Measureshttp://socialmediamining.info/ I. Group Degree Centrality – Normalization: II. Group Betweenness Centrality – Normalization: Group Centrality divide by |𝑉 − 𝑆| divide by
  • 48. 48Social Media Mining Measures and Metrics 48Social Media Mining Network Measureshttp://socialmediamining.info/ III. Group Closeness Centrality – It is the average distance from non-members to the group • One can also utilize the maximum distance or the average distance Group Centrality
  • 49. 49Social Media Mining Measures and Metrics 49Social Media Mining Network Measureshttp://socialmediamining.info/ Group Centrality Example • Consider 𝑆 = {𝑣2, 𝑣3} • Group degree centrality = • Group betweenness centrality = • Group closeness centrality = 3 3 1
  • 50. 50Social Media Mining Measures and Metrics 50Social Media Mining Network Measureshttp://socialmediamining.info/ • Transitivity/Reciprocity • Status/Balance Friendship Patterns
  • 51. 51Social Media Mining Measures and Metrics 51Social Media Mining Network Measureshttp://socialmediamining.info/ I. Transitivity and Reciprocity
  • 52. 52Social Media Mining Measures and Metrics 52Social Media Mining Network Measureshttp://socialmediamining.info/ Transitivity • Mathematic representation: – For a transitive relation 𝑅: • In a social network: – Transitivity is when a friend of my friend is my friend – Transitivity in a social network leads to a denser graph, which in turn is closer to a complete graph – We can determine how close graphs are to the complete graph by measuring transitivity 𝒄𝑹𝒂 or 𝒂𝑹𝒄 ?
  • 53. 53Social Media Mining Measures and Metrics 53Social Media Mining Network Measureshttp://socialmediamining.info/ [Global] Clustering Coefficient • Clustering coefficient measures transitivity in undirected graphs – Count paths of length two and check whether the third edge exists When counting triangles, since every triangle has 6 closed paths of length 2
  • 54. 54Social Media Mining Measures and Metrics 54Social Media Mining Network Measureshttp://socialmediamining.info/ Clustering Coefficient and Triples Or we can rewrite it as • Triple: an ordered set of three nodes, – connected by two (open triple) edges or – three edges (closed triple) • A triangle can miss any of its three edges – A triangle has 3 Triples 𝑣𝑖 𝑣𝑗 𝑣 𝑘 and 𝑣𝑗 𝑣 𝑘 𝑣𝑖are different triples • The same members • First missing edge 𝑒(𝑣 𝑘, 𝑣𝑖) and second missing 𝑒(𝑣𝑖, 𝑣𝑗) 𝑣𝑖 𝑣𝑗 𝑣 𝑘and 𝑣 𝑘 𝑣𝑗 𝑣𝑖are the same triple
  • 55. 55Social Media Mining Measures and Metrics 55Social Media Mining Network Measureshttp://socialmediamining.info/ [Global] Clustering Coefficient: Example
  • 56. 56Social Media Mining Measures and Metrics 56Social Media Mining Network Measureshttp://socialmediamining.info/ Local Clustering Coefficient • Local clustering coefficient measures transitivity at the node level – Commonly employed for undirected graphs – Computes how strongly neighbors of a node 𝑣 (nodes adjacent to 𝑣) are themselves connected In an undirected graph, the denominator can be rewritten as: Provides a way to determine structural holes Structural Holes
  • 57. 57Social Media Mining Measures and Metrics 57Social Media Mining Network Measureshttp://socialmediamining.info/ Local Clustering Coefficient: Example • Thin lines depict connections to neighbors • Dashed lines are the missing link among neighbors • Solid lines indicate connected neighbors – When none of neighbors are connected 𝐶 = 0 – When all neighbors are connected 𝐶 = 1
  • 58. 58Social Media Mining Measures and Metrics 58Social Media Mining Network Measureshttp://socialmediamining.info/ Reciprocity If you become my friend, I’ll be yours • Reciprocity is simplified version of transitivity – It considers closed loops of length 2 • If node 𝑣 is connected to node 𝑢, – 𝑢 by connecting to 𝑣, exhibits reciprocity What about 𝒊 = 𝒋 ?
  • 59. 59Social Media Mining Measures and Metrics 59Social Media Mining Network Measureshttp://socialmediamining.info/ Reciprocity: Example Reciprocal nodes: 𝑣1, 𝑣2
  • 60. 60Social Media Mining Measures and Metrics 60Social Media Mining Network Measureshttp://socialmediamining.info/ • Measuring consistency in friendships II. Balance and Status
  • 61. 61Social Media Mining Measures and Metrics 61Social Media Mining Network Measureshttp://socialmediamining.info/ Social Balance Theory Social balance theory – Consistency in friend/foe relationships among individuals – Informally, friend/foe relationships are consistent when • In the network – Positive edges demonstrate friendships (𝑤𝑖𝑗 = 1) – Negative edges demonstrate being enemies (𝑤𝑖𝑗 = −1) • Triangle of nodes 𝑖, 𝑗, and 𝑘, is balanced, if and only if – 𝑤𝑖𝑗 denotes the value of the edge between nodes 𝑖 and 𝑗
  • 62. 62Social Media Mining Measures and Metrics 62Social Media Mining Network Measureshttp://socialmediamining.info/ Social Balance Theory: Possible Combinations For any cycle, if the multiplication of edge values become positive, then the cycle is socially balanced
  • 63. 63Social Media Mining Measures and Metrics 63Social Media Mining Network Measureshttp://socialmediamining.info/ Social Status Theory • Status: how prestigious an individual is ranked within a society • Social status theory: – How consistent individuals are in assigning status to their neighbors – Informally,
  • 64. 64Social Media Mining Measures and Metrics 64Social Media Mining Network Measureshttp://socialmediamining.info/ Social Status Theory: Example • A directed ‘+’ edge from node 𝑋 to node 𝑌 shows that 𝑌 has a higher status than 𝑋 and a ‘-’ one shows vice versa Unstable configuration Stable configuration
  • 65. 65Social Media Mining Measures and Metrics 65Social Media Mining Network Measureshttp://socialmediamining.info/ • Structural Equivalence • Regular Equivalence Similarity How similar are two nodes in a network?
  • 66. 66Social Media Mining Measures and Metrics 66Social Media Mining Network Measureshttp://socialmediamining.info/ Structural Equivalence • Structural Equivalence: – We look at the neighborhood shared by two nodes; – The size of this shared neighborhood defines how similar two nodes are. • Example: – Two brothers have in common • sisters, mother, father, grandparents, etc. – This shows that they are similar, – Two random male or female individuals do not have much in common and are dissimilar.
  • 67. 67Social Media Mining Measures and Metrics 67Social Media Mining Network Measureshttp://socialmediamining.info/ • Vertex similarity: • The neighborhood 𝑁(𝑣) often excludes the node itself 𝑣. – What can go wrong? • Connected nodes not sharing a neighbor will be assigned zero similarity – Solution: • We can assume nodes are included in their neighborhoods Structural Equivalence: Definitions Jaccard Similarity: Cosine Similarity: Normalize?
  • 68. 68Social Media Mining Measures and Metrics 68Social Media Mining Network Measureshttp://socialmediamining.info/ Similarity: Example
  • 69. 69Social Media Mining Measures and Metrics 69Social Media Mining Network Measureshttp://socialmediamining.info/ Similarity Significance Measuring Similarity Significance: compare the calculated similarity value with its expected value where vertices pick their neighbors at random • For vertices 𝑣𝑖 and 𝑣𝑗 with degrees 𝑑𝑖 and 𝑑𝑗 this expectation is 𝑑𝑖 𝑑𝑗/𝑛 – There is a 𝑑𝑖/𝑛 chance of becoming 𝑣𝑖‘s neighbor – 𝑣𝑗 selects 𝑑𝑗 neighbors • We can rewrite neighborhood overlap as
  • 70. 70Social Media Mining Measures and Metrics 70Social Media Mining Network Measureshttp://socialmediamining.info/ Normalized Similarity, cont. What is this?
  • 71. 71Social Media Mining Measures and Metrics 71Social Media Mining Network Measureshttp://socialmediamining.info/ Normalized Similarity, cont. 𝒏 times the Covariance between 𝑨𝒊 and 𝑨𝒋 Normalize covariance by the multiplication of Variances. We get Pearson correlation coefficient (range of   [-1,1] )
  • 72. 72Social Media Mining Measures and Metrics 72Social Media Mining Network Measureshttp://socialmediamining.info/ Regular Equivalence • In regular equivalence, – We do not look at neighborhoods shared between individuals, but – How neighborhoods themselves are similar • Example: – Athletes are similar not because they know each other in person, but since they know similar individuals, such as coaches, trainers, other players, etc.
  • 73. 73Social Media Mining Measures and Metrics 73Social Media Mining Network Measureshttp://socialmediamining.info/ • 𝑣𝑖, 𝑣𝑗 are similar when their neighbors 𝑣 𝑘 and 𝑣𝑙 are similar • The equation (left figure) is hard to solve since it is self referential so we relax our definition using the right figure Regular Equivalence
  • 74. 74Social Media Mining Measures and Metrics 74Social Media Mining Network Measureshttp://socialmediamining.info/ Regular Equivalence • 𝑣𝑖 and 𝑣𝑗 are similar when 𝑣𝑗 is similar to 𝑣𝑖’s neighbors 𝑣 𝑘 • In vector format A vertex is highly similar to itself, we guarantee this by adding an identity matrix to the equation W𝐡𝐞𝐧 𝛼 < 𝟏/𝝀 𝒎𝒂𝒙 the matrix is invertible
  • 75. 75Social Media Mining Measures and Metrics 75Social Media Mining Network Measureshttp://socialmediamining.info/ Regular Equivalence: Example • Any row/column of this matrix shows the similarity to other vertices • Vertex 1 is most similar (other than itself) to vertices 2 and 3 • Nodes 2 and 3 have the highest similarity (regular equivalence) The largest eigenvalue of 𝐴 is 2.43 Set 𝛼 = 0.3 < 1/2.43

Hinweis der Redaktion

  1. n – 1 is the maximum degree a node can have 2|E| = the total number of degrees,
  2. Out-degree is used
  3. Initial assignments are not important. In this example, \alpha = 1, \beta = 0 Normally, \alpha =0.85 and \beta = 0.15