We analyse the corpus of user relationships of the Slashdot technology news site. The data was collected from the Slashdot Zoo feature where users of the website can tag other users as friends and foes, providing positive and negative endorsements. We adapt social network analysis techniques to the problem of negative edge weights. In particular, we consider signed variants of global network characteristics such as
the clustering coefficient, node-level characteristics such as centrality and popularity measures, and link-level characteristics such as distances and similarity measures. We evaluate
these measures on the task of identifying unpopular users,
as well as on the task of predicting the sign of links and show that the network exhibits multiplicative transitivity which allows algebraic methods based on matrix multiplication to
be used. We compare our methods to traditional methods which are only suitable for positively weighted edges.
The Slashdot Zoo: Mining a Social Network with Negative Edges
1. The Slashdot Zoo
Mining a Social Network with Negative Edges
Jérôme Kunegis
DAI-Labor, Technische Universität Berlin, Germany
2. Outline
Introduction: The Slashdot Zoo
1. Balance and the Signed Clustering Coefficient
2. Popularity, Trust and Trolls
3. Visualization, Clustering and the Signed Laplacian
4. Link Sign Prediction
Discussion
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 2
3. Slashdot- http://slashdot.org/
Technology news website founded in 1997
Powered by Slash (slashcode.org)
Features: user accounts, threads, moderation, tags, journals and the
zoo (and more)
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 3
4. The Slashdot Zoo (Kunegis 2009)
•
Users can tag other users as friends and foes
•
Nomenclature: You are the fan of your friends and the freak of your foes
Foe of Friend of
me
Freak of Fan of
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 4
5. Statistics about the Slashdot Zoo
Statistics about the giant connected component:
77,985 users
510,157 links (388,190 friends / 122,967 foes)
75.9% of all links are positive
Sparsity: 0.00839% of all possible edges exist
Mean links per user: 6.54 (4.98 friends / 1.56 foes)
Median number of links per user: 3
Diameter = 6, Radius = 3
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 5
6. Famous (and Popular?) Slashdotters
From left to right:
CmdrTaco (Rob Malda, founder of Slashdot)
John Carmack (Quake, Doom, etc.)
Bruce Perens (Debian, Open Source Definition)
CleverNickName (Wil Wheaton, Star Trek)
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 6
7. The Slashdot Zoo
GREEN: friend link
RED: foe link
Centered at
CmdrTaco
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 7
8. Degree Distributions and Power Laws
Friends Foes Fans Freaks
Total
• Observation: power laws for all
four relationship types
• Cutoff at 200 friends/foes (400 for
registered users)
• The Slashdot Zoo is scale-free
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 8
9. 1. Balance and the Multiplication Rule
Assumption: The enemy of my enemy is my friend
– See e.g. (Hage & Harary 1983) +1
•
Mathematical formulation:
? −1
friend = +1 foe = −1
friend × friend = foe × foe = +1 −1
friend × foe = foe × friend = −1
•
A.k.a. ‘multiplicative transitivity’
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 9
10. Network Balance (Harary 1953)
Look at triads of users:
●
In balanced triangles, the multiplication rule holds
●
If it doesn't, there is conflict
Balance:
Conflict:
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 10
11. The Clustering Coefficient
Def.: Percentage of incident edge pairs completed by an edge to form a
triangle
C = |A o A²|+ / |A²|+
●
Characteristic number of a network, 0 ≤ C ≤ 1 (Watts & Strogatz, 1998)
●
High clustering coefficient: clustered graph with many cliques. (Graph
is clustered when the value higher than that predicted by random
graph models.)
●
Slashdot Zoo has C = 3.22%
(vs. 0.0095% random)
●
The Slashdot Zoo is Edge present ?
a small-world network
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 11
12. Signed Clustering Coefficient
• Denote the amount to which the network is balanced by counting
“wrongly” signed edges negatively
CS = | A o A² |+ / | abs(A)² |+
• Range: −1 ≤ CS ≤ +1
• Slashdot Zoo has CS = +2.46% (vs. 0% for random)
• Relative signed clustering coefficient: CS / C = +76.4%
• The Slashdot Zoo is balanced
u v
± uv ?
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 12
13. 2. Popularity, Trust and Trolls
Central (close to other nodes)
Important (connects nodes)
Unpopular (many freaks)
Popular (many fans)
Distrusted (many trusted freaks)
Trusted (many trusted fans)
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 13
14. Node Characteristics
Node characteristics that apply to individual nodes:
●
Centrality: How central is the node in the network?
●
Importance: How ‘important’ is a user in the network?
●
Popularity: How popular is a user?
●
Trust: Can a user be trusted?
Node characteristics allow opposites:
●
Popularity → Unpopularity
●
Trust → Distrust
Can negative edges be used to predict unpopularity and distrust?
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 14
15. Computing Node Characteristics
Popularity and trust measures:
●
Fan count minus freak count
●
PageRank (Brin & Page 1998)
●
EigenTrust (Kamvar 2003)
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 15
16. PageRank (Brin & Page 1998)
PageRank is an algebraic measure, it is defined using matrices:
The adjacency matrix A:
Aij = 1 when (i, j) is an edge, Aij = 0 otherwise
The normalized adjacency matrix:
N = D−1 A with Dii = Σj Aij
The ‘Google matrix’:
Gij = (α − 1) Nij + α / n with α = 0.15 (can be varied)
• Compute PageRank by iterated multiplication of any vector with G
v' = G v
• Result: Upper eigenvector of matrix G
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 16
17. EigenTrust (Kamvar 2003)
Exploit negative egdes in calculation of PageRank:
Aij = ±1 when (i,j) is an edge, Aij = 0 otherwise
N = D−1 A with Dii = Σj | Aij |
Gij = (α − 1) Nij + α / n
Implicit assumption: The multiplication rule holds
v'' = G G v
(A A)i j = Σk Ai k Akj
Observation: Matrix multiplication relies on edge weight products
Thus: Algebraic methods assume the validity of the multiplication rule.
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 17
18. Popular and Trusted Users
#1 #2 #3 #4 #5 #6
Fans minus CleverNickName Bruce Perens CmdrTaco John NewYorkCountryLawyer $$$$$exyGal
Freaks Carmack
PageRank FortKnox SamTheButcher Ethelred turg Some Woman gmhowell
Unraed
EigenTrust FortKnox SamTheButcher turg Some Ethelred Unraed gmhowell
Woman
Key: Famous persons – Trolls – Active users
Observation:
Fans minus Freaks denotes prominence,
PageRank and EigenTrust denote community.
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 18
19. Detecting Trolls
●
Slashdot is known for its trolls
trolling, n. posting disruptive, false or offensive information to fool and
provoke readers
• Task: Predict foes of blacklist “No More Trolls” (162 names[ 1 ] )
PhysicsGenius
Profane Motherfucker
ObviousGuy CmderTaco
Klerck
YourMissionForToday
$$$$$exyGal IN SOVIET RUSSIA
SexyKellyOsbourneBankofAmerica_ATM strat
j0nkatz spinlocked
jakt
CmdrTaco (editor)
CmdrTaco (troll)
TrollBurger Twirlip of the Mists
[1] See http://slashdot.org/~No+More+Trolls/foes/
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 19
20. PageRank and EigenTrust of Trolls
Troll
Non-troll
← PageRank
EigenTrust →
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 20
21. Negative Rank
• Observations:
PageRank and EigenTrust are almost equal for most users
For trolls, EigenTrust is less than PageRank
• Conclusion:
Define NegativeRank = EigenTrust − PageRank
How does Negative Rank peform at troll prediction?
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 21
22. Performance at Prediction
• Mean average precision (MAP) at troll prediction
• Negative Rank works best!
•Thus: trolling
is a community
phenomenon
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 22
23. 3. Visualization, Clustering and the Signed Laplacian
●
Graph drawing: Place each node at the center of its neighbors
v1
v0 = (1/3) (v1 + v2 + v3)
v0
Algebraically: Dv=Av v2 v3
Solution 1: Upper eigenvectors of D− 1 A using Di i = Σj Ai j
Solution 2: Lower eigenvectors of D – A
We look at solution 2: L = D − A is the Laplacian matrix
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 23
24. Drawing Signed Graphs (Kunegis & Lerner 2010)
•
Replace ‘negative’ neighbors by their antipodal
points −v1
v0 = (1/3) (−v1 + v2 + v3) v0
v2 v3
Solution: lower eigenvectors of L = D − A
Note: Di i = Σj | Ai j |
v1
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 24
25. Example: Synthetic Graph
Unsigned Graph Drawing → Signed Graph Drawing
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 25
26. Example: Wikipedia Reverts
•
Wikipedia users editing an article revert each other
• All edges are negative
• Distance to center
normalized to unit
• Four groups are
apparent
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 26
27. Example: Tribal Groups (Hage 1983)
The tribal groups of the Eastern Central Highlands of New Guinea can
be friends (‘rova’) or enemies (‘hina’)
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 27
28. Clustering: Finding Communities
The Laplacian matrix finds communities:
• Communities are
connected by many
positive edges
• Community are
separated by many
negative edges
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 28
29. Signed Spectral Clustering (Kunegis 2010)
•
Compute the d eigenvectors of L having smallest eigenvalue
•
Use k-means to cluster nodes in this d-dimensional space
•
Minimizes signed normalized cuts between communites X and Y
SNC(X, Y) = (|X|−1 + |Y|−1 )
· (2 pos(X, Y) + neg(X, X) + neg(Y, Y))
pos/neg is the number of
positive/negative edges between
two communities
•
Plot:
Clustering the Slashdot Zoo
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 29
30. 4. Link Sign Prediction
Task: Predict the sign of links
AT – Mutual friendship Exploit asymmetry
A² – Triangle closing Exploit multiplication rule
(A)k – Rank reduction Exploit latent structure
(A + AT)k – Symmetric rank reduction Exploit asymmetry and latent
structure
exp{α (A + AT)} – Matrix exponential Exploit multiplication rule,
clustering and asymmetry
{I − α (A + AT)}−1 – von Neumann kernel Exploit multiplication rule,
diffusion and asymmetry
(D − A)+ – Signed Laplacian kernel Exploit topology and
multiplication rule
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 30
31. Matrix Powers
•
The power of A contains weighted path counts:
(An)ij = Σ| p| =n sgn(p)
sgn(p) = Π( u, v) ∈p Auv
where the sum is over all paths of length n from i to j and the product over all
edges in the path p.
•
sgn(p) defines positive and negative paths:
(An)i j = pos(i, j) − neg(i, j)
where pos(i, j) and neg(i, j) count positive and negative paths between two nodes
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 31
32. Matrix Exponential
The exponential function for matrices:
exp(A) = I + A + 1/2 A² + 1/6 A³ + …
•
The matrix exponential is a sum over all paths
– Counting negative paths negatively
– Weighting each path with the inverse factorial of its length
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 32
33. Evaluation Results
Accuracy is
measured on a
scale from −1
to +1.
1 0.517
AT 0.536
A2 0.552
Best link sign prediction: matrix exponential, confirms the multiplication rule
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 33
34. Summary
●
The Slashdot Zoo is a signed, scale-free and small-world network
●
Multiplication rule ‘the enemy of my enemy is my friend’ confirmed
at global, nodal and relational scale
●
The multiplication rule is implicit in algebraic approaches
●
New concepts that exploit the multiplication rule:
Signed clustering coefficient – To characterize balance
Negative Rank – For troll prediction
Signed Laplacian matrix – For clustering, prediction and visualization
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 34
35. Ongoing Work
•
More signed network datasets
– Essembly.org, Epinions.com (distrust), LibimSeTi.cz (dating site
ratings), Wikipedia adminship votes, all rating graphs
•
Other networks that can be extended to negative values
– Folksonomies with negative tags (e.g. !funny)
•
Social networks with more than two relationship types
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 35
37. References
S. Brin, L. Page. The anatomy of a large-scale hypertextual Web search engine, Proc. Int. Conf. on
World Wide Web, pages 107–117, 1998.
P. Hage, F. Harary. Structural models in anthropology, Cambridge University Press, 1983.
F. Harary. On the notion of balance of a signed graph, Michigan Math. J., 2:143–146, 1953.
S. D. Kamvar, M. T. Schlosser, H. Garcia-Molina. The EigenTrust algorithm for reputation
management in P2P networks, Proc. Int. Conf. on World Wide Web, pages 640–651, 2003.
J. Kunegis, A. Lommatzsch, C. Bauckhage, The Slashdot Zoo: Mining a social network with
negative edges, Proc. Int. World Wide Web Conf., pages 741–750, 2009.
J. Kunegis, S. Schmidt, A. Lommatzsch, J. Lerner, E. De Luca, S. Albayrak, Spectral analysis
of signed graphs for clustering, prediction and visualization, Proc. SIAM Int. Conf. on Data
Mining, 2010. [ presentation on April 30 ]
J. Kunegis, J. Lerner, A. Lommatzsch, S. Schmidt, Advances in spectral drawing of signed
conflict networks, unpublished, 2010.
J. Leskovec, Daniel Huttenlocher, Jon Kleinberg, Predicting positive and negative links in online
social networks, Proc. Int. World Wide Web Conf., 2010. [ presentation on April 28 ]
D. J. Watts, S. H. Strogatz. Collective dynamics in ‘small-world’ networks, Nature 393(6684):440–
442, 1998.
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 37
38. Appendix – Screenshots
Kunegis et al. The Slashdot Zoo: Mining a Social Network with Negative Edges 38