1. Parallel Community Detection for Massive
Graphs
E. Jason Riedy, Henning Meyerhenke, David Ediger, and
David A. Bader
13 September 2011
2. Main contributions
• First massively parallel algorithm for community detection.
• Very general algorithm using a weighted matching and applying
to many optimization criteria.
• Partition a 122 million vertex, 1.99 billion edge graph into
communities in 2 hours.
PPAM11—Parallel Community Detection—Jason Riedy 2/25
3. Exascale data analysis
Health care Finding outbreaks, population epidemiology
Social networks Advertising, searching, grouping
Intelligence Decisions at scale, regulating algorithms
Systems biology Understanding interactions, drug design
Power grid Disruptions, conservation
Simulation Discrete events, cracking meshes
• Graph clustering is common in all application areas.
PPAM11—Parallel Community Detection—Jason Riedy 3/25
4. These are not easy graphs.
Yifan Hu’s (AT&T) visualization of the Livejournal data set
PPAM11—Parallel Community Detection—Jason Riedy 4/25
5. But no shortage of structure...
Protein interactions, Giot et al., “A Protein
Interaction Map of Drosophila melanogaster”,
Jason’s network via LinkedIn Labs
Science 302, 1722-1736, 2003.
• Locally, there are clusters or communities.
• First pass over a massive social graph:
• Find smaller communities of interest.
• Analyze / visualize top-ranked communities.
• Our part: First massively parallel community detection method.
PPAM11—Parallel Community Detection—Jason Riedy 5/25
6. Outline
Motivation
Defining community detection and metrics
Our parallel method
Results
Implementation and platform details
Performance
Conclusions and plans
PPAM11—Parallel Community Detection—Jason Riedy 6/25
7. Community detection
What do we mean?
• Partition a graph’s
vertices into disjoint
communities.
• A community locally
maximizes some metric.
• Modularity,
conductance, ...
• Trying to capture that
vertices are more similar
within one community
than between
communities. Jason’s network via LinkedIn Labs
PPAM11—Parallel Community Detection—Jason Riedy 7/25
8. Community detection
Assumptions
• Disjoint partitioning of
vertices.
• There is no one unique
answer.
• Some metrics are
NP-complete to opti-
mize [Brandes, et al.].
• Graph is lossy
representation.
• Want an adaptable
detection method.
Jason’s network via LinkedIn Labs
PPAM11—Parallel Community Detection—Jason Riedy 8/25
9. Common community metric: Modularity
• Modularity: Deviation of connectivity in the community
induced by a vertex set S from some expected background
model of connectivity.
• We take [Newman]’s basic uniform model.
• Let m count all edges in graph G, mS count of edges with
both endpoints in S, and xS count the edges with any
endpoint in S. Modularity QS :
QS = (mS − x2 /4m)/m
S
• Total modularity: sum of each partion’s subsets’ modularity.
• A sufficiently positive modularity implies some structure.
• Known issues: Resolution limit, NP-complete opt. prob.
PPAM11—Parallel Community Detection—Jason Riedy 9/25
10. Sequential agglomerative method
• A common method
(e.g. [Clauset, et al.]) agglomerates
A C vertices into communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with
modularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 10/25
11. Sequential agglomerative method
• A common method
(e.g. [Clauset, et al.]) agglomerates
A C vertices into communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with
modularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 11/25
12. Sequential agglomerative method
• A common method
(e.g. [Clauset, et al.]) agglomerates
A C vertices into communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with
modularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 12/25
13. Sequential agglomerative method
• A common method
(e.g. [Clauset, et al.]) agglomerates
A C vertices into communities.
• Each vertex begins in its own
B community.
D • An edge is chosen to contract.
E • Merging maximally increases
modularity.
G • Priority queue.
F
• Known often to fall into an O(n2 )
performance trap with
modularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 13/25
14. New parallel method
• We use a matching to avoid the queue.
• Compute a heavy weight, large
matching.
A C • Greedy algorithm.
• Maximal matching.
• Within factor of 2 in weight.
B
• Merge all communities at once.
D
E • Maintains some balance.
• Produces different results.
G
F • Agnostic to weighting, matching...
• Can maximize modularity, minimize
conductance.
• Modifying matching permits easy
exploration.
PPAM11—Parallel Community Detection—Jason Riedy 14/25
15. New parallel method
• We use a matching to avoid the queue.
• Compute a heavy weight, large
matching.
A C • Greedy algorithm.
• Maximal matching.
• Within factor of 2 in weight.
B
• Merge all communities at once.
D
E • Maintains some balance.
• Produces different results.
G
F • Agnostic to weighting, matching...
• Can maximize modularity, minimize
conductance.
• Modifying matching permits easy
exploration.
PPAM11—Parallel Community Detection—Jason Riedy 15/25
16. New parallel method
• We use a matching to avoid the queue.
• Compute a heavy weight, large
matching.
A C • Greedy algorithm.
• Maximal matching.
• Within factor of 2 in weight.
B
• Merge all communities at once.
D
E • Maintains some balance.
• Produces different results.
G
F • Agnostic to weighting, matching...
• Can maximize modularity, minimize
conductance.
• Modifying matching permits easy
exploration.
PPAM11—Parallel Community Detection—Jason Riedy 16/25
17. Implementation: Cray XMT
Tolerates latency by massive multithreading.
• Hardware: 128 threads per processor
• Every-cycle context switch
• Many outstanding memory requests (180/proc)
• Flexibly supports dynamic load balancing
• Globally hashed address space, no data cache
• Support for fine-grained, word-level synchronization
• Full/empty bit on with every memory word
• 128 processor XMT at
Pacific Northwest Nat’l Lab
• 500 MHz processors, 16384
threads, 1 TB of shared
memory
Image: cray.com
PPAM11—Parallel Community Detection—Jason Riedy 17/25
18. Implementation: Data structures
Extremely basic for graph G = (V, E)
• An array of (i, j; w) weighted edge pairs, i < j, |E|
• An array to store self-edges, d(i) = w, |V |
• A temporary floating-point array for scores, |E|
• Two additional temporary arrays (|V |, |E|) to store degrees,
matching choices, linked list for compression, ...
• Weights count number of agglomerated vertices or edges.
• Scoring methods need only vertex-local counts.
• Greedy matching parallelizes trivially over the edge array.
• Contraction compacts the edge list in place through a
temporary, hashed linked list of duplicates.
• Full-empty bits or compare&swap keep correctness.
PPAM11—Parallel Community Detection—Jason Riedy 18/25
19. Data and experiment
• Generated R-MAT graphs [Chakrabarti, et al.] (a = 0.55,
b = c = 0.1, and d = 0.25) and extracted the largest
component.
Scale Fact. |V | |E| Avg. degree Edge group
18 8 236 605 2 009 752 8.5 2M
16 252 427 3 936 239 15.6 4M
32 259 372 7 605 572 29.3 8M
19 8 467 993 3 480 977 7.4 4M
16 502 152 7 369 885 14.7 8M
32 517 452 14 853 837 28.7 16M
• Three runs with each parameter setting.
• Tested both modularity (CNM for Clauset-Newman-Moore)
and McCloskey-Bader’s filtered modularity (MB).
PPAM11—Parallel Community Detection—Jason Riedy 19/25
23. Performance: Large and real-world
Large R-MAT Scale 27, edge factor 16 generated a component with
122 million vertices and 1.99 billion edges.
7258 seconds ≈ 2 hours on 128P
R-MAT known not to have much community
structure...
LiveJournal 4.8 million vertices, 68 million edges
2779 seconds ≈ 47 minutes on 128P
Long tail of around 1 000 matched pairs per iteration.
Tail is important to modularity and community count,
but to users?
PPAM11—Parallel Community Detection—Jason Riedy 23/25
24. Conclusions and plans
• First massively parallel algorithm based on matching for
agglomerative community detection.
• Non-deterministic algorithm, still studying impact.
• Now can easily experiment with agglomerative community
detection on real-world graphs.
• How volatile are modularity and conductance to perturbations?
• What matching schemes work well?
• How do different metrics compare in applications?
• Experimenting on OpenMP platforms.
• CSR format is more cache-friendly, but uses more memory.
• Extending to streaming graph data!
• Code available on request, to be integrated into GT packages
like GraphCT.
PPAM11—Parallel Community Detection—Jason Riedy 24/25
25. And if you can do better...
Then please do!
10th DIMACS Implementation Challenge — Graph Partitioning
and Graph Clustering
• 12-13 Feb, 2012 in Atlanta,
http://www.cc.gatech.edu/dimacs10/
• Paper deadline: 21 October, 2011
http://www.cc.gatech.edu/dimacs10/data/call-for-papers.pdf
• Challenge problem statement and evaluation rules:
http://www.cc.gatech.edu/dimacs10/data/dimacs10-rules.pdf
• Co-sponsored by DIMACS, by the Command, Control, and Interoperability
Center for Advanced Data Analysis (CCICADA); Pacific Northwest
National Lab.; Sandia National Lab.; and Deutsche
Forschungsgemeinschaft (DFG).
PPAM11—Parallel Community Detection—Jason Riedy 25/25
27. Bibliography I
D. Bader and J. McCloskey.
Modularity and graph algorithms.
Presented at UMBC, Sept. 2009.
J. Berry., B. Hendrickson, R. LaViolette, and C. Phillips.
Tolerating the community detection resolution limit with edge
weighting.
CoRR, abs/0903.1072, 2009.
U. Brandes, D. Delling, M. Gaertler, R. G¨rke, M. Hoefer,
o
Z. Nikoloski, and D. Wagner.
On modularity clustering.
IEEE Trans. Knowledge and Data Engineering, 20(2):172–188,
2008.
PPAM11—Parallel Community Detection—Jason Riedy 27/25
28. Bibliography II
D. Chakrabarti, Y. Zhan, and C. Faloutsos.
R-MAT: A recursive model for graph mining.
In Proc. 4th SIAM Intl. Conf. on Data Mining (SDM), Orlando,
FL, Apr. 2004. SIAM.
A. Clauset, M. Newman, and C. Moore.
Finding community structure in very large networks.
Physical Review E, 70(6):66111, 2004.
M. Newman.
Modularity and community structure in networks.
Proc. of the National Academy of Sciences, 103(23):8577–8582,
2006.
K. Wakita and T. Tsurumi.
Finding community structure in mega-scale social networks.
CoRR, abs/cs/0702048, 2007.
PPAM11—Parallel Community Detection—Jason Riedy 28/25