SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
Parallel Community Detection for Massive
Graphs
E. Jason Riedy, Henning Meyerhenke, David Ediger, and
David A. Bader

                                      13 September 2011
Main contributions




          • First massively parallel algorithm for community detection.
          • Very general algorithm using a weighted matching and applying
             to many optimization criteria.
          • Partition a 122 million vertex, 1.99 billion edge graph into
             communities in 2 hours.




PPAM11—Parallel Community Detection—Jason Riedy                             2/25
Exascale data analysis



             Health care Finding outbreaks, population epidemiology
       Social networks Advertising, searching, grouping
             Intelligence Decisions at scale, regulating algorithms
      Systems biology Understanding interactions, drug design
              Power grid Disruptions, conservation
              Simulation Discrete events, cracking meshes

          • Graph clustering is common in all application areas.




PPAM11—Parallel Community Detection—Jason Riedy                       3/25
These are not easy graphs.
                              Yifan Hu’s (AT&T) visualization of the Livejournal data set




PPAM11—Parallel Community Detection—Jason Riedy                                             4/25
But no shortage of structure...




          Protein interactions, Giot et al., “A Protein
          Interaction Map of Drosophila melanogaster”,
                                                          Jason’s network via LinkedIn Labs
          Science 302, 1722-1736, 2003.



          • Locally, there are clusters or communities.
          • First pass over a massive social graph:
               • Find smaller communities of interest.
               • Analyze / visualize top-ranked communities.
          • Our part: First massively parallel community detection method.

PPAM11—Parallel Community Detection—Jason Riedy                                               5/25
Outline


      Motivation

      Defining community detection and metrics

      Our parallel method

      Results
         Implementation and platform details
         Performance

      Conclusions and plans



PPAM11—Parallel Community Detection—Jason Riedy   6/25
Community detection

     What do we mean?
         • Partition a graph’s
           vertices into disjoint
           communities.
         • A community locally
           maximizes some metric.
                • Modularity,
                   conductance, ...
         • Trying to capture that
            vertices are more similar
            within one community
            than between
            communities.                          Jason’s network via LinkedIn Labs




PPAM11—Parallel Community Detection—Jason Riedy                                       7/25
Community detection


     Assumptions
         • Disjoint partitioning of
            vertices.
         • There is no one unique
            answer.
                • Some metrics are
                  NP-complete to opti-
                  mize [Brandes, et al.].
                • Graph is lossy
                  representation.
         • Want an adaptable
            detection method.
                                                  Jason’s network via LinkedIn Labs




PPAM11—Parallel Community Detection—Jason Riedy                                       8/25
Common community metric: Modularity

          • Modularity: Deviation of connectivity in the community
             induced by a vertex set S from some expected background
             model of connectivity.
          • We take [Newman]’s basic uniform model.
          • Let m count all edges in graph G, mS count of edges with
             both endpoints in S, and xS count the edges with any
             endpoint in S. Modularity QS :

                                        QS = (mS − x2 /4m)/m
                                                    S

          • Total modularity: sum of each partion’s subsets’ modularity.
          • A sufficiently positive modularity implies some structure.
          • Known issues: Resolution limit, NP-complete opt. prob.



PPAM11—Parallel Community Detection—Jason Riedy                            9/25
Sequential agglomerative method


                                                  • A common method
                                                    (e.g. [Clauset, et al.]) agglomerates
       A                 C                          vertices into communities.
                                                  • Each vertex begins in its own
                    B                               community.
       D                                          • An edge is chosen to contract.
                                E                     • Merging maximally increases
                                                        modularity.
           G                                          • Priority queue.
                        F
                                                  • Known often to fall into an O(n2 )
                                                    performance trap with
                                                    modularity [Wakita & Tsurumi].



PPAM11—Parallel Community Detection—Jason Riedy                                             10/25
Sequential agglomerative method


                                                  • A common method
                                                    (e.g. [Clauset, et al.]) agglomerates
       A                 C                          vertices into communities.
                                                  • Each vertex begins in its own
                    B                               community.
       D                                          • An edge is chosen to contract.
                                E                     • Merging maximally increases
                                                        modularity.
           G                                          • Priority queue.
                        F
                                                  • Known often to fall into an O(n2 )
                                                    performance trap with
                                                    modularity [Wakita & Tsurumi].



PPAM11—Parallel Community Detection—Jason Riedy                                             11/25
Sequential agglomerative method


                                                  • A common method
                                                    (e.g. [Clauset, et al.]) agglomerates
       A                 C                          vertices into communities.
                                                  • Each vertex begins in its own
                    B                               community.
       D                                          • An edge is chosen to contract.
                                E                     • Merging maximally increases
                                                        modularity.
           G                                          • Priority queue.
                        F
                                                  • Known often to fall into an O(n2 )
                                                    performance trap with
                                                    modularity [Wakita & Tsurumi].



PPAM11—Parallel Community Detection—Jason Riedy                                             12/25
Sequential agglomerative method


                                                  • A common method
                                                    (e.g. [Clauset, et al.]) agglomerates
       A                 C                          vertices into communities.
                                                  • Each vertex begins in its own
                    B                               community.
       D                                          • An edge is chosen to contract.
                                E                     • Merging maximally increases
                                                        modularity.
           G                                          • Priority queue.
                        F
                                                  • Known often to fall into an O(n2 )
                                                    performance trap with
                                                    modularity [Wakita & Tsurumi].



PPAM11—Parallel Community Detection—Jason Riedy                                             13/25
New parallel method

                                                  • We use a matching to avoid the queue.
                                                  • Compute a heavy weight, large
                                                    matching.
       A                 C                            • Greedy algorithm.
                                                      • Maximal matching.
                                                      • Within factor of 2 in weight.
                    B
                                                  • Merge all communities at once.
       D
                                E                 • Maintains some balance.
                                                  • Produces different results.
           G
                        F                         • Agnostic to weighting, matching...
                                                      • Can maximize modularity, minimize
                                                        conductance.
                                                      • Modifying matching permits easy
                                                        exploration.


PPAM11—Parallel Community Detection—Jason Riedy                                             14/25
New parallel method

                                                  • We use a matching to avoid the queue.
                                                  • Compute a heavy weight, large
                                                    matching.
       A                 C                            • Greedy algorithm.
                                                      • Maximal matching.
                                                      • Within factor of 2 in weight.
                    B
                                                  • Merge all communities at once.
       D
                                E                 • Maintains some balance.
                                                  • Produces different results.
           G
                        F                         • Agnostic to weighting, matching...
                                                      • Can maximize modularity, minimize
                                                        conductance.
                                                      • Modifying matching permits easy
                                                        exploration.


PPAM11—Parallel Community Detection—Jason Riedy                                             15/25
New parallel method

                                                  • We use a matching to avoid the queue.
                                                  • Compute a heavy weight, large
                                                    matching.
       A                 C                            • Greedy algorithm.
                                                      • Maximal matching.
                                                      • Within factor of 2 in weight.
                    B
                                                  • Merge all communities at once.
       D
                                E                 • Maintains some balance.
                                                  • Produces different results.
           G
                        F                         • Agnostic to weighting, matching...
                                                      • Can maximize modularity, minimize
                                                        conductance.
                                                      • Modifying matching permits easy
                                                        exploration.


PPAM11—Parallel Community Detection—Jason Riedy                                             16/25
Implementation: Cray XMT
      Tolerates latency by massive multithreading.
          • Hardware: 128 threads per processor
               • Every-cycle context switch
               • Many outstanding memory requests (180/proc)
          • Flexibly supports dynamic load balancing
               • Globally hashed address space, no data cache
          • Support for fine-grained, word-level synchronization
               • Full/empty bit on with every memory word


            • 128 processor XMT at
               Pacific Northwest Nat’l Lab
            • 500 MHz processors, 16384
               threads, 1 TB of shared
               memory
                                                  Image: cray.com


PPAM11—Parallel Community Detection—Jason Riedy                     17/25
Implementation: Data structures

      Extremely basic for graph G = (V, E)
          • An array of (i, j; w) weighted edge pairs, i < j, |E|
          • An array to store self-edges, d(i) = w, |V |
          • A temporary floating-point array for scores, |E|
          • Two additional temporary arrays (|V |, |E|) to store degrees,
             matching choices, linked list for compression, ...

          • Weights count number of agglomerated vertices or edges.
          • Scoring methods need only vertex-local counts.
          • Greedy matching parallelizes trivially over the edge array.
          • Contraction compacts the edge list in place through a
             temporary, hashed linked list of duplicates.
                 • Full-empty bits or compare&swap keep correctness.


PPAM11—Parallel Community Detection—Jason Riedy                             18/25
Data and experiment


          • Generated R-MAT graphs [Chakrabarti, et al.] (a = 0.55,
             b = c = 0.1, and d = 0.25) and extracted the largest
             component.
               Scale     Fact.        |V |           |E|       Avg. degree   Edge group
                 18        8       236 605         2 009 752       8.5           2M
                           16      252 427         3 936 239      15.6           4M
                           32      259 372         7 605 572      29.3           8M
                 19        8       467 993         3 480 977       7.4           4M
                           16      502 152         7 369 885      14.7           8M
                           32      517 452        14 853 837      28.7          16M
          • Three runs with each parameter setting.
          • Tested both modularity (CNM for Clauset-Newman-Moore)
             and McCloskey-Bader’s filtered modularity (MB).



PPAM11—Parallel Community Detection—Jason Riedy                                           19/25
Performance: Time

                                           CNM                                             MB


                   210
                         1321.1s                                         1316.6s



                         464.4s                                          460.8s
                   28


                                                                                                                         Edge group
                         144.6s                                          144.7s                                           q   2M
        Time (s)




                   26                                                                                                         4M
                            q                                               q

                                                                                                                              8M
                          51.4s                                  19.8s    51.9s                                  19.7s
                                   q                                               q                                          16M

                   24                                           9.8s                                         10.0s
                                       q                                               q



                                           q                                               q
                                                                                                                     q
                                                                                                                     q
                                                                    q
                                                                    q                                                q
                                                                q
                   22                          q
                                                         4.3s   q
                                                                                                q
                                                                                                          4.5s
                                                                                                                 q
                                                                                                                 q
                                                                                                                 q
                                                           q
                                                         q q                                              q q
                                                                                                          q q
                                                     q
                                                     q   q q                                         q    q
                                                    2.7s                                                 2.7s

                            1      2   4   8   16   32     64    128        1      2   4   8    16   32     64   128

                                                     Number of processors




PPAM11—Parallel Community Detection—Jason Riedy                                                                                       20/25
Performance: Scaling

                                           CNM                                         MB

                  128


                        16M: 66.8x at 128P                          16M: 66.7x at 128P
                  64
                            8M: 48.6x at 96P                            8M: 48.0x at 96P
                            4M: 33.4x at 48P                            4M: 32.5x at 48P
                  32

                            2M: 19.2x at 32P                            2M: 19.4x at 48P                        Edge group
                                                 q
                                                 q    q q                                    q    q
                                                      q q
                                                        q                                         q q
                                                                                                  q q
                  16                                                                                q            q   2M
        Speedup




                                                                                                        q
                                            q               q                           q               q
                                                            q                                           q
                                                                q
                                                                q
                                                                q
                                                                                                            q        4M
                                                                                                            q

                   8                  q                                           q
                                                                                                                     8M
                                      q                                           q
                                                                                                                     16M

                   4              q                                           q
                                                                              q




                   2         q
                             q                                           q
                                                                         q




                   1    q                                           q




                        1     2   4    8    16   32    64   128     1     2   4    8    16   32    64   128

                                                     Number of processors




PPAM11—Parallel Community Detection—Jason Riedy                                                                              21/25
Performance: Modularity


                   (18, 8): 0.423         (18,16): 0.425
                                                                                                               • SNAP: GT’s sequential
                                          (18,32): 0.418
             0.4
                                                                                                                 implementation.
                                                                                                               • Plain modularity (CNM)
                                                                        q (18, 8): 0.319
             0.3
                                                                                                                 known for too-large
                                    q (18, 8): 0.240                                                             communities (resolution
Modularity




                                                                                              Implementation

             0.2                                                                              q   Parallel       limit).
                                                       (18,16): 0.187   q                         SNAP
                   (18,16): 0.171   q
                                                                                                               • MB filters out statistically
             0.1                                                        q    (18,32): 0.104                      negligible merges, smaller
                                    q (18,32): 0.084
                                                                                                                 communities & modularity.
                                                       (18,16): 0.013
                                                                             (18, 8): 0.023
                                                                             (18,32): 0.009
                                                                                                               • Our more balanced, parallel
                                    CNM                                 MB                                       algorithm falls between the
                                          Scoring method
                                                                                                                 two?



PPAM11—Parallel Community Detection—Jason Riedy                                                                                                22/25
Performance: Large and real-world


      Large R-MAT Scale 27, edge factor 16 generated a component with
                 122 million vertices and 1.99 billion edges.
                       7258 seconds ≈ 2 hours on 128P
                 R-MAT known not to have much community
                 structure...
        LiveJournal 4.8 million vertices, 68 million edges
                          2779 seconds ≈ 47 minutes on 128P
                    Long tail of around 1 000 matched pairs per iteration.
                    Tail is important to modularity and community count,
                    but to users?




PPAM11—Parallel Community Detection—Jason Riedy                              23/25
Conclusions and plans

          • First massively parallel algorithm based on matching for
             agglomerative community detection.
                 • Non-deterministic algorithm, still studying impact.
          • Now can easily experiment with agglomerative community
             detection on real-world graphs.
                 • How volatile are modularity and conductance to perturbations?
                 • What matching schemes work well?
                 • How do different metrics compare in applications?
          • Experimenting on OpenMP platforms.
              • CSR format is more cache-friendly, but uses more memory.
          • Extending to streaming graph data!
          • Code available on request, to be integrated into GT packages
             like GraphCT.


PPAM11—Parallel Community Detection—Jason Riedy                                    24/25
And if you can do better...
                                        Then please do!
      10th DIMACS Implementation Challenge — Graph Partitioning
      and Graph Clustering
          • 12-13 Feb, 2012 in Atlanta,
             http://www.cc.gatech.edu/dimacs10/
          • Paper deadline: 21 October, 2011
             http://www.cc.gatech.edu/dimacs10/data/call-for-papers.pdf
          • Challenge problem statement and evaluation rules:
             http://www.cc.gatech.edu/dimacs10/data/dimacs10-rules.pdf
          • Co-sponsored by DIMACS, by the Command, Control, and Interoperability
             Center for Advanced Data Analysis (CCICADA); Pacific Northwest
             National Lab.; Sandia National Lab.; and Deutsche
             Forschungsgemeinschaft (DFG).



PPAM11—Parallel Community Detection—Jason Riedy                                     25/25
Acknowledgment of support




PPAM11—Parallel Community Detection—Jason Riedy   26/25
Bibliography I

             D. Bader and J. McCloskey.
             Modularity and graph algorithms.
             Presented at UMBC, Sept. 2009.
             J. Berry., B. Hendrickson, R. LaViolette, and C. Phillips.
             Tolerating the community detection resolution limit with edge
             weighting.
             CoRR, abs/0903.1072, 2009.
             U. Brandes, D. Delling, M. Gaertler, R. G¨rke, M. Hoefer,
                                                      o
             Z. Nikoloski, and D. Wagner.
             On modularity clustering.
             IEEE Trans. Knowledge and Data Engineering, 20(2):172–188,
             2008.



PPAM11—Parallel Community Detection—Jason Riedy                              27/25
Bibliography II

             D. Chakrabarti, Y. Zhan, and C. Faloutsos.
             R-MAT: A recursive model for graph mining.
             In Proc. 4th SIAM Intl. Conf. on Data Mining (SDM), Orlando,
             FL, Apr. 2004. SIAM.
             A. Clauset, M. Newman, and C. Moore.
             Finding community structure in very large networks.
             Physical Review E, 70(6):66111, 2004.
             M. Newman.
             Modularity and community structure in networks.
             Proc. of the National Academy of Sciences, 103(23):8577–8582,
             2006.
             K. Wakita and T. Tsurumi.
             Finding community structure in mega-scale social networks.
             CoRR, abs/cs/0702048, 2007.

PPAM11—Parallel Community Detection—Jason Riedy                              28/25

Weitere ähnliche Inhalte

Mehr von Jason Riedy

Mehr von Jason Riedy (20)

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architectures
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and Emus
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to Architecture
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and Beyond
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with Microbenchmarks
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery Update
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New Architectures
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph AnalysisA New Algorithm Model for Massive-Scale Streaming Graph Analysis
A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
Updating PageRank for Streaming Graphs
Updating PageRank for Streaming GraphsUpdating PageRank for Streaming Graphs
Updating PageRank for Streaming Graphs
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Parallel Community Detection for Massive Graphs

  • 1. Parallel Community Detection for Massive Graphs E. Jason Riedy, Henning Meyerhenke, David Ediger, and David A. Bader 13 September 2011
  • 2. Main contributions • First massively parallel algorithm for community detection. • Very general algorithm using a weighted matching and applying to many optimization criteria. • Partition a 122 million vertex, 1.99 billion edge graph into communities in 2 hours. PPAM11—Parallel Community Detection—Jason Riedy 2/25
  • 3. Exascale data analysis Health care Finding outbreaks, population epidemiology Social networks Advertising, searching, grouping Intelligence Decisions at scale, regulating algorithms Systems biology Understanding interactions, drug design Power grid Disruptions, conservation Simulation Discrete events, cracking meshes • Graph clustering is common in all application areas. PPAM11—Parallel Community Detection—Jason Riedy 3/25
  • 4. These are not easy graphs. Yifan Hu’s (AT&T) visualization of the Livejournal data set PPAM11—Parallel Community Detection—Jason Riedy 4/25
  • 5. But no shortage of structure... Protein interactions, Giot et al., “A Protein Interaction Map of Drosophila melanogaster”, Jason’s network via LinkedIn Labs Science 302, 1722-1736, 2003. • Locally, there are clusters or communities. • First pass over a massive social graph: • Find smaller communities of interest. • Analyze / visualize top-ranked communities. • Our part: First massively parallel community detection method. PPAM11—Parallel Community Detection—Jason Riedy 5/25
  • 6. Outline Motivation Defining community detection and metrics Our parallel method Results Implementation and platform details Performance Conclusions and plans PPAM11—Parallel Community Detection—Jason Riedy 6/25
  • 7. Community detection What do we mean? • Partition a graph’s vertices into disjoint communities. • A community locally maximizes some metric. • Modularity, conductance, ... • Trying to capture that vertices are more similar within one community than between communities. Jason’s network via LinkedIn Labs PPAM11—Parallel Community Detection—Jason Riedy 7/25
  • 8. Community detection Assumptions • Disjoint partitioning of vertices. • There is no one unique answer. • Some metrics are NP-complete to opti- mize [Brandes, et al.]. • Graph is lossy representation. • Want an adaptable detection method. Jason’s network via LinkedIn Labs PPAM11—Parallel Community Detection—Jason Riedy 8/25
  • 9. Common community metric: Modularity • Modularity: Deviation of connectivity in the community induced by a vertex set S from some expected background model of connectivity. • We take [Newman]’s basic uniform model. • Let m count all edges in graph G, mS count of edges with both endpoints in S, and xS count the edges with any endpoint in S. Modularity QS : QS = (mS − x2 /4m)/m S • Total modularity: sum of each partion’s subsets’ modularity. • A sufficiently positive modularity implies some structure. • Known issues: Resolution limit, NP-complete opt. prob. PPAM11—Parallel Community Detection—Jason Riedy 9/25
  • 10. Sequential agglomerative method • A common method (e.g. [Clauset, et al.]) agglomerates A C vertices into communities. • Each vertex begins in its own B community. D • An edge is chosen to contract. E • Merging maximally increases modularity. G • Priority queue. F • Known often to fall into an O(n2 ) performance trap with modularity [Wakita & Tsurumi]. PPAM11—Parallel Community Detection—Jason Riedy 10/25
  • 11. Sequential agglomerative method • A common method (e.g. [Clauset, et al.]) agglomerates A C vertices into communities. • Each vertex begins in its own B community. D • An edge is chosen to contract. E • Merging maximally increases modularity. G • Priority queue. F • Known often to fall into an O(n2 ) performance trap with modularity [Wakita & Tsurumi]. PPAM11—Parallel Community Detection—Jason Riedy 11/25
  • 12. Sequential agglomerative method • A common method (e.g. [Clauset, et al.]) agglomerates A C vertices into communities. • Each vertex begins in its own B community. D • An edge is chosen to contract. E • Merging maximally increases modularity. G • Priority queue. F • Known often to fall into an O(n2 ) performance trap with modularity [Wakita & Tsurumi]. PPAM11—Parallel Community Detection—Jason Riedy 12/25
  • 13. Sequential agglomerative method • A common method (e.g. [Clauset, et al.]) agglomerates A C vertices into communities. • Each vertex begins in its own B community. D • An edge is chosen to contract. E • Merging maximally increases modularity. G • Priority queue. F • Known often to fall into an O(n2 ) performance trap with modularity [Wakita & Tsurumi]. PPAM11—Parallel Community Detection—Jason Riedy 13/25
  • 14. New parallel method • We use a matching to avoid the queue. • Compute a heavy weight, large matching. A C • Greedy algorithm. • Maximal matching. • Within factor of 2 in weight. B • Merge all communities at once. D E • Maintains some balance. • Produces different results. G F • Agnostic to weighting, matching... • Can maximize modularity, minimize conductance. • Modifying matching permits easy exploration. PPAM11—Parallel Community Detection—Jason Riedy 14/25
  • 15. New parallel method • We use a matching to avoid the queue. • Compute a heavy weight, large matching. A C • Greedy algorithm. • Maximal matching. • Within factor of 2 in weight. B • Merge all communities at once. D E • Maintains some balance. • Produces different results. G F • Agnostic to weighting, matching... • Can maximize modularity, minimize conductance. • Modifying matching permits easy exploration. PPAM11—Parallel Community Detection—Jason Riedy 15/25
  • 16. New parallel method • We use a matching to avoid the queue. • Compute a heavy weight, large matching. A C • Greedy algorithm. • Maximal matching. • Within factor of 2 in weight. B • Merge all communities at once. D E • Maintains some balance. • Produces different results. G F • Agnostic to weighting, matching... • Can maximize modularity, minimize conductance. • Modifying matching permits easy exploration. PPAM11—Parallel Community Detection—Jason Riedy 16/25
  • 17. Implementation: Cray XMT Tolerates latency by massive multithreading. • Hardware: 128 threads per processor • Every-cycle context switch • Many outstanding memory requests (180/proc) • Flexibly supports dynamic load balancing • Globally hashed address space, no data cache • Support for fine-grained, word-level synchronization • Full/empty bit on with every memory word • 128 processor XMT at Pacific Northwest Nat’l Lab • 500 MHz processors, 16384 threads, 1 TB of shared memory Image: cray.com PPAM11—Parallel Community Detection—Jason Riedy 17/25
  • 18. Implementation: Data structures Extremely basic for graph G = (V, E) • An array of (i, j; w) weighted edge pairs, i < j, |E| • An array to store self-edges, d(i) = w, |V | • A temporary floating-point array for scores, |E| • Two additional temporary arrays (|V |, |E|) to store degrees, matching choices, linked list for compression, ... • Weights count number of agglomerated vertices or edges. • Scoring methods need only vertex-local counts. • Greedy matching parallelizes trivially over the edge array. • Contraction compacts the edge list in place through a temporary, hashed linked list of duplicates. • Full-empty bits or compare&swap keep correctness. PPAM11—Parallel Community Detection—Jason Riedy 18/25
  • 19. Data and experiment • Generated R-MAT graphs [Chakrabarti, et al.] (a = 0.55, b = c = 0.1, and d = 0.25) and extracted the largest component. Scale Fact. |V | |E| Avg. degree Edge group 18 8 236 605 2 009 752 8.5 2M 16 252 427 3 936 239 15.6 4M 32 259 372 7 605 572 29.3 8M 19 8 467 993 3 480 977 7.4 4M 16 502 152 7 369 885 14.7 8M 32 517 452 14 853 837 28.7 16M • Three runs with each parameter setting. • Tested both modularity (CNM for Clauset-Newman-Moore) and McCloskey-Bader’s filtered modularity (MB). PPAM11—Parallel Community Detection—Jason Riedy 19/25
  • 20. Performance: Time CNM MB 210 1321.1s 1316.6s 464.4s 460.8s 28 Edge group 144.6s 144.7s q 2M Time (s) 26 4M q q 8M 51.4s 19.8s 51.9s 19.7s q q 16M 24 9.8s 10.0s q q q q q q q q q q 22 q 4.3s q q 4.5s q q q q q q q q q q q q q q q q 2.7s 2.7s 1 2 4 8 16 32 64 128 1 2 4 8 16 32 64 128 Number of processors PPAM11—Parallel Community Detection—Jason Riedy 20/25
  • 21. Performance: Scaling CNM MB 128 16M: 66.8x at 128P 16M: 66.7x at 128P 64 8M: 48.6x at 96P 8M: 48.0x at 96P 4M: 33.4x at 48P 4M: 32.5x at 48P 32 2M: 19.2x at 32P 2M: 19.4x at 48P Edge group q q q q q q q q q q q q q 16 q q 2M Speedup q q q q q q q q q q q 4M q 8 q q 8M q q 16M 4 q q q 2 q q q q 1 q q 1 2 4 8 16 32 64 128 1 2 4 8 16 32 64 128 Number of processors PPAM11—Parallel Community Detection—Jason Riedy 21/25
  • 22. Performance: Modularity (18, 8): 0.423 (18,16): 0.425 • SNAP: GT’s sequential (18,32): 0.418 0.4 implementation. • Plain modularity (CNM) q (18, 8): 0.319 0.3 known for too-large q (18, 8): 0.240 communities (resolution Modularity Implementation 0.2 q Parallel limit). (18,16): 0.187 q SNAP (18,16): 0.171 q • MB filters out statistically 0.1 q (18,32): 0.104 negligible merges, smaller q (18,32): 0.084 communities & modularity. (18,16): 0.013 (18, 8): 0.023 (18,32): 0.009 • Our more balanced, parallel CNM MB algorithm falls between the Scoring method two? PPAM11—Parallel Community Detection—Jason Riedy 22/25
  • 23. Performance: Large and real-world Large R-MAT Scale 27, edge factor 16 generated a component with 122 million vertices and 1.99 billion edges. 7258 seconds ≈ 2 hours on 128P R-MAT known not to have much community structure... LiveJournal 4.8 million vertices, 68 million edges 2779 seconds ≈ 47 minutes on 128P Long tail of around 1 000 matched pairs per iteration. Tail is important to modularity and community count, but to users? PPAM11—Parallel Community Detection—Jason Riedy 23/25
  • 24. Conclusions and plans • First massively parallel algorithm based on matching for agglomerative community detection. • Non-deterministic algorithm, still studying impact. • Now can easily experiment with agglomerative community detection on real-world graphs. • How volatile are modularity and conductance to perturbations? • What matching schemes work well? • How do different metrics compare in applications? • Experimenting on OpenMP platforms. • CSR format is more cache-friendly, but uses more memory. • Extending to streaming graph data! • Code available on request, to be integrated into GT packages like GraphCT. PPAM11—Parallel Community Detection—Jason Riedy 24/25
  • 25. And if you can do better... Then please do! 10th DIMACS Implementation Challenge — Graph Partitioning and Graph Clustering • 12-13 Feb, 2012 in Atlanta, http://www.cc.gatech.edu/dimacs10/ • Paper deadline: 21 October, 2011 http://www.cc.gatech.edu/dimacs10/data/call-for-papers.pdf • Challenge problem statement and evaluation rules: http://www.cc.gatech.edu/dimacs10/data/dimacs10-rules.pdf • Co-sponsored by DIMACS, by the Command, Control, and Interoperability Center for Advanced Data Analysis (CCICADA); Pacific Northwest National Lab.; Sandia National Lab.; and Deutsche Forschungsgemeinschaft (DFG). PPAM11—Parallel Community Detection—Jason Riedy 25/25
  • 26. Acknowledgment of support PPAM11—Parallel Community Detection—Jason Riedy 26/25
  • 27. Bibliography I D. Bader and J. McCloskey. Modularity and graph algorithms. Presented at UMBC, Sept. 2009. J. Berry., B. Hendrickson, R. LaViolette, and C. Phillips. Tolerating the community detection resolution limit with edge weighting. CoRR, abs/0903.1072, 2009. U. Brandes, D. Delling, M. Gaertler, R. G¨rke, M. Hoefer, o Z. Nikoloski, and D. Wagner. On modularity clustering. IEEE Trans. Knowledge and Data Engineering, 20(2):172–188, 2008. PPAM11—Parallel Community Detection—Jason Riedy 27/25
  • 28. Bibliography II D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In Proc. 4th SIAM Intl. Conf. on Data Mining (SDM), Orlando, FL, Apr. 2004. SIAM. A. Clauset, M. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70(6):66111, 2004. M. Newman. Modularity and community structure in networks. Proc. of the National Academy of Sciences, 103(23):8577–8582, 2006. K. Wakita and T. Tsurumi. Finding community structure in mega-scale social networks. CoRR, abs/cs/0702048, 2007. PPAM11—Parallel Community Detection—Jason Riedy 28/25