This paper presents two algorithms for efficiently computing PageRank on dynamically updating graphs in a batched manner: DynamicLevelwisePR and DynamicMonolithicPR. DynamicLevelwisePR processes vertices level-by-level based on strongly connected components and avoids recomputing converged vertices on the CPU. DynamicMonolithicPR uses a full power iteration approach on the GPU that partitions vertices by in-degree and skips unaffected vertices. Evaluation on real-world graphs shows the batched algorithms provide speedups of up to 4000x over single-edge updates and outperform other state-of-the-art dynamic PageRank algorithms.
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER
1. Dynamic Batch Parallel Algorithms for Updating PageRank
Subhajit Sahu†, Kishore Kothapalli† and Dip Sankar Banerjee‡
†International Institute of Information Technology Hyderabad, India.
‡Indian Institute of Technology Jodhpur, India.
subhajit.sahu@research.,kkishore@iiit.ac.in, dipsankarb@iitj.ac.in
Acknowledgement
This work is partially supported by a grant from the Department of Science and Technology (DST), India, under the National
Supercomputing Mission (NSM) R&D in Exascale initiative vide Ref. No: DST/NSM/R&D Exascale/2021/16.
References
[1] P. Garg and K. Kothapalli, “STIC-D: Algorithmic Techniques For Efficient Parallel Pagerank Computation on Real-World Graphs,” in
Proceedings of the 17th International Conference on Distributed Computing and Networking - ICDCN ’16. ACM Press, 01 2016, pp. 1—-10.
[2] H. K. Giri, M. Haque, and D. S. Banerjee, “HyPR: Hybrid Page Ranking on Evolving Graphs,” in Proc. IEEE 27th International Conference on
High Performance Computing, Data, and Analytics (HiPC), 2020, pp. 62–71.
Results
Batched vs Cumulative update
- CPU: 4066×, 2998× of 5000 edges batch wrt
single-edge cumulative update.
- GPU: 1712×, 2324× of 5000 edges batch wrt
cumulative single-edge update.
Comparison with state-of-the-art
- CPU: 6.1×, 8.6× wrt static plain STIC-D PR [1].
- GPU: 9.8×, 9.3× wrt naive dynamic nvGraph PR.
- CPU: 4.2×, 5.8× wrt Pure CPU HyPR [2].
- GPU: 1.9×, 1.8× wrt Pure GPU HyPR.
Figure 2: Comparison with pure-CPU HyPR and plain STIC-D PR on
the CPU; speedup of DynamicLevelwisePR on the respective bars
(top). Comparison with pure-GPU HyPR and naive dynamic
nvGraph PR on the GPU; speedup of DynamicMonolithicPR on the
respective bars (bottom). Averaged over batch sizes of 500, 1000,
2000, 5000, and 10000.
Figure 3: Speedup of batched DynamicLevelwisePR with respect
to cumulative single-edge updates (same approach) on the CPU
is shown on the top. Speedup of batched DynamicMonolithicPR
with respect to cumulative single-edge updates with the same
approach on the GPU is shown on the bottom. Batch sizes of 500,
1000, 5000, and 10000 are shown.
Dataset
- From the SuiteSparse Matrix Collection.
- Add self-loops to dead ends in all graphs.
- Number of vertices vary from 75k to 41M.
- Number of edges vary from 524k to 1.1B.
Batch generation
- Batch sizes vary from 500 to 10,000 edges.
- Edge insertions, deletions in equal mix.
- High degree vertices have higher chance
of selection (mimic real-world graphs).
- No new vertices are added or removed.
Performance measurement
- 32-bit integers for CSR representation.
- 32-bit floats for rank vector.
- L∞-norm for error measurement,
(L2-norm for nvGraph PageRank).
- Measured time only rank computation.
Platform
- Intel(R) Xeon(R) Silver 4116 CPU (12 cores) x 2
Cache L1: 768KB, L2: 12MB, L3: 16MB (shared).
- NVIDIA Tesla V100 GPU (16GB PCIe),
14 TFLOPs SP (84 SMs x 64 FP/INT cores),
- CentOS 7.9, OpenMP 5.0, CUDA 11.3, GCC 9.3.
Our Approaches
DynamicLevelwisePR
- Contrast to full power-iteration.
- Process vertices in levels of SCCs.
- Avoid converged/unstable vertices.
- No per-iteration sharing of ranks.
- Faster on CPU with OpenMP.
- Slightly higher error.
- Requires graph to be dead-end free.
DynamicMonolithicPR
- Full power-iteration, process all vertices.
- Group vertices by SCC for better access.
- Partition vertices by in-degree on GPU.
- Use old ranks, skip unaffected vertices.
- Affected vertices found with DFS.
- Faster on GPU with CUDA.
Introduction
Types of Dynamic graph algorithms
- Incremental: handles 1 edge/vertex insertion.
- Decremental: handles 1 edge/vertex deletion.
- Fully dynamic: handles 1 insertion or deletion.
- Batched fully dynamic: handles n insertions
and/or deletions.
Benefits of Dynamic graph algorithms
- Reduces time needed for performing analytics.
- Enables interactivity with dataset.
- Parallel fully dynamic algorithms accept a
batch of updates to minimize computation
needed in contrast to fully dynamic ones.
PageRank computation approaches
- Matrix multiplication.
- Power-iteration (push vs pull).
- Random walk (approximate).
Challenges & Limitations
- Graphs are massive and constantly updated.
- Existing dynamic algorithms do not utilize
reducibility of graphs.
- Vertices which are dependent upon other
vertices to converge are still processed.
- Locality benefits of SCCs are not explored.
PageRank has applications in:
- Ranking of websites.
- Measuring scientific impact of researchers.
- Finding the best teams and athletes.
- Ranking companies by talent concentration.
- Predicting road/foot traffic in urban spaces.
- Analysing protein networks.
- Finding the most authoritative news sources
- Identifying parts of brain that change jointly.
- Toxic waste management.
- PageRank is a link-analysis algorithm.
- By Larry Page and Sergey Brin in 1996.
- For ordering information on the web.
- Represented with a random-surfer model.
- Rank of a page is defined recursively.
- Calculate iteratively with power-iteration.
Fighting Fake news
- Click-Gap: When is Facebook is driving
disproportionate amounts of traffic to
websites.
- Effort to rid fake news from Facebook’s
services.
- Is a website relying on Facebook to drive
significant traffic, but not well ranked by the
rest of the web?
Debugging complex software systems
- MonitorRank: a version of PageRank designed
to analyze complex, engineered systems.
- Returns a ranked list of systems based on the
likelihood that they contributed to, or
participated in, an anomalous situation.
Finding the most original writers
- BookRank: using a network of 19th century
authors to find quantitative evidence that
Jane Austin and Walter Scott were found to be
the most original authors of the 19th century.
Finding topical authorities
- TwitterRank: using the teleportation vector
and topic-specific transition probabilities to
localize the PageRank vector.
1
2
3
4