SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Dynamic Batch Parallel Algorithms for Updating PageRank
Subhajit Sahu†, Kishore Kothapalli† and Dip Sankar Banerjee‡
†International Institute of Information Technology Hyderabad, India.
‡Indian Institute of Technology Jodhpur, India.
subhajit.sahu@research.,kkishore@iiit.ac.in, dipsankarb@iitj.ac.in
Acknowledgement
This work is partially supported by a grant from the Department of Science and Technology (DST), India, under the National
Supercomputing Mission (NSM) R&D in Exascale initiative vide Ref. No: DST/NSM/R&D Exascale/2021/16.
References
[1] P. Garg and K. Kothapalli, “STIC-D: Algorithmic Techniques For Efficient Parallel Pagerank Computation on Real-World Graphs,” in
Proceedings of the 17th International Conference on Distributed Computing and Networking - ICDCN ’16. ACM Press, 01 2016, pp. 1—-10.
[2] H. K. Giri, M. Haque, and D. S. Banerjee, “HyPR: Hybrid Page Ranking on Evolving Graphs,” in Proc. IEEE 27th International Conference on
High Performance Computing, Data, and Analytics (HiPC), 2020, pp. 62–71.
Results
Batched vs Cumulative update
- CPU: 4066×, 2998× of 5000 edges batch wrt
single-edge cumulative update.
- GPU: 1712×, 2324× of 5000 edges batch wrt
cumulative single-edge update.
Comparison with state-of-the-art
- CPU: 6.1×, 8.6× wrt static plain STIC-D PR [1].
- GPU: 9.8×, 9.3× wrt naive dynamic nvGraph PR.
- CPU: 4.2×, 5.8× wrt Pure CPU HyPR [2].
- GPU: 1.9×, 1.8× wrt Pure GPU HyPR.
Figure 2: Comparison with pure-CPU HyPR and plain STIC-D PR on
the CPU; speedup of DynamicLevelwisePR on the respective bars
(top). Comparison with pure-GPU HyPR and naive dynamic
nvGraph PR on the GPU; speedup of DynamicMonolithicPR on the
respective bars (bottom). Averaged over batch sizes of 500, 1000,
2000, 5000, and 10000.
Figure 3: Speedup of batched DynamicLevelwisePR with respect
to cumulative single-edge updates (same approach) on the CPU
is shown on the top. Speedup of batched DynamicMonolithicPR
with respect to cumulative single-edge updates with the same
approach on the GPU is shown on the bottom. Batch sizes of 500,
1000, 5000, and 10000 are shown.
Dataset
- From the SuiteSparse Matrix Collection.
- Add self-loops to dead ends in all graphs.
- Number of vertices vary from 75k to 41M.
- Number of edges vary from 524k to 1.1B.
Batch generation
- Batch sizes vary from 500 to 10,000 edges.
- Edge insertions, deletions in equal mix.
- High degree vertices have higher chance
of selection (mimic real-world graphs).
- No new vertices are added or removed.
Performance measurement
- 32-bit integers for CSR representation.
- 32-bit floats for rank vector.
- L∞-norm for error measurement,
(L2-norm for nvGraph PageRank).
- Measured time only rank computation.
Platform
- Intel(R) Xeon(R) Silver 4116 CPU (12 cores) x 2
Cache L1: 768KB, L2: 12MB, L3: 16MB (shared).
- NVIDIA Tesla V100 GPU (16GB PCIe),
14 TFLOPs SP (84 SMs x 64 FP/INT cores),
- CentOS 7.9, OpenMP 5.0, CUDA 11.3, GCC 9.3.
Our Approaches
DynamicLevelwisePR
- Contrast to full power-iteration.
- Process vertices in levels of SCCs.
- Avoid converged/unstable vertices.
- No per-iteration sharing of ranks.
- Faster on CPU with OpenMP.
- Slightly higher error.
- Requires graph to be dead-end free.
DynamicMonolithicPR
- Full power-iteration, process all vertices.
- Group vertices by SCC for better access.
- Partition vertices by in-degree on GPU.
- Use old ranks, skip unaffected vertices.
- Affected vertices found with DFS.
- Faster on GPU with CUDA.
Introduction
Types of Dynamic graph algorithms
- Incremental: handles 1 edge/vertex insertion.
- Decremental: handles 1 edge/vertex deletion.
- Fully dynamic: handles 1 insertion or deletion.
- Batched fully dynamic: handles n insertions
and/or deletions.
Benefits of Dynamic graph algorithms
- Reduces time needed for performing analytics.
- Enables interactivity with dataset.
- Parallel fully dynamic algorithms accept a
batch of updates to minimize computation
needed in contrast to fully dynamic ones.
PageRank computation approaches
- Matrix multiplication.
- Power-iteration (push vs pull).
- Random walk (approximate).
Challenges & Limitations
- Graphs are massive and constantly updated.
- Existing dynamic algorithms do not utilize
reducibility of graphs.
- Vertices which are dependent upon other
vertices to converge are still processed.
- Locality benefits of SCCs are not explored.
PageRank has applications in:
- Ranking of websites.
- Measuring scientific impact of researchers.
- Finding the best teams and athletes.
- Ranking companies by talent concentration.
- Predicting road/foot traffic in urban spaces.
- Analysing protein networks.
- Finding the most authoritative news sources
- Identifying parts of brain that change jointly.
- Toxic waste management.
- PageRank is a link-analysis algorithm.
- By Larry Page and Sergey Brin in 1996.
- For ordering information on the web.
- Represented with a random-surfer model.
- Rank of a page is defined recursively.
- Calculate iteratively with power-iteration.
Fighting Fake news
- Click-Gap: When is Facebook is driving
disproportionate amounts of traffic to
websites.
- Effort to rid fake news from Facebook’s
services.
- Is a website relying on Facebook to drive
significant traffic, but not well ranked by the
rest of the web?
Debugging complex software systems
- MonitorRank: a version of PageRank designed
to analyze complex, engineered systems.
- Returns a ranked list of systems based on the
likelihood that they contributed to, or
participated in, an anomalous situation.
Finding the most original writers
- BookRank: using a network of 19th century
authors to find quantitative evidence that
Jane Austin and Walter Scott were found to be
the most original authors of the 19th century.
Finding topical authorities
- TwitterRank: using the teleportation vector
and topic-specific transition probabilities to
localize the PageRank vector.
1
2
3
4

Weitere ähnliche Inhalte

Ähnlich wie Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER

Aastha Grover Resume (2)
Aastha Grover Resume (2)Aastha Grover Resume (2)
Aastha Grover Resume (2)
Aastha Grover
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
sscdotopen
 
Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
inside-BigData.com
 
Sigmaplot 13 PPT
Sigmaplot 13 PPTSigmaplot 13 PPT
Sigmaplot 13 PPT
Siriyak Cr
 
Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)
Mahadevan N
 

Ähnlich wie Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER (20)

Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Aastha Grover Resume (2)
Aastha Grover Resume (2)Aastha Grover Resume (2)
Aastha Grover Resume (2)
 
Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on Summit
 
AI Super computer update
AI Super computer update AI Super computer update
AI Super computer update
 
Satwik Mishra resume
Satwik Mishra resumeSatwik Mishra resume
Satwik Mishra resume
 
Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
 
Sigmaplot 13 PPT
Sigmaplot 13 PPTSigmaplot 13 PPT
Sigmaplot 13 PPT
 
Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)Resume_Mahadevan_new (2)
Resume_Mahadevan_new (2)
 
MTECH IT syllabus
MTECH IT syllabusMTECH IT syllabus
MTECH IT syllabus
 

Mehr von Subhajit Sahu

DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
Subhajit Sahu
 
A Dynamic Algorithm for Local Community Detection in Graphs : NOTES
A Dynamic Algorithm for Local Community Detection in Graphs : NOTESA Dynamic Algorithm for Local Community Detection in Graphs : NOTES
A Dynamic Algorithm for Local Community Detection in Graphs : NOTES
Subhajit Sahu
 
Scalable Static and Dynamic Community Detection Using Grappolo : NOTES
Scalable Static and Dynamic Community Detection Using Grappolo : NOTESScalable Static and Dynamic Community Detection Using Grappolo : NOTES
Scalable Static and Dynamic Community Detection Using Grappolo : NOTES
Subhajit Sahu
 
Application Areas of Community Detection: A Review : NOTES
Application Areas of Community Detection: A Review : NOTESApplication Areas of Community Detection: A Review : NOTES
Application Areas of Community Detection: A Review : NOTES
Subhajit Sahu
 
Community Detection on the GPU : NOTES
Community Detection on the GPU : NOTESCommunity Detection on the GPU : NOTES
Community Detection on the GPU : NOTES
Subhajit Sahu
 

Mehr von Subhajit Sahu (20)

DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTESDyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
 
Shared memory Parallelism (NOTES)
Shared memory Parallelism (NOTES)Shared memory Parallelism (NOTES)
Shared memory Parallelism (NOTES)
 
A Dynamic Algorithm for Local Community Detection in Graphs : NOTES
A Dynamic Algorithm for Local Community Detection in Graphs : NOTESA Dynamic Algorithm for Local Community Detection in Graphs : NOTES
A Dynamic Algorithm for Local Community Detection in Graphs : NOTES
 
Scalable Static and Dynamic Community Detection Using Grappolo : NOTES
Scalable Static and Dynamic Community Detection Using Grappolo : NOTESScalable Static and Dynamic Community Detection Using Grappolo : NOTES
Scalable Static and Dynamic Community Detection Using Grappolo : NOTES
 
Application Areas of Community Detection: A Review : NOTES
Application Areas of Community Detection: A Review : NOTESApplication Areas of Community Detection: A Review : NOTES
Application Areas of Community Detection: A Review : NOTES
 
Community Detection on the GPU : NOTES
Community Detection on the GPU : NOTESCommunity Detection on the GPU : NOTES
Community Detection on the GPU : NOTES
 
Survey for extra-child-process package : NOTES
Survey for extra-child-process package : NOTESSurvey for extra-child-process package : NOTES
Survey for extra-child-process package : NOTES
 
Fast Incremental Community Detection on Dynamic Graphs : NOTES
Fast Incremental Community Detection on Dynamic Graphs : NOTESFast Incremental Community Detection on Dynamic Graphs : NOTES
Fast Incremental Community Detection on Dynamic Graphs : NOTES
 
Can you fix farming by going back 8000 years : NOTES
Can you fix farming by going back 8000 years : NOTESCan you fix farming by going back 8000 years : NOTES
Can you fix farming by going back 8000 years : NOTES
 
HITS algorithm : NOTES
HITS algorithm : NOTESHITS algorithm : NOTES
HITS algorithm : NOTES
 
Basic Computer Architecture and the Case for GPUs : NOTES
Basic Computer Architecture and the Case for GPUs : NOTESBasic Computer Architecture and the Case for GPUs : NOTES
Basic Computer Architecture and the Case for GPUs : NOTES
 
Are Satellites Covered in Gold Foil : NOTES
Are Satellites Covered in Gold Foil : NOTESAre Satellites Covered in Gold Foil : NOTES
Are Satellites Covered in Gold Foil : NOTES
 
Taxation for Traders < Markets and Taxation : NOTES
Taxation for Traders < Markets and Taxation : NOTESTaxation for Traders < Markets and Taxation : NOTES
Taxation for Traders < Markets and Taxation : NOTES
 
A Generalization of the PageRank Algorithm : NOTES
A Generalization of the PageRank Algorithm : NOTESA Generalization of the PageRank Algorithm : NOTES
A Generalization of the PageRank Algorithm : NOTES
 
ApproxBioWear: Approximating Additions for Efficient Biomedical Wearable Comp...
ApproxBioWear: Approximating Additions for Efficient Biomedical Wearable Comp...ApproxBioWear: Approximating Additions for Efficient Biomedical Wearable Comp...
ApproxBioWear: Approximating Additions for Efficient Biomedical Wearable Comp...
 
Income Tax Calender 2021 (ITD) : NOTES
Income Tax Calender 2021 (ITD) : NOTESIncome Tax Calender 2021 (ITD) : NOTES
Income Tax Calender 2021 (ITD) : NOTES
 
Youngistaan Foundation: Annual Report 2020-21 : NOTES
Youngistaan Foundation: Annual Report 2020-21 : NOTESYoungistaan Foundation: Annual Report 2020-21 : NOTES
Youngistaan Foundation: Annual Report 2020-21 : NOTES
 
Youngistaan: Voting awarness-campaign : NOTES
Youngistaan: Voting awarness-campaign : NOTESYoungistaan: Voting awarness-campaign : NOTES
Youngistaan: Voting awarness-campaign : NOTES
 
Cost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTESCost Efficient PageRank Computation using GPU : NOTES
Cost Efficient PageRank Computation using GPU : NOTES
 
Rank adjustment strategies for Dynamic PageRank : REPORT
Rank adjustment strategies for Dynamic PageRank : REPORTRank adjustment strategies for Dynamic PageRank : REPORT
Rank adjustment strategies for Dynamic PageRank : REPORT
 

Kürzlich hochgeladen

biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Silpa
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Silpa
 

Kürzlich hochgeladen (20)

Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
GBSN - Biochemistry (Unit 2)
GBSN - Biochemistry (Unit 2)GBSN - Biochemistry (Unit 2)
GBSN - Biochemistry (Unit 2)
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 

Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER

  • 1. Dynamic Batch Parallel Algorithms for Updating PageRank Subhajit Sahu†, Kishore Kothapalli† and Dip Sankar Banerjee‡ †International Institute of Information Technology Hyderabad, India. ‡Indian Institute of Technology Jodhpur, India. subhajit.sahu@research.,kkishore@iiit.ac.in, dipsankarb@iitj.ac.in Acknowledgement This work is partially supported by a grant from the Department of Science and Technology (DST), India, under the National Supercomputing Mission (NSM) R&D in Exascale initiative vide Ref. No: DST/NSM/R&D Exascale/2021/16. References [1] P. Garg and K. Kothapalli, “STIC-D: Algorithmic Techniques For Efficient Parallel Pagerank Computation on Real-World Graphs,” in Proceedings of the 17th International Conference on Distributed Computing and Networking - ICDCN ’16. ACM Press, 01 2016, pp. 1—-10. [2] H. K. Giri, M. Haque, and D. S. Banerjee, “HyPR: Hybrid Page Ranking on Evolving Graphs,” in Proc. IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC), 2020, pp. 62–71. Results Batched vs Cumulative update - CPU: 4066×, 2998× of 5000 edges batch wrt single-edge cumulative update. - GPU: 1712×, 2324× of 5000 edges batch wrt cumulative single-edge update. Comparison with state-of-the-art - CPU: 6.1×, 8.6× wrt static plain STIC-D PR [1]. - GPU: 9.8×, 9.3× wrt naive dynamic nvGraph PR. - CPU: 4.2×, 5.8× wrt Pure CPU HyPR [2]. - GPU: 1.9×, 1.8× wrt Pure GPU HyPR. Figure 2: Comparison with pure-CPU HyPR and plain STIC-D PR on the CPU; speedup of DynamicLevelwisePR on the respective bars (top). Comparison with pure-GPU HyPR and naive dynamic nvGraph PR on the GPU; speedup of DynamicMonolithicPR on the respective bars (bottom). Averaged over batch sizes of 500, 1000, 2000, 5000, and 10000. Figure 3: Speedup of batched DynamicLevelwisePR with respect to cumulative single-edge updates (same approach) on the CPU is shown on the top. Speedup of batched DynamicMonolithicPR with respect to cumulative single-edge updates with the same approach on the GPU is shown on the bottom. Batch sizes of 500, 1000, 5000, and 10000 are shown. Dataset - From the SuiteSparse Matrix Collection. - Add self-loops to dead ends in all graphs. - Number of vertices vary from 75k to 41M. - Number of edges vary from 524k to 1.1B. Batch generation - Batch sizes vary from 500 to 10,000 edges. - Edge insertions, deletions in equal mix. - High degree vertices have higher chance of selection (mimic real-world graphs). - No new vertices are added or removed. Performance measurement - 32-bit integers for CSR representation. - 32-bit floats for rank vector. - L∞-norm for error measurement, (L2-norm for nvGraph PageRank). - Measured time only rank computation. Platform - Intel(R) Xeon(R) Silver 4116 CPU (12 cores) x 2 Cache L1: 768KB, L2: 12MB, L3: 16MB (shared). - NVIDIA Tesla V100 GPU (16GB PCIe), 14 TFLOPs SP (84 SMs x 64 FP/INT cores), - CentOS 7.9, OpenMP 5.0, CUDA 11.3, GCC 9.3. Our Approaches DynamicLevelwisePR - Contrast to full power-iteration. - Process vertices in levels of SCCs. - Avoid converged/unstable vertices. - No per-iteration sharing of ranks. - Faster on CPU with OpenMP. - Slightly higher error. - Requires graph to be dead-end free. DynamicMonolithicPR - Full power-iteration, process all vertices. - Group vertices by SCC for better access. - Partition vertices by in-degree on GPU. - Use old ranks, skip unaffected vertices. - Affected vertices found with DFS. - Faster on GPU with CUDA. Introduction Types of Dynamic graph algorithms - Incremental: handles 1 edge/vertex insertion. - Decremental: handles 1 edge/vertex deletion. - Fully dynamic: handles 1 insertion or deletion. - Batched fully dynamic: handles n insertions and/or deletions. Benefits of Dynamic graph algorithms - Reduces time needed for performing analytics. - Enables interactivity with dataset. - Parallel fully dynamic algorithms accept a batch of updates to minimize computation needed in contrast to fully dynamic ones. PageRank computation approaches - Matrix multiplication. - Power-iteration (push vs pull). - Random walk (approximate). Challenges & Limitations - Graphs are massive and constantly updated. - Existing dynamic algorithms do not utilize reducibility of graphs. - Vertices which are dependent upon other vertices to converge are still processed. - Locality benefits of SCCs are not explored. PageRank has applications in: - Ranking of websites. - Measuring scientific impact of researchers. - Finding the best teams and athletes. - Ranking companies by talent concentration. - Predicting road/foot traffic in urban spaces. - Analysing protein networks. - Finding the most authoritative news sources - Identifying parts of brain that change jointly. - Toxic waste management. - PageRank is a link-analysis algorithm. - By Larry Page and Sergey Brin in 1996. - For ordering information on the web. - Represented with a random-surfer model. - Rank of a page is defined recursively. - Calculate iteratively with power-iteration. Fighting Fake news - Click-Gap: When is Facebook is driving disproportionate amounts of traffic to websites. - Effort to rid fake news from Facebook’s services. - Is a website relying on Facebook to drive significant traffic, but not well ranked by the rest of the web? Debugging complex software systems - MonitorRank: a version of PageRank designed to analyze complex, engineered systems. - Returns a ranked list of systems based on the likelihood that they contributed to, or participated in, an anomalous situation. Finding the most original writers - BookRank: using a network of 19th century authors to find quantitative evidence that Jane Austin and Walter Scott were found to be the most original authors of the 19th century. Finding topical authorities - TwitterRank: using the teleportation vector and topic-specific transition probabilities to localize the PageRank vector. 1 2 3 4