SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
BenchMarking Tool for
Graph Algorithms
IIIT-H Cloud Computing - Major Project
By:
Abhinaba Sarkar 201405616
Malavika Reddy 201201193
Yash Khandelwal 201302164
Nikita Kad 201330030
Description
● In computer science and mathematics, graphs are abstract data structures that model
structural relationships among objects. They are now widely used for data modeling in
application domains for which identifying relationship patterns, rules, and anomalies is useful.
● These domains include the web graph, social networks,etc. The ever-increasing size of graph-
structured data for these applications creates a critical need for scalable systems that can
process large amounts of it efficiently.
● The project aims at making a benchmarking tool for testing the performance of graph
algorithms like BFS, Pagerank,etc. with MapReduce, Giraph, GraphLab and Neo4j and testing
which approach works better on what kind of graphs.
Motivation
● Analyze the runtime of different types of graph algorithms on different
types of distributed systems.
● Performing computation on a graph data structure requires processing at
each node.
● Each node contains node-specific data as well as links (edges) to other
nodes. So computation must traverse the graph which will take a huge
amount of time.
Approach
The BFS/SSSP algorithm is broken in 2 tasks:
● Map Task:In each Map task, we discover all the neighbors of the node currently in queue (we
used color encoding GRAY for nodes in queue) and add them to our graph.
● Reduce Task:In each Reduce task, we set the correct level of the nodes and update the graph.
The pagerank algorithm is also broken in 2 steps:
● Map Task: Each page emit its neighbours and current pagerank.
● Reduce Task: For each key(page) new page rank is calculated using pagerank emitted in the
map task.
○ PR(A)=(1-d) + d(PR(T1)/C(T1) + ... +PR(Tn)/C(Tn)) Where - C(P) is the cardinality (out-
degree) of page P, d is the damping (“random URL”) factor.
Dijkstra:
● Map task : In each of the map tasks, neighbors are discovered and put into
the queue with color coding gray.
● Reduce task : In each of the reduce tasks, we select the nodes according to
the shortest distances from the current node.
Applications
In today’s world, dynamic social graphs (like:
linkedin, twitter and facebook) are not feasible to
process in single node. Therefore we need to
benchmark the runtime of different graph
algorithms in distributed system.
Example graph: LinkedIn’s social graph
Comlexity
● BFS: The complexity of standard BFS algorithm is O(V+E) but because of
the overhead of read/write in distributed computing, the order reaches O
(E*Depth).
● Similar is the case for Dijkstra’s algorithm. But number of iterations will be
higher than BFS.
● Page Rank: The Complexity of pagerank in distributed system is –
(No. of Node + No. of Relations)*Iterations
Graph-Lab:
Nodes Time
1000 6.029 sec
10,000 20.154 sec
1 million 1 min 11.124
sec
Nodes Time
1000 4.852 sec
10,000 13.029 sec
1 million 1 min 10.576sec
Page-Rank
Dijkstra
Benchmarking
Conclusion and Future Work
From the experimental results, it is seen that the time taken for pagerank algorithm is directly
proportional to the number of relations in the graph when the number of nodes and iterations are
constant. This explains the huge difference in time
The runtime of BFS is directly proportional to the depth of the graph. So, greater the depth, more will
be the number of iterations and hence more time.
Future Work:
Taking the input graph from file adds a huge overhead of reading and writing to files in each
iteration, so if somehow we can store the graph and its properties in a Database, the read/write
overhead will be gone and the query time will be reduced. So,we plan to include Database in it.

Weitere ähnliche Inhalte

Was ist angesagt?

Dynamic Mapping of Raster Data (IV 2009)
Dynamic Mapping of Raster Data (IV 2009)Dynamic Mapping of Raster Data (IV 2009)
Dynamic Mapping of Raster Data (IV 2009)Matthias Trapp
 
Analysis of Impact of Graph Theory in Computer Application
Analysis of Impact of Graph Theory in Computer ApplicationAnalysis of Impact of Graph Theory in Computer Application
Analysis of Impact of Graph Theory in Computer ApplicationIRJET Journal
 
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...I3E Technologies
 
System architecture
System architectureSystem architecture
System architectureSanjay Raj
 
How Rough Is Your Runway?
How Rough Is Your Runway? How Rough Is Your Runway?
How Rough Is Your Runway? Safe Software
 
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsMeta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsShantanu Sharma
 
Improvement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationImprovement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationBeniamino Murgante
 
5 spatial data editing
5 spatial data editing5 spatial data editing
5 spatial data editinganita bodke
 
Par add shared ifc parameters
Par add shared ifc parametersPar add shared ifc parameters
Par add shared ifc parametersMenno Mekes
 
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...Mahmud Hossain
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyGuy Lansley
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISShaun Lewis
 
Synthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modelingSynthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modelingGraph-TA
 
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Ural-PDC
 
Graphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platformsGraphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platformsGraph-TA
 

Was ist angesagt? (20)

Dynamic Mapping of Raster Data (IV 2009)
Dynamic Mapping of Raster Data (IV 2009)Dynamic Mapping of Raster Data (IV 2009)
Dynamic Mapping of Raster Data (IV 2009)
 
3D Analyst
3D Analyst3D Analyst
3D Analyst
 
Analysis of Impact of Graph Theory in Computer Application
Analysis of Impact of Graph Theory in Computer ApplicationAnalysis of Impact of Graph Theory in Computer Application
Analysis of Impact of Graph Theory in Computer Application
 
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
ON TRAFFIC-AWARE PARTITION AND AGGREGATION IN MAPREDUCE FOR BIG DATA APPLICAT...
 
System architecture
System architectureSystem architecture
System architecture
 
How Rough Is Your Runway?
How Rough Is Your Runway? How Rough Is Your Runway?
How Rough Is Your Runway?
 
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsMeta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
 
Iccsa stankuteha180611
Iccsa stankuteha180611Iccsa stankuteha180611
Iccsa stankuteha180611
 
Improvement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data ConflationImprovement of Spatial Data Quality Using the Data Conflation
Improvement of Spatial Data Quality Using the Data Conflation
 
5 spatial data editing
5 spatial data editing5 spatial data editing
5 spatial data editing
 
Par add shared ifc parameters
Par add shared ifc parametersPar add shared ifc parameters
Par add shared ifc parameters
 
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
Rosaic: A Round-wise Fair Scheduling Approach for Mobile Clouds Based on Task...
 
Resume
ResumeResume
Resume
 
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy LansleyUsing R to Visualize Spatial Data: R as GIS - Guy Lansley
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GIS
 
Synthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modelingSynthetic Data Generation using exponential random Graph modeling
Synthetic Data Generation using exponential random Graph modeling
 
Maps with leafletR
Maps with leafletRMaps with leafletR
Maps with leafletR
 
BarnieMAT
BarnieMATBarnieMAT
BarnieMAT
 
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
 
Graphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platformsGraphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platforms
 

Andere mochten auch

10 motivi per scegliere emc vspex con xtremio
10 motivi per scegliere emc vspex con xtremio10 motivi per scegliere emc vspex con xtremio
10 motivi per scegliere emc vspex con xtremioMaticmind
 
BioJET előállítás növényi olajokból
BioJET előállítás növényi olajokbólBioJET előállítás növényi olajokból
BioJET előállítás növényi olajokbólSnowcrowmisi
 
Brian Kerr & Associates - Marketing
Brian Kerr & Associates - MarketingBrian Kerr & Associates - Marketing
Brian Kerr & Associates - MarketingBrian Kerr
 
2012 taiwan mission
2012 taiwan mission2012 taiwan mission
2012 taiwan missionStephenyesu
 
Cyber Monday Hour by Hour
Cyber Monday Hour by HourCyber Monday Hour by Hour
Cyber Monday Hour by HourEbates.com
 
Dalayil khatm e nabuwat maa radd e qadyaniyat
Dalayil khatm e nabuwat maa radd e qadyaniyatDalayil khatm e nabuwat maa radd e qadyaniyat
Dalayil khatm e nabuwat maa radd e qadyaniyatsunninews92
 
التقنيات المساندة لذوي الإحتياجات الخاصة
التقنيات المساندة لذوي الإحتياجات الخاصة التقنيات المساندة لذوي الإحتياجات الخاصة
التقنيات المساندة لذوي الإحتياجات الخاصة raz20
 

Andere mochten auch (18)

Innovative lesson-plan
Innovative lesson-planInnovative lesson-plan
Innovative lesson-plan
 
Taiwo Oluwadare
Taiwo OluwadareTaiwo Oluwadare
Taiwo Oluwadare
 
10 motivi per scegliere emc vspex con xtremio
10 motivi per scegliere emc vspex con xtremio10 motivi per scegliere emc vspex con xtremio
10 motivi per scegliere emc vspex con xtremio
 
BioJET előállítás növényi olajokból
BioJET előállítás növényi olajokbólBioJET előállítás növényi olajokból
BioJET előállítás növényi olajokból
 
דוח סופי
דוח סופידוח סופי
דוח סופי
 
Final PPP
Final PPPFinal PPP
Final PPP
 
Brian Kerr & Associates - Marketing
Brian Kerr & Associates - MarketingBrian Kerr & Associates - Marketing
Brian Kerr & Associates - Marketing
 
Ashrafi anthem
Ashrafi anthemAshrafi anthem
Ashrafi anthem
 
2012 taiwan mission
2012 taiwan mission2012 taiwan mission
2012 taiwan mission
 
Tik kelompok2 bab 3
Tik kelompok2 bab 3Tik kelompok2 bab 3
Tik kelompok2 bab 3
 
Untitled Presentation
Untitled PresentationUntitled Presentation
Untitled Presentation
 
Cyber Monday Hour by Hour
Cyber Monday Hour by HourCyber Monday Hour by Hour
Cyber Monday Hour by Hour
 
Sk1 kd1-1-kebutuhan-manusia
Sk1 kd1-1-kebutuhan-manusiaSk1 kd1-1-kebutuhan-manusia
Sk1 kd1-1-kebutuhan-manusia
 
Class 9
Class 9 Class 9
Class 9
 
Android platform
Android platform Android platform
Android platform
 
The history of the aqualand
The history of the aqualandThe history of the aqualand
The history of the aqualand
 
Dalayil khatm e nabuwat maa radd e qadyaniyat
Dalayil khatm e nabuwat maa radd e qadyaniyatDalayil khatm e nabuwat maa radd e qadyaniyat
Dalayil khatm e nabuwat maa radd e qadyaniyat
 
التقنيات المساندة لذوي الإحتياجات الخاصة
التقنيات المساندة لذوي الإحتياجات الخاصة التقنيات المساندة لذوي الإحتياجات الخاصة
التقنيات المساندة لذوي الإحتياجات الخاصة
 

Ähnlich wie Benchmarking tool for graph algorithms

STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...Subhajit Sahu
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on HadoopVivian S. Zhang
 
Benchmarking tool for graph algorithms
Benchmarking tool for graph algorithmsBenchmarking tool for graph algorithms
Benchmarking tool for graph algorithmsYash Khandelwal
 
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...IRJET Journal
 
Parallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A SurveyParallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A SurveyKyong-Ha Lee
 
Chapter 3 principles of parallel algorithm design
Chapter 3   principles of parallel algorithm designChapter 3   principles of parallel algorithm design
Chapter 3 principles of parallel algorithm designDenisAkbar1
 
Map reduce programming model to solve graph problems
Map reduce programming model to solve graph problemsMap reduce programming model to solve graph problems
Map reduce programming model to solve graph problemsNishant Gandhi
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce scriptHaripritha
 
MapReduceAlgorithms.ppt
MapReduceAlgorithms.pptMapReduceAlgorithms.ppt
MapReduceAlgorithms.pptCheeWeiTan10
 
user_defined_functions_forinterpolation
user_defined_functions_forinterpolationuser_defined_functions_forinterpolation
user_defined_functions_forinterpolationsushanth tiruvaipati
 
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors  Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors ijcga
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspectiveপল্লব রায়
 
How to Automate CAD & GIS Integration
How to Automate CAD & GIS IntegrationHow to Automate CAD & GIS Integration
How to Automate CAD & GIS IntegrationSafe Software
 
A Subgraph Pattern Search over Graph Databases
A Subgraph Pattern Search over Graph DatabasesA Subgraph Pattern Search over Graph Databases
A Subgraph Pattern Search over Graph DatabasesIJMER
 
Introduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searchingIntroduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searchingMvenkatarao
 

Ähnlich wie Benchmarking tool for graph algorithms (20)

STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
 
Benchmarking tool for graph algorithms
Benchmarking tool for graph algorithmsBenchmarking tool for graph algorithms
Benchmarking tool for graph algorithms
 
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
Scheduling Algorithm Based Simulator for Resource Allocation Task in Cloud Co...
 
Main map reduce
Main map reduceMain map reduce
Main map reduce
 
Parallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A SurveyParallel Data Processing with MapReduce: A Survey
Parallel Data Processing with MapReduce: A Survey
 
Chap3 slides
Chap3 slidesChap3 slides
Chap3 slides
 
Chapter 3 principles of parallel algorithm design
Chapter 3   principles of parallel algorithm designChapter 3   principles of parallel algorithm design
Chapter 3 principles of parallel algorithm design
 
Map reduce programming model to solve graph problems
Map reduce programming model to solve graph problemsMap reduce programming model to solve graph problems
Map reduce programming model to solve graph problems
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Mapreduce script
Mapreduce scriptMapreduce script
Mapreduce script
 
MapReduceAlgorithms.ppt
MapReduceAlgorithms.pptMapReduceAlgorithms.ppt
MapReduceAlgorithms.ppt
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
 
user_defined_functions_forinterpolation
user_defined_functions_forinterpolationuser_defined_functions_forinterpolation
user_defined_functions_forinterpolation
 
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors  Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
 
How to Automate CAD & GIS Integration
How to Automate CAD & GIS IntegrationHow to Automate CAD & GIS Integration
How to Automate CAD & GIS Integration
 
A Subgraph Pattern Search over Graph Databases
A Subgraph Pattern Search over Graph DatabasesA Subgraph Pattern Search over Graph Databases
A Subgraph Pattern Search over Graph Databases
 
Introduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searchingIntroduction to Data Structures Sorting and searching
Introduction to Data Structures Sorting and searching
 
Pregel
PregelPregel
Pregel
 

Kürzlich hochgeladen

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Kürzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Benchmarking tool for graph algorithms

  • 1. BenchMarking Tool for Graph Algorithms IIIT-H Cloud Computing - Major Project By: Abhinaba Sarkar 201405616 Malavika Reddy 201201193 Yash Khandelwal 201302164 Nikita Kad 201330030
  • 2. Description ● In computer science and mathematics, graphs are abstract data structures that model structural relationships among objects. They are now widely used for data modeling in application domains for which identifying relationship patterns, rules, and anomalies is useful. ● These domains include the web graph, social networks,etc. The ever-increasing size of graph- structured data for these applications creates a critical need for scalable systems that can process large amounts of it efficiently. ● The project aims at making a benchmarking tool for testing the performance of graph algorithms like BFS, Pagerank,etc. with MapReduce, Giraph, GraphLab and Neo4j and testing which approach works better on what kind of graphs.
  • 3. Motivation ● Analyze the runtime of different types of graph algorithms on different types of distributed systems. ● Performing computation on a graph data structure requires processing at each node. ● Each node contains node-specific data as well as links (edges) to other nodes. So computation must traverse the graph which will take a huge amount of time.
  • 4. Approach The BFS/SSSP algorithm is broken in 2 tasks: ● Map Task:In each Map task, we discover all the neighbors of the node currently in queue (we used color encoding GRAY for nodes in queue) and add them to our graph. ● Reduce Task:In each Reduce task, we set the correct level of the nodes and update the graph. The pagerank algorithm is also broken in 2 steps: ● Map Task: Each page emit its neighbours and current pagerank. ● Reduce Task: For each key(page) new page rank is calculated using pagerank emitted in the map task. ○ PR(A)=(1-d) + d(PR(T1)/C(T1) + ... +PR(Tn)/C(Tn)) Where - C(P) is the cardinality (out- degree) of page P, d is the damping (“random URL”) factor. Dijkstra: ● Map task : In each of the map tasks, neighbors are discovered and put into the queue with color coding gray. ● Reduce task : In each of the reduce tasks, we select the nodes according to the shortest distances from the current node.
  • 5. Applications In today’s world, dynamic social graphs (like: linkedin, twitter and facebook) are not feasible to process in single node. Therefore we need to benchmark the runtime of different graph algorithms in distributed system. Example graph: LinkedIn’s social graph
  • 6. Comlexity ● BFS: The complexity of standard BFS algorithm is O(V+E) but because of the overhead of read/write in distributed computing, the order reaches O (E*Depth). ● Similar is the case for Dijkstra’s algorithm. But number of iterations will be higher than BFS. ● Page Rank: The Complexity of pagerank in distributed system is – (No. of Node + No. of Relations)*Iterations
  • 7. Graph-Lab: Nodes Time 1000 6.029 sec 10,000 20.154 sec 1 million 1 min 11.124 sec Nodes Time 1000 4.852 sec 10,000 13.029 sec 1 million 1 min 10.576sec Page-Rank Dijkstra Benchmarking
  • 8. Conclusion and Future Work From the experimental results, it is seen that the time taken for pagerank algorithm is directly proportional to the number of relations in the graph when the number of nodes and iterations are constant. This explains the huge difference in time The runtime of BFS is directly proportional to the depth of the graph. So, greater the depth, more will be the number of iterations and hence more time. Future Work: Taking the input graph from file adds a huge overhead of reading and writing to files in each iteration, so if somehow we can store the graph and its properties in a Database, the read/write overhead will be gone and the query time will be reduced. So,we plan to include Database in it.