SlideShare ist ein Scribd-Unternehmen logo
1 von 26
MapReduce Programming Model
To Solve Graph Problems
Presented By:
Nishant Gandhi
M.Tech. - CSE 1st Year
1311CS05
Guided By:
Dr. Rajiv Misra
Seminar Overview
• Introduction to MapReduce
• MapReduce Programming Model
– Word Count problem
• Graph Problems & MapReduce
– Breath First Search
– Augmenting Edges with Degree
– Enumerating Triangles from Graph
Introduction to MapReduce
• History of Computing
– Moore’s Law
• Not holding since last few years
• Memory is still bottle neck for large GHZ processor
– Distributed Problems
• Indexing The Web, Simulating Internet Sized Network, Speeding Up
Content Delivery, Rendering Multiple Frames
– Parallel Computing (1975-1985)
• Synchronization Problems
• Very Costly Super Computers
– Distributed Computing (1995-Today)
• Cost Effective Solution
• Use Commodity Hardware
• Google has no Super Computer
Introduction to MapReduce
• History of MapReduce at Google
– Problem at Google
• Computing Large Amount of Data on DS
• Parallelize Computing, Distribute Data, Handle Failure
– One Solution
• New Abstract that allows simple computation & hide
all other mess
• Automatics Parallelization, Distribution, Fault Handling
• MapReduce Paper 2004
MapReduce Programming Model
• Motivation
– Automatic Parallelization & Distribution
– Fault tolerant
– Provides Status & Monitoring Tool
– Clean Abstract For Programmer
MapReduce Programming Model
• Programming Model
– Borrows From Functional Programming
– User Implement interface of two functions
• Map & Reduce
• map (in_key, in_value) --> (out_key, intermediate_value)
list
• reduce (out_key, intermediate_value list) --> out_value list
MapReduce Programming Model
map: (K1,V1) → list (K2,V2)
reduce: (K2,list(V2)) → list (K3,V3)
1. Map function is applied to every input key-value pair
2. Map function generates intermediate key-value pairs
3. Intermediate key-values are sorted and grouped by key
4. Reduce is applied to sorted and grouped intermediate
key-values
5. Reduce emits result key-values
MapReduce Programming Model
MapReduce Programming Model
Example: WordCount
Graph Problems
Graphs are ubiquitous in modern society. Some
examples:
• The hyperlink structure of the web
• Social networks on social networking sites like
Facebook, IMDB, email, text messages and tweet
flows (like Twitter)
• Transportation networks (roads, trains, fights etc)
• Human body can be seen as a graph of genes,
proteins, cells etc..
Graph Problems & MapReduce
• Performing Computation on a graph data
structure requires processing at each node
• Each node contain node-specific data as well
as links (edges) to other nodes
• Computation must traverse the graph and
perform the computation step
• How do we traverse a graph in MapReduce?
How do we represent the graph for this?
Breath First Search & MapReduce
Problem:
This does not fit into MapReduce
Solution:
Iterated passes through
MapReduce-map some nodes,
result includes additional nodes
which are fed into successive
MapReduce passes
Breath First Search & MapReduce
Example
Representation as adjacent list
ID EDGES|DISTANCE_FROM_SOURCE|COLOR|
• Input to MAP
1 2,5|0|GRAY|
2 1,3,4,5|Integer.MAX_VALUE|WHITE|
3 2,4|Integer.MAX_VALUE|WHITE|
4 2,3,5|Integer.MAX_VALUE|WHITE|
5 1,2,4|Integer.MAX_VALUE|WHITE|
Breath First Search & MapReduce
Example
• 1st iteration of Map
1 2,5|0|BLACK|
2 NULL|1|GRAY|
5 NULL|1|GRAY|
2 1,3,4,5|Integer.MAX_VALUE|WHITE|
3 2,4|Integer.MAX_VALUE|WHITE|
4 2,3,5|Integer.MAX_VALUE|WHITE|
5 1,2,4|Integer.MAX_VALUE|WHITE|
•1st iteration for Reduce(result only for node 2)
2 NULL|1|GRAY|
2 1,3,4,5|Integer.MAX_VALUE|WHITE|
The reducers job is to take all
this data and construct a new
node using
the non-null list of edges
the minimum distance
the darkest color
Breath First Search & MapReduce
Example
•Output of 1st iteration
1 2,5,|0|BLACK
2 1,3,4,5,|1|GRAY
3 2,4,|Integer.MAX_VALUE|WHITE
4 2,3,5,|Integer.MAX_VALUE|WHITE
5 1,2,4,|1|GRAY
•Output of 2st iteration
1 2,5,|0|BLACK
2 1,3,4,5,|1|BLACK
3 2,4,|2|GRAY
4 2,3,5,|2|GRAY
5 1,2,4,|1|BLACK
Breath First Search & MapReduce
Example
•Output of 3st iteration
1 2,5,|0|BLACK
2 1,3,4,5,|1|BLACK
3 2,4,|2|BLACK
4 2,3,5,|2|BLACK
5 1,2,4,|1|BLACK
Augmenting Edges with Degrees &
MapReduce
Problem:
This does not fit into MapReduce
Solution:
Requires two MapReduce
jobs: two reduce steps and two
map steps,
one of which is the identity map.
Augmenting Edges with Degrees &
MapReduce Example
Mapper:
for each input record, the map creates two
output records, one keyed under each
vertex in the edge.
Reducer:
The reduce takes all edges mapped to a
single vertex (“Fred” here), counts them to
obtain the degree, and emits a record for
each input record, each keyed under the
edge it represents.
Augmenting Edges with Degrees &
MapReduce Example
Mapper:
the identity mapper preserves the records
unchanged, so the records are binned by
the edges they represent.
Reducer:
The reducer combines the partial-degree
information to produce a complete record,
which it exports.
Enumerating Triangles & MapReduce
Example
 Problem:
Enumerating 3-cycle sub graph
from given graph
 Solution:
• augmenting the edge records
with vertex valence
• two MapReduce jobs
Enumerating Triangles & MapReduce
Example
• In the first map operation for enumerating triangles, the
mapper records each edge under the vertex with the lowest
degree.
• The incoming records’ key doesn’t matter.
Enumerating Triangles & MapReduce
Example
• In the first map operation for enumerating triangles, the
mapper records each edge under the vertex with the lowest
degree.
• The incoming records’ key doesn’t matter.
Enumerating Triangles & MapReduce
Example
• The second map for enumerating triangles brings together
the edge and open triad records.
• In the process, it rekeys the edge records so that both record
types are binned under the vertices they connect.
Enumerating Triangles & MapReduce
Example
• In the second reduce, each bin contains at most one edge record
and some number of triad records (perhaps none).
• For every combination of edge record and triad record in a bin, the
reduce emits a triangle record. The output key isn’t significant.
Bibliography
1. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on
Large Clusters,” Comm. ACM, vol. 51, no. 1,2008, pp. 107–112.
2. GoogleDevelopers, “Lecture 5: Parallel Graph Algorithms with
MapReduce,” 28 Aug. 2007; http://youtube.com/watch?v=BT-piFBP4fE.
3. Jonathan Cohen, Graph Twiddling in a MapReduce World. Comp. in
Science & Engineering, July/August 2009, 29-41.
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Python Programming and GIS
Python Programming and GISPython Programming and GIS
Python Programming and GIS
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Introduction to OpenStreetMap
Introduction to OpenStreetMapIntroduction to OpenStreetMap
Introduction to OpenStreetMap
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Hadoop
HadoopHadoop
Hadoop
 
TYBSC IT PGIS Unit V Data Visualization
TYBSC IT PGIS Unit V  Data VisualizationTYBSC IT PGIS Unit V  Data Visualization
TYBSC IT PGIS Unit V Data Visualization
 
Decision tree
Decision treeDecision tree
Decision tree
 
Geographic Information System unit 1
Geographic Information System   unit 1Geographic Information System   unit 1
Geographic Information System unit 1
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptx
 
Grid based method & model based clustering method
Grid based method & model based clustering methodGrid based method & model based clustering method
Grid based method & model based clustering method
 
DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering Algorithm
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
HDFS Architecture
HDFS ArchitectureHDFS Architecture
HDFS Architecture
 
Visualizing Data with Geographic Information Systems (GIS)
Visualizing Data with Geographic Information Systems (GIS)Visualizing Data with Geographic Information Systems (GIS)
Visualizing Data with Geographic Information Systems (GIS)
 
Agent architectures
Agent architecturesAgent architectures
Agent architectures
 
QGIS Tutorial 1
QGIS Tutorial 1QGIS Tutorial 1
QGIS Tutorial 1
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must Know
 
TYBSC IT SEM 6 GIS
TYBSC IT SEM 6 GISTYBSC IT SEM 6 GIS
TYBSC IT SEM 6 GIS
 

Ähnlich wie Map reduce programming model to solve graph problems

module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
TSANKARARAO
 
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Databricks
 
Hadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User GroupHadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User Group
Csaba Toth
 

Ähnlich wie Map reduce programming model to solve graph problems (20)

Enar short course
Enar short courseEnar short course
Enar short course
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
 
Hadoop classes in mumbai
Hadoop classes in mumbaiHadoop classes in mumbai
Hadoop classes in mumbai
 
Pregel
PregelPregel
Pregel
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduce
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
 
Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processing
 
Benchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph AlgorithmsBenchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph Algorithms
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Benchmarking tool for graph algorithms
Benchmarking tool for graph algorithmsBenchmarking tool for graph algorithms
Benchmarking tool for graph algorithms
 
MapReduce
MapReduceMapReduce
MapReduce
 
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
 
Hadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User GroupHadoop and Mapreduce for .NET User Group
Hadoop and Mapreduce for .NET User Group
 
On Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and ExperimentsOn Extending MapReduce - Survey and Experiments
On Extending MapReduce - Survey and Experiments
 
MapReduce Programming Model
MapReduce Programming ModelMapReduce Programming Model
MapReduce Programming Model
 
Presentation
PresentationPresentation
Presentation
 
ENAR short course
ENAR short courseENAR short course
ENAR short course
 
How to Automate CAD & GIS Integration
How to Automate CAD & GIS IntegrationHow to Automate CAD & GIS Integration
How to Automate CAD & GIS Integration
 

Mehr von Nishant Gandhi (7)

Customer Feedback Analytics for Starbucks
Customer Feedback Analytics for Starbucks Customer Feedback Analytics for Starbucks
Customer Feedback Analytics for Starbucks
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
 
Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large Graphs
 
Graph Coloring Algorithms on Pregel Model using Hadoop
Graph Coloring Algorithms on Pregel Model using HadoopGraph Coloring Algorithms on Pregel Model using Hadoop
Graph Coloring Algorithms on Pregel Model using Hadoop
 
Neo4j vs giraph
Neo4j vs giraphNeo4j vs giraph
Neo4j vs giraph
 
Packet tracer practical guide
Packet tracer practical guidePacket tracer practical guide
Packet tracer practical guide
 
Hadoop Report
Hadoop ReportHadoop Report
Hadoop Report
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Kürzlich hochgeladen (20)

This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 

Map reduce programming model to solve graph problems

  • 1. MapReduce Programming Model To Solve Graph Problems Presented By: Nishant Gandhi M.Tech. - CSE 1st Year 1311CS05 Guided By: Dr. Rajiv Misra
  • 2. Seminar Overview • Introduction to MapReduce • MapReduce Programming Model – Word Count problem • Graph Problems & MapReduce – Breath First Search – Augmenting Edges with Degree – Enumerating Triangles from Graph
  • 3. Introduction to MapReduce • History of Computing – Moore’s Law • Not holding since last few years • Memory is still bottle neck for large GHZ processor – Distributed Problems • Indexing The Web, Simulating Internet Sized Network, Speeding Up Content Delivery, Rendering Multiple Frames – Parallel Computing (1975-1985) • Synchronization Problems • Very Costly Super Computers – Distributed Computing (1995-Today) • Cost Effective Solution • Use Commodity Hardware • Google has no Super Computer
  • 4. Introduction to MapReduce • History of MapReduce at Google – Problem at Google • Computing Large Amount of Data on DS • Parallelize Computing, Distribute Data, Handle Failure – One Solution • New Abstract that allows simple computation & hide all other mess • Automatics Parallelization, Distribution, Fault Handling • MapReduce Paper 2004
  • 5. MapReduce Programming Model • Motivation – Automatic Parallelization & Distribution – Fault tolerant – Provides Status & Monitoring Tool – Clean Abstract For Programmer
  • 6. MapReduce Programming Model • Programming Model – Borrows From Functional Programming – User Implement interface of two functions • Map & Reduce • map (in_key, in_value) --> (out_key, intermediate_value) list • reduce (out_key, intermediate_value list) --> out_value list
  • 7. MapReduce Programming Model map: (K1,V1) → list (K2,V2) reduce: (K2,list(V2)) → list (K3,V3) 1. Map function is applied to every input key-value pair 2. Map function generates intermediate key-value pairs 3. Intermediate key-values are sorted and grouped by key 4. Reduce is applied to sorted and grouped intermediate key-values 5. Reduce emits result key-values
  • 10. Graph Problems Graphs are ubiquitous in modern society. Some examples: • The hyperlink structure of the web • Social networks on social networking sites like Facebook, IMDB, email, text messages and tweet flows (like Twitter) • Transportation networks (roads, trains, fights etc) • Human body can be seen as a graph of genes, proteins, cells etc..
  • 11. Graph Problems & MapReduce • Performing Computation on a graph data structure requires processing at each node • Each node contain node-specific data as well as links (edges) to other nodes • Computation must traverse the graph and perform the computation step • How do we traverse a graph in MapReduce? How do we represent the graph for this?
  • 12. Breath First Search & MapReduce Problem: This does not fit into MapReduce Solution: Iterated passes through MapReduce-map some nodes, result includes additional nodes which are fed into successive MapReduce passes
  • 13. Breath First Search & MapReduce Example Representation as adjacent list ID EDGES|DISTANCE_FROM_SOURCE|COLOR| • Input to MAP 1 2,5|0|GRAY| 2 1,3,4,5|Integer.MAX_VALUE|WHITE| 3 2,4|Integer.MAX_VALUE|WHITE| 4 2,3,5|Integer.MAX_VALUE|WHITE| 5 1,2,4|Integer.MAX_VALUE|WHITE|
  • 14. Breath First Search & MapReduce Example • 1st iteration of Map 1 2,5|0|BLACK| 2 NULL|1|GRAY| 5 NULL|1|GRAY| 2 1,3,4,5|Integer.MAX_VALUE|WHITE| 3 2,4|Integer.MAX_VALUE|WHITE| 4 2,3,5|Integer.MAX_VALUE|WHITE| 5 1,2,4|Integer.MAX_VALUE|WHITE| •1st iteration for Reduce(result only for node 2) 2 NULL|1|GRAY| 2 1,3,4,5|Integer.MAX_VALUE|WHITE| The reducers job is to take all this data and construct a new node using the non-null list of edges the minimum distance the darkest color
  • 15. Breath First Search & MapReduce Example •Output of 1st iteration 1 2,5,|0|BLACK 2 1,3,4,5,|1|GRAY 3 2,4,|Integer.MAX_VALUE|WHITE 4 2,3,5,|Integer.MAX_VALUE|WHITE 5 1,2,4,|1|GRAY •Output of 2st iteration 1 2,5,|0|BLACK 2 1,3,4,5,|1|BLACK 3 2,4,|2|GRAY 4 2,3,5,|2|GRAY 5 1,2,4,|1|BLACK
  • 16. Breath First Search & MapReduce Example •Output of 3st iteration 1 2,5,|0|BLACK 2 1,3,4,5,|1|BLACK 3 2,4,|2|BLACK 4 2,3,5,|2|BLACK 5 1,2,4,|1|BLACK
  • 17. Augmenting Edges with Degrees & MapReduce Problem: This does not fit into MapReduce Solution: Requires two MapReduce jobs: two reduce steps and two map steps, one of which is the identity map.
  • 18. Augmenting Edges with Degrees & MapReduce Example Mapper: for each input record, the map creates two output records, one keyed under each vertex in the edge. Reducer: The reduce takes all edges mapped to a single vertex (“Fred” here), counts them to obtain the degree, and emits a record for each input record, each keyed under the edge it represents.
  • 19. Augmenting Edges with Degrees & MapReduce Example Mapper: the identity mapper preserves the records unchanged, so the records are binned by the edges they represent. Reducer: The reducer combines the partial-degree information to produce a complete record, which it exports.
  • 20. Enumerating Triangles & MapReduce Example  Problem: Enumerating 3-cycle sub graph from given graph  Solution: • augmenting the edge records with vertex valence • two MapReduce jobs
  • 21. Enumerating Triangles & MapReduce Example • In the first map operation for enumerating triangles, the mapper records each edge under the vertex with the lowest degree. • The incoming records’ key doesn’t matter.
  • 22. Enumerating Triangles & MapReduce Example • In the first map operation for enumerating triangles, the mapper records each edge under the vertex with the lowest degree. • The incoming records’ key doesn’t matter.
  • 23. Enumerating Triangles & MapReduce Example • The second map for enumerating triangles brings together the edge and open triad records. • In the process, it rekeys the edge records so that both record types are binned under the vertices they connect.
  • 24. Enumerating Triangles & MapReduce Example • In the second reduce, each bin contains at most one edge record and some number of triad records (perhaps none). • For every combination of edge record and triad record in a bin, the reduce emits a triangle record. The output key isn’t significant.
  • 25. Bibliography 1. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Comm. ACM, vol. 51, no. 1,2008, pp. 107–112. 2. GoogleDevelopers, “Lecture 5: Parallel Graph Algorithms with MapReduce,” 28 Aug. 2007; http://youtube.com/watch?v=BT-piFBP4fE. 3. Jonathan Cohen, Graph Twiddling in a MapReduce World. Comp. in Science & Engineering, July/August 2009, 29-41.