SlideShare ist ein Scribd-Unternehmen logo
1 von 13
MAPREDUSE MODEL 
Presented By: 
Kalyani wankhede (Roll No.606008) 
Guided by: 
Prof. Himangi Pande 
Seminar On
Outline… 
 What is MapReduce? 
 MapReduce used for? 
 MapReduce Runtime 
 MapReduce Programming Model 
 Example: Word Count 
 Fault Tolerance in MapReduce 
6-Nov-14 
2
What is MapReduce? 
 Simple data-parallel programming model designed for scalability 
and fault-tolerance 
 Pioneered by Google 
 Processes 20 petabytes of data per day 
 Popularized by Hadoop project 
6-Nov-14 
3
MapReduce used for? 
 At Google: 
 Index construction for Google Search 
 Statistical machine translation 
 At Facebook: 
 Data mining 
 Ad optimization 
 Spam detection 
 In research: 
 Bioinformatics 
 Natural language processing 
6-Nov-14 
4
MapReduce “Runtime” 
Handles 
Scheduling 
Data distribution 
Synchronization 
Errors and faults 
Speculative execution 
6-Nov-14 
5
MapReduce Programming Model 
 Consists of two components: 
 Job Tracker (master node): 
Accepting job requests 
Splitting data input 
Assigned task to be executed in parallel 
Monitoring process and handling failures 
 Many Task Tracker (slave nodes) 
Executes tasks 
Task can be either map or reduce(running in parallel) 
6-Nov-14 
6
Cont’d… 
 Data type: key-value records 
 Map function: 
(Kin, Vin)  list(Kinter, Vinter) 
 Reduce function: 
(Kinter, list(Vinter))  list(Kout, Vout) 
6-Nov-14 
7
Example: Word Count 
 def mapper(line): 
foreach word in line.split(): 
output(word, 1) 
 def reducer(key, values): 
output(key, sum(values)) 
6-Nov-14 
8
Word Count Execution 
Input Map Shuffle & Sort Reduce Output 
the quick 
brown fox 
the fox ate 
the mouse 
how now 
brown cow 
Map 
Map 
Map 
Reduc 
e 
Reduc 
e 
brown, 2 
fox, 2 
how, 1 
now, 1 
the, 3 
ate, 1 
cow, 1 
mouse, 1 
quick, 1 
the, 1 
brown, 1 
fox, 1 
quick, 1 
the, 1 
fox, 1 
the, 1 
how, 1 
now, 1 
brown, 1 
ate, 1 
mouse, 1 
cow, 1 
6-Nov-14 
9
An optimization: Combiner 
Works for associative functions like sum, count, max 
 Decreases size of intermediate data 
 Example: map-side aggregation for Word Count: 
def combiner(key, values): 
output(key, sum(values)) 
6-Nov-14 
10
Word Count with Combiner 
Input Map & Combine Shuffle & Sort Reduce Output 
the quick 
brown fox 
the fox ate 
the mouse 
how now 
brown cow 
Map 
Map 
Map 
Reduc 
e 
Reduc 
e 
brown, 2 
fox, 2 
how, 1 
now, 1 
the, 3 
ate, 1 
cow, 1 
mouse, 1 
quick, 1 
the, 1 
brown, 1 
fox, 1 
quick, 1 
the, 2 
fox, 1 
how, 1 
now, 1 
brown, 1 
ate, 1 
mouse, 1 
cow, 1 
6-Nov-14 
11
Fault Tolerance in MapReduce 
1. If a task crashes: 
 Retry on another node 
2. If a node crashes: 
 Re-launch its current tasks on other nodes 
3. If a task is going slowly: 
 Launch second copy of task on another node 
 Take the output of whichever copy finishes first, and kill the other 
6-Nov-14 
12
6-Nov-14 
13

Weitere ähnliche Inhalte

Andere mochten auch

At Social Shared we take teamwork and project management to another level.
At Social Shared we take teamwork and project management to another level.At Social Shared we take teamwork and project management to another level.
At Social Shared we take teamwork and project management to another level.Social Shared
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionJasonRafeMiller
 
Plugin smilk : données liées et traitement de la langue pour améliorer la nav...
Plugin smilk : données liées et traitement de la langue pour améliorer la nav...Plugin smilk : données liées et traitement de la langue pour améliorer la nav...
Plugin smilk : données liées et traitement de la langue pour améliorer la nav...SemWebPro
 
Dos and Don'ts on the road to Mobility
Dos and Don'ts on the road to MobilityDos and Don'ts on the road to Mobility
Dos and Don'ts on the road to MobilityRuben Goncalves
 
Букмекерское ремесло:
Букмекерское ремесло:Букмекерское ремесло:
Букмекерское ремесло:TopBukmeker
 
Benevole e newsletter jan 2015
Benevole e newsletter jan 2015Benevole e newsletter jan 2015
Benevole e newsletter jan 2015Ramabhau Patil
 
Benevole e newsletter march 2015
Benevole e newsletter march 2015Benevole e newsletter march 2015
Benevole e newsletter march 2015Ramabhau Patil
 
Hardcore Mobile integrations
Hardcore Mobile integrationsHardcore Mobile integrations
Hardcore Mobile integrationsRuben Goncalves
 
Creating Mobile Apps like a BOSS
Creating Mobile Apps like a BOSSCreating Mobile Apps like a BOSS
Creating Mobile Apps like a BOSSRuben Goncalves
 

Andere mochten auch (13)

At Social Shared we take teamwork and project management to another level.
At Social Shared we take teamwork and project management to another level.At Social Shared we take teamwork and project management to another level.
At Social Shared we take teamwork and project management to another level.
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
 
Plugin smilk : données liées et traitement de la langue pour améliorer la nav...
Plugin smilk : données liées et traitement de la langue pour améliorer la nav...Plugin smilk : données liées et traitement de la langue pour améliorer la nav...
Plugin smilk : données liées et traitement de la langue pour améliorer la nav...
 
My Assignment.pptx
My Assignment.pptxMy Assignment.pptx
My Assignment.pptx
 
Dos and Don'ts on the road to Mobility
Dos and Don'ts on the road to MobilityDos and Don'ts on the road to Mobility
Dos and Don'ts on the road to Mobility
 
Букмекерское ремесло:
Букмекерское ремесло:Букмекерское ремесло:
Букмекерское ремесло:
 
Benevole e newsletter jan 2015
Benevole e newsletter jan 2015Benevole e newsletter jan 2015
Benevole e newsletter jan 2015
 
Presentation11
Presentation11Presentation11
Presentation11
 
Ihc 2014 dr rt patil
Ihc 2014 dr rt patilIhc 2014 dr rt patil
Ihc 2014 dr rt patil
 
Benevole e newsletter march 2015
Benevole e newsletter march 2015Benevole e newsletter march 2015
Benevole e newsletter march 2015
 
Hardcore Mobile integrations
Hardcore Mobile integrationsHardcore Mobile integrations
Hardcore Mobile integrations
 
Creating Mobile Apps like a BOSS
Creating Mobile Apps like a BOSSCreating Mobile Apps like a BOSS
Creating Mobile Apps like a BOSS
 
Legal environment
Legal environmentLegal environment
Legal environment
 

Ähnlich wie Mapreduse model

Mapreduce introduction
Mapreduce introductionMapreduce introduction
Mapreduce introductionYogender Singh
 
Large Scale Data Processing & Storage
Large Scale Data Processing & StorageLarge Scale Data Processing & Storage
Large Scale Data Processing & StorageIlayaraja P
 
Big Data and Hadoop with MapReduce Paradigms
Big Data and Hadoop with MapReduce ParadigmsBig Data and Hadoop with MapReduce Paradigms
Big Data and Hadoop with MapReduce ParadigmsArundhati Kanungo
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduceDavid Gleich
 
Intro to Big Data using Hadoop
Intro to Big Data using Hadoop Intro to Big Data using Hadoop
Intro to Big Data using Hadoop Sergejus Barinovas
 
Reduce Side Joins
Reduce Side Joins Reduce Side Joins
Reduce Side Joins Edureka!
 
Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalabilityWANdisco Plc
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map ReduceApache Apex
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduceHassan A-j
 
Big Data Frameworks: A primer on Apache Spark and MapReduce
Big Data Frameworks: A primer on Apache Spark and MapReduceBig Data Frameworks: A primer on Apache Spark and MapReduce
Big Data Frameworks: A primer on Apache Spark and MapReduceNitinder Mohan
 
Download It
Download ItDownload It
Download Itbutest
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsDilum Bandara
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraSomnath Mazumdar
 
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part ILarge Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part IMarin Dimitrov
 
Hadoop interview question
Hadoop interview questionHadoop interview question
Hadoop interview questionpappupassindia
 

Ähnlich wie Mapreduse model (20)

Tutorial5
Tutorial5Tutorial5
Tutorial5
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Mapreduce introduction
Mapreduce introductionMapreduce introduction
Mapreduce introduction
 
Large Scale Data Processing & Storage
Large Scale Data Processing & StorageLarge Scale Data Processing & Storage
Large Scale Data Processing & Storage
 
Big Data and Hadoop with MapReduce Paradigms
Big Data and Hadoop with MapReduce ParadigmsBig Data and Hadoop with MapReduce Paradigms
Big Data and Hadoop with MapReduce Paradigms
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Intro to Big Data using Hadoop
Intro to Big Data using Hadoop Intro to Big Data using Hadoop
Intro to Big Data using Hadoop
 
Reduce Side Joins
Reduce Side Joins Reduce Side Joins
Reduce Side Joins
 
Hadoop scalability
Hadoop scalabilityHadoop scalability
Hadoop scalability
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
 
Big Data Frameworks: A primer on Apache Spark and MapReduce
Big Data Frameworks: A primer on Apache Spark and MapReduceBig Data Frameworks: A primer on Apache Spark and MapReduce
Big Data Frameworks: A primer on Apache Spark and MapReduce
 
MapReduce
MapReduceMapReduce
MapReduce
 
Download It
Download ItDownload It
Download It
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel Problems
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
 
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part ILarge Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 
Hadoop interview question
Hadoop interview questionHadoop interview question
Hadoop interview question
 

Kürzlich hochgeladen

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptxNikhil Raut
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadaditya806802
 

Kürzlich hochgeladen (20)

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptx
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasadhome automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
 

Mapreduse model

  • 1. MAPREDUSE MODEL Presented By: Kalyani wankhede (Roll No.606008) Guided by: Prof. Himangi Pande Seminar On
  • 2. Outline…  What is MapReduce?  MapReduce used for?  MapReduce Runtime  MapReduce Programming Model  Example: Word Count  Fault Tolerance in MapReduce 6-Nov-14 2
  • 3. What is MapReduce?  Simple data-parallel programming model designed for scalability and fault-tolerance  Pioneered by Google  Processes 20 petabytes of data per day  Popularized by Hadoop project 6-Nov-14 3
  • 4. MapReduce used for?  At Google:  Index construction for Google Search  Statistical machine translation  At Facebook:  Data mining  Ad optimization  Spam detection  In research:  Bioinformatics  Natural language processing 6-Nov-14 4
  • 5. MapReduce “Runtime” Handles Scheduling Data distribution Synchronization Errors and faults Speculative execution 6-Nov-14 5
  • 6. MapReduce Programming Model  Consists of two components:  Job Tracker (master node): Accepting job requests Splitting data input Assigned task to be executed in parallel Monitoring process and handling failures  Many Task Tracker (slave nodes) Executes tasks Task can be either map or reduce(running in parallel) 6-Nov-14 6
  • 7. Cont’d…  Data type: key-value records  Map function: (Kin, Vin)  list(Kinter, Vinter)  Reduce function: (Kinter, list(Vinter))  list(Kout, Vout) 6-Nov-14 7
  • 8. Example: Word Count  def mapper(line): foreach word in line.split(): output(word, 1)  def reducer(key, values): output(key, sum(values)) 6-Nov-14 8
  • 9. Word Count Execution Input Map Shuffle & Sort Reduce Output the quick brown fox the fox ate the mouse how now brown cow Map Map Map Reduc e Reduc e brown, 2 fox, 2 how, 1 now, 1 the, 3 ate, 1 cow, 1 mouse, 1 quick, 1 the, 1 brown, 1 fox, 1 quick, 1 the, 1 fox, 1 the, 1 how, 1 now, 1 brown, 1 ate, 1 mouse, 1 cow, 1 6-Nov-14 9
  • 10. An optimization: Combiner Works for associative functions like sum, count, max  Decreases size of intermediate data  Example: map-side aggregation for Word Count: def combiner(key, values): output(key, sum(values)) 6-Nov-14 10
  • 11. Word Count with Combiner Input Map & Combine Shuffle & Sort Reduce Output the quick brown fox the fox ate the mouse how now brown cow Map Map Map Reduc e Reduc e brown, 2 fox, 2 how, 1 now, 1 the, 3 ate, 1 cow, 1 mouse, 1 quick, 1 the, 1 brown, 1 fox, 1 quick, 1 the, 2 fox, 1 how, 1 now, 1 brown, 1 ate, 1 mouse, 1 cow, 1 6-Nov-14 11
  • 12. Fault Tolerance in MapReduce 1. If a task crashes:  Retry on another node 2. If a node crashes:  Re-launch its current tasks on other nodes 3. If a task is going slowly:  Launch second copy of task on another node  Take the output of whichever copy finishes first, and kill the other 6-Nov-14 12