Map reduce (from Google)

Sri Prasanna
Sri PrasannaSlideShare
Distributed Computing Seminar Lecture 2: MapReduce Theory and Implementation Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except as otherwise noted, the contents of this presentation are © Copyright 2007 University of Washington and licensed under the Creative Commons Attribution 2.5 License.
Outline ,[object Object],[object Object],[object Object],[object Object]
Functional Programming Review ,[object Object],[object Object],[object Object],[object Object]
Functional Programming Review ,[object Object],[object Object],[object Object]
Functional Updates Do Not Modify Structures ,[object Object],[object Object],[object Object],The append() function above reverses a list, adds a new element to the front, and returns all of that, reversed, which appends an item.  But it  never modifies lst !
Functions Can Be Used As Arguments ,[object Object],It does not matter what f does to its argument; DoDouble() will do it twice. What is the type of this function?
Map ,[object Object],[object Object]
Fold ,[object Object],[object Object]
fold left vs. fold right ,[object Object],[object Object],[object Object],SML Implementation: fun foldl f a []  = a | foldl f a (x::xs) = foldl f (f(x, a)) xs fun foldr f a []  = a | foldr f a (x::xs) = f(x, (foldr f a xs))
Example ,[object Object],[object Object],[object Object]
Example (Solved) ,[object Object],[object Object],[object Object],[object Object],[object Object]
A More Complicated Fold Problem ,[object Object],[object Object],[object Object]
A More Complicated Map Problem ,[object Object],[object Object]
map Implementation ,[object Object],[object Object],fun map f []  = [] | map f (x::xs) = (f x) :: (map f xs)
Implicit Parallelism In map ,[object Object],[object Object],[object Object]
MapReduce
Motivation: Large Scale Data Processing ,[object Object],[object Object],[object Object]
MapReduce ,[object Object],[object Object],[object Object],[object Object]
Programming Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
map ,[object Object],[object Object]
reduce ,[object Object],[object Object],[object Object]
 
Parallelism ,[object Object],[object Object],[object Object],[object Object]
Example: Count word occurrences map(String input_key, String input_value): // input_key: document name  // input_value: document contents  for each  word w  in  input_value:  EmitIntermediate (w, "1");  reduce(String output_key, Iterator intermediate_values):  // output_key: a word  // output_values: a list of counts  int  result = 0;  for each  v  in  intermediate_values:  result += ParseInt(v); Emit (AsString(result));
Example vs. Actual Source Code ,[object Object],[object Object],[object Object],[object Object]
Locality ,[object Object],[object Object]
Fault Tolerance ,[object Object],[object Object],[object Object],[object Object],[object Object]
Optimizations ,[object Object],[object Object],[object Object],Why is it safe to redundantly execute map tasks? Wouldn’t this mess up the total computation?
Optimizations ,[object Object],[object Object],Under what conditions is it sound to use a combiner?
MapReduce Conclusions ,[object Object],[object Object],[object Object],[object Object]
Next Time... ,[object Object]
1 von 31

Recomendados

Mapreduce: Theory and implementation von
Mapreduce: Theory and implementationMapreduce: Theory and implementation
Mapreduce: Theory and implementationSri Prasanna
1.7K views30 Folien
Big data shim von
Big data shimBig data shim
Big data shimtistrue
613 views39 Folien
Aggregators: Data Day Texas, 2015 von
Aggregators: Data Day Texas, 2015Aggregators: Data Day Texas, 2015
Aggregators: Data Day Texas, 2015johnynek
670 views40 Folien
Map Reduce von
Map ReduceMap Reduce
Map ReduceManuel Correa
820 views18 Folien
H base introduction & development von
H base introduction & developmentH base introduction & development
H base introduction & developmentShashwat Shriparv
1.2K views24 Folien
Relational Algebra and MapReduce von
Relational Algebra and MapReduceRelational Algebra and MapReduce
Relational Algebra and MapReducePietro Michiardi
14.1K views23 Folien

Más contenido relacionado

Was ist angesagt?

Map and Reduce von
Map and ReduceMap and Reduce
Map and ReduceChristopher Schleiden
1.3K views24 Folien
R studio von
R studio R studio
R studio Kinza Irshad
562 views32 Folien
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard) von
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)npinto
3.7K views42 Folien
Introduction to MapReduce von
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduceChicago Hadoop Users Group
4.1K views28 Folien
Mapreduce von
MapreduceMapreduce
MapreduceHumera Shaikh
173 views21 Folien
MapReduce : Simplified Data Processing on Large Clusters von
MapReduce : Simplified Data Processing on Large ClustersMapReduce : Simplified Data Processing on Large Clusters
MapReduce : Simplified Data Processing on Large ClustersAbolfazl Asudeh
894 views15 Folien

Was ist angesagt?(20)

[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard) von npinto
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
npinto3.7K views
MapReduce : Simplified Data Processing on Large Clusters von Abolfazl Asudeh
MapReduce : Simplified Data Processing on Large ClustersMapReduce : Simplified Data Processing on Large Clusters
MapReduce : Simplified Data Processing on Large Clusters
Abolfazl Asudeh894 views
06 how to write a map reduce version of k-means clustering von Subhas Kumar Ghosh
06 how to write a map reduce version of k-means clustering06 how to write a map reduce version of k-means clustering
06 how to write a map reduce version of k-means clustering
Subhas Kumar Ghosh995 views
Hash map (java platform se 8 ) von charan kumar
Hash map (java platform se 8 )Hash map (java platform se 8 )
Hash map (java platform se 8 )
charan kumar135 views
MapReduce: Simplified Data Processing on Large Clusters von Ashraf Uddin
MapReduce: Simplified Data Processing on Large ClustersMapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
Ashraf Uddin3.5K views
Join optimization in hive von Liyin Tang
Join optimization in hive Join optimization in hive
Join optimization in hive
Liyin Tang6K views
Data Visualization With R: Introduction von Rsquared Academy
Data Visualization With R: IntroductionData Visualization With R: Introduction
Data Visualization With R: Introduction
Rsquared Academy598 views
Data Visualization With R: Learn To Combine Multiple Graphs von Rsquared Academy
Data Visualization With R: Learn To Combine Multiple GraphsData Visualization With R: Learn To Combine Multiple Graphs
Data Visualization With R: Learn To Combine Multiple Graphs
Rsquared Academy1.1K views
Optimization for iterative queries on Mapreduce von makoto onizuka
Optimization for iterative queries on MapreduceOptimization for iterative queries on Mapreduce
Optimization for iterative queries on Mapreduce
makoto onizuka1.1K views
Mi primer map reduce von betabeers
Mi primer map reduceMi primer map reduce
Mi primer map reduce
betabeers518 views

Destacado

BigTable And Hbase von
BigTable And HbaseBigTable And Hbase
BigTable And HbaseEdward Yoon
5.1K views11 Folien
Mallorca MUG: MapReduce y Aggregation Framework von
Mallorca MUG: MapReduce y Aggregation FrameworkMallorca MUG: MapReduce y Aggregation Framework
Mallorca MUG: MapReduce y Aggregation FrameworkEmilio Torrens
713 views15 Folien
MapReduce en Hadoop von
MapReduce en HadoopMapReduce en Hadoop
MapReduce en HadoopTomás Fernández Pena
2.3K views67 Folien
HDFS von
HDFSHDFS
HDFSTomás Fernández Pena
2.1K views28 Folien
The google MapReduce von
The google MapReduceThe google MapReduce
The google MapReduceRomain Jacotin
4.4K views37 Folien
Introducción a Hadoop von
Introducción a HadoopIntroducción a Hadoop
Introducción a HadoopTomás Fernández Pena
3.3K views147 Folien

Destacado(20)

BigTable And Hbase von Edward Yoon
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
Edward Yoon5.1K views
Mallorca MUG: MapReduce y Aggregation Framework von Emilio Torrens
Mallorca MUG: MapReduce y Aggregation FrameworkMallorca MUG: MapReduce y Aggregation Framework
Mallorca MUG: MapReduce y Aggregation Framework
Emilio Torrens713 views
Big table von PSIT
Big tableBig table
Big table
PSIT2.1K views
Hadoop: MapReduce para procesar grandes cantidades de datos von Raul Ochoa
Hadoop: MapReduce para procesar grandes cantidades de datosHadoop: MapReduce para procesar grandes cantidades de datos
Hadoop: MapReduce para procesar grandes cantidades de datos
Raul Ochoa2.1K views
Herramientas y ejemplos de trabajos MapReduce con Apache Hadoop von David Albela Pérez
Herramientas y ejemplos de trabajos MapReduce con Apache HadoopHerramientas y ejemplos de trabajos MapReduce con Apache Hadoop
Herramientas y ejemplos de trabajos MapReduce con Apache Hadoop
David Albela Pérez5.8K views
Agrupamiento Kmeans von Omar Sanchez
Agrupamiento KmeansAgrupamiento Kmeans
Agrupamiento Kmeans
Omar Sanchez17.1K views
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革 von CLOUDIAN KK
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
IoT/ビッグデータ/AI連携により次世代ストレージが促進するビジネス変革
CLOUDIAN KK1.2K views
Filesystem Comparison: NFS vs GFS2 vs OCFS2 von Giuseppe Paterno'
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Giuseppe Paterno'51.7K views
Summary of "Google's Big Table" at nosql summer reading in Tokyo von CLOUDIAN KK
Summary of "Google's Big Table" at nosql summer reading in TokyoSummary of "Google's Big Table" at nosql summer reading in Tokyo
Summary of "Google's Big Table" at nosql summer reading in Tokyo
CLOUDIAN KK6.4K views
Temadeinvestigacion 130402203353-phpapp02 von Camilo López
Temadeinvestigacion 130402203353-phpapp02Temadeinvestigacion 130402203353-phpapp02
Temadeinvestigacion 130402203353-phpapp02
Camilo López286 views
Aprendizaje de Maquina y Aplicaciones von Edgar Marca
Aprendizaje de Maquina y AplicacionesAprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y Aplicaciones
Edgar Marca1.2K views

Similar a Map reduce (from Google)

Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation von
Distributed Computing Seminar - Lecture 2: MapReduce Theory and ImplementationDistributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementationtugrulh
2.9K views31 Folien
Lec2 Mapred von
Lec2 MapredLec2 Mapred
Lec2 Mapredmobius.cn
427 views31 Folien
Map Reduce von
Map ReduceMap Reduce
Map ReduceSri Prasanna
534 views36 Folien
Hadoop Map Reduce von
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map ReduceVNIT-ACM Student Chapter
11.1K views20 Folien
Unit3 MapReduce von
Unit3 MapReduceUnit3 MapReduce
Unit3 MapReduceIntegral university, India
47 views11 Folien
MapReduce-Notes.pdf von
MapReduce-Notes.pdfMapReduce-Notes.pdf
MapReduce-Notes.pdfAnilVijayagiri
5 views6 Folien

Similar a Map reduce (from Google)(20)

Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation von tugrulh
Distributed Computing Seminar - Lecture 2: MapReduce Theory and ImplementationDistributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
Distributed Computing Seminar - Lecture 2: MapReduce Theory and Implementation
tugrulh2.9K views
Lec2 Mapred von mobius.cn
Lec2 MapredLec2 Mapred
Lec2 Mapred
mobius.cn427 views
Map reduce presentation von ateeq ateeq
Map reduce presentationMap reduce presentation
Map reduce presentation
ateeq ateeq1.8K views
Game of Life - Polyglot FP - Haskell - Scala - Unison - Part 3 von Philip Schwarz
Game of Life - Polyglot FP - Haskell - Scala - Unison - Part 3Game of Life - Polyglot FP - Haskell - Scala - Unison - Part 3
Game of Life - Polyglot FP - Haskell - Scala - Unison - Part 3
Philip Schwarz806 views
Stacks,queues,linked-list von pinakspatel
Stacks,queues,linked-listStacks,queues,linked-list
Stacks,queues,linked-list
pinakspatel1.6K views
Multinomial Logistic Regression with Apache Spark von DB Tsai
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
DB Tsai12.9K views
Alpine Spark Implementation - Technical von alpinedatalabs
Alpine Spark Implementation - TechnicalAlpine Spark Implementation - Technical
Alpine Spark Implementation - Technical
alpinedatalabs10.4K views
I JUST NEED THE GRAPHH FILE PLEASE In this project yo.pdf von sukhvir71
I JUST NEED THE GRAPHH FILE PLEASE      In this project yo.pdfI JUST NEED THE GRAPHH FILE PLEASE      In this project yo.pdf
I JUST NEED THE GRAPHH FILE PLEASE In this project yo.pdf
sukhvir712 views
Introduction to MapReduce von Hassan A-j
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
Hassan A-j852 views

Más de Sri Prasanna

Qr codes para tech radar von
Qr codes para tech radarQr codes para tech radar
Qr codes para tech radarSri Prasanna
1K views19 Folien
Qr codes para tech radar 2 von
Qr codes para tech radar 2Qr codes para tech radar 2
Qr codes para tech radar 2Sri Prasanna
626 views32 Folien
Test von
TestTest
TestSri Prasanna
740 views1 Folie
Test von
TestTest
TestSri Prasanna
520 views1 Folie
assds von
assdsassds
assdsSri Prasanna
768 views1 Folie
assds von
assdsassds
assdsSri Prasanna
784 views1 Folie

Último

MemVerge: Memory Viewer Software von
MemVerge: Memory Viewer SoftwareMemVerge: Memory Viewer Software
MemVerge: Memory Viewer SoftwareCXL Forum
118 views10 Folien
Business Analyst Series 2023 - Week 3 Session 5 von
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5DianaGray10
165 views20 Folien
MemVerge: Gismo (Global IO-free Shared Memory Objects) von
MemVerge: Gismo (Global IO-free Shared Memory Objects)MemVerge: Gismo (Global IO-free Shared Memory Objects)
MemVerge: Gismo (Global IO-free Shared Memory Objects)CXL Forum
112 views16 Folien
Transcript: The Details of Description Techniques tips and tangents on altern... von
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...BookNet Canada
119 views15 Folien
PharoJS - Zürich Smalltalk Group Meetup November 2023 von
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023Noury Bouraqadi
113 views17 Folien
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad... von
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad..."Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad...
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad...Fwdays
40 views30 Folien

Último(20)

MemVerge: Memory Viewer Software von CXL Forum
MemVerge: Memory Viewer SoftwareMemVerge: Memory Viewer Software
MemVerge: Memory Viewer Software
CXL Forum118 views
Business Analyst Series 2023 - Week 3 Session 5 von DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10165 views
MemVerge: Gismo (Global IO-free Shared Memory Objects) von CXL Forum
MemVerge: Gismo (Global IO-free Shared Memory Objects)MemVerge: Gismo (Global IO-free Shared Memory Objects)
MemVerge: Gismo (Global IO-free Shared Memory Objects)
CXL Forum112 views
Transcript: The Details of Description Techniques tips and tangents on altern... von BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada119 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 von Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi113 views
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad... von Fwdays
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad..."Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad...
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad...
Fwdays40 views
Combining Orchestration and Choreography for a Clean Architecture von ThomasHeinrichs1
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean Architecture
ThomasHeinrichs168 views
"Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ... von Fwdays
"Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ..."Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ...
"Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ...
Fwdays33 views
Liqid: Composable CXL Preview von CXL Forum
Liqid: Composable CXL PreviewLiqid: Composable CXL Preview
Liqid: Composable CXL Preview
CXL Forum121 views
"Role of a CTO in software outsourcing company", Yuriy Nakonechnyy von Fwdays
"Role of a CTO in software outsourcing company", Yuriy Nakonechnyy"Role of a CTO in software outsourcing company", Yuriy Nakonechnyy
"Role of a CTO in software outsourcing company", Yuriy Nakonechnyy
Fwdays40 views
"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur von Fwdays
"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur
"Thriving Culture in a Product Company — Practical Story", Volodymyr Tsukur
Fwdays40 views
TE Connectivity: Card Edge Interconnects von CXL Forum
TE Connectivity: Card Edge InterconnectsTE Connectivity: Card Edge Interconnects
TE Connectivity: Card Edge Interconnects
CXL Forum96 views
Data-centric AI and the convergence of data and model engineering: opportunit... von Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier29 views
MemVerge: Past Present and Future of CXL von CXL Forum
MemVerge: Past Present and Future of CXLMemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXL
CXL Forum110 views
CXL at OCP von CXL Forum
CXL at OCPCXL at OCP
CXL at OCP
CXL Forum208 views
The details of description: Techniques, tips, and tangents on alternative tex... von BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada110 views
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... von Vadym Kazulkin
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
Vadym Kazulkin70 views
"Fast Start to Building on AWS", Igor Ivaniuk von Fwdays
"Fast Start to Building on AWS", Igor Ivaniuk"Fast Start to Building on AWS", Igor Ivaniuk
"Fast Start to Building on AWS", Igor Ivaniuk
Fwdays36 views
Microchip: CXL Use Cases and Enabling Ecosystem von CXL Forum
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling Ecosystem
CXL Forum129 views

Map reduce (from Google)

  • 1. Distributed Computing Seminar Lecture 2: MapReduce Theory and Implementation Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except as otherwise noted, the contents of this presentation are © Copyright 2007 University of Washington and licensed under the Creative Commons Attribution 2.5 License.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.  
  • 23.
  • 24. Example: Count word occurrences map(String input_key, String input_value): // input_key: document name // input_value: document contents for each word w in input_value: EmitIntermediate (w, "1"); reduce(String output_key, Iterator intermediate_values): // output_key: a word // output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit (AsString(result));
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.