Suche senden
Hochladen
Map Reduce introduction
•
2 gefällt mir
•
1,533 views
M
Muralidharan Deenathayalan
Folgen
Introduction about Map Reduce program
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 19
Empfohlen
Spark at-hackthon8jan2014
Spark at-hackthon8jan2014
Madhukara Phatak
Hadoop Map Reduce
Hadoop Map Reduce
VNIT-ACM Student Chapter
Introduction to Map Reduce
Introduction to Map Reduce
Apache Apex
Map Reduce
Map Reduce
Michel Bruley
MapReduce basic
MapReduce basic
Chirag Ahuja
Mapreduce by examples
Mapreduce by examples
Andrea Iacono
Introduction to Map-Reduce
Introduction to Map-Reduce
Brendan Tierney
Map Reduce
Map Reduce
Rahul Agarwal
Empfohlen
Spark at-hackthon8jan2014
Spark at-hackthon8jan2014
Madhukara Phatak
Hadoop Map Reduce
Hadoop Map Reduce
VNIT-ACM Student Chapter
Introduction to Map Reduce
Introduction to Map Reduce
Apache Apex
Map Reduce
Map Reduce
Michel Bruley
MapReduce basic
MapReduce basic
Chirag Ahuja
Mapreduce by examples
Mapreduce by examples
Andrea Iacono
Introduction to Map-Reduce
Introduction to Map-Reduce
Brendan Tierney
Map Reduce
Map Reduce
Rahul Agarwal
Map reduce presentation
Map reduce presentation
ateeq ateeq
Introduction to MapReduce
Introduction to MapReduce
Hassan A-j
MapReduce Paradigm
MapReduce Paradigm
Dilip Reddy
Map Reduce
Map Reduce
Vigen Sahakyan
Map Reduce
Map Reduce
schapht
An Introduction to MapReduce
An Introduction to MapReduce
Frane Bandov
Map Reduce
Map Reduce
Sri Prasanna
Analysing of big data using map reduce
Analysing of big data using map reduce
Paladion Networks
Mapreduce Algorithms
Mapreduce Algorithms
Amund Tveit
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
Marin Dimitrov
An Introduction To Map-Reduce
An Introduction To Map-Reduce
Francisco Pérez-Sorrosal
Map reduce paradigm explained
Map reduce paradigm explained
Dmytro Sandu
MapReduce Algorithm Design
MapReduce Algorithm Design
Gabriela Agustini
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
soujavajug
Introduction to MapReduce
Introduction to MapReduce
Chicago Hadoop Users Group
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
praveen bhat
Introduction To Map Reduce
Introduction To Map Reduce
rantav
Topic 6: MapReduce Applications
Topic 6: MapReduce Applications
Zubair Nabi
Hadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
Lynn Langit
Behm Shah Pagerank
Behm Shah Pagerank
gothicane
Hadoop ecosystem
Hadoop ecosystem
Ran Silberman
Weitere ähnliche Inhalte
Was ist angesagt?
Map reduce presentation
Map reduce presentation
ateeq ateeq
Introduction to MapReduce
Introduction to MapReduce
Hassan A-j
MapReduce Paradigm
MapReduce Paradigm
Dilip Reddy
Map Reduce
Map Reduce
Vigen Sahakyan
Map Reduce
Map Reduce
schapht
An Introduction to MapReduce
An Introduction to MapReduce
Frane Bandov
Map Reduce
Map Reduce
Sri Prasanna
Analysing of big data using map reduce
Analysing of big data using map reduce
Paladion Networks
Mapreduce Algorithms
Mapreduce Algorithms
Amund Tveit
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
Marin Dimitrov
An Introduction To Map-Reduce
An Introduction To Map-Reduce
Francisco Pérez-Sorrosal
Map reduce paradigm explained
Map reduce paradigm explained
Dmytro Sandu
MapReduce Algorithm Design
MapReduce Algorithm Design
Gabriela Agustini
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
soujavajug
Introduction to MapReduce
Introduction to MapReduce
Chicago Hadoop Users Group
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
praveen bhat
Introduction To Map Reduce
Introduction To Map Reduce
rantav
Topic 6: MapReduce Applications
Topic 6: MapReduce Applications
Zubair Nabi
Hadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
Lynn Langit
Was ist angesagt?
(20)
Map reduce presentation
Map reduce presentation
Introduction to MapReduce
Introduction to MapReduce
MapReduce Paradigm
MapReduce Paradigm
Map Reduce
Map Reduce
Map Reduce
Map Reduce
An Introduction to MapReduce
An Introduction to MapReduce
Map Reduce
Map Reduce
Analysing of big data using map reduce
Analysing of big data using map reduce
Mapreduce Algorithms
Mapreduce Algorithms
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
An Introduction To Map-Reduce
An Introduction To Map-Reduce
Map reduce paradigm explained
Map reduce paradigm explained
MapReduce Algorithm Design
MapReduce Algorithm Design
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Introduction to MapReduce
Introduction to MapReduce
Hadoop/MapReduce/HDFS
Hadoop/MapReduce/HDFS
Introduction To Map Reduce
Introduction To Map Reduce
Topic 6: MapReduce Applications
Topic 6: MapReduce Applications
Hadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
Ähnlich wie Map Reduce introduction
Behm Shah Pagerank
Behm Shah Pagerank
gothicane
Hadoop ecosystem
Hadoop ecosystem
Ran Silberman
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
Map Reduce
Map Reduce
Prashant Gupta
Hadoop ecosystem
Hadoop ecosystem
Ran Silberman
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
EMC
MapReduce basics
MapReduce basics
Harisankar H
MapReduce wordcount program
MapReduce wordcount program
Sarwan Singh
Spark what's new what's coming
Spark what's new what's coming
Databricks
Big-data-analysis-training-in-mumbai
Big-data-analysis-training-in-mumbai
Unmesh Baile
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Desing Pathshala
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
Kumari Surabhi
Taste Java In The Clouds
Taste Java In The Clouds
Jacky Chu
Mapredtutorial
Mapredtutorial
Anup Mohta
Running Cognos on Hadoop
Running Cognos on Hadoop
Senturus
Cs267 hadoop programming
Cs267 hadoop programming
Kuldeep Dhole
Dart and Flutter Basics.pptx
Dart and Flutter Basics.pptx
DSCVSSUT
Intermachine Parallelism
Intermachine Parallelism
Sri Prasanna
Introduction to Mahout
Introduction to Mahout
Ted Dunning
Ähnlich wie Map Reduce introduction
(20)
Behm Shah Pagerank
Behm Shah Pagerank
Hadoop ecosystem
Hadoop ecosystem
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
Map Reduce
Map Reduce
Hadoop ecosystem
Hadoop ecosystem
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
MapReduce basics
MapReduce basics
MapReduce wordcount program
MapReduce wordcount program
Spark what's new what's coming
Spark what's new what's coming
Big-data-analysis-training-in-mumbai
Big-data-analysis-training-in-mumbai
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
Taste Java In The Clouds
Taste Java In The Clouds
Mapredtutorial
Mapredtutorial
Running Cognos on Hadoop
Running Cognos on Hadoop
Cs267 hadoop programming
Cs267 hadoop programming
Dart and Flutter Basics.pptx
Dart and Flutter Basics.pptx
Intermachine Parallelism
Intermachine Parallelism
Introduction to Mahout
Introduction to Mahout
Mehr von Muralidharan Deenathayalan
What's new in C# 8.0 (beta)
What's new in C# 8.0 (beta)
Muralidharan Deenathayalan
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Muralidharan Deenathayalan
Alfresco 5.0 features
Alfresco 5.0 features
Muralidharan Deenathayalan
Test drive on driven development process
Test drive on driven development process
Muralidharan Deenathayalan
Apache Hive - Introduction
Apache Hive - Introduction
Muralidharan Deenathayalan
Apache cassandra
Apache cassandra
Muralidharan Deenathayalan
Alfresco share 4.1 to 4.2 customisation
Alfresco share 4.1 to 4.2 customisation
Muralidharan Deenathayalan
Introduction about Alfresco webscript
Introduction about Alfresco webscript
Muralidharan Deenathayalan
Alfresco activiti workflows
Alfresco activiti workflows
Muralidharan Deenathayalan
Alfresco content model
Alfresco content model
Muralidharan Deenathayalan
Mehr von Muralidharan Deenathayalan
(10)
What's new in C# 8.0 (beta)
What's new in C# 8.0 (beta)
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Alfresco 5.0 features
Alfresco 5.0 features
Test drive on driven development process
Test drive on driven development process
Apache Hive - Introduction
Apache Hive - Introduction
Apache cassandra
Apache cassandra
Alfresco share 4.1 to 4.2 customisation
Alfresco share 4.1 to 4.2 customisation
Introduction about Alfresco webscript
Introduction about Alfresco webscript
Alfresco activiti workflows
Alfresco activiti workflows
Alfresco content model
Alfresco content model
Kürzlich hochgeladen
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
Wes McKinney
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Pim van der Noll
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
ThousandEyes
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
BookNet Canada
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
fnnc6jmgwh
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
LoriGlavin3
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
panagenda
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
marketing932765
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
LoriGlavin3
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Inflectra
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
LoriGlavin3
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
itnewsafrica
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
Kari Kakkonen
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
itnewsafrica
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
Nathaniel Shimoni
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
LoriGlavin3
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
UiPathCommunity
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
Manik S Magar
Kürzlich hochgeladen
(20)
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to write a Business Continuity Plan
How to write a Business Continuity Plan
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
Map Reduce introduction
1.
Confidential, Copyright ©
Quanticate Introduction to Map - Reduce Muralidharan Deenathayalan Technical Lead Muralidharan.deenathayalan@quanticate.com Apache logo are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.
2.
Confidential, Copyright ©
Quanticate Agenda What is Map-Reduce? Map-Reduce architecture Advantages of Map-Reduce Frameworks available for writing Map-Reduce? WordCount – Map-Reduce Program explained How to compile Map-Reduce program using Eclipse? How to deploy Map-Reduce program? How to run Map-Reduce program? Q & A
3.
Confidential, Copyright ©
Quanticate Who Am I ? 7+ years of experience in Microsoft technologies like Asp.net, C#, SQL server and SharePoint 2+ years of experience in open source technologies like Java, Alfresco and Apache Cassandra Author of Apache Cassandra Cookbook (In writing ) Csharpcorner MVP Frequent blogger
4.
Confidential, Copyright ©
Quanticate What is Map-Reduce? Generally called as Map-R program MapReduce Map() + Reduce() MapReduce is a programming approach to process large datasets in parallel, distributed on a cluster ( Divide and conquer). Map
5.
Confidential, Copyright ©
Quanticate What is Map-Reduce? • Map: – Receives input key/value pair – Outputs intermediate key/value pair • Reduce : – Receives intermediate key/value pair – Outputs key/value pair Input Data Map Reduce Reduce Map Map Input Data
6.
Confidential, Copyright ©
Quanticate Map-Reduce Architecture overview Job trackerJob tracker Task tracker Task tracker Task tracker Master node Slave node 1 Slave node 2 Slave node N Workers user Workers Workers
7.
Confidential, Copyright ©
Quanticate Advantages of Map-Reduce Distributed pattern-based searching Distributed sorting Web access logs Machine Learning
8.
Confidential, Copyright ©
Quanticate Framework available for writing Map-Reduce Courtesy & ©: http://blog.matthewrathbone.com/2013/01/05/a-quick-guide-to-hadoop-map-reduce-frameworks.html JAVA Cascading Crunch CLOJURE Cascalog SCALA Scrunch Scalding Scoobi R Rhadoop MICROSOFT .Net (C# / VB.net) SPECIAL (HIGH-LEVEL) Apache Hive Apache Pig RUBY Wukong Cascading Jruby PYTHON MR Job Dumbo Hadooppy Pydoop Luigi
9.
Confidential, Copyright ©
Quanticate WordCount – Map-Reduce Program public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } }
10.
Confidential, Copyright ©
Quanticate WordCount – Map-Reduce Program public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } }
11.
Confidential, Copyright ©
Quanticate WordCount – Map-Reduce Program public static void main(String[] args) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); }
12.
Confidential, Copyright ©
Quanticate How to compile Map-Reduce program using Eclipse? Refer Hadoop jar file from your disk Maven is simple to use Eclipse Project Build Project No errors in the eclipse console
13.
Confidential, Copyright ©
Quanticate How to deploy Map-Reduce program?
14.
Confidential, Copyright ©
Quanticate How to run Map-Reduce program?
15.
Confidential, Copyright ©
Quanticate Summary What is Map-Reduce? Architecture of Map-Reduce? Advantages of Map-Reduce Frameworks available for Map-Reduce? WordCount – Map-Reduce Program explained Compiling WordCount Map-Reduce program using Eclipse Deploying Map-Reduce program Executing a Map-Reduce program
16.
Confidential, Copyright ©
Quanticate Q & A
17.
Confidential, Copyright ©
Quanticate References http://en.wikipedia.org/wiki/MapReduce http://hortonworks.com http://hadoop.apache.org
18.
Confidential, Copyright ©
Quanticate Coding-Freaks.Net www.codingfreaks.net Quanticate OPDev Twitter https://twitter.com/quanticateopdev Twitter www.Twitter.com/muralidharand
19.
Confidential, Copyright ©
Quanticate