SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Newyorksys.com
An Overview of Hadoop
Hadoop is a open-source tool which can be used
effectively in processing huge volumes of data sets. It
works in a distributed computing scenario. Hadoop is
one of the best solution for addressing the issue of big
data.
Newyorksys has the best trainers who provides the best
online training for Hadoop by using the state of the art
training methodologies
Agenda
 What is Hadoop.
 Why do we need Hadoop.
 How Hadoop works.
 HDFS Architecture.
 What is Map – Reduce.
 Hadoop Cluster.
 Hadoop Processes.
 Topology of a Hadoop Cluster.
 Distinction of Hadoop Framework .
 Prerequisites to learn hadoop.
What is Hadoop
 Hadoop is an open Sourse Framework.
 Developed by Apache Software Foundation.
 Used for distributed processing of large date sets.
 It works across clusters of computers using a simple
programming model (Map-Reduce).
Why do we need Hadoop
 Data is growing faster.
 Need to process multi petabytes of data.
 The performance of traditional applications is
decreasing.
 The number of machines in a cluster is not constant.
 Failure is expected, rather than exceptional.
How Hadoop Works
 The Hadoop core consists of two modules :
 Hadoop Distributed File System (HDFS) [Storage].
 Map Reduce [Processing].
Mapper
Reducer
HDFS Architecture
What is Map – Reduce
 Map Reduce plays a key role in hadoop framework.
 Map Reduce is a Programming model for writing
applications that rapidly process large amount of data.
 Mapper – is a function that processes input data to
generate intermediate output data.
 Reducer – Merges all intermediate data from all
mappers and generate final output data.
Hadoop Cluster
 A Hadoop Cluster consist of multiple machines Which
can be classified into 3 types
 Namenode
 Secondary Namenode
 Datanode
Hadoop Processes
 Below are the daemons (Processes) Which runs in a
cluster.
Name node (Runs on a master machine)
Job Tracker (Runs on a master machine)
Data node (Runs on slave machines)
Task Tracker (Runs on slave machines)
Topology of a Hadoop Cluster
Distinction
 Simple – Hadoop allows users to quickly write efficient
parallel code.
 Reliable – Because Hadoop runs on commodity
hardware, it can face frequent automatically handle
such failures.
 Scalable – we can increase or decrease the number of
nodes (machine) in hadoop cluster.
Prerequisites
 Linux bases operating system (Mac
OS, Redhat, ubuntu)
 Java 1.6 or higher version
 Disk space ( To hold HDFS data and it’s replications )
 Ram (Recommended 2GB)
 A cluster of computers.
 You can even install Hadoop on single machine.
Newyorksys.com
 NewyorkSys is one of the leading top Training and
Consulting Company in US. We have certified trainers.
We will provide Online Training, Fast Track online
training, with job assistance. We are providing
excellent Training in all courses. We also help you in
resume preparation and provide job assistance till you
get job.
For more details Visit : http://www.newyorksys.com
15 Roaring Brook Rd, Chappaqua, NY 10514.
USA: +1-718-313-0499 & 718-305-1757
E:enquiry@newyorksys.us
Newyorksys.com
The End

Weitere Àhnliche Inhalte

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

KĂŒrzlich hochgeladen (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 

Empfohlen

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 

Empfohlen (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 

Hadoop online training overview

  • 2. An Overview of Hadoop Hadoop is a open-source tool which can be used effectively in processing huge volumes of data sets. It works in a distributed computing scenario. Hadoop is one of the best solution for addressing the issue of big data. Newyorksys has the best trainers who provides the best online training for Hadoop by using the state of the art training methodologies
  • 3. Agenda  What is Hadoop.  Why do we need Hadoop.  How Hadoop works.  HDFS Architecture.  What is Map – Reduce.  Hadoop Cluster.  Hadoop Processes.  Topology of a Hadoop Cluster.  Distinction of Hadoop Framework .  Prerequisites to learn hadoop.
  • 4. What is Hadoop  Hadoop is an open Sourse Framework.  Developed by Apache Software Foundation.  Used for distributed processing of large date sets.  It works across clusters of computers using a simple programming model (Map-Reduce).
  • 5. Why do we need Hadoop  Data is growing faster.  Need to process multi petabytes of data.  The performance of traditional applications is decreasing.  The number of machines in a cluster is not constant.  Failure is expected, rather than exceptional.
  • 6. How Hadoop Works  The Hadoop core consists of two modules :  Hadoop Distributed File System (HDFS) [Storage].  Map Reduce [Processing]. Mapper Reducer
  • 8. What is Map – Reduce  Map Reduce plays a key role in hadoop framework.  Map Reduce is a Programming model for writing applications that rapidly process large amount of data.  Mapper – is a function that processes input data to generate intermediate output data.  Reducer – Merges all intermediate data from all mappers and generate final output data.
  • 9. Hadoop Cluster  A Hadoop Cluster consist of multiple machines Which can be classified into 3 types  Namenode  Secondary Namenode  Datanode
  • 10. Hadoop Processes  Below are the daemons (Processes) Which runs in a cluster. Name node (Runs on a master machine) Job Tracker (Runs on a master machine) Data node (Runs on slave machines) Task Tracker (Runs on slave machines)
  • 11. Topology of a Hadoop Cluster
  • 12. Distinction  Simple – Hadoop allows users to quickly write efficient parallel code.  Reliable – Because Hadoop runs on commodity hardware, it can face frequent automatically handle such failures.  Scalable – we can increase or decrease the number of nodes (machine) in hadoop cluster.
  • 13. Prerequisites  Linux bases operating system (Mac OS, Redhat, ubuntu)  Java 1.6 or higher version  Disk space ( To hold HDFS data and it’s replications )  Ram (Recommended 2GB)  A cluster of computers.  You can even install Hadoop on single machine.
  • 14. Newyorksys.com  NewyorkSys is one of the leading top Training and Consulting Company in US. We have certified trainers. We will provide Online Training, Fast Track online training, with job assistance. We are providing excellent Training in all courses. We also help you in resume preparation and provide job assistance till you get job. For more details Visit : http://www.newyorksys.com 15 Roaring Brook Rd, Chappaqua, NY 10514. USA: +1-718-313-0499 & 718-305-1757 E:enquiry@newyorksys.us