SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Introduction of Apache
Hadoop
Presenter: Prem Chand Mali, Mindfire Solutions
Date: 30/01/2014
About Me
SCJP/OCJP - Oracle Certified Java Programmer
MCP:70-480 - Specialist certification in HTML5
with JavaScript and CSS3 Exam
Skills : Java, Swings, Springs,
Hibernate, JavaFX, Jquery,
prototypeJS, ExtJS.
Connect Me :
https://www.facebook.com/prem.c.mali
http://www.linkedin.com/in/premmali
https://twitter.com/prem_mali
https://plus.google.com/106150245941317924019/about/p/pub
Contact Me :
premchandm@mindfiresolutions.com / prem.c.mali@gmail.com
mfsi_premchandm
Presenter: Prem Chand Mali, Mindfire Solutions
Agenda
History
What is Apache Hadoop
Why Apache Hadoop
HDFS
MapReduce
Q&A

Presenter: Prem Chand Mali, Mindfire Solutions
History
• Nutch Crawler based search
• GFS and Map Reduce paper published.
• Yahoo! hired Doug Cutting and given dedicated team.

Presenter: Prem Chand Mali, Mindfire Solutions
What is Apache Hadoop ?
• Apache Hadoop is an open-source software framework that supports dataintensive distributed applications licensed under the Apache v2 license. It supports
running applications on large clusters of commodity hardware.
• Hadoop are designed with a fundamental assumption that hardware failures (of
individual machines, or racks of machines) are common and thus should be
automatically handled in software by the framework.
• Apache Hadoop's MapReduce and HDFS components originally derived
respectively from Google's MapReduce and Google File System (GFS) papers.

Presenter: Prem Chand Mali, Mindfire Solutions
What is Apache Hadoop ?
• The Apache Hadoop framework is composed of the following modules :
– Hadoop Distributed File System (HDFS) - a distributed file-system that stores
data on the commodity machines, providing very high aggregate bandwidth
across the cluster.
– Hadoop MapReduce - a programming model for large scale data processing.
– Hadoop Common - contains libraries and utilities needed by other Hadoop
modules
– Hadoop YARN - a resource-management platform responsible for managing
compute resources in clusters and using them for scheduling of users'
applications.

Presenter: Prem Chand Mali, Mindfire Solutions
Why Apache Hadoop ?
• State of Data
– 90% of data in past three years.
– Type of data
• Unstructured
• Semi-structured
• Relational
– Relation world can handle GB of data.
• Distributed
• Scalable
• Flexible
• Fault tolerant
• Intelligent

Presenter: Prem Chand Mali, Mindfire Solutions
HDFS
• HDFS is the primary distributed storage used by Hadoop applications. It consist of
following two type of components.
– NameNode
– DataNode
• HDFS, is well suited for distributed storage and distributed processing using
commodity hardware.
• Hadoop supports shell-like commands to interact with HDFS directly.

Presenter: Prem Chand Mali, Mindfire Solutions
HDFS

Presenter: Prem Chand Mali, Mindfire Solutions
MapReduce
• MapReduce if combination of following three things.
– Map
– Shuffle
– Reduce
• It done it's job through Job Tracker and Task Tracker

Presenter: Prem Chand Mali, Mindfire Solutions
MapReduce

Presenter: Prem Chand Mali, Mindfire Solutions
MapReduce

Presenter: Prem Chand Mali, Mindfire Solutions
MapReduce

Presenter: Prem Chand Mali, Mindfire Solutions
Question and
Answer

Presenter: Prem Chand Mali, Mindfire Solutions
Thank you

Presenter: Prem Chand Mali, Mindfire Solutions
www.mindfiresolutions.com
https://www.facebook.com/MindfireSolutions
http://www.linkedin.com/company/mindfire-solutions
http://twitter.com/mindfires

Presenter: Prem Chand Mali, Mindfire Solutions

Weitere ähnliche Inhalte

Was ist angesagt?

Cloudera hadoop developer training
Cloudera hadoop developer trainingCloudera hadoop developer training
Cloudera hadoop developer trainingMagnific Trainings
 
Cloudera hadoop developer training
Cloudera hadoop developer trainingCloudera hadoop developer training
Cloudera hadoop developer trainingMagnific Trainings
 
SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy  SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy Lohika_Odessa_TechTalks
 
Hadoop_RealTime_Processing_eVenkat
Hadoop_RealTime_Processing_eVenkatHadoop_RealTime_Processing_eVenkat
Hadoop_RealTime_Processing_eVenkatVenkat Krishnan
 
Cloudera administrator training
Cloudera administrator trainingCloudera administrator training
Cloudera administrator trainingMagnific Trainings
 
Hadoop training and certification
Hadoop training and certificationHadoop training and certification
Hadoop training and certificationMagnific Trainings
 
Big data and hadoop training - Session 2
Big data and hadoop training  - Session 2Big data and hadoop training  - Session 2
Big data and hadoop training - Session 2hkbhadraa
 
Hadoop big data online training
Hadoop big data online trainingHadoop big data online training
Hadoop big data online trainingMagnific Trainings
 
Hadoop training and certification
Hadoop training and certificationHadoop training and certification
Hadoop training and certificationMagnific Trainings
 

Was ist angesagt? (17)

Big data analytics training
Big data analytics trainingBig data analytics training
Big data analytics training
 
Cloudera hadoop developer training
Cloudera hadoop developer trainingCloudera hadoop developer training
Cloudera hadoop developer training
 
Cloudera hadoop developer training
Cloudera hadoop developer trainingCloudera hadoop developer training
Cloudera hadoop developer training
 
Spark in the BigData dark
Spark in the BigData darkSpark in the BigData dark
Spark in the BigData dark
 
SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy  SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy
 
Hadoop cassandra training
Hadoop cassandra trainingHadoop cassandra training
Hadoop cassandra training
 
Hadoop_RealTime_Processing_eVenkat
Hadoop_RealTime_Processing_eVenkatHadoop_RealTime_Processing_eVenkat
Hadoop_RealTime_Processing_eVenkat
 
Hadoop online training usa
Hadoop online training usaHadoop online training usa
Hadoop online training usa
 
Hadppo training
Hadppo trainingHadppo training
Hadppo training
 
Bigdata slide
Bigdata slideBigdata slide
Bigdata slide
 
Cloudera administrator training
Cloudera administrator trainingCloudera administrator training
Cloudera administrator training
 
Hadoop training and certification
Hadoop training and certificationHadoop training and certification
Hadoop training and certification
 
Big data and hadoop training - Session 2
Big data and hadoop training  - Session 2Big data and hadoop training  - Session 2
Big data and hadoop training - Session 2
 
Hadoop big data online training
Hadoop big data online trainingHadoop big data online training
Hadoop big data online training
 
Hadoop training and certification
Hadoop training and certificationHadoop training and certification
Hadoop training and certification
 
Big data developer training
Big data developer trainingBig data developer training
Big data developer training
 
Hadoop online training usa
Hadoop online training usaHadoop online training usa
Hadoop online training usa
 

Andere mochten auch

Getting Started with Apache Jmeter
Getting Started with Apache JmeterGetting Started with Apache Jmeter
Getting Started with Apache JmeterMindfire Solutions
 
Filemaker design concept_handout
Filemaker design concept_handoutFilemaker design concept_handout
Filemaker design concept_handout정권 김
 
File_Organization_112014
File_Organization_112014File_Organization_112014
File_Organization_112014eshuppy
 
Filemaker design concept
Filemaker design conceptFilemaker design concept
Filemaker design concept정권 김
 

Andere mochten auch (6)

YSlow For QA
YSlow For QAYSlow For QA
YSlow For QA
 
Getting Started with Apache Jmeter
Getting Started with Apache JmeterGetting Started with Apache Jmeter
Getting Started with Apache Jmeter
 
Filemaker design concept_handout
Filemaker design concept_handoutFilemaker design concept_handout
Filemaker design concept_handout
 
File_Organization_112014
File_Organization_112014File_Organization_112014
File_Organization_112014
 
Ruby Metaprogramming
Ruby MetaprogrammingRuby Metaprogramming
Ruby Metaprogramming
 
Filemaker design concept
Filemaker design conceptFilemaker design concept
Filemaker design concept
 

Ähnlich wie Introduction to Apache Hadoop

The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...NashvilleTechCouncil
 
Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production Paolo Platter
 
Hadoop training kit from lcc infotech
Hadoop   training kit from lcc infotechHadoop   training kit from lcc infotech
Hadoop training kit from lcc infotechlccinfotech
 
Big data analytics_using_hadoop
Big data analytics_using_hadoopBig data analytics_using_hadoop
Big data analytics_using_hadoopKnowledgehut
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst TrainingCloudera, Inc.
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Hadoop 2.0-development
Hadoop 2.0-developmentHadoop 2.0-development
Hadoop 2.0-developmentKnowledgehut
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online trainingsrikanthhadoop
 
Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Zaloni
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-trainingGeohedrick
 
Hadoop training-and-placement
Hadoop training-and-placementHadoop training-and-placement
Hadoop training-and-placementsofia taylor
 
Hadoop training-and-placement
Hadoop training-and-placementHadoop training-and-placement
Hadoop training-and-placementIqbal Patel
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesSriram Krishnan
 
First cadd big data-hadoop course
First cadd big data-hadoop courseFirst cadd big data-hadoop course
First cadd big data-hadoop courseFirstCADD2014
 
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014gmalouf678
 
9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to SchoolAdam Doyle
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw
 

Ähnlich wie Introduction to Apache Hadoop (20)

The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
The Nuts and Bolts of Hadoop and it's Ever-changing Ecosystem, Presented by J...
 
Bringing Deep Learning into production
Bringing Deep Learning into production Bringing Deep Learning into production
Bringing Deep Learning into production
 
Hadoop training kit from lcc infotech
Hadoop   training kit from lcc infotechHadoop   training kit from lcc infotech
Hadoop training kit from lcc infotech
 
Big data analytics_using_hadoop
Big data analytics_using_hadoopBig data analytics_using_hadoop
Big data analytics_using_hadoop
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst Training
 
Hadoop 80hr v1.0
Hadoop 80hr v1.0Hadoop 80hr v1.0
Hadoop 80hr v1.0
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Hadoop 2.0-development
Hadoop 2.0-developmentHadoop 2.0-development
Hadoop 2.0-development
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe? Webinar: Is Spark Hadoop's Friend or Foe?
Webinar: Is Spark Hadoop's Friend or Foe?
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-training
 
Hadoop training-and-placement
Hadoop training-and-placementHadoop training-and-placement
Hadoop training-and-placement
 
Hadoop training-and-placement
Hadoop training-and-placementHadoop training-and-placement
Hadoop training-and-placement
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
 
First cadd big data-hadoop course
First cadd big data-hadoop courseFirst cadd big data-hadoop course
First cadd big data-hadoop course
 
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 
9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School9/2017 STL HUG - Back to School
9/2017 STL HUG - Back to School
 
MahoutNew
MahoutNewMahoutNew
MahoutNew
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 

Mehr von Mindfire Solutions (20)

Physician Search and Review
Physician Search and ReviewPhysician Search and Review
Physician Search and Review
 
diet management app
diet management appdiet management app
diet management app
 
Business Technology Solution
Business Technology SolutionBusiness Technology Solution
Business Technology Solution
 
Remote Health Monitoring
Remote Health MonitoringRemote Health Monitoring
Remote Health Monitoring
 
Influencer Marketing Solution
Influencer Marketing SolutionInfluencer Marketing Solution
Influencer Marketing Solution
 
ELMAH
ELMAHELMAH
ELMAH
 
High Availability of Azure Applications
High Availability of Azure ApplicationsHigh Availability of Azure Applications
High Availability of Azure Applications
 
IOT Hands On
IOT Hands OnIOT Hands On
IOT Hands On
 
Glimpse of Loops Vs Set
Glimpse of Loops Vs SetGlimpse of Loops Vs Set
Glimpse of Loops Vs Set
 
Oracle Sql Developer-Getting Started
Oracle Sql Developer-Getting StartedOracle Sql Developer-Getting Started
Oracle Sql Developer-Getting Started
 
Adaptive Layout In iOS 8
Adaptive Layout In iOS 8Adaptive Layout In iOS 8
Adaptive Layout In iOS 8
 
Introduction to Auto-layout : iOS/Mac
Introduction to Auto-layout : iOS/MacIntroduction to Auto-layout : iOS/Mac
Introduction to Auto-layout : iOS/Mac
 
LINQPad - utility Tool
LINQPad - utility ToolLINQPad - utility Tool
LINQPad - utility Tool
 
Get started with watch kit development
Get started with watch kit developmentGet started with watch kit development
Get started with watch kit development
 
Swift vs Objective-C
Swift vs Objective-CSwift vs Objective-C
Swift vs Objective-C
 
Material Design in Android
Material Design in AndroidMaterial Design in Android
Material Design in Android
 
Introduction to OData
Introduction to ODataIntroduction to OData
Introduction to OData
 
Ext js Part 2- MVC
Ext js Part 2- MVCExt js Part 2- MVC
Ext js Part 2- MVC
 
ExtJs Basic Part-1
ExtJs Basic Part-1ExtJs Basic Part-1
ExtJs Basic Part-1
 
Spring Security Introduction
Spring Security IntroductionSpring Security Introduction
Spring Security Introduction
 

Kürzlich hochgeladen

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 

Kürzlich hochgeladen (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 

Introduction to Apache Hadoop

  • 1. Introduction of Apache Hadoop Presenter: Prem Chand Mali, Mindfire Solutions Date: 30/01/2014
  • 2. About Me SCJP/OCJP - Oracle Certified Java Programmer MCP:70-480 - Specialist certification in HTML5 with JavaScript and CSS3 Exam Skills : Java, Swings, Springs, Hibernate, JavaFX, Jquery, prototypeJS, ExtJS. Connect Me : https://www.facebook.com/prem.c.mali http://www.linkedin.com/in/premmali https://twitter.com/prem_mali https://plus.google.com/106150245941317924019/about/p/pub Contact Me : premchandm@mindfiresolutions.com / prem.c.mali@gmail.com mfsi_premchandm Presenter: Prem Chand Mali, Mindfire Solutions
  • 3. Agenda History What is Apache Hadoop Why Apache Hadoop HDFS MapReduce Q&A Presenter: Prem Chand Mali, Mindfire Solutions
  • 4. History • Nutch Crawler based search • GFS and Map Reduce paper published. • Yahoo! hired Doug Cutting and given dedicated team. Presenter: Prem Chand Mali, Mindfire Solutions
  • 5. What is Apache Hadoop ? • Apache Hadoop is an open-source software framework that supports dataintensive distributed applications licensed under the Apache v2 license. It supports running applications on large clusters of commodity hardware. • Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are common and thus should be automatically handled in software by the framework. • Apache Hadoop's MapReduce and HDFS components originally derived respectively from Google's MapReduce and Google File System (GFS) papers. Presenter: Prem Chand Mali, Mindfire Solutions
  • 6. What is Apache Hadoop ? • The Apache Hadoop framework is composed of the following modules : – Hadoop Distributed File System (HDFS) - a distributed file-system that stores data on the commodity machines, providing very high aggregate bandwidth across the cluster. – Hadoop MapReduce - a programming model for large scale data processing. – Hadoop Common - contains libraries and utilities needed by other Hadoop modules – Hadoop YARN - a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of users' applications. Presenter: Prem Chand Mali, Mindfire Solutions
  • 7. Why Apache Hadoop ? • State of Data – 90% of data in past three years. – Type of data • Unstructured • Semi-structured • Relational – Relation world can handle GB of data. • Distributed • Scalable • Flexible • Fault tolerant • Intelligent Presenter: Prem Chand Mali, Mindfire Solutions
  • 8. HDFS • HDFS is the primary distributed storage used by Hadoop applications. It consist of following two type of components. – NameNode – DataNode • HDFS, is well suited for distributed storage and distributed processing using commodity hardware. • Hadoop supports shell-like commands to interact with HDFS directly. Presenter: Prem Chand Mali, Mindfire Solutions
  • 9. HDFS Presenter: Prem Chand Mali, Mindfire Solutions
  • 10. MapReduce • MapReduce if combination of following three things. – Map – Shuffle – Reduce • It done it's job through Job Tracker and Task Tracker Presenter: Prem Chand Mali, Mindfire Solutions
  • 11. MapReduce Presenter: Prem Chand Mali, Mindfire Solutions
  • 12. MapReduce Presenter: Prem Chand Mali, Mindfire Solutions
  • 13. MapReduce Presenter: Prem Chand Mali, Mindfire Solutions
  • 14. Question and Answer Presenter: Prem Chand Mali, Mindfire Solutions
  • 15. Thank you Presenter: Prem Chand Mali, Mindfire Solutions