SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Msquare Systems Inc.,
What is Hadoop?

Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain
insight from massive amounts of structured and unstructured data quickly and without significant investment.

Hadoop is designed to run on commodity hardware and can scale up or down without system interruption. It consists
of three main functions: storage, processing and resource management.
Core services on Hadoop

MapReduce: MapReduce is a framework for writing applications that process large amounts of structured and unstructured data in
parallel across a cluster of several machines in a reliable and fault-tolerant.

HDFS: Hadoop Distributed File System is a java-based file system that provides scalable and reliable data storage for large group of
clusters.

Hadoop Yarn: Yarn is a next generation framework for Hadoop Data processing extending MapReduce capabilities by supporting nonMapReduce workloads associated with other programming models.

Apache Tez: Tez generalizes the MapReduce paradigm to a more powerful framework for executing a complex DAG (directed acyclic
graph) of tasks for near real-time big data processing
Hadoop Data Services

Apache Pig: Its platform for processing and analyzing large data sets.

Apache Hbase: A column-oriented No SQL data storage system that provides random real-time read/write access to big data for user
applications.

Apache Hive: Built on the MapReduce framework, Hive is a data warehouse that enables easy data summarization and add-hoc queries via
SQL-like interface for large datasets stored in HDFS.

Apache Flume: Allows efficiently aggregating and moving large amounts of log data from many different sources to Hadoop.

Apache Mahout: Apache Mahout scalable machine learning algorithms for hadoop, which aids with data science for clustering, classification
and batch based collaborative filtering
Hadoop Data Services

Apache Accumulo : Accumulo is a high performance data storage and retrieval system with cell-level access control. It is a scalable
implementation of Google’s Big Table design that works on top of Apache Hadoop and Apache ZooKeeper.

Apache Storm : Storm is a distributed real-time computation system for processing fast, large streams of data adding reliable real-time data
processing capabilities to Apache Hadoop 2.x.
Apache Catalog : A table and metadata management service that provides a centralized way for data processing systems to understand the
structure and location of the data stored within Apache Hadoop

Apache Sqoop : Sqoop is a tool that speeds and eases movement of data in and out of Hadoop. It provides a reliable parallel load for
various, popular enterprise data sources.
Hadoop Operational Services

Apache Zookeeper: A highly available system for coordinating distributing processes.

Apache Falcon: Falcon is a data management framework for simplifying data lifecycle management and processing pipelines on Apache
hadoop.

Apache Ambari: Open source installation lifecycle management, administration, and monitoring system for Apache Hadoop Clusters.

Apache knox: “Knox” gateway is a system that provides a single point of authentication and access for Apache Hadoop services in a cluster.

Apache Oozie: Oozie Java web application used to schedule Apache Hadoop Jobs. Oozie combines multiple jobs sequentially into one logical
unit of work.
What Hadoop can, and can't do

What Hadoop can't do
You can't use Hadoop for
 Structured data
 Transactional data

What Hadoop can do
You can use Hadoop for
 Big Data
Support & Partner
Getting Started or Support –

Muthu Natarajan

muthu.n@msquaresystems.com

www.msquaresystems.com

Phone: 212-941-6000

Weitere Àhnliche Inhalte

Was ist angesagt?

The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.Data Con LA
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoopOmar Jaber
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and HadoopFlavio Vit
 
Űč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸ
Űč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸŰč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸ
Űč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸdatastack
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irdatastack
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoopVarun Narang
 
Big data vahidamiri-datastack.ir
Big data vahidamiri-datastack.irBig data vahidamiri-datastack.ir
Big data vahidamiri-datastack.irdatastack
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop IntroductionDzung Nguyen
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceeakasit_dpu
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Big Data and Hadoop - An Introduction
Big Data and Hadoop - An IntroductionBig Data and Hadoop - An Introduction
Big Data and Hadoop - An IntroductionNagarjuna Kanamarlapudi
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and HadoopRahul Agarwal
 
Hadoop info
Hadoop infoHadoop info
Hadoop infoNikita Sure
 
Big data processing with apache spark part1
Big data processing with apache spark   part1Big data processing with apache spark   part1
Big data processing with apache spark part1Abbas Maazallahi
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar ReportBhushan Kulkarni
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherTop Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherJanBask Training
 
Big data Hadoop presentation
Big data  Hadoop  presentation Big data  Hadoop  presentation
Big data Hadoop presentation Shivanee garg
 

Was ist angesagt? (20)

The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Űč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸ
Űč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸŰč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸ
Űč۔۱ Ú©Ù„Ű§Ù† ŰŻŰ§ŰŻÙ‡ŰŒ چ۱ۧ و Ú†ÚŻÙˆÙ†Ù‡ŰŸ
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
Big data vahidamiri-datastack.ir
Big data vahidamiri-datastack.irBig data vahidamiri-datastack.ir
Big data vahidamiri-datastack.ir
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop Introduction
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Big Data and Hadoop - An Introduction
Big Data and Hadoop - An IntroductionBig Data and Hadoop - An Introduction
Big Data and Hadoop - An Introduction
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
HDFS
HDFSHDFS
HDFS
 
Big data processing with apache spark part1
Big data processing with apache spark   part1Big data processing with apache spark   part1
Big data processing with apache spark part1
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherTop Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for Fresher
 
Big data Hadoop presentation
Big data  Hadoop  presentation Big data  Hadoop  presentation
Big data Hadoop presentation
 

Andere mochten auch

Processing cassandra datasets with hadoop streaming based approaches
Processing cassandra datasets with hadoop streaming based approachesProcessing cassandra datasets with hadoop streaming based approaches
Processing cassandra datasets with hadoop streaming based approachesLeMeniz Infotech
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)Yu Liu
 
Silicon Halton - Meetup 72 - Co-Accelerate Your Business
Silicon Halton - Meetup 72 - Co-Accelerate Your BusinessSilicon Halton - Meetup 72 - Co-Accelerate Your Business
Silicon Halton - Meetup 72 - Co-Accelerate Your BusinessSilicon Halton
 
Silicon Halton Meetup 75: Augmented Reality, A Primer
Silicon Halton Meetup 75: Augmented Reality, A PrimerSilicon Halton Meetup 75: Augmented Reality, A Primer
Silicon Halton Meetup 75: Augmented Reality, A PrimerSilicon Halton
 
Silicon Halton Meetup 77 - Work Unscripted
Silicon Halton Meetup 77 - Work UnscriptedSilicon Halton Meetup 77 - Work Unscripted
Silicon Halton Meetup 77 - Work UnscriptedSilicon Halton
 
Silicon Halton Meetup 83 - Sr. HR Panel
Silicon Halton Meetup 83 - Sr. HR PanelSilicon Halton Meetup 83 - Sr. HR Panel
Silicon Halton Meetup 83 - Sr. HR PanelSilicon Halton
 
Silicon Halton Meetup 79 - Chart of Accounts
Silicon Halton Meetup 79 - Chart of AccountsSilicon Halton Meetup 79 - Chart of Accounts
Silicon Halton Meetup 79 - Chart of AccountsSilicon Halton
 

Andere mochten auch (7)

Processing cassandra datasets with hadoop streaming based approaches
Processing cassandra datasets with hadoop streaming based approachesProcessing cassandra datasets with hadoop streaming based approaches
Processing cassandra datasets with hadoop streaming based approaches
 
An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)An Introduction of Recent Research on MapReduce (2011)
An Introduction of Recent Research on MapReduce (2011)
 
Silicon Halton - Meetup 72 - Co-Accelerate Your Business
Silicon Halton - Meetup 72 - Co-Accelerate Your BusinessSilicon Halton - Meetup 72 - Co-Accelerate Your Business
Silicon Halton - Meetup 72 - Co-Accelerate Your Business
 
Silicon Halton Meetup 75: Augmented Reality, A Primer
Silicon Halton Meetup 75: Augmented Reality, A PrimerSilicon Halton Meetup 75: Augmented Reality, A Primer
Silicon Halton Meetup 75: Augmented Reality, A Primer
 
Silicon Halton Meetup 77 - Work Unscripted
Silicon Halton Meetup 77 - Work UnscriptedSilicon Halton Meetup 77 - Work Unscripted
Silicon Halton Meetup 77 - Work Unscripted
 
Silicon Halton Meetup 83 - Sr. HR Panel
Silicon Halton Meetup 83 - Sr. HR PanelSilicon Halton Meetup 83 - Sr. HR Panel
Silicon Halton Meetup 83 - Sr. HR Panel
 
Silicon Halton Meetup 79 - Chart of Accounts
Silicon Halton Meetup 79 - Chart of AccountsSilicon Halton Meetup 79 - Chart of Accounts
Silicon Halton Meetup 79 - Chart of Accounts
 

Ähnlich wie Hadoop white papers

Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Muthu Natarajan
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellKhalid Imran
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big pictureJ S Jodha
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceChris Nauroth
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceNeev Technologies
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basicsLaxmi Rauth
 
2.1-HADOOP.pdf
2.1-HADOOP.pdf2.1-HADOOP.pdf
2.1-HADOOP.pdfMarianJRuben
 
Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop Introductionsheetal sharma
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoopManoj Jangalva
 
Case study on big data
Case study on big dataCase study on big data
Case study on big dataKhushboo Kumari
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache HadoopOleksiy Krotov
 
RDBMS vs Hadoop vs Spark
RDBMS vs Hadoop vs SparkRDBMS vs Hadoop vs Spark
RDBMS vs Hadoop vs SparkLaxmi8
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersMrigendra Sharma
 

Ähnlich wie Hadoop white papers (20)

Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
In15orlesss hadoop
In15orlesss hadoopIn15orlesss hadoop
In15orlesss hadoop
 
Hadoop vs Apache Spark
Hadoop vs Apache SparkHadoop vs Apache Spark
Hadoop vs Apache Spark
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big picture
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduce
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a Glance
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
2.1-HADOOP.pdf
2.1-HADOOP.pdf2.1-HADOOP.pdf
2.1-HADOOP.pdf
 
Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop Introduction
 
BIGDATA ppts
BIGDATA pptsBIGDATA ppts
BIGDATA ppts
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoop
 
Case study on big data
Case study on big dataCase study on big data
Case study on big data
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
 
RDBMS vs Hadoop vs Spark
RDBMS vs Hadoop vs SparkRDBMS vs Hadoop vs Spark
RDBMS vs Hadoop vs Spark
 
Unit IV.pdf
Unit IV.pdfUnit IV.pdf
Unit IV.pdf
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, Providers
 

Mehr von Muthu Natarajan

Understanding about relational database m-square systems inc
Understanding about relational database m-square systems incUnderstanding about relational database m-square systems inc
Understanding about relational database m-square systems incMuthu Natarajan
 
Agile methodologiesvswaterfall
Agile methodologiesvswaterfallAgile methodologiesvswaterfall
Agile methodologiesvswaterfallMuthu Natarajan
 
Business intelligence data analytics-visualization
Business intelligence data analytics-visualizationBusiness intelligence data analytics-visualization
Business intelligence data analytics-visualizationMuthu Natarajan
 
Business intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data VisualizationBusiness intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data VisualizationMuthu Natarajan
 
Social Media Strategies and Social Marketing
Social Media Strategies and Social MarketingSocial Media Strategies and Social Marketing
Social Media Strategies and Social MarketingMuthu Natarajan
 
Protect your website
Protect your websiteProtect your website
Protect your websiteMuthu Natarajan
 
Cloud Computing & Benefits
Cloud Computing & BenefitsCloud Computing & Benefits
Cloud Computing & BenefitsMuthu Natarajan
 

Mehr von Muthu Natarajan (8)

Understanding about relational database m-square systems inc
Understanding about relational database m-square systems incUnderstanding about relational database m-square systems inc
Understanding about relational database m-square systems inc
 
Agile methodologiesvswaterfall
Agile methodologiesvswaterfallAgile methodologiesvswaterfall
Agile methodologiesvswaterfall
 
Business intelligence data analytics-visualization
Business intelligence data analytics-visualizationBusiness intelligence data analytics-visualization
Business intelligence data analytics-visualization
 
Business intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data VisualizationBusiness intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data Visualization
 
Social Media Strategies and Social Marketing
Social Media Strategies and Social MarketingSocial Media Strategies and Social Marketing
Social Media Strategies and Social Marketing
 
Protect your website
Protect your websiteProtect your website
Protect your website
 
Hr presentation
Hr presentationHr presentation
Hr presentation
 
Cloud Computing & Benefits
Cloud Computing & BenefitsCloud Computing & Benefits
Cloud Computing & Benefits
 

KĂŒrzlich hochgeladen

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...gurkirankumar98700
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

KĂŒrzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Hadoop white papers

  • 2. What is Hadoop? Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment. Hadoop is designed to run on commodity hardware and can scale up or down without system interruption. It consists of three main functions: storage, processing and resource management.
  • 3. Core services on Hadoop MapReduce: MapReduce is a framework for writing applications that process large amounts of structured and unstructured data in parallel across a cluster of several machines in a reliable and fault-tolerant. HDFS: Hadoop Distributed File System is a java-based file system that provides scalable and reliable data storage for large group of clusters. Hadoop Yarn: Yarn is a next generation framework for Hadoop Data processing extending MapReduce capabilities by supporting nonMapReduce workloads associated with other programming models. Apache Tez: Tez generalizes the MapReduce paradigm to a more powerful framework for executing a complex DAG (directed acyclic graph) of tasks for near real-time big data processing
  • 4. Hadoop Data Services Apache Pig: Its platform for processing and analyzing large data sets. Apache Hbase: A column-oriented No SQL data storage system that provides random real-time read/write access to big data for user applications. Apache Hive: Built on the MapReduce framework, Hive is a data warehouse that enables easy data summarization and add-hoc queries via SQL-like interface for large datasets stored in HDFS. Apache Flume: Allows efficiently aggregating and moving large amounts of log data from many different sources to Hadoop. Apache Mahout: Apache Mahout scalable machine learning algorithms for hadoop, which aids with data science for clustering, classification and batch based collaborative filtering
  • 5. Hadoop Data Services Apache Accumulo : Accumulo is a high performance data storage and retrieval system with cell-level access control. It is a scalable implementation of Google’s Big Table design that works on top of Apache Hadoop and Apache ZooKeeper. Apache Storm : Storm is a distributed real-time computation system for processing fast, large streams of data adding reliable real-time data processing capabilities to Apache Hadoop 2.x. Apache Catalog : A table and metadata management service that provides a centralized way for data processing systems to understand the structure and location of the data stored within Apache Hadoop Apache Sqoop : Sqoop is a tool that speeds and eases movement of data in and out of Hadoop. It provides a reliable parallel load for various, popular enterprise data sources.
  • 6. Hadoop Operational Services Apache Zookeeper: A highly available system for coordinating distributing processes. Apache Falcon: Falcon is a data management framework for simplifying data lifecycle management and processing pipelines on Apache hadoop. Apache Ambari: Open source installation lifecycle management, administration, and monitoring system for Apache Hadoop Clusters. Apache knox: “Knox” gateway is a system that provides a single point of authentication and access for Apache Hadoop services in a cluster. Apache Oozie: Oozie Java web application used to schedule Apache Hadoop Jobs. Oozie combines multiple jobs sequentially into one logical unit of work.
  • 7. What Hadoop can, and can't do What Hadoop can't do You can't use Hadoop for  Structured data  Transactional data What Hadoop can do You can use Hadoop for  Big Data
  • 8. Support & Partner Getting Started or Support – Muthu Natarajan muthu.n@msquaresystems.com www.msquaresystems.com Phone: 212-941-6000