SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Analytics on your data in place
Steve Watt, Red Hat

CC flickr Barta IV

@wattsteve
Hadoop at Red Hat

@wattsteve
But tonight I have my community hat on

CC flickr wcdumonts

@wattsteve
Hadoop in 2007
Platform Layers Technologies
Computational
Runtimes
FileSystems

HDFS or Amazon S3

Infrastructures

CC flickr wwarby

MapReduce, HBase

x86 or Amazon EC2

@wattsteve
Hadoop in 2013
Platform Layers

Technologies

Computational
Runtimes

YARN, GiRAPH, MapReduce,
HBase, Phoenix,
Spark/BDAS, Drill, Impala,
Stinger

FileSystems

HDFS + 13 Other Hadoop
FileSystems

Infrastructures

System on a Chip, x86,
Virtualization and Cloud

CC flickr lowfatbrains

@wattsteve
Observation #1: The Hadoop FileSystem Interface is
the keystone of the entire Ecosystem

CC flickr grufnik

@wattsteve
Observation #2: Moving data around just to analyze it
is slow and expensive. Especially if it requires a redundant
repository

.

CC flickr traftery

@wattsteve
So how does this work?
By leveraging Hadoop’s pluggable FileSystem architecture
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
Hadoop FileSystem Plugin

Hadoop FileSystem

FileSystem
Implementation

@wattsteve
Hadoop FileSystem Configuration for HDFS

Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
HDFS Plugin

Hadoop FileSystem

HDFS

@wattsteve
What are some examples of where big
data is stored?
- Object Stores
- NoSQL Stores
- Distributed FileSystems
- Network Filers
- Databases

CC flickr birdwatcher63

@wattsteve
Network Filer Example
Hadoop FileSystem Configuration for GlusterFS
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
GlusterFS Plugin

Hadoop FileSystem

@wattsteve
Network Filer - Apache Hadoop on GlusterFS
Hadoop

Resource

Master Services

Manager

Management
Server

plugin

SWIFT

Hadoop

Node

Node

Node

Workers

Manager

Manager

Manager

plugin

plugin

plugin

NFS

FUSE

GlusterFS
FUSE

FUSE

FUSE

Trusted Peer

Trusted Peer

DAS Brick

DAS Brick

DAS Brick

Server 1

Server 2

Server 50

...

Trusted Peer

@wattsteve
Object Store Example
Hadoop FileSystem Configuration for SWIFT
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
SWIFT Plugin

Hadoop FileSystem

SWIFT

@wattsteve
NoSQL Example
Hadoop FileSystem Configuration for CassandraFS
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
CassandraFS Plugin

Hadoop FileSystem

@wattsteve
NoSQL - Apache Hadoop on CassandraFS

@wattsteve
We are working on filesystem tests within
Apache Hadoop-Common and Apache BigTop
as well as opening up ecosystem tools

CC flickr syume

@wattsteve
@wattsteve
@wattsteve
Closing Remarks
1. The amount of Hadoop FileSystems available
to you continues to increase
2. This is good! A vibrant ecosystem gives you
choice
3. Evaluate the option of analyzing your data in
place before deploying new environments

CC flickr zoomboy1

@wattsteve

Weitere Àhnliche Inhalte

Was ist angesagt?

Enabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudEnabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudAlluxio, Inc.
 
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...BMonica1
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Nick Galbreath
 
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReducePublic Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduceHadoop User Group
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21Hadoop User Group
 
B.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCEB.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCEBMonica1
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixJeff Magnusson
 
Big Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformBig Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformNavneet Gupta
 
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...Ocean Project
 
Big data and tools
Big data and tools Big data and tools
Big data and tools Shivam Shukla
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache HadoopSteve Watt
 
Apache Arrow and Python: The latest
Apache Arrow and Python: The latestApache Arrow and Python: The latest
Apache Arrow and Python: The latestWes McKinney
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkDataWorks Summit
 
Facebook Hadoop Data & Applications
Facebook Hadoop Data & ApplicationsFacebook Hadoop Data & Applications
Facebook Hadoop Data & Applicationsdzhou
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptxMoldovan Radu Adrian
 

Was ist angesagt? (20)

Tame that Beast
Tame that BeastTame that Beast
Tame that Beast
 
Enabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudEnabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid Cloud
 
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
 
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReducePublic Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21
 
B.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCEB.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCE
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at Netflix
 
Big Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformBig Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data Platform
 
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
 
Big data and tools
Big data and tools Big data and tools
Big data and tools
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
Apache Arrow and Python: The latest
Apache Arrow and Python: The latestApache Arrow and Python: The latest
Apache Arrow and Python: The latest
 
Searching At Scale
Searching At ScaleSearching At Scale
Searching At Scale
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hadoop - Apache Hive
Hadoop - Apache HiveHadoop - Apache Hive
Hadoop - Apache Hive
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Facebook Hadoop Data & Applications
Facebook Hadoop Data & ApplicationsFacebook Hadoop Data & Applications
Facebook Hadoop Data & Applications
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptx
 

Ähnlich wie Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup

Hadoop for the disillusioned
Hadoop for the disillusionedHadoop for the disillusioned
Hadoop for the disillusionedSteve Watt
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusionedBigDataCamp
 
SQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQSQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQpivotalny
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSDataWorks Summit
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data JourneyTugdual Grall
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo pptPhil Young
 
20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarnDatalayer
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoopAsis Mohanty
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big dealeduarderwee
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony NguyenThanh Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahofield_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahoMartin Ferguson
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
Lecture-20.pptx
Lecture-20.pptxLecture-20.pptx
Lecture-20.pptxmohaaalsa
 
Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)Marcel Krcah
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick GuideAsim Jalis
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_pptjerrin joseph
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010nzhang
 

Ähnlich wie Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup (20)

Hadoop for the disillusioned
Hadoop for the disillusionedHadoop for the disillusioned
Hadoop for the disillusioned
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned
 
SQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQSQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQ
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFS
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahofield_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentaho
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Lecture-20.pptx
Lecture-20.pptxLecture-20.pptx
Lecture-20.pptx
 
Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick Guide
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
 

KĂŒrzlich hochgeladen

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Christopher Logan Kennedy
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

KĂŒrzlich hochgeladen (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup

  • 1. Analytics on your data in place Steve Watt, Red Hat CC flickr Barta IV @wattsteve
  • 2. Hadoop at Red Hat @wattsteve
  • 3. But tonight I have my community hat on CC flickr wcdumonts @wattsteve
  • 4. Hadoop in 2007 Platform Layers Technologies Computational Runtimes FileSystems HDFS or Amazon S3 Infrastructures CC flickr wwarby MapReduce, HBase x86 or Amazon EC2 @wattsteve
  • 5. Hadoop in 2013 Platform Layers Technologies Computational Runtimes YARN, GiRAPH, MapReduce, HBase, Phoenix, Spark/BDAS, Drill, Impala, Stinger FileSystems HDFS + 13 Other Hadoop FileSystems Infrastructures System on a Chip, x86, Virtualization and Cloud CC flickr lowfatbrains @wattsteve
  • 6. Observation #1: The Hadoop FileSystem Interface is the keystone of the entire Ecosystem CC flickr grufnik @wattsteve
  • 7. Observation #2: Moving data around just to analyze it is slow and expensive. Especially if it requires a redundant repository . CC flickr traftery @wattsteve
  • 8. So how does this work? By leveraging Hadoop’s pluggable FileSystem architecture Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface Hadoop FileSystem Plugin Hadoop FileSystem FileSystem Implementation @wattsteve
  • 9. Hadoop FileSystem Configuration for HDFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface HDFS Plugin Hadoop FileSystem HDFS @wattsteve
  • 10. What are some examples of where big data is stored? - Object Stores - NoSQL Stores - Distributed FileSystems - Network Filers - Databases CC flickr birdwatcher63 @wattsteve
  • 11. Network Filer Example Hadoop FileSystem Configuration for GlusterFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface GlusterFS Plugin Hadoop FileSystem @wattsteve
  • 12. Network Filer - Apache Hadoop on GlusterFS Hadoop Resource Master Services Manager Management Server plugin SWIFT Hadoop Node Node Node Workers Manager Manager Manager plugin plugin plugin NFS FUSE GlusterFS FUSE FUSE FUSE Trusted Peer Trusted Peer DAS Brick DAS Brick DAS Brick Server 1 Server 2 Server 50 ... Trusted Peer @wattsteve
  • 13. Object Store Example Hadoop FileSystem Configuration for SWIFT Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface SWIFT Plugin Hadoop FileSystem SWIFT @wattsteve
  • 14. NoSQL Example Hadoop FileSystem Configuration for CassandraFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface CassandraFS Plugin Hadoop FileSystem @wattsteve
  • 15. NoSQL - Apache Hadoop on CassandraFS @wattsteve
  • 16. We are working on filesystem tests within Apache Hadoop-Common and Apache BigTop as well as opening up ecosystem tools CC flickr syume @wattsteve
  • 19. Closing Remarks 1. The amount of Hadoop FileSystems available to you continues to increase 2. This is good! A vibrant ecosystem gives you choice 3. Evaluate the option of analyzing your data in place before deploying new environments CC flickr zoomboy1 @wattsteve

Hinweis der Redaktion

  1. Now lets look at some examples