Suche senden
Hochladen
Hadoop2 new and noteworthy SNIA conf
•
0 gefällt mir
•
1,146 views
Sujee Maniyam
Folgen
New features in Hadoop version 2
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 55
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Uwe Printz
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
DataWorks Summit
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
Jan Pieter Posthuma
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache HBase
Carol McDonald
Apache Spark & Hadoop
Apache Spark & Hadoop
MapR Technologies
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
DataWorks Summit
Spark vstez
Spark vstez
David Groozman
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Mithun Radhakrishnan
Empfohlen
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Uwe Printz
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
DataWorks Summit
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
Jan Pieter Posthuma
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache HBase
Carol McDonald
Apache Spark & Hadoop
Apache Spark & Hadoop
MapR Technologies
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
DataWorks Summit
Spark vstez
Spark vstez
David Groozman
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Mithun Radhakrishnan
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenches
DataWorks Summit
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
DataWorks Summit
Hadoop
Hadoop
yasser hassen
TriHUG Feb: Hive on spark
TriHUG Feb: Hive on spark
trihug
Introduction to Data Analyst Training
Introduction to Data Analyst Training
Cloudera, Inc.
Apache Spark Overview
Apache Spark Overview
Carol McDonald
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
Adam Muise
Spark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different Rules
DataWorks Summit/Hadoop Summit
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduce
Uwe Printz
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Uwe Printz
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Uwe Printz
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Continuent
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
Hadoop And Their Ecosystem
Hadoop And Their Ecosystem
sunera pathan
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
DataWorks Summit/Hadoop Summit
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Amazon Web Services
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and Hadoop
Cynthia Saracco
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Cynthia Saracco
Weitere ähnliche Inhalte
Was ist angesagt?
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenches
DataWorks Summit
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
DataWorks Summit
Hadoop
Hadoop
yasser hassen
TriHUG Feb: Hive on spark
TriHUG Feb: Hive on spark
trihug
Introduction to Data Analyst Training
Introduction to Data Analyst Training
Cloudera, Inc.
Apache Spark Overview
Apache Spark Overview
Carol McDonald
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
Adam Muise
Spark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different Rules
DataWorks Summit/Hadoop Summit
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduce
Uwe Printz
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Uwe Printz
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Uwe Printz
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Continuent
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
Hadoop And Their Ecosystem
Hadoop And Their Ecosystem
sunera pathan
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
DataWorks Summit/Hadoop Summit
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Amazon Web Services
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
Was ist angesagt?
(20)
Hive at Yahoo: Letters from the trenches
Hive at Yahoo: Letters from the trenches
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
Hadoop
Hadoop
TriHUG Feb: Hive on spark
TriHUG Feb: Hive on spark
Introduction to Data Analyst Training
Introduction to Data Analyst Training
Apache Spark Overview
Apache Spark Overview
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
Spark SQL versus Apache Drill: Different Tools with Different Rules
Spark SQL versus Apache Drill: Different Tools with Different Rules
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduce
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Keynote: Getting Serious about MySQL and Hadoop at Continuent
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
Hadoop And Their Ecosystem
Hadoop And Their Ecosystem
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
Andere mochten auch
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and Hadoop
Cynthia Saracco
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Cynthia Saracco
Big Data: Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guide
Cynthia Saracco
Big Data: HBase and Big SQL self-study lab
Big Data: HBase and Big SQL self-study lab
Cynthia Saracco
Big Data: Big SQL and HBase
Big Data: Big SQL and HBase
Cynthia Saracco
Big Data: Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark
Cynthia Saracco
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
Cloudera, Inc.
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Edureka!
Big Data: SQL on Hadoop from IBM
Big Data: SQL on Hadoop from IBM
Cynthia Saracco
Big Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
Andere mochten auch
(10)
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guide
Big Data: HBase and Big SQL self-study lab
Big Data: HBase and Big SQL self-study lab
Big Data: Big SQL and HBase
Big Data: Big SQL and HBase
Big Data: Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Big Data: SQL on Hadoop from IBM
Big Data: SQL on Hadoop from IBM
Big Data Analytics with Hadoop
Big Data Analytics with Hadoop
Ähnlich wie Hadoop2 new and noteworthy SNIA conf
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentaho
Martin Ferguson
Dallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: Hadoop
lamont_lockwood
Sam fineberg big_data_hadoop_storage_options_3v9-1
Sam fineberg big_data_hadoop_storage_options_3v9-1
Pramod Gosavi
Hadoop Tutorial for Beginners
Hadoop Tutorial for Beginners
business Corporate
Big Data and Hadoop Basics
Big Data and Hadoop Basics
Sonal Tiwari
big data
big data
Arohi Khandelwal
Introduction of Big data and Hadoop
Introduction of Big data and Hadoop
Arohi Khandelwal
Hadoop tutorial-pdf.pdf
Hadoop tutorial-pdf.pdf
Sheetal Jain
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
Tarak Tar
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
Tarak Tar
Big data overview by Edgars
Big data overview by Edgars
Andrejs Vorobjovs
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
AltafKhadim
Hadoop description
Hadoop description
Hadoop online training
Hadoop a Highly Available and Secure Enterprise Data Warehousing solution
Hadoop a Highly Available and Secure Enterprise Data Warehousing solution
Edureka!
Hadoop seminar
Hadoop seminar
KrishnenduKrishh
HDFS presented by VIJAY
HDFS presented by VIJAY
thevijayps
Introduction to Hadoop
Introduction to Hadoop
Vigen Sahakyan
Big data and hadoop product page
Big data and hadoop product page
Janu Jahnavi
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
outstanding59
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
Richard McDougall
Ähnlich wie Hadoop2 new and noteworthy SNIA conf
(20)
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentaho
Dallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: Hadoop
Sam fineberg big_data_hadoop_storage_options_3v9-1
Sam fineberg big_data_hadoop_storage_options_3v9-1
Hadoop Tutorial for Beginners
Hadoop Tutorial for Beginners
Big Data and Hadoop Basics
Big Data and Hadoop Basics
big data
big data
Introduction of Big data and Hadoop
Introduction of Big data and Hadoop
Hadoop tutorial-pdf.pdf
Hadoop tutorial-pdf.pdf
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
Big data overview by Edgars
Big data overview by Edgars
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
Hadoop description
Hadoop description
Hadoop a Highly Available and Secure Enterprise Data Warehousing solution
Hadoop a Highly Available and Secure Enterprise Data Warehousing solution
Hadoop seminar
Hadoop seminar
HDFS presented by VIJAY
HDFS presented by VIJAY
Introduction to Hadoop
Introduction to Hadoop
Big data and hadoop product page
Big data and hadoop product page
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
Mehr von Sujee Maniyam
Reference architecture for Internet of Things
Reference architecture for Internet of Things
Sujee Maniyam
Hadoop to spark-v2
Hadoop to spark-v2
Sujee Maniyam
Building secure NoSQL applications nosqlnow_conf_2014
Building secure NoSQL applications nosqlnow_conf_2014
Sujee Maniyam
Launching your career in Big Data
Launching your career in Big Data
Sujee Maniyam
Hadoop security landscape
Hadoop security landscape
Sujee Maniyam
Spark Intro @ analytics big data summit
Spark Intro @ analytics big data summit
Sujee Maniyam
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
Sujee Maniyam
Iphone client-server app with Rails backend (v3)
Iphone client-server app with Rails backend (v3)
Sujee Maniyam
Mehr von Sujee Maniyam
(8)
Reference architecture for Internet of Things
Reference architecture for Internet of Things
Hadoop to spark-v2
Hadoop to spark-v2
Building secure NoSQL applications nosqlnow_conf_2014
Building secure NoSQL applications nosqlnow_conf_2014
Launching your career in Big Data
Launching your career in Big Data
Hadoop security landscape
Hadoop security landscape
Spark Intro @ analytics big data summit
Spark Intro @ analytics big data summit
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
Iphone client-server app with Rails backend (v3)
Iphone client-server app with Rails backend (v3)
Kürzlich hochgeladen
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Stephanie Beckett
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
Zilliz
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Fwdays
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Memoori
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Kalema Edgar
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Fwdays
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
The Digital Insurer
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Wonjun Hwang
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
hariprasad279825
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Lorenzo Miniero
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Patryk Bandurski
Kürzlich hochgeladen
(20)
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Hadoop2 new and noteworthy SNIA conf
1.
PRESENTATION TITLE GOES
HEREHadoop 2 : New and Noteworthy Sujee Maniyam, ElephantScale sujee@ElephantScale.com http://ElephantScale.com
2.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. SNIA Legal Notice ! The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. ! Member companies and individual members may use this material in presentations and literature under the following conditions: ! Any slide or slides used must be reproduced in their entirety without modification ! The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations. ! This presentation is a project of the SNIA Education Committee. ! Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney. ! The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. 2
3.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Abstract ! Hadoop 2 : New And Noteworthy Features ! This session will appeal to Data Center Managers, Development Managers, and those that are looking for an overview of ‘whats new’ in Hadoop 2 platform. The session will highlight some of the notable features in Hadoop 2. 3
4.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Quick Poll ! How many of you are NEW to Hadoop? ! How many of you are USING Hadoop?
5.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Hadoop Timeline
6.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Hadoop Versions – J
7.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Hadoop Versions – Simplified Hadoop 1 Hadooop 2 1.2.1 (aug 2013) 2.2.0 : (oct 2013)
8.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Feature Matrix Component Feature V1 v2 HDFS NameNode High Availability X Namenode federation X Snapshots X NFS v3 access to HDFS X Improved IO X Processing MapReduce v1 X YARN (MapReduce v2) X Other Kerberos security X X
9.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NEXT ! NameNode High Availability ! Federation ! Snapshots ! NFS ! Improved IO 9
10.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Architecture (V1) 10 Name Node Data Node Data NodeData NodeData Node
11.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Name Node High Availability ! HDFS has (had) a ONE NameNode/ many Datanode design ! This leads to ‘Single Point of Failure’ (SPOF) for Name Node
12.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NameNode Is Very Important In A Cluster 12
13.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Is Hadoop NN Failure A Big Deal? ! At Yahoo study ! 18 month study ! 22 failure on 25 clusters ! 0.58 failures per cluster per year ! Only half of them would have benefited from HA ! à 0.23 failure / year / cluster ! http://www.slideshare.net/Hadoop_Summit/hdfs- namenode-high-availability
14.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Still Needs To Be Fixed ! Downtime may be acceptable for batch workloads ! But not acceptable for running real time workloads like HBase that depend on HDFS ! Downtime (even minutes) is not acceptable ! Make Hadoop more Enterprise friendly
15.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. How Do We Fix A Single NameNode Failure? ! Have two Namenodes ! ! One ACTIVE and another PASSIVE ! When Active NN fails, Passive one will take over ! Fail over can be automated
16.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Architecture (v1) 16 Name Node Data Node Data NodeData NodeData Node
17.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NameNode HA (V2) 17 Name Node 1 (active) Data Node Data NodeData NodeData Node Name Node 2 (passive) Shared storage
18.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NameNode HA : Shared Storage (c) ElephantScale.com, 2014 18 Name Node 1 (active) Data Node Data NodeData NodeData Node Name Node 2 (passive) Filer Option 1) external filer Option 2) Quorum Journal
19.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Namenode HA ! Namenode meta data is written to a shared storage (external filer or Quorum Journal Manager) ! Only ONE active NN can write to shared storage ! Passive NN reads and replays meta data from shared storage ! When Active NN fails, passive NN is promoted to active ! Can be manual or automatic
20.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NameNode HA Setup 20
21.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NEXT ! NameNode High Availability ! Federation ! Snapshots ! NFS ! Improved IO 21
22.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Namenode Federation ! Namenode stores meta data in memory ! For large (very large) clusters, NN could exhaust memory ! Spread meta-data over mulitiple namenodes
23.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Federation
24.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Federation ! Now the namespace is divided ! /hbase à NN1 ! /user à NN2 ! /hive à NN3
25.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Federation ! Namespace is partitioned into ‘block pools’ ! Datanodes are shared across cluster ! They store blocks for different pools ! Datanodes send heart-beats to all NNs
26.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NEXT ! NameNode High Availability ! Federation ! Snapshots ! NFS ! Improved IO 26
27.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Snapshots ! Wait, doesn’t HDFS makes replicas? ! Yes ! But it doesn’t save you from : hdfs dfs –rm –r /data ! ‘Trash’ feature only works for CLI utilities ! You can delete files using API.. Poof gone
28.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Snapshots ! Recover from user errors, other disasters ! Peroidic snapshots ! E.g : daily backups… keep them for 15 days ! Snapshotting is ! Efficient (no data duplication, copy on write) ! Fast ! snapshot part of file system (not the whole thing) ! http://cdn.oreillystatic.com/en/assets/1/event/100/HDFS %20Snapshots%20and%20Beyond%20Presentation.pdf
29.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NEXT ! NameNode High Availability ! Federation ! Snapshots ! NFS ! Improved IO 29
30.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NFS Access to HDFS ! HDFS is a userland file system ! Not a kernel file system ! So most linux programs can not read/write data to HDFS ! We use ‘hdfs’ command line utils
31.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NFS Access to HDFS ! HDFS supports NFS protocol starting with v2 ! NFS is done via gateway machine
32.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. NEXT ! NameNode High Availability ! Federation ! Snapshots ! NFS ! Improved IO 32
33.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. HDFS Improved IO ! Lots of performance fixes from v1 à v2 ! Quick comparison ! Multi threaded random-read ! HDFS v1 : 264 MB/sec ! HDFS v2 : 1395 MB /sec ( 5x !) Source : http://www.slideshare.net/cloudera/hdfs-update-lipcon-federal-big-data-apache- hadoop-forum
34.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. V2 Features ! HDFS ! Processing ! YARN
35.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. MapReduce V1 ! MRV1 proved itself as a reliable batch processing framework! ! One Job Tracker (master) and many task tracker (workers)
36.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. MapReduce Architecture 36 Job Tracker Task Tracker Task TrackerTask TrackerTask Tracker
37.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. MRV1 Limitations ! Only supports one programming paradigm ! Batch processing ! Alternate processing is hard to (or not possible) implement on top of MRV1 ! Real time processing ! In-memory data
38.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. MRV1 Limitations ! Single Job Tracker (JT) à single point of failure ! JT Failure kills all running jobs (and queued jobs) ! JT started hit scalability limitations for very large clusters ! 4,000 nodes
39.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Looking Ahead HDFS MRV1 1) Processing 2) Resource management HDFS YARN (resource management) mapreduce other Hadoop v1 Hadoop v2
40.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. 40
41.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Yarn ! MRV1 did ! Resource Management ! And Processing ! Separate both out ! Yarn for resource management ! Mapreduce / other frameworks for processing ! Now mapreduce is ‘just another app’
42.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Yarn Architecture
43.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. YARN Architecture ! resource manager : manages the resource for entire cluster ! node manager : manages resources a single node ! Containers : resource buckets ( 2 cpu + 8 G RAM) ! application masters : one for each application ! batch mapreduce, storm …etc ! Manages application scheduling and execution
44.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Adoption of YARN ! Standard on Hadoop v2 ! Already running at Yahoo at scale ! Lot of applications are already moving to YARN architecture
45.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Apps on Yarn HDFS YARN Batch (mapreduce) Streaming (storm, S4) In-memory (spark) Graph (giraph) realtime (hbase)
46.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Apps on YARN ! Storm : real time event processing ! Giraph : graph processing (in memory) ! Spark : in-memory, iterative processing ! Hbase
47.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. MapReduce on YARN ! MapReduce is NOT going anywhere ! Works very well for batch processing ! Proven ! Lots of code out there ! No more single JobTracker ! Each MapReduce job runs an Application ! So failure one AppMaster only causes that job to fail ! Other jobs are insulated ! Better performance ! MR jobs scale / utilize cluster better in Yarn (1.5 x – 2x ) (c) ElephantScale.com, 2014
48.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. MapReduce on YARN (c) ElephantScale.com, 2014
49.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Writing A YARN Application ! http://hadoop.apache.org/docs/stable/hadoop-yarn/ hadoop-yarn-site/WritingYarnApplications.html
50.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. So Which Hadoop Should I Use? ! If you are starting now… ! Hadoop 2 ! Already using Hadoop 1 ! Worth the upgrade (new features / performance) ! How do I migrate? ! Recommended : Standup a separate v2 cluster and migrate data over ! In place update? (yeek!)
51.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Hadoop Distributions Distribution Hadoop v1 Hadoop v2 Cloudera CDH 3.x / CDH 4.x CDH 5.x Horton Works HDP 1.x HDP 2.x Pivotal HD
52.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Future… ! HDFS ! Mirroring across data centers ! Work well with SSD (solid state drives / flash drives) ! YARN ! Better containers (not just JVMs) ! Performance ! Make Resource Manager HA
53.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Thanks & Questions?
54.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Attribution & Feedback 54 Please send any questions or comments regarding this SNIA Tutorial to tracktutorials@snia.org The SNIA Education Committee thanks the following individuals for their contributions to this Tutorial. Authorship History Sujee Maniyam (Sept 2014) Additional Contributors Joseph White : Review & Feedback
55.
Hadoop 2 :
New and Noteworthy © 2013 Storage Networking Industry Association. All Rights Reserved. Backup Slides 55
Jetzt herunterladen