SlideShare a Scribd company logo
1 of 24
HDFS:
Now and Future
Sanjay Radia (sanjay@hortonworks.com)
@Hortonworks.com
Suresh Srinivas (suresh@hortonworks.com)




© Hortonworks Inc. 2011                    Page 1
Outline

•  Hadoop 1 and Hadoop 2 Releases
•  Generalized storage service
  –  Leverage it for further innovation
•  Enterprise Use Cases
•  HDFS Infrastructure Improvements
•  HA in Hadoop 1!


                                          2
Hadoop 1 and Hadoop 2

Hadoop 1 (GA)                     Hadoop 2 (alpha)
• Security                        • New Append
• Append/Fsync (Hbase)            • Federation
• WebHdfs + Spnego                • Wire compatibility
• Write pipeline improvements • Edit logs rewrite
• Local write optimization        • Faster startup
• Performance improvements        • HA NameNode
• Disk-fail-in-place



                                                         Page 3
        © Hortonworks Inc. 2011
Testing & Quality – Used for each stable release
Nightly Testing
      –  1200 automated tests on 30 nodes
      –  Live data and applications
QE Certification for Release
      –  Large variety and scale tests on 500 nodes
      –  Performance benchmarking
      –  QE HIT integration testing of whole stack
Release Testing – alpha and beta
•  Sandbox cluster – 3 clusters each with 400 - 1K nodes
      –  Major releases: 2 months testing on actual data - all production projects must
         sign off
•    Research clusters – 6 Clusters (non-revenue production jobs) (4K Nodes)
      –  Major releases – minimum 2 months before moving to production
      –  .25Million to .5Million jobs per week
                      if it clears research then mostly fine in production
Release
•  Production clusters - 11 clusters (4.5K nodes)
      –  Revenue generating, stricter SLAs

                                                                             4
Hadoop 1 and Hadoop 2 Timelines

               0.20.1       DEV         QA     beta
HADOOP 1.0




                              DEV         QA          beta
                 0.20.2
                                                  Security
                                                     DEV       QA     beta
                                       0.20.1xx
                                                                      Operability, Multi Tenancy
                                                                          DEV QA beta                            Hadoop 1.0
                                                           0.20.2xx
                                                                                                                    GA
                                                                            Old Append
                                                                      1.0     DEV         QA        beta


                                         New Append
HADOOP 2.0




                                               DEV
                               0.21
                                                             SecurityPort +
                                                                    DEV                                     QA
                                                      0.22
                                                                            Federation, YARN
                                                                                                                           Hadoop 2.0
                                                                     0.23           DEV                    QA      alpha
                                                                                                                             alpha
                                                                                               HA, Wire Compatibility
                                                                                                           DEV              QA    beta
                                                                                         2.0


             2008                       2009                        2010                   2011                    2012


                                                                                                                                         Page 5
             © Hortonworks Inc. 2011
Outline

•  Hadoop 1 and Hadoop 2 Releases
•  Generalized storage service
  –  Leverage it for further innovation
•  Enterprise Use Cases
•  HDFS Infrastructure Improvements
•  HA in Hadoop 1!


                                          6
Federation: Generalized Block Storage
         Namespace       NN-1                                                                 NN-k                                                               NN-n

                                                                                                                                                                           Foreign	
  
                           	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  NS1                                   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  NS	
  k                    NS	
  n
                                                                                ...                                                                        ...


                                                                Pool	
  	
  1                                         Pool	
  	
  k                               Pool	
  	
  n
         Block Storage




                                                                                      	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Block	
  	
  Pools




                          Datanode	
  1	
                                                             Datanode	
  2                                               Datanode	
  m
                                                       ...                                                                          ...                                      ...
                                                                                            Common	
  Storage


•    Block Storage as generic storage service
      –  Set of blocks for a Namespace Volume is called a Block Pool
      –  DNs store blocks for all the Namespace Volumes – no partitioning
•    Multiple independent Namenodes and Namespace Volumes in a cluster
      –  Namespace Volume = Namespace + Block Pool
HDFS’ Generic Storage Service
     Opportunities for Innovation
•  Federation - Distributed (Partitioned) Namespace
   –  Simple and Robust due to independent masters                        Alternate NN
                                                                         Implementation
                                                                                           HBase
   –  Scalability, Isolation, Availability                  HDFS
                                                          Namespace                       MR tmp

•  New Services – Independent Block Pools
   –  New FS - Partial namespace in memory
   –  MR Tmp storage, HBase directly on block storage                 Storage Service

   –  Shadow file system – caches HDFS, NFS, S3

•  Future: move Block Management in DataNodes
   –  Simplifies namespace/application implementation
   –  Distributed namenode becomes significantly simple
Shadow File System for Another
                                                                            S3

                      Custom Shadow NameSpaces
                                                                          HDFS

                                    DataNode
                                                                           NFS



•  Custom Namespace to shadow namespace of another system
  –  Uses a private block pool
•  Different policies on the data
  –  E.g. Single replica, fetch missing ones from source
       –  Hadoop can serve as a processing engine for source data without putting a lot of load on source
  –  E.g. Reduce replication factor for data duplicated in another cluster

                                                                                                 Page 9
          © Hortonworks Inc. 2011
Managing Namespaces
•    Federation has multiple namespaces                                                   Client-side
                                                                                     /
                                                                                          mount-table
•    Don’t you need a single global namespace?
      –  Some tenants want private namespace
           •    Hadoop as service – each tenant its own namespace

      –  Global? Key is to share the data and the names used to           data project home    tmp
         access the data

•    A single global namespace is one way share
•    Client-side mount table is another way to share.
      –  Shared mount-table => “global” shared view                                             NS4

      –  Personalized mount-table => per-application view
           •  Share the data that matter by mounting it

•    Client-side implementation of mount tables                     NS1       NS2        NS3

      –  No single point of failure
      –  No hotspot for root and top level directories
Next Steps… first class support for volumes
                                     •  NameServer - Container for namespaces
                                       ›  Lots of small namespace volumes
                                           –  Chosen per user/tenant/data feed

                                           –  Management policies (quota, …)

                          …
                                           –  Mount tables for unified namespace

    NameServers as                              •  Can be managed by a central volume server
Containers of Namespaces
                                       ›  Move namespace for balancing

                                     •  WorkingSet of namespace in memory
                 …	
  
 Datanode                 Datanode     ›  Many more namespaces in a server

      Storage Layer
                                     •  Number of NameServers =
                                       ›  Sum of (Namespace working set)

                                       ›  Sum of (Namespace throughput)

                                                                                       11
       © Hortonworks Inc. 2011
Outline

•  Hadoop 1 and Hadoop 2 Releases
•  Generalized storage service
  –  Leverage it for further innovation
•  Enterprise Use Cases
•  HDFS Infrastructure Improvements
•  HA in Hadoop 1!


                                          12
Enterprise Use Cases
•  High Availability þ
•  Standard Interfaces þ
  –  WebHdfs(REST) þ, Fuse þ and NFS access
•  Snapshots - Under progress
•  Disaster Recovery
  –  Distcp does parallel and incremental copies þ
      –  Enhance using journal interface & Snapshots

•  Data Efficiency/RAID
  –  Productize the tools and experience at Facebook




                                                       Page 13
       © Hortonworks Inc. 2011
Outline

•  Hadoop 1 and Hadoop 2 Releases
•  Generalized storage service
  –  Leverage it for further innovation
•  Enterprise Use Cases
•  HDFS Infrastructure Improvements
•  HA in Hadoop 1!


                                          14
Infrastructure Improvements
•  Netty
  –  Better connection and thread management
•  Image/Edits management
  –  HDFS image/edits stored with in HDFS
•  Parallel writes
  –  Lower latency
•  Grouping blocks
  –  Scaling number of blocks and block reports
•  Support for Heterogeneous Storage
  –  SSD, archival storage
•  Rolling upgrades improvements
  –  Wire compatibility done

                                                  Page 15
       © Hortonworks Inc. 2011
Outline

•  Hadoop 1 and Hadoop 2 Releases
•  Generalized storage service
  –  Leverage it for further innovation
•  Enterprise use cases
•  HDFS Infrastructure Improvements
•  HA in Hadoop 1!


                                          16
HA in 1.0
            Using Full Stack HA Architecture




                                               17	
  
© Hortonworks Inc. 2011
Hadoop Full Stack HA Architecture


                                          Slave Nodes of Hadoop Cluster


                                    job        job              job   job    job


 Apps
Running
Outside
                                                         Failover

                                      JT into Safemode

                        NN                                 JT               NN
                                                                                        N+K
                          Server                            Server           Server   failover

                                      HA Cluster for Master Daemons

                                                                                            18
          © Hortonworks Inc. 2011
HA in Hadoop 1 with HDP1
•  Full Stack HA Architecture
  –  NameNode
      –  Clients pause automatically
      –  JobTracker pauses automatically
  –  HA for other Hadoop master daemons coming
•  Use industry standard HA frameworks
  –  VMWare vSphere-HA, and others soon
      –  Industry Proven
           –  Failover, fencing, …
           –  Deals with tricky corner cases and prevents corruption
      –  Addition benefits
           –  N-N & N+K failover
           –  Migration for maintenance

                                                                       19
      © Hortonworks Inc. 2011
Hadoop NN/JT HA with vSphere




                               Page 20
   © Hortonworks Inc. 2011
NN HA with Linux-HA

         Linux	
  HA	
                                            Linux	
  HA	
  
                                             Heartbeat
       Resource	
  Mgr	
                                        Resource	
  Mgr	
  
        (Watchdog)	
                                             (Watchdog)	
  
                         Cmds
Monitor Health                                                      Monitor Health
of NN. OS, HW                                                       of NN. OS, HW

                               NN          Shared NN     NN
                             Active          state       Cold




                                      DN   DN            DN
          © Hortonworks Inc. 2011
Failover Times

•  NameNode Failover times with vSphere and LinuxHA
  –  Failure detection and Failover – 0.5 to 2 minutes
  –  OS bootup needed for vSphere – 1 minute
  –  Namenode Startup (exit safemode)
       –  Small/Medium clusters – 1 to 2 minutes
       –  Large cluster – 5 to 15 minutes
•  NameNode startup time measurements
  –  60 Nodes, 60K files, 6 million blocks, 300 TB raw storage – 40 sec
  –  180 Nodes, 200K files, 18 million blocks, 900TB raw storage – 120 sec



   Cold Failover is good enough for small/medium clusters
        Failure Detection and Automatic Failover Dominates



                                                                             22
          © Hortonworks Inc. 2011
Summary
•  Hadoop 1 – The most stable release
  –  Now with Full-Stack HA using industry proven tools


•  Hadoop 2 – in Alpha testing
  –  3 years of development
      –  significant new in alpha/beta testing
  –  Generalized storage layer – opportunities for innovation
      –  Partial namespace in memory, shadow/caching file system, MR
         tmp, etc.
  –  Hadoop 2 HA
      –  main difference – warm/hot failover
•  Snapshot and DR improvements are coming

                                                                  Page 23
      © Hortonworks Inc. 2011
Thanks




                            Page 24
  © Hortonworks Inc. 2011

More Related Content

What's hot

HA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talkHA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talk
Steve Loughran
 
Enhancing Live Migration Process for CPU and/or memory intensive VMs running...
Enhancing Live Migration Process for CPU and/or  memory intensive VMs running...Enhancing Live Migration Process for CPU and/or  memory intensive VMs running...
Enhancing Live Migration Process for CPU and/or memory intensive VMs running...
Benoit Hudzia
 
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandShared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Fuenteovejuna
 

What's hot (20)

Next Gen Datacenter
Next Gen DatacenterNext Gen Datacenter
Next Gen Datacenter
 
HA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talkHA Hadoop -ApacheCon talk
HA Hadoop -ApacheCon talk
 
Hecatonchire kvm forum_2012_benoit_hudzia
Hecatonchire kvm forum_2012_benoit_hudziaHecatonchire kvm forum_2012_benoit_hudzia
Hecatonchire kvm forum_2012_benoit_hudzia
 
Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06Hadoop at Rakuten, 2011/07/06
Hadoop at Rakuten, 2011/07/06
 
SQL Server Workshop Paul Bertucci
SQL Server Workshop Paul BertucciSQL Server Workshop Paul Bertucci
SQL Server Workshop Paul Bertucci
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in Ruby
 
Exchange 2010 ha ctd
Exchange 2010 ha ctdExchange 2010 ha ctd
Exchange 2010 ha ctd
 
Hydrogen Reliability Cavern Storage
Hydrogen Reliability   Cavern StorageHydrogen Reliability   Cavern Storage
Hydrogen Reliability Cavern Storage
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
 
Enhancing Live Migration Process for CPU and/or memory intensive VMs running...
Enhancing Live Migration Process for CPU and/or  memory intensive VMs running...Enhancing Live Migration Process for CPU and/or  memory intensive VMs running...
Enhancing Live Migration Process for CPU and/or memory intensive VMs running...
 
Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09
 
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introduction
 
HP Microsoft SQL Server Data Management Solutions
HP Microsoft SQL Server Data Management SolutionsHP Microsoft SQL Server Data Management Solutions
HP Microsoft SQL Server Data Management Solutions
 
Couchbase Korea User Gorup 2nd Meetup #1
Couchbase Korea User Gorup 2nd Meetup #1Couchbase Korea User Gorup 2nd Meetup #1
Couchbase Korea User Gorup 2nd Meetup #1
 
SQL Server 2008 Fast Track Data Warehouse
SQL Server 2008 Fast Track Data WarehouseSQL Server 2008 Fast Track Data Warehouse
SQL Server 2008 Fast Track Data Warehouse
 
Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies
Escape From Amazon: Tips/Techniques for Reducing AWS DependenciesEscape From Amazon: Tips/Techniques for Reducing AWS Dependencies
Escape From Amazon: Tips/Techniques for Reducing AWS Dependencies
 
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandShared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
 
27ian2011 hp
27ian2011   hp27ian2011   hp
27ian2011 hp
 
Development of AGROVOC Plug-in for DSpace (DSpace AGROVOC Plug-in)
Development of AGROVOC Plug-in for DSpace (DSpace AGROVOC Plug-in)Development of AGROVOC Plug-in for DSpace (DSpace AGROVOC Plug-in)
Development of AGROVOC Plug-in for DSpace (DSpace AGROVOC Plug-in)
 

Similar to Hdfs high availability

Openstack@ebay: Practical SDN deployment with Quantum
Openstack@ebay: Practical SDN deployment with QuantumOpenstack@ebay: Practical SDN deployment with Quantum
Openstack@ebay: Practical SDN deployment with Quantum
Jean-Christophe "JC" Martin
 
OWF12/Open Cloud Strategies Openstackinaction enovance
OWF12/Open Cloud Strategies Openstackinaction enovanceOWF12/Open Cloud Strategies Openstackinaction enovance
OWF12/Open Cloud Strategies Openstackinaction enovance
Paris Open Source Summit
 
Open stack in action enovance-quantum in action
Open stack in action enovance-quantum in actionOpen stack in action enovance-quantum in action
Open stack in action enovance-quantum in action
eNovance
 
Deploying OpenStack using Crowbar
Deploying OpenStack using CrowbarDeploying OpenStack using Crowbar
Deploying OpenStack using Crowbar
openstackindia
 
Stairway to heaven webinar
Stairway to heaven webinarStairway to heaven webinar
Stairway to heaven webinar
CloudBees
 

Similar to Hdfs high availability (20)

Cloud Foundry Open Tour - London
Cloud Foundry Open Tour - LondonCloud Foundry Open Tour - London
Cloud Foundry Open Tour - London
 
Savanna: Hadoop on OpenStack
Savanna: Hadoop on OpenStackSavanna: Hadoop on OpenStack
Savanna: Hadoop on OpenStack
 
Openstack@ebay: Practical SDN deployment with Quantum
Openstack@ebay: Practical SDN deployment with QuantumOpenstack@ebay: Practical SDN deployment with Quantum
Openstack@ebay: Practical SDN deployment with Quantum
 
Openstack@ebay.pptx
Openstack@ebay.pptxOpenstack@ebay.pptx
Openstack@ebay.pptx
 
RunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdfRunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdf
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentd
 
OWF12/Open Cloud Strategies Openstackinaction enovance
OWF12/Open Cloud Strategies Openstackinaction enovanceOWF12/Open Cloud Strategies Openstackinaction enovance
OWF12/Open Cloud Strategies Openstackinaction enovance
 
Open stack in action enovance-quantum in action
Open stack in action enovance-quantum in actionOpen stack in action enovance-quantum in action
Open stack in action enovance-quantum in action
 
Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
 
Feb 2013 HUG: HIT (Hadoop Integration Testing) for Automated Certification an...
Feb 2013 HUG: HIT (Hadoop Integration Testing) for Automated Certification an...Feb 2013 HUG: HIT (Hadoop Integration Testing) for Automated Certification an...
Feb 2013 HUG: HIT (Hadoop Integration Testing) for Automated Certification an...
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrow
 
Hadoop Versioning
Hadoop VersioningHadoop Versioning
Hadoop Versioning
 
Deploying OpenStack using Crowbar
Deploying OpenStack using CrowbarDeploying OpenStack using Crowbar
Deploying OpenStack using Crowbar
 
Openstack in action2 Rackspace- state of the openstack union 31-05-12
Openstack in action2   Rackspace- state of the openstack union 31-05-12Openstack in action2   Rackspace- state of the openstack union 31-05-12
Openstack in action2 Rackspace- state of the openstack union 31-05-12
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
Stairway to heaven webinar
Stairway to heaven webinarStairway to heaven webinar
Stairway to heaven webinar
 
The Behaviour-Driven Programmer
The Behaviour-Driven ProgrammerThe Behaviour-Driven Programmer
The Behaviour-Driven Programmer
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
 

More from DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

Hdfs high availability

  • 1. HDFS: Now and Future Sanjay Radia (sanjay@hortonworks.com) @Hortonworks.com Suresh Srinivas (suresh@hortonworks.com) © Hortonworks Inc. 2011 Page 1
  • 2. Outline •  Hadoop 1 and Hadoop 2 Releases •  Generalized storage service –  Leverage it for further innovation •  Enterprise Use Cases •  HDFS Infrastructure Improvements •  HA in Hadoop 1! 2
  • 3. Hadoop 1 and Hadoop 2 Hadoop 1 (GA) Hadoop 2 (alpha) • Security • New Append • Append/Fsync (Hbase) • Federation • WebHdfs + Spnego • Wire compatibility • Write pipeline improvements • Edit logs rewrite • Local write optimization • Faster startup • Performance improvements • HA NameNode • Disk-fail-in-place Page 3 © Hortonworks Inc. 2011
  • 4. Testing & Quality – Used for each stable release Nightly Testing –  1200 automated tests on 30 nodes –  Live data and applications QE Certification for Release –  Large variety and scale tests on 500 nodes –  Performance benchmarking –  QE HIT integration testing of whole stack Release Testing – alpha and beta •  Sandbox cluster – 3 clusters each with 400 - 1K nodes –  Major releases: 2 months testing on actual data - all production projects must sign off •  Research clusters – 6 Clusters (non-revenue production jobs) (4K Nodes) –  Major releases – minimum 2 months before moving to production –  .25Million to .5Million jobs per week if it clears research then mostly fine in production Release •  Production clusters - 11 clusters (4.5K nodes) –  Revenue generating, stricter SLAs 4
  • 5. Hadoop 1 and Hadoop 2 Timelines 0.20.1 DEV QA beta HADOOP 1.0 DEV QA beta 0.20.2 Security DEV QA beta 0.20.1xx Operability, Multi Tenancy DEV QA beta Hadoop 1.0 0.20.2xx GA Old Append 1.0 DEV QA beta New Append HADOOP 2.0 DEV 0.21 SecurityPort + DEV QA 0.22 Federation, YARN Hadoop 2.0 0.23 DEV QA alpha alpha HA, Wire Compatibility DEV QA beta 2.0 2008 2009 2010 2011 2012 Page 5 © Hortonworks Inc. 2011
  • 6. Outline •  Hadoop 1 and Hadoop 2 Releases •  Generalized storage service –  Leverage it for further innovation •  Enterprise Use Cases •  HDFS Infrastructure Improvements •  HA in Hadoop 1! 6
  • 7. Federation: Generalized Block Storage Namespace NN-1 NN-k NN-n Foreign                      NS1                    NS  k NS  n ... ... Pool    1 Pool    k Pool    n Block Storage                        Block    Pools Datanode  1   Datanode  2 Datanode  m ... ... ... Common  Storage •  Block Storage as generic storage service –  Set of blocks for a Namespace Volume is called a Block Pool –  DNs store blocks for all the Namespace Volumes – no partitioning •  Multiple independent Namenodes and Namespace Volumes in a cluster –  Namespace Volume = Namespace + Block Pool
  • 8. HDFS’ Generic Storage Service Opportunities for Innovation •  Federation - Distributed (Partitioned) Namespace –  Simple and Robust due to independent masters Alternate NN Implementation HBase –  Scalability, Isolation, Availability HDFS Namespace MR tmp •  New Services – Independent Block Pools –  New FS - Partial namespace in memory –  MR Tmp storage, HBase directly on block storage Storage Service –  Shadow file system – caches HDFS, NFS, S3 •  Future: move Block Management in DataNodes –  Simplifies namespace/application implementation –  Distributed namenode becomes significantly simple
  • 9. Shadow File System for Another S3 Custom Shadow NameSpaces HDFS DataNode NFS •  Custom Namespace to shadow namespace of another system –  Uses a private block pool •  Different policies on the data –  E.g. Single replica, fetch missing ones from source –  Hadoop can serve as a processing engine for source data without putting a lot of load on source –  E.g. Reduce replication factor for data duplicated in another cluster Page 9 © Hortonworks Inc. 2011
  • 10. Managing Namespaces •  Federation has multiple namespaces Client-side / mount-table •  Don’t you need a single global namespace? –  Some tenants want private namespace •  Hadoop as service – each tenant its own namespace –  Global? Key is to share the data and the names used to data project home tmp access the data •  A single global namespace is one way share •  Client-side mount table is another way to share. –  Shared mount-table => “global” shared view NS4 –  Personalized mount-table => per-application view •  Share the data that matter by mounting it •  Client-side implementation of mount tables NS1 NS2 NS3 –  No single point of failure –  No hotspot for root and top level directories
  • 11. Next Steps… first class support for volumes •  NameServer - Container for namespaces ›  Lots of small namespace volumes –  Chosen per user/tenant/data feed –  Management policies (quota, …) … –  Mount tables for unified namespace NameServers as •  Can be managed by a central volume server Containers of Namespaces ›  Move namespace for balancing •  WorkingSet of namespace in memory …   Datanode Datanode ›  Many more namespaces in a server Storage Layer •  Number of NameServers = ›  Sum of (Namespace working set) ›  Sum of (Namespace throughput) 11 © Hortonworks Inc. 2011
  • 12. Outline •  Hadoop 1 and Hadoop 2 Releases •  Generalized storage service –  Leverage it for further innovation •  Enterprise Use Cases •  HDFS Infrastructure Improvements •  HA in Hadoop 1! 12
  • 13. Enterprise Use Cases •  High Availability þ •  Standard Interfaces þ –  WebHdfs(REST) þ, Fuse þ and NFS access •  Snapshots - Under progress •  Disaster Recovery –  Distcp does parallel and incremental copies þ –  Enhance using journal interface & Snapshots •  Data Efficiency/RAID –  Productize the tools and experience at Facebook Page 13 © Hortonworks Inc. 2011
  • 14. Outline •  Hadoop 1 and Hadoop 2 Releases •  Generalized storage service –  Leverage it for further innovation •  Enterprise Use Cases •  HDFS Infrastructure Improvements •  HA in Hadoop 1! 14
  • 15. Infrastructure Improvements •  Netty –  Better connection and thread management •  Image/Edits management –  HDFS image/edits stored with in HDFS •  Parallel writes –  Lower latency •  Grouping blocks –  Scaling number of blocks and block reports •  Support for Heterogeneous Storage –  SSD, archival storage •  Rolling upgrades improvements –  Wire compatibility done Page 15 © Hortonworks Inc. 2011
  • 16. Outline •  Hadoop 1 and Hadoop 2 Releases •  Generalized storage service –  Leverage it for further innovation •  Enterprise use cases •  HDFS Infrastructure Improvements •  HA in Hadoop 1! 16
  • 17. HA in 1.0 Using Full Stack HA Architecture 17   © Hortonworks Inc. 2011
  • 18. Hadoop Full Stack HA Architecture Slave Nodes of Hadoop Cluster job job job job job Apps Running Outside Failover JT into Safemode NN JT NN N+K Server Server Server failover HA Cluster for Master Daemons 18 © Hortonworks Inc. 2011
  • 19. HA in Hadoop 1 with HDP1 •  Full Stack HA Architecture –  NameNode –  Clients pause automatically –  JobTracker pauses automatically –  HA for other Hadoop master daemons coming •  Use industry standard HA frameworks –  VMWare vSphere-HA, and others soon –  Industry Proven –  Failover, fencing, … –  Deals with tricky corner cases and prevents corruption –  Addition benefits –  N-N & N+K failover –  Migration for maintenance 19 © Hortonworks Inc. 2011
  • 20. Hadoop NN/JT HA with vSphere Page 20 © Hortonworks Inc. 2011
  • 21. NN HA with Linux-HA Linux  HA   Linux  HA   Heartbeat Resource  Mgr   Resource  Mgr   (Watchdog)   (Watchdog)   Cmds Monitor Health Monitor Health of NN. OS, HW of NN. OS, HW NN Shared NN NN Active state Cold DN DN DN © Hortonworks Inc. 2011
  • 22. Failover Times •  NameNode Failover times with vSphere and LinuxHA –  Failure detection and Failover – 0.5 to 2 minutes –  OS bootup needed for vSphere – 1 minute –  Namenode Startup (exit safemode) –  Small/Medium clusters – 1 to 2 minutes –  Large cluster – 5 to 15 minutes •  NameNode startup time measurements –  60 Nodes, 60K files, 6 million blocks, 300 TB raw storage – 40 sec –  180 Nodes, 200K files, 18 million blocks, 900TB raw storage – 120 sec Cold Failover is good enough for small/medium clusters Failure Detection and Automatic Failover Dominates 22 © Hortonworks Inc. 2011
  • 23. Summary •  Hadoop 1 – The most stable release –  Now with Full-Stack HA using industry proven tools •  Hadoop 2 – in Alpha testing –  3 years of development –  significant new in alpha/beta testing –  Generalized storage layer – opportunities for innovation –  Partial namespace in memory, shadow/caching file system, MR tmp, etc. –  Hadoop 2 HA –  main difference – warm/hot failover •  Snapshot and DR improvements are coming Page 23 © Hortonworks Inc. 2011
  • 24. Thanks Page 24 © Hortonworks Inc. 2011