The HDFS NameNode is a robust and reliable service as seen in practice in production at Yahoo and other customers. However, the NameNode does not have automatic failover support. A hot failover solution called HA NameNode is currently under active development (HDFS-1623). This talk will cover the architecture, design and setup. We will also discuss the future direction for HA NameNode.
2. Outline
• Hadoop 1 and Hadoop 2 Releases
• Generalized storage service
– Leverage it for further innovation
• Enterprise Use Cases
• HDFS Infrastructure Improvements
• HA in Hadoop 1!
2
4. Testing & Quality – Used for each stable release
Nightly Testing
– 1200 automated tests on 30 nodes
– Live data and applications
QE Certification for Release
– Large variety and scale tests on 500 nodes
– Performance benchmarking
– QE HIT integration testing of whole stack
Release Testing – alpha and beta
• Sandbox cluster – 3 clusters each with 400 - 1K nodes
– Major releases: 2 months testing on actual data - all production projects must
sign off
• Research clusters – 6 Clusters (non-revenue production jobs) (4K Nodes)
– Major releases – minimum 2 months before moving to production
– .25Million to .5Million jobs per week
if it clears research then mostly fine in production
Release
• Production clusters - 11 clusters (4.5K nodes)
– Revenue generating, stricter SLAs
4
6. Outline
• Hadoop 1 and Hadoop 2 Releases
• Generalized storage service
– Leverage it for further innovation
• Enterprise Use Cases
• HDFS Infrastructure Improvements
• HA in Hadoop 1!
6
7. Federation: Generalized Block Storage
Namespace NN-1 NN-k NN-n
Foreign
NS1
NS
k NS
n
... ...
Pool
1 Pool
k Pool
n
Block Storage
Block
Pools
Datanode
1
Datanode
2 Datanode
m
... ... ...
Common
Storage
• Block Storage as generic storage service
– Set of blocks for a Namespace Volume is called a Block Pool
– DNs store blocks for all the Namespace Volumes – no partitioning
• Multiple independent Namenodes and Namespace Volumes in a cluster
– Namespace Volume = Namespace + Block Pool
8. HDFS’ Generic Storage Service
Opportunities for Innovation
• Federation - Distributed (Partitioned) Namespace
– Simple and Robust due to independent masters Alternate NN
Implementation
HBase
– Scalability, Isolation, Availability HDFS
Namespace MR tmp
• New Services – Independent Block Pools
– New FS - Partial namespace in memory
– MR Tmp storage, HBase directly on block storage Storage Service
– Shadow file system – caches HDFS, NFS, S3
• Future: move Block Management in DataNodes
– Simplifies namespace/application implementation
– Distributed namenode becomes significantly simple
10. Managing Namespaces
• Federation has multiple namespaces Client-side
/
mount-table
• Don’t you need a single global namespace?
– Some tenants want private namespace
• Hadoop as service – each tenant its own namespace
– Global? Key is to share the data and the names used to data project home tmp
access the data
• A single global namespace is one way share
• Client-side mount table is another way to share.
– Shared mount-table => “global” shared view NS4
– Personalized mount-table => per-application view
• Share the data that matter by mounting it
• Client-side implementation of mount tables NS1 NS2 NS3
– No single point of failure
– No hotspot for root and top level directories
12. Outline
• Hadoop 1 and Hadoop 2 Releases
• Generalized storage service
– Leverage it for further innovation
• Enterprise Use Cases
• HDFS Infrastructure Improvements
• HA in Hadoop 1!
12
14. Outline
• Hadoop 1 and Hadoop 2 Releases
• Generalized storage service
– Leverage it for further innovation
• Enterprise Use Cases
• HDFS Infrastructure Improvements
• HA in Hadoop 1!
14
16. Outline
• Hadoop 1 and Hadoop 2 Releases
• Generalized storage service
– Leverage it for further innovation
• Enterprise use cases
• HDFS Infrastructure Improvements
• HA in Hadoop 1!
16