2. Outline
• Background
– YARN architecture and need for HA
• RM HA architecture
– Persisting the state
– Active/ Standby pair and Fencing
– Failover and redirection
• Configuring HA
• Demo
6/30/2014 YARN High Availability, Hadoop Summit 2
3. YARN Architecture
6/30/2014 YARN High Availability, Hadoop Summit 3
Resource
Manager
Node Manager
Node Manager
Node Manager
App
Master
Container
Client
Client
Cluster State
Applications
State
4. Fault-tolerance
6/30/2014 YARN High Availability, Hadoop Summit 4
Resource
Manager
Node Manager
Node Manager
Node Manager
App
Master
Container
Client
Client
App
Master
ContainerCluster State
Applications
State
5. Naïve RM Restart
6/30/2014 YARN High Availability, Hadoop Summit 5
Resource
Manager
Node Manager
Node Manager
Client
Client
App
Master
Cluster State
Applications
State
6. ResourceManager is a YARN cluster’s
single point of failure.
6/30/2014 YARN High Availability, Hadoop Summit 6
Need stateful restart and multiple RMs.
7. Highly Available Resource Manager
a.k.a. HARMful YARN
• Currently shipped
– Beta in Apache Hadoop 2.3.0
– Stable in Apache Hadoop 2.4.0
– More stable in Apache Hadoop 2.4.1
6/30/2014 YARN High Availability, Hadoop Summit 7
8. Stateful RM Restart (Phase 1)
6/30/2014 YARN High Availability, Hadoop Summit 8
Node Manager
Node Manager
App
Master
Container
Client
Client
Resource
Manager
Cluster State
Applications
State
RM Store
App
Master
Container
9. RM Store Implementations
• Memory store – testing purposes
• Filesystem based store
– Any file system: local, HDFS or any other
• Zookeeper based store (ZKRMStateStore)
– Recommended (for fencing)
– Loading 10,000 applications takes about 8.5 secs.
6/30/2014 YARN High Availability, Hadoop Summit 9
10. Implications to Running applications
• In-flight work is lost.
• AMs are restarted.
• AMs could checkpoint completed work.
– MapReduce AM does.
– Consider a job with 100 map tasks
• If RM goes down after 90 map tasks finish.
• After restart, only the remaining 10 are run.
6/30/2014 YARN High Availability, Hadoop Summit 10
11. Stateful RM Restart (Phase 2)
• Under development (YARN-556)
– No loss of in-flight work
• Related work
– Work-preserving NodeManager restart (YARN-
1336)
– Work-preserving ApplicationMaster restart (YARN-
1489)
6/30/2014 YARN High Availability, Hadoop Summit 11
12. Multiple RMs
• Active / Standby architecture
– Potentially multiple standbys
– Warm standby
• Running
• Loads state and starts RPC servers on becoming Active
– Manual / automatic failover
– Clients and Web UI failover automatically
6/30/2014 YARN High Availability, Hadoop Summit 12
13. Active / Standby
6/30/2014 YARN High Availability, Hadoop Summit 13
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
14. Manual Failover through CLI
6/30/2014 YARN High Availability, Hadoop Summit 14
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
16. Automatic Failover
6/30/2014 YARN High Availability, Hadoop Summit 16
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK
17. Automatic Failover
6/30/2014 YARN High Availability, Hadoop Summit 17
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK
18. Automatic Failover
• Zookeeper based
– Uses ActiveStandbyElector for Active election
• No need for a FailoverController
– Can’t monitor RM process health and recover
6/30/2014 YARN High Availability, Hadoop Summit 18
19. Network Hiccup
6/30/2014 YARN High Availability, Hadoop Summit 19
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK
20. Multiple Actives?
6/30/2014 YARN High Availability, Hadoop Summit 20
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Active
Resource
Manager
Elector
Elector
ZK
21. Fencing
• The state store gets corrupted when multiple
RMs assume the Active role.
• Exclusive access to a single RM.
– ZKRMStateStore takes care of this.
– Shared “admin” access.
– Exclusive “create-delete” access on transition to
Active
6/30/2014 YARN High Availability, Hadoop Summit 21
22. Network Hiccup
6/30/2014 YARN High Availability, Hadoop Summit 22
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK
23. Active / Standby
6/30/2014 YARN High Availability, Hadoop Summit 23
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK
24. In-flight RPCs
• In-flight RPCs: Retry or not?
– E.g. Submit application – we clearly don’t want
two applications submitted.
• Depends on whether failover happens before, during,
or after the RM acts on the call.
• Solution
– Annotate APIs as Idempotent or AtMostOnce
6/30/2014 YARN High Availability, Hadoop Summit 24
25. Web UI
• Standby RM has no/stale information.
• Users don’t know which RM is Active.
• Redirect Web UI and REST calls to Active RM.
– Except a few pages that give information about
the RM.
6/30/2014 YARN High Availability, Hadoop Summit 25
26. Admin Refresh
• Admin refresh ($ yarn rmadmin –refresh):
– Refreshes that particular RM – Active/Standby
– Uses local configuration file
• FileSystemBasedConfigurationProvider
– Upload the configuration files to (potentially shared)
filesystem like HDFS.
6/30/2014 YARN High Availability, Hadoop Summit 26
27. Setting up HA
Config name Value
yarn.resourcemanager.ha.enabled true
yarn.resourcemanager.ha.rm-ids rm1,rm2
yarn.resourcemanager.hostname.rm1 <host1>
yarn.resourcemanager.hostname.rm2 <host2>
yarn.resourcemanager.recovery.enabled true
yarn.resourcemanager.store.class ZKRMStateStore1
yarn.resourcemanager.zk-address <zk-quorum>
yarn.resourcemanager.cluster-id <cluster-id>
6/30/2014 YARN High Availability, Hadoop Summit 27
1. org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore