YARN High Availability

YARN High Availability
Karthik Kambatla – Cloudera Inc
Xuan Gong – Hortonworks Inc

Outline
• Background
– YARN architecture and need for HA
• RM HA architecture
– Persisting the state
– Active/ Standby pair and Fencing
– Failover and redirection
• Configuring HA
• Demo
6/30/2014 YARN High Availability, Hadoop Summit 2

YARN Architecture
Resource
Manager
Node Manager
Node Manager
Node Manager
App
Master
Container
Client
Client
Cluster State
Applications
State

Fault-tolerance
Resource
Manager
Node Manager
Node Manager
Node Manager
App
Master
Container
Client
Client
App
Master
ContainerCluster State
Applications
State

Naïve RM Restart
Resource
Manager
Node Manager
Node Manager
Client
Client
App
Master
Cluster State
Applications
State

ResourceManager is a YARN cluster’s
single point of failure.
Need stateful restart and multiple RMs.

Highly Available Resource Manager
a.k.a. HARMful YARN
• Currently shipped
– Beta in Apache Hadoop 2.3.0
– Stable in Apache Hadoop 2.4.0
– More stable in Apache Hadoop 2.4.1

Stateful RM Restart (Phase 1)
Node Manager
Node Manager
App
Master
Container
Client
Client
Resource
Manager
Cluster State
Applications
State
RM Store
App
Master
Container

RM Store Implementations
• Memory store – testing purposes
• Filesystem based store
– Any file system: local, HDFS or any other
• Zookeeper based store (ZKRMStateStore)
– Recommended (for fencing)
– Loading 10,000 applications takes about 8.5 secs.

Implications to Running applications
• In-flight work is lost.
• AMs are restarted.
• AMs could checkpoint completed work.
– MapReduce AM does.
– Consider a job with 100 map tasks
• If RM goes down after 90 map tasks finish.
• After restart, only the remaining 10 are run.

Stateful RM Restart (Phase 2)
• Under development (YARN-556)
– No loss of in-flight work
• Related work
– Work-preserving NodeManager restart (YARN-
1336)
– Work-preserving ApplicationMaster restart (YARN-
1489)

Multiple RMs
• Active / Standby architecture
– Potentially multiple standbys
– Warm standby
• Running
• Loads state and starts RPC servers on becoming Active
– Manual / automatic failover
– Clients and Web UI failover automatically

Active / Standby
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager

Manual Failover through CLI
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager

Client Failover
(ConfiguredFailoverProxyProvider)
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
App
Master

Automatic Failover
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK

Automatic Failover
• Zookeeper based
– Uses ActiveStandbyElector for Active election
• No need for a FailoverController
– Can’t monitor RM process health and recover

Network Hiccup
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK

Multiple Actives?
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Active
Resource
Manager
Elector
Elector
ZK

Fencing
• The state store gets corrupted when multiple
RMs assume the Active role.
• Exclusive access to a single RM.
– ZKRMStateStore takes care of this.
– Shared “admin” access.
– Exclusive “create-delete” access on transition to
Active

Network Hiccup
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK

Active / Standby
Node Manager
Node Manager
App
Master
Client
Client
Active
Resource
Manager
RM Store
Standby
Resource
Manager
Elector
Elector
ZK

In-flight RPCs
• In-flight RPCs: Retry or not?
– E.g. Submit application – we clearly don’t want
two applications submitted.
• Depends on whether failover happens before, during,
or after the RM acts on the call.
• Solution
– Annotate APIs as Idempotent or AtMostOnce

Web UI
• Standby RM has no/stale information.
• Users don’t know which RM is Active.
• Redirect Web UI and REST calls to Active RM.
– Except a few pages that give information about
the RM.

Admin Refresh
• Admin refresh ($ yarn rmadmin –refresh):
– Refreshes that particular RM – Active/Standby
– Uses local configuration file
• FileSystemBasedConfigurationProvider
– Upload the configuration files to (potentially shared)
filesystem like HDFS.

Setting up HA
Config name Value
yarn.resourcemanager.ha.enabled true
yarn.resourcemanager.ha.rm-ids rm1,rm2
yarn.resourcemanager.hostname.rm1 <host1>
yarn.resourcemanager.hostname.rm2 <host2>
yarn.resourcemanager.recovery.enabled true
yarn.resourcemanager.store.class ZKRMStateStore1
yarn.resourcemanager.zk-address <zk-quorum>
yarn.resourcemanager.cluster-id <cluster-id>
1. org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore

Demo!

Questions?

YARN High Availability

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie YARN High Availability

Ähnlich wie YARN High Availability (20)

Mehr von Cloudera, Inc.

Mehr von Cloudera, Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

YARN High Availability