SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Consistent Reads from Standby Node
Konstantin V Shvachko
Sr. Staff Software Engineer
@LinkedIn
Chen Liang
Senior Software Engineer
@LinkedIn
Chao Sun
Software Engineer
@Uber
Agenda
HDFS CONSISTENT READ FROM STANDBY
1
• Motivation
• Consistency Read from Standby
• Challenges
• Design and Implementation
• Next steps
The Team
2
• Konstantin Shvachko (LinkedIn)
• Chen Liang (LinkedIn)
• Erik Krogen (LinkedIn)
• Chao Sun (Uber)
• Plamen Jeliazkov (Paypal)
Consistent Reads From
Standby Nodes
Motivation
4
• 2x Growth/Year In Workloads and Size
• Approaching active Name Node performance limits rapidly
• We need a scalability solution
• Key Insights:
• Reads comprise 95% of all metadata operations in our practice
• Another source of truth for read: Standby Nodes
• Standby Nodes Serving Read Requests
• Can substantially decrease active Name Node workload
• Allowing cluster to scale further!
Architecture
ROLE OF STANDBY NODES
DataNodes
Active
NameNode
Standby
NameNodes
JournalNodes
Write
5
/Read• Standby nodes have same copy of all
metadata (with some delay)
• Standby Node syncs edits from Active
NameNode
• Standby nodes can potentially serve read
requests
• All reads can go to Standby nodes
• OR, time critical applications can still
choose to read from Active only
Challenges
DataNodes
Active
NameNode
Standby
NameNodes
JournalNodes
Write Read
6
/Read
• Standby Node delay
• ANN write edits to JN, then SbNN
applying the edits from JN
• With delay at minute magnitude
• Consistency
• If client performs a read after a write,
client would expect to see the state
change
Fast Journaling
DELAY REDUCTION
7
• Fast Edit Tailing HDFS-13150
• Current JN is slow: serving whole segments of edits from disk
• Optimization on JN and SbNN
o JN caching recent edits in memory, only applied edits are served
o SbNN request only recent edits through RPC calls
o Fall back to existing mechanism on error
• Significantly reduce SbNN delay
o Reduce from 1 minute to 2 to 50 milliseconds
• Standby node delay is no more than a few ms in most cases
Consistency Model
8
• Consistency Principle:
• If client c1 modifies an object state at id1
at time t1, then in any future time t2 > t1,
c1 will see the state of that object at id2 >=
id1
• Read-Your-Own-Write
• Client writes to Active NameNode
• Then read from the StandbyNode
• Read should reflect the write
Active
NameNode
Standby
NameNodes
JournalNodes
lastSeenStateId
txnid = 100
= 100
txnid = 99
100 100
txnid = 100
Consistency Model
9
• Consistency Principle:
• If client c1 modifies an object state at id1
at time t1, then in any future time t2 > t1,
c1 will see the state of that object at id2 >=
id1
• LastSeenStateID
• Monotonically increasing Id of ANN
namespace state txnid
• Kept on client side, client’s known most
recent ANN state
• Sent to SbNN, SbNN only replies after it
has caught up to this state
Active
NameNode
Standby
NameNodes
JournalNodes
lastSeenStateId
txnid = 100
= 100
100 100
Corner Case: Stale Reads
10
• Stale Read Cases
• Case1: Multiple client instances
• DFSClient#1 to write to ANN, DFSClient#2 to
read SbNN
• DFSClient#2’s state older than DFSClient#1,
read is out of sync
• Case2: Out-of-band communication
• Client#1 writes to ANN, inform client#2
• Client#2 read from SbNN, not see the write
Active
NameNode
DFSClient#1
Standby
NameNode
Write
DFSClient#2
Read
Read your own writes
Active
NameNode
DFSClient#1
Standby
NameNode
Write
DFSClient#2
Read
Third-party communication
msync API
11
• Dealing with Stale Reads: FileSystem.msync()
• Sync between existing client instances
• Force the DFSClient to sync up to the most
recent state of ANN
• Multiple client instances: call msync on
DFSClient#2
• Out-of-band communication: client#2 calls
msync first before read
• “Always msync” mode HDFS-14211
Active
NameNode
DFSClient#1
Standby
NameNode
Write
DFSClient#2
Read
Read your own writes
Active
NameNode
DFSClient#1
Standby
NameNode
Write
DFSClient#2
Read
Third-party communication
Robustness Optimization: Standby Node Back-off
REDIRECT WHEN TOO FAR BEHIND
• In the case where a Standby node state is too far behind, client may retry another node
• e.g. Standby node machine running slow
• Standby Node Back-off
• 1: Upon receiving request, if Standby node finds itself too far behind requested state, it
rejects the request, throwing retry exception
• 2: If a request has been in queue for long, and Standby still is not caught up, Standby
rejects the request, throwing retry exception
• Client Retry
• Upon retry exception, client tries a different standby node, or simply falling back to
ANN 12
Configuration and Startup Process
13
• Configuring NameNodes
• Configure namenodes via haadmin
• Observer mode is similar to Standby, but serves
read and does not perform checkpointing
• All NameNodes start as check pointing Standby,
Standby can be transitioned to Active or Observer
• Configuring Client
• Configure to use ObserverReadProxyProvider
• If not, client still works but only talks to ANN
• ObserverReadProxy will discover the state of all
NNs
Active
Standby
Observer
Check
Pointing
Standby
Read
Serving
Standby
Active
Current Status
14
• Test and benchmark
• With YARN application, e.g. TeraSort
• With HDFS benchmarks, e.g. DFSIO
• Run on a cluster with >100 nodes and with Kerberos and delegation token enabled
• Merged to trunk (3.3.0)
• Being backported to branch-2
• Active work on further improvement/optimization
• Has been running at Uber in production
Background
● Back in 2017, Uber’s HDFS clusters were in a bad shape
○ Rapid growth in # of jobs accessing HDFS
○ Ingestion & adhoc jobs co-locate on the same cluster
○ Lots of listing calls on very large directories (esp. Hudi)
● HDFS traffic composition: 96% reads, 4% writes
● Presto is very sensitive to HDFS latency
○ Occupies ~20% of HDFS traffic
○ Only reads from HDFS, no write
Implementation & Timeline
● Implementation (compare to open source version)
○ No msync or fast edit log tailing
■ Only eventual consistency with max staleness of 10s
○ Observer was NOT eligible to NN failover
○ Batched edits loading to reduce NN locktime when tailing edits
● Timeline
○ 08/2017 - finished the POC and basic testing in dev clusters
○ 12/2017 - started collaborating with HDFS open source community (e.g.,
Linkedin, Paypal)
○ 12/2018 - fully rolled out to Presto in production
○ Tool multiple retries along the process
■ Disable access time (dfs.namenode.accesstime.precision)
■ HDFS-13898, HDFS-13924
Impact
Comparing to traffic goes to active NameNode, Observer NameNode
improves the overall throughput by ~20% (roughly the same throughput
from Presto), while RPC queue time has dropped ~30X.
Impact (cont.)
Presto listing status call latency has dropped 8-10X after migrating to
Observer
Next Steps
Three-Stage Scalability Plan
2X GROWTH / YEAR IN WORKLOADS AND SIZE
• Stage I. Consistent reads from standby
• Optimize for reads: 95% of all operations
• Consistent reading is a coordination problem
• Stage II. In-memory Partitioned Namespace
• Optimize write operations
• Eliminate NameNode’s global lock – fine-grained locking
• Stage III. Dynamically Distributed Namespace Service
• Linear scaling to accommodate increases in RPC load and metadata growth
HDFS-12943
20
NameNode Current State
NAMENODE’S GLOBAL LOCK – PERFORMANCE BOTTLENECK
• Three main data structures
• INodeMap: id -> INode
• BlocksMap: key -> BlockInfo
• DatanodeMap: don’t split
• GSet – an efficient HashMap
implementation
• Hash(key) -> Value
• Global lock to update INodes and
blocks
21
NameNode – FSNamesystem
INodeMap – Directory Tree
GSet: Id -> INode
BlocksMap – Block Manager
GSet: Block -> BlockInfo
DataNode Manager
Stage II. In-memory Partitioned Namespace
ELIMINATE NAMENODE’S GLOBAL LOCK
• PartitionedGSet:
• two level mapping
1. RangeMap: keyRange -> GSet
2. RangeGSet: key -> INode
• Fine-grained locking
• Individual locks per range
• Different ranges are accessed
in parallel
22
NameNode
GSet-1
DataNode Manager
GSet-2 GSet-n
GSet-1 GSet-2 GSet-n
INodeMap - Partitioned GSet
BlocksMap - Partitioned GSet
Stage II. In-memory Partitioned Namespace
EARLY POC RESULTS
23
• PartitionedGSet: two level mapping
• LatchLock: swap RangeMap lock for GSet locks corresponding to inode keys
• Run NNTroughputBenchmark creating 10 million directories
• 30% throughput gain
• Large batches of edits
• Why not 100%?
• Key is inodeId – incremental number generator
• Contention on the last partition
• Expect MORE
Stage III. Dynamically Distributed Namespace
SCALABLE DATA AND METADATA
• Split NameNode state into
multiple servers based on ranges
• Each NameNode
• Serves a designate range of
INode keys
• Metadata in PartitionedGSet
• Can reassign certain
subranges to adjacent nodes
• Coordination Service (Ratis)
• Change ranges served by NNs
• Renames / moves, Quotas
24
NameNode 1
INodeMap
Part-GSet
DataNode
Manager
BlocksMap
Part-GSet
NameNode 2 NameNode n
INodeMap
Part-GSet
DataNode
Manager
BlocksMap
Part-GSet
INodeMap
Part-GSet
DataNode
Manager
BlocksMap
Part-GSet
Thank You!
Konstantin V Shvachko Chen Liang Chao Sun
Sr. Staff
Software Engineer
@LinkedIn
Software Engineer
@Uber
Senior
Software Engineer
@LinkedIn
25
Consistent Reads from Standby Node

Weitere ähnliche Inhalte

Was ist angesagt?

PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSTomas Vondra
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
 
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptxEncrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptxNeo4j
 
openCypher: Introducing subqueries
openCypher: Introducing subqueriesopenCypher: Introducing subqueries
openCypher: Introducing subqueriesopenCypher
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and KerberosYuta Imai
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path ForwardAlluxio, Inc.
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala InternalsDavid Groozman
 
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...NTT DATA Technology & Innovation
 
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017Cloudera Japan
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Apekshit Sharma
 
Scylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with RaftScylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with RaftScyllaDB
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryDataWorks Summit
 
PostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめPostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめOhyama Masanori
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 

Was ist angesagt? (20)

PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFS
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
 
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptxEncrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
 
HDFS Selective Wire Encryption
HDFS Selective Wire EncryptionHDFS Selective Wire Encryption
HDFS Selective Wire Encryption
 
openCypher: Introducing subqueries
openCypher: Introducing subqueriesopenCypher: Introducing subqueries
openCypher: Introducing subqueries
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and Kerberos
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Hiveを高速化するLLAP
Hiveを高速化するLLAPHiveを高速化するLLAP
Hiveを高速化するLLAP
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
データインターフェースとしてのHadoop ~HDFSとクラウドストレージと私~ (NTTデータ テクノロジーカンファレンス 2019 講演資料、2019...
 
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
Apache Kuduは何がそんなに「速い」DBなのか? #dbts2017
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
Scylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with RaftScylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with Raft
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
 
PostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめPostgreSQLによるデータ分析ことはじめ
PostgreSQLによるデータ分析ことはじめ
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 

Ähnlich wie Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node

RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per DayRedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per DayRedis Labs
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayMongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCeph Community
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019Dave Nielsen
 
ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?Jagadish Venkatraman
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
 
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...Lucidworks
 
10 Ways to Scale with Redis - LA Redis Meetup 2019
10 Ways to Scale with Redis - LA Redis Meetup 201910 Ways to Scale with Redis - LA Redis Meetup 2019
10 Ways to Scale with Redis - LA Redis Meetup 2019Dave Nielsen
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...DataWorks Summit/Hadoop Summit
 
Signing DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutionsSigning DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutionsAPNIC
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFS
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFSIEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFS
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFSAndré Oriani
 
Delivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBDelivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBJohn Bennett
 
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesMulti-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesLINE Corporation
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 

Ähnlich wie Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node (20)

Scaling HDFS at Xiaomi
Scaling HDFS at XiaomiScaling HDFS at Xiaomi
Scaling HDFS at Xiaomi
 
Scaling HDFS at Xiaomi
Scaling HDFS at XiaomiScaling HDFS at Xiaomi
Scaling HDFS at Xiaomi
 
RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per DayRedisConf18 - Redis at LINE - 25 Billion Messages Per Day
RedisConf18 - Redis at LINE - 25 Billion Messages Per Day
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
 
ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?ApacheCon BigData - What it takes to process a trillion events a day?
ApacheCon BigData - What it takes to process a trillion events a day?
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
 
10 Ways to Scale with Redis - LA Redis Meetup 2019
10 Ways to Scale with Redis - LA Redis Meetup 201910 Ways to Scale with Redis - LA Redis Meetup 2019
10 Ways to Scale with Redis - LA Redis Meetup 2019
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
Signing DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutionsSigning DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutions
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFS
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFSIEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFS
IEEE SRDS'12: From Backup to Hot Standby: High Availability for HDFS
 
Delivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBDelivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDB
 
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messagesMulti-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Kürzlich hochgeladen (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node

  • 1. Consistent Reads from Standby Node Konstantin V Shvachko Sr. Staff Software Engineer @LinkedIn Chen Liang Senior Software Engineer @LinkedIn Chao Sun Software Engineer @Uber
  • 2. Agenda HDFS CONSISTENT READ FROM STANDBY 1 • Motivation • Consistency Read from Standby • Challenges • Design and Implementation • Next steps
  • 3. The Team 2 • Konstantin Shvachko (LinkedIn) • Chen Liang (LinkedIn) • Erik Krogen (LinkedIn) • Chao Sun (Uber) • Plamen Jeliazkov (Paypal)
  • 5. Motivation 4 • 2x Growth/Year In Workloads and Size • Approaching active Name Node performance limits rapidly • We need a scalability solution • Key Insights: • Reads comprise 95% of all metadata operations in our practice • Another source of truth for read: Standby Nodes • Standby Nodes Serving Read Requests • Can substantially decrease active Name Node workload • Allowing cluster to scale further!
  • 6. Architecture ROLE OF STANDBY NODES DataNodes Active NameNode Standby NameNodes JournalNodes Write 5 /Read• Standby nodes have same copy of all metadata (with some delay) • Standby Node syncs edits from Active NameNode • Standby nodes can potentially serve read requests • All reads can go to Standby nodes • OR, time critical applications can still choose to read from Active only
  • 7. Challenges DataNodes Active NameNode Standby NameNodes JournalNodes Write Read 6 /Read • Standby Node delay • ANN write edits to JN, then SbNN applying the edits from JN • With delay at minute magnitude • Consistency • If client performs a read after a write, client would expect to see the state change
  • 8. Fast Journaling DELAY REDUCTION 7 • Fast Edit Tailing HDFS-13150 • Current JN is slow: serving whole segments of edits from disk • Optimization on JN and SbNN o JN caching recent edits in memory, only applied edits are served o SbNN request only recent edits through RPC calls o Fall back to existing mechanism on error • Significantly reduce SbNN delay o Reduce from 1 minute to 2 to 50 milliseconds • Standby node delay is no more than a few ms in most cases
  • 9. Consistency Model 8 • Consistency Principle: • If client c1 modifies an object state at id1 at time t1, then in any future time t2 > t1, c1 will see the state of that object at id2 >= id1 • Read-Your-Own-Write • Client writes to Active NameNode • Then read from the StandbyNode • Read should reflect the write Active NameNode Standby NameNodes JournalNodes lastSeenStateId txnid = 100 = 100 txnid = 99 100 100
  • 10. txnid = 100 Consistency Model 9 • Consistency Principle: • If client c1 modifies an object state at id1 at time t1, then in any future time t2 > t1, c1 will see the state of that object at id2 >= id1 • LastSeenStateID • Monotonically increasing Id of ANN namespace state txnid • Kept on client side, client’s known most recent ANN state • Sent to SbNN, SbNN only replies after it has caught up to this state Active NameNode Standby NameNodes JournalNodes lastSeenStateId txnid = 100 = 100 100 100
  • 11. Corner Case: Stale Reads 10 • Stale Read Cases • Case1: Multiple client instances • DFSClient#1 to write to ANN, DFSClient#2 to read SbNN • DFSClient#2’s state older than DFSClient#1, read is out of sync • Case2: Out-of-band communication • Client#1 writes to ANN, inform client#2 • Client#2 read from SbNN, not see the write Active NameNode DFSClient#1 Standby NameNode Write DFSClient#2 Read Read your own writes Active NameNode DFSClient#1 Standby NameNode Write DFSClient#2 Read Third-party communication
  • 12. msync API 11 • Dealing with Stale Reads: FileSystem.msync() • Sync between existing client instances • Force the DFSClient to sync up to the most recent state of ANN • Multiple client instances: call msync on DFSClient#2 • Out-of-band communication: client#2 calls msync first before read • “Always msync” mode HDFS-14211 Active NameNode DFSClient#1 Standby NameNode Write DFSClient#2 Read Read your own writes Active NameNode DFSClient#1 Standby NameNode Write DFSClient#2 Read Third-party communication
  • 13. Robustness Optimization: Standby Node Back-off REDIRECT WHEN TOO FAR BEHIND • In the case where a Standby node state is too far behind, client may retry another node • e.g. Standby node machine running slow • Standby Node Back-off • 1: Upon receiving request, if Standby node finds itself too far behind requested state, it rejects the request, throwing retry exception • 2: If a request has been in queue for long, and Standby still is not caught up, Standby rejects the request, throwing retry exception • Client Retry • Upon retry exception, client tries a different standby node, or simply falling back to ANN 12
  • 14. Configuration and Startup Process 13 • Configuring NameNodes • Configure namenodes via haadmin • Observer mode is similar to Standby, but serves read and does not perform checkpointing • All NameNodes start as check pointing Standby, Standby can be transitioned to Active or Observer • Configuring Client • Configure to use ObserverReadProxyProvider • If not, client still works but only talks to ANN • ObserverReadProxy will discover the state of all NNs Active Standby Observer Check Pointing Standby Read Serving Standby Active
  • 15. Current Status 14 • Test and benchmark • With YARN application, e.g. TeraSort • With HDFS benchmarks, e.g. DFSIO • Run on a cluster with >100 nodes and with Kerberos and delegation token enabled • Merged to trunk (3.3.0) • Being backported to branch-2 • Active work on further improvement/optimization • Has been running at Uber in production
  • 16. Background ● Back in 2017, Uber’s HDFS clusters were in a bad shape ○ Rapid growth in # of jobs accessing HDFS ○ Ingestion & adhoc jobs co-locate on the same cluster ○ Lots of listing calls on very large directories (esp. Hudi) ● HDFS traffic composition: 96% reads, 4% writes ● Presto is very sensitive to HDFS latency ○ Occupies ~20% of HDFS traffic ○ Only reads from HDFS, no write
  • 17. Implementation & Timeline ● Implementation (compare to open source version) ○ No msync or fast edit log tailing ■ Only eventual consistency with max staleness of 10s ○ Observer was NOT eligible to NN failover ○ Batched edits loading to reduce NN locktime when tailing edits ● Timeline ○ 08/2017 - finished the POC and basic testing in dev clusters ○ 12/2017 - started collaborating with HDFS open source community (e.g., Linkedin, Paypal) ○ 12/2018 - fully rolled out to Presto in production ○ Tool multiple retries along the process ■ Disable access time (dfs.namenode.accesstime.precision) ■ HDFS-13898, HDFS-13924
  • 18. Impact Comparing to traffic goes to active NameNode, Observer NameNode improves the overall throughput by ~20% (roughly the same throughput from Presto), while RPC queue time has dropped ~30X.
  • 19. Impact (cont.) Presto listing status call latency has dropped 8-10X after migrating to Observer
  • 21. Three-Stage Scalability Plan 2X GROWTH / YEAR IN WORKLOADS AND SIZE • Stage I. Consistent reads from standby • Optimize for reads: 95% of all operations • Consistent reading is a coordination problem • Stage II. In-memory Partitioned Namespace • Optimize write operations • Eliminate NameNode’s global lock – fine-grained locking • Stage III. Dynamically Distributed Namespace Service • Linear scaling to accommodate increases in RPC load and metadata growth HDFS-12943 20
  • 22. NameNode Current State NAMENODE’S GLOBAL LOCK – PERFORMANCE BOTTLENECK • Three main data structures • INodeMap: id -> INode • BlocksMap: key -> BlockInfo • DatanodeMap: don’t split • GSet – an efficient HashMap implementation • Hash(key) -> Value • Global lock to update INodes and blocks 21 NameNode – FSNamesystem INodeMap – Directory Tree GSet: Id -> INode BlocksMap – Block Manager GSet: Block -> BlockInfo DataNode Manager
  • 23. Stage II. In-memory Partitioned Namespace ELIMINATE NAMENODE’S GLOBAL LOCK • PartitionedGSet: • two level mapping 1. RangeMap: keyRange -> GSet 2. RangeGSet: key -> INode • Fine-grained locking • Individual locks per range • Different ranges are accessed in parallel 22 NameNode GSet-1 DataNode Manager GSet-2 GSet-n GSet-1 GSet-2 GSet-n INodeMap - Partitioned GSet BlocksMap - Partitioned GSet
  • 24. Stage II. In-memory Partitioned Namespace EARLY POC RESULTS 23 • PartitionedGSet: two level mapping • LatchLock: swap RangeMap lock for GSet locks corresponding to inode keys • Run NNTroughputBenchmark creating 10 million directories • 30% throughput gain • Large batches of edits • Why not 100%? • Key is inodeId – incremental number generator • Contention on the last partition • Expect MORE
  • 25. Stage III. Dynamically Distributed Namespace SCALABLE DATA AND METADATA • Split NameNode state into multiple servers based on ranges • Each NameNode • Serves a designate range of INode keys • Metadata in PartitionedGSet • Can reassign certain subranges to adjacent nodes • Coordination Service (Ratis) • Change ranges served by NNs • Renames / moves, Quotas 24 NameNode 1 INodeMap Part-GSet DataNode Manager BlocksMap Part-GSet NameNode 2 NameNode n INodeMap Part-GSet DataNode Manager BlocksMap Part-GSet INodeMap Part-GSet DataNode Manager BlocksMap Part-GSet
  • 26. Thank You! Konstantin V Shvachko Chen Liang Chao Sun Sr. Staff Software Engineer @LinkedIn Software Engineer @Uber Senior Software Engineer @LinkedIn 25 Consistent Reads from Standby Node

Hinweis der Redaktion

  1. State transition diagram
  2. Winter is coming!
  3. See Appendix. The SlideShare version will have more details about the Satellite Cluster configuration and operational solutions