SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
RAFT algorithm & Copycat
2015-08-17
Mobile Convergence LAB,
Department of Computer Engineering,
Kyung Hee University.
Consensus Algorithm
Mobile Convergence Laboratory 1 /
분산 컴퓨팅 분야에서
하나의 클러스터링 시스템에서 몇 개의 인스턴스가 오류가
발생하더라도 계속해서 서비스가 제공되도록 해주는 알고리즘
각 노드의 상태에 의존하는 어떤 값이 서로 일치하게 됨
각 노드가 네트워크 상의 이웃 노드들과 관련 정보를 공유하는 상호 규칙
In Consensus…
• Minority of servers fail = No problem
• 정상 작동하는 서버가 과반수 이상이면 시스템은 정상 작동
• Key : Consistent storage system
Mobile Convergence Laboratory 2 /
Paxos
• 1989년도에 발표
• Consensus Algorithm을 구현한 대표적인 프로토콜
• 이해하기 너무 어렵고, 실제 구현하기 어렵다라는 단점이 존재
3 /Mobile Convergence Laboratory
Why Raft?
• 너무 어려운 Paxos의 대안으로 주목받은 알고리즘
• 세분화되어 이해하기 쉽고, 구현하기 쉽게 개발(연구)됨
Mobile Convergence Laboratory 4 /
Mobile Convergence Laboratory 5 /
Raft
• “In Search of an Understandable Consensus Algorithm”
• Diego Ongaro & John Ousterhout in Stanford
• Best Paper Award at the 2014 USENIX Annual Technial
Conference.
• 이후 박사 학위 논문에서 확장하여 발표
(Consensus: Bridging Theory and Practice)
Mobile Convergence Laboratory 6 /
USENIX : Unix Users Group / 컴퓨터시스템 분야 세계
최고의 권위의 학술단체 첫 논문 : 18pages / 박사 논문 : 258pages
Replicated State Machines (1)
x3 y2 x1
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
x3 y2 x1
x 1
y 2
x3 y2 x1
x 1
y 2
Consensus Module Manage & Replicate the logs
Log collection of commands
State Machine Execute the commands & Produce result
Replicated State Machines (2)
x3 y2 x1
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
x3 y2 x1
x 1
y 2
x3 y2 x1 z6
x 1
y 2
z6
Record local machine
Consensus Module Manage & Replicate the logs
Log collection of commands
State Machine Execute the commands & Produce result
Replicated State Machines (3)
x3 y2 x1
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
x3 y2 x1
x 1
y 2
x3 y2 x1 z6
x 1
y 2
Pass
to other machines
Consensus Module Manage & Replicate the logs
Log collection of commands
State Machine Execute the commands & Produce result
Replicated State Machines (4)
x3 y2 x1 z6
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
x3 y2 x1 z6
x 1
y 2
x3 y2 x1 z6
x 1
y 2
Replicate log
Consensus Module Manage & Replicate the logs
Log collection of commands
State Machine Execute the commands & Produce result
Replicated State Machines (5)
x3 y2 x1 z6
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
z 6
x3 y2 x1 z6
x 1
y 2
z 6
x3 y2 x1 z6
x 1
y 2
z 6
Safely replicate log
execute the command
Consensus Module Manage & Replicate the logs
Log collection of commands
State Machine Execute the commands & Produce result
Replicated State Machines (6)
x3 y2 x1 z6
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Log
Consensus
Module
State
Machine
Servers
Clients
x 1
y 2
z 6
x3 y2 x1 z6
x 1
y 2
z 6
x3 y2 x1 z6
x 1
y 2
z 6
z6
Consensus Module Manage & Replicate the logs
Log collection of commands
State Machine Execute the commands & Produce result
Raft 알고리즘의 핵심
• Leader Election
• leader의 failure 발생 시, 새 leader 가 반드시 선출되야 한다.
• Log Replication
• client로부터 log entry를 받으면 클러스터의 노드에 복사해준다.
• Safety
• consistency(일관성), leader election 관련 안전성
Mobile Convergence Laboratory 13 /
RPC for Raft
AppendEntries RPC
• Arguments
• term
• leaderID
• prevLogIndex
• prevLogTerm
• entries[]
RequestVote RPC
• Arguments
• term
• candidateID
• lastLogIndex
• lastLogTerm
Mobile Convergence Laboratory 14 /RPC : Remote Procedure Call
entries값(data값)을 empty이면
Heartbeat
Raft에서의 세 가지 상태
• Follower state
• Candidate state
• Leader state
Mobile Convergence Laboratory 15 /
Raft에서의 세 가지 상태 (cont)
• Follower state
• passive node; 모든 노드의 요청에 대한 응답만 수행 (issue no RPC)
• 클라이언트가 follower에 접속하면 leader에게 리다이렉트
Mobile Convergence Laboratory 16 /
Leader
Follower
Follower
Follower
Client
Raft에서의 세 가지 상태 (cont)
• Candidate state
• follower가 timeout된 상태
• leader에게 heartbeat를 받지 못 했을 때
• candidate 상태로 전이
Mobile Convergence Laboratory 17 /
Raft에서의 세 가지 상태 (cont)
• Leader state
• 모든 클라이언트의 요청을 처리
• 다른 노드(서버)의 Log replication 담당
• 반드시 가용한 leader 가 존재
Mobile Convergence Laboratory 18 /
Leader Election
• 기본적으로 노드들은 follower 로 시작
• leader는 heartbeat 이용
• heartbeat == Empty AppendEntries RPC
• 150ms < timeout < 300ms
• when timeout, follower -> candidate
• candidate는 과반수의 표를 획득하면 leader 로 상태 전이
Mobile Convergence Laboratory 19 /
Mobile Convergence Laboratory 20 /
Heart
beat
Leader FollowerEmpty
AppendEntries RPC
Timeout
150~300ms
Leader Election (cont)
Leader Election (cont)
• 일반적인 heartbeat
• leader가 failure 되었을 경우 or leader 의 응답이 늦을 경우
• http://raftconsensus.github.io/
Mobile Convergence Laboratory 21 /
Log replication
• leader 가 AppendEntries RPC로 수행
• 다른 노드로 복사(동기화)
• https://youtu.be/4OZZv80WrNk
Mobile Convergence Laboratory 22 /
Mobile Convergence Laboratory 23 /
Copycat
Mobile Convergence Laboratory 24 /
Copycat! Why?
• we chose to use Copycat because:
• 순수 자바 기반의 구현물
• 라이선스가 부합한 오픈소스
• 확장성과 커스텀 가능성(customizability)
• 최근까지 커밋, 지속적인 발전
• Well documentation
Mobile Convergence Laboratory 25
By Madan Jampani
copycat
• distributed coordination framework
• 수많은 Raft consensus protocol 구현물 중 하나
• Raft implement + α
26 /Mobile Convergence Laboratory
Mobile Convergence Laboratory 27 / 22
• Active member
• 리더가 될 수 있는 멤버
• Raft protocol
• Synchronous log replication
• Passive member
• 리더 선출에 참여하지 않는
follower
• Gossip protocol
• Asynchronous log replication
Copycat System Architecture
Mobile Convergence Laboratory 28 / 22
Server
Active
Leader
Follower
Candidate
Passive Follower
Raft Protocol
Gossip Protocol
Gossip protocol
• =epidemic protocol
• messages broadcast
• 주기적으로 랜덤한 타겟을 골라 gossip message 전송, 이것을
받아 infected 상태가 된 노드도 똑같이 행동
Mobile Convergence Laboratory 29 / 22
Gossip protocol (1)
• Gossiping = Probabilistic flooding
• Nodes forward with probability, p
Source
Gossip protocol (2)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (3)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (4)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (5)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (6)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (7)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (8)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (9)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (10)
• Gossip based broadcast
• Nodes forward with probability, p
Source
Gossip protocol (11)
• Gossip based broadcast
• Nodes forward with probability, p
Source
1. Simple,
2. Fault tolerant
3. Load-balanced

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker ContainersKafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache Kafka
 
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
Timelines at Scale (Raffi Krikorian - VP of Engineering at Twitter)
 
High Availability for OpenStack
High Availability for OpenStackHigh Availability for OpenStack
High Availability for OpenStack
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Untangling Cluster Management with Helix
Untangling Cluster Management with HelixUntangling Cluster Management with Helix
Untangling Cluster Management with Helix
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron SchildkroutKafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...
 

Andere mochten auch

Algoritmos de Consenso: Paxos vs RAFT
Algoritmos de Consenso: Paxos vs RAFTAlgoritmos de Consenso: Paxos vs RAFT
Algoritmos de Consenso: Paxos vs RAFT
Maycon Viana Bordin
 

Andere mochten auch (19)

Introduction to Raft algorithm
Introduction to Raft algorithmIntroduction to Raft algorithm
Introduction to Raft algorithm
 
Raft in details
Raft in detailsRaft in details
Raft in details
 
Algorithms for Cloud Computing
Algorithms for Cloud ComputingAlgorithms for Cloud Computing
Algorithms for Cloud Computing
 
Algoritmos de Consenso: Paxos vs RAFT
Algoritmos de Consenso: Paxos vs RAFTAlgoritmos de Consenso: Paxos vs RAFT
Algoritmos de Consenso: Paxos vs RAFT
 
SDN컨트롤러 오벨_아토리서치
SDN컨트롤러 오벨_아토리서치SDN컨트롤러 오벨_아토리서치
SDN컨트롤러 오벨_아토리서치
 
Copycat presentation
Copycat presentationCopycat presentation
Copycat presentation
 
C++17 - the upcoming revolution (Code::Dive 2015)/
C++17 - the upcoming revolution (Code::Dive 2015)/C++17 - the upcoming revolution (Code::Dive 2015)/
C++17 - the upcoming revolution (Code::Dive 2015)/
 
OpenDaylight MD-SAL Clustering Explained
OpenDaylight MD-SAL Clustering ExplainedOpenDaylight MD-SAL Clustering Explained
OpenDaylight MD-SAL Clustering Explained
 
Sales presentations
Sales presentationsSales presentations
Sales presentations
 
Raft
Raft Raft
Raft
 
Voice Recognition Car
Voice Recognition CarVoice Recognition Car
Voice Recognition Car
 
OpenDaylight OpenFlow clustering
OpenDaylight OpenFlow clusteringOpenDaylight OpenFlow clustering
OpenDaylight OpenFlow clustering
 
OpenDaylight의 High Availability 기능 분석
OpenDaylight의 High Availability 기능 분석OpenDaylight의 High Availability 기능 분석
OpenDaylight의 High Availability 기능 분석
 
Performance analysis of adaptive noise canceller for an ecg signal
Performance analysis of adaptive noise canceller for an ecg signalPerformance analysis of adaptive noise canceller for an ecg signal
Performance analysis of adaptive noise canceller for an ecg signal
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
Differentiated Instruction Strategy Raft
Differentiated Instruction Strategy RaftDifferentiated Instruction Strategy Raft
Differentiated Instruction Strategy Raft
 
reveal.js 3.0.0
reveal.js 3.0.0reveal.js 3.0.0
reveal.js 3.0.0
 
The Etsy Shard Architecture: Starts With S and Ends With Hard
The Etsy Shard Architecture: Starts With S and Ends With HardThe Etsy Shard Architecture: Starts With S and Ends With Hard
The Etsy Shard Architecture: Starts With S and Ends With Hard
 
RAFT
RAFT RAFT
RAFT
 

Ähnlich wie RAFT Consensus Algorithm

HES2011 - Sebastien Tricaud - Capture me if you can
HES2011 - Sebastien Tricaud - Capture me if you canHES2011 - Sebastien Tricaud - Capture me if you can
HES2011 - Sebastien Tricaud - Capture me if you can
Hackito Ergo Sum
 

Ähnlich wie RAFT Consensus Algorithm (20)

Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
BigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexBigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache Apex
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
HES2011 - Sebastien Tricaud - Capture me if you can
HES2011 - Sebastien Tricaud - Capture me if you canHES2011 - Sebastien Tricaud - Capture me if you can
HES2011 - Sebastien Tricaud - Capture me if you can
 
Hackito Ergo Sum 2011: Capture me if you can!
Hackito Ergo Sum 2011: Capture me if you can!Hackito Ergo Sum 2011: Capture me if you can!
Hackito Ergo Sum 2011: Capture me if you can!
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
 
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic AnalyticsSAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
 
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data EU 2016: Next Gen Big Data Analytics with Apache Apex
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
 
Ch3-2
Ch3-2Ch3-2
Ch3-2
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Introduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas WeiseIntroduction to Apache Apex by Thomas Weise
Introduction to Apache Apex by Thomas Weise
 

Mehr von sangyun han

Mehr von sangyun han (16)

SDN, ONOS, and Network Virtualization
SDN, ONOS, and Network VirtualizationSDN, ONOS, and Network Virtualization
SDN, ONOS, and Network Virtualization
 
Introduce to OpenVirteX
Introduce to OpenVirteXIntroduce to OpenVirteX
Introduce to OpenVirteX
 
XOS in open CORD project
XOS in open CORD projectXOS in open CORD project
XOS in open CORD project
 
Introduction to CORD project
Introduction to CORD projectIntroduction to CORD project
Introduction to CORD project
 
OpenWRT/Hostapd with ONOS
OpenWRT/Hostapd with ONOSOpenWRT/Hostapd with ONOS
OpenWRT/Hostapd with ONOS
 
KhuHub student guideline
KhuHub student guidelineKhuHub student guideline
KhuHub student guideline
 
KhuHub professor guideline
KhuHub professor guidelineKhuHub professor guideline
KhuHub professor guideline
 
ONOS - multiple instance setting(Distributed SDN Controller)
ONOS - multiple instance setting(Distributed SDN Controller)ONOS - multiple instance setting(Distributed SDN Controller)
ONOS - multiple instance setting(Distributed SDN Controller)
 
ONOS - setting, configuration, installation, and test
ONOS - setting, configuration, installation, and testONOS - setting, configuration, installation, and test
ONOS - setting, configuration, installation, and test
 
Introduction of ONOS and core technology
Introduction of ONOS and core technologyIntroduction of ONOS and core technology
Introduction of ONOS and core technology
 
ONOS와 Raspberry Pi 기반 가상물리 SDN 실증 환경 구축과 응용 개발
ONOS와 Raspberry Pi 기반 가상물리 SDN 실증 환경 구축과 응용 개발ONOS와 Raspberry Pi 기반 가상물리 SDN 실증 환경 구축과 응용 개발
ONOS와 Raspberry Pi 기반 가상물리 SDN 실증 환경 구축과 응용 개발
 
[SoftCon]SDN/IoT 그리고 Testbed
[SoftCon]SDN/IoT 그리고 Testbed[SoftCon]SDN/IoT 그리고 Testbed
[SoftCon]SDN/IoT 그리고 Testbed
 
Hazelcast 소개
Hazelcast 소개Hazelcast 소개
Hazelcast 소개
 
Implementing SDN Testbed(ONOS & OpenVirteX)
Implementing SDN Testbed(ONOS & OpenVirteX)Implementing SDN Testbed(ONOS & OpenVirteX)
Implementing SDN Testbed(ONOS & OpenVirteX)
 
Git & Github Seminar-1
Git & Github Seminar-1Git & Github Seminar-1
Git & Github Seminar-1
 
Git & Github Seminar-2
Git & Github Seminar-2Git & Github Seminar-2
Git & Github Seminar-2
 

Kürzlich hochgeladen

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Kürzlich hochgeladen (20)

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 

RAFT Consensus Algorithm

  • 1. RAFT algorithm & Copycat 2015-08-17 Mobile Convergence LAB, Department of Computer Engineering, Kyung Hee University.
  • 2. Consensus Algorithm Mobile Convergence Laboratory 1 / 분산 컴퓨팅 분야에서 하나의 클러스터링 시스템에서 몇 개의 인스턴스가 오류가 발생하더라도 계속해서 서비스가 제공되도록 해주는 알고리즘 각 노드의 상태에 의존하는 어떤 값이 서로 일치하게 됨 각 노드가 네트워크 상의 이웃 노드들과 관련 정보를 공유하는 상호 규칙
  • 3. In Consensus… • Minority of servers fail = No problem • 정상 작동하는 서버가 과반수 이상이면 시스템은 정상 작동 • Key : Consistent storage system Mobile Convergence Laboratory 2 /
  • 4. Paxos • 1989년도에 발표 • Consensus Algorithm을 구현한 대표적인 프로토콜 • 이해하기 너무 어렵고, 실제 구현하기 어렵다라는 단점이 존재 3 /Mobile Convergence Laboratory
  • 5. Why Raft? • 너무 어려운 Paxos의 대안으로 주목받은 알고리즘 • 세분화되어 이해하기 쉽고, 구현하기 쉽게 개발(연구)됨 Mobile Convergence Laboratory 4 /
  • 7. Raft • “In Search of an Understandable Consensus Algorithm” • Diego Ongaro & John Ousterhout in Stanford • Best Paper Award at the 2014 USENIX Annual Technial Conference. • 이후 박사 학위 논문에서 확장하여 발표 (Consensus: Bridging Theory and Practice) Mobile Convergence Laboratory 6 / USENIX : Unix Users Group / 컴퓨터시스템 분야 세계 최고의 권위의 학술단체 첫 논문 : 18pages / 박사 논문 : 258pages
  • 8. Replicated State Machines (1) x3 y2 x1 Log Consensus Module State Machine Log Consensus Module State Machine Log Consensus Module State Machine Servers Clients x 1 y 2 x3 y2 x1 x 1 y 2 x3 y2 x1 x 1 y 2 Consensus Module Manage & Replicate the logs Log collection of commands State Machine Execute the commands & Produce result
  • 9. Replicated State Machines (2) x3 y2 x1 Log Consensus Module State Machine Log Consensus Module State Machine Log Consensus Module State Machine Servers Clients x 1 y 2 x3 y2 x1 x 1 y 2 x3 y2 x1 z6 x 1 y 2 z6 Record local machine Consensus Module Manage & Replicate the logs Log collection of commands State Machine Execute the commands & Produce result
  • 10. Replicated State Machines (3) x3 y2 x1 Log Consensus Module State Machine Log Consensus Module State Machine Log Consensus Module State Machine Servers Clients x 1 y 2 x3 y2 x1 x 1 y 2 x3 y2 x1 z6 x 1 y 2 Pass to other machines Consensus Module Manage & Replicate the logs Log collection of commands State Machine Execute the commands & Produce result
  • 11. Replicated State Machines (4) x3 y2 x1 z6 Log Consensus Module State Machine Log Consensus Module State Machine Log Consensus Module State Machine Servers Clients x 1 y 2 x3 y2 x1 z6 x 1 y 2 x3 y2 x1 z6 x 1 y 2 Replicate log Consensus Module Manage & Replicate the logs Log collection of commands State Machine Execute the commands & Produce result
  • 12. Replicated State Machines (5) x3 y2 x1 z6 Log Consensus Module State Machine Log Consensus Module State Machine Log Consensus Module State Machine Servers Clients x 1 y 2 z 6 x3 y2 x1 z6 x 1 y 2 z 6 x3 y2 x1 z6 x 1 y 2 z 6 Safely replicate log execute the command Consensus Module Manage & Replicate the logs Log collection of commands State Machine Execute the commands & Produce result
  • 13. Replicated State Machines (6) x3 y2 x1 z6 Log Consensus Module State Machine Log Consensus Module State Machine Log Consensus Module State Machine Servers Clients x 1 y 2 z 6 x3 y2 x1 z6 x 1 y 2 z 6 x3 y2 x1 z6 x 1 y 2 z 6 z6 Consensus Module Manage & Replicate the logs Log collection of commands State Machine Execute the commands & Produce result
  • 14. Raft 알고리즘의 핵심 • Leader Election • leader의 failure 발생 시, 새 leader 가 반드시 선출되야 한다. • Log Replication • client로부터 log entry를 받으면 클러스터의 노드에 복사해준다. • Safety • consistency(일관성), leader election 관련 안전성 Mobile Convergence Laboratory 13 /
  • 15. RPC for Raft AppendEntries RPC • Arguments • term • leaderID • prevLogIndex • prevLogTerm • entries[] RequestVote RPC • Arguments • term • candidateID • lastLogIndex • lastLogTerm Mobile Convergence Laboratory 14 /RPC : Remote Procedure Call entries값(data값)을 empty이면 Heartbeat
  • 16. Raft에서의 세 가지 상태 • Follower state • Candidate state • Leader state Mobile Convergence Laboratory 15 /
  • 17. Raft에서의 세 가지 상태 (cont) • Follower state • passive node; 모든 노드의 요청에 대한 응답만 수행 (issue no RPC) • 클라이언트가 follower에 접속하면 leader에게 리다이렉트 Mobile Convergence Laboratory 16 / Leader Follower Follower Follower Client
  • 18. Raft에서의 세 가지 상태 (cont) • Candidate state • follower가 timeout된 상태 • leader에게 heartbeat를 받지 못 했을 때 • candidate 상태로 전이 Mobile Convergence Laboratory 17 /
  • 19. Raft에서의 세 가지 상태 (cont) • Leader state • 모든 클라이언트의 요청을 처리 • 다른 노드(서버)의 Log replication 담당 • 반드시 가용한 leader 가 존재 Mobile Convergence Laboratory 18 /
  • 20. Leader Election • 기본적으로 노드들은 follower 로 시작 • leader는 heartbeat 이용 • heartbeat == Empty AppendEntries RPC • 150ms < timeout < 300ms • when timeout, follower -> candidate • candidate는 과반수의 표를 획득하면 leader 로 상태 전이 Mobile Convergence Laboratory 19 /
  • 21. Mobile Convergence Laboratory 20 / Heart beat Leader FollowerEmpty AppendEntries RPC Timeout 150~300ms Leader Election (cont)
  • 22. Leader Election (cont) • 일반적인 heartbeat • leader가 failure 되었을 경우 or leader 의 응답이 늦을 경우 • http://raftconsensus.github.io/ Mobile Convergence Laboratory 21 /
  • 23. Log replication • leader 가 AppendEntries RPC로 수행 • 다른 노드로 복사(동기화) • https://youtu.be/4OZZv80WrNk Mobile Convergence Laboratory 22 /
  • 26. Copycat! Why? • we chose to use Copycat because: • 순수 자바 기반의 구현물 • 라이선스가 부합한 오픈소스 • 확장성과 커스텀 가능성(customizability) • 최근까지 커밋, 지속적인 발전 • Well documentation Mobile Convergence Laboratory 25 By Madan Jampani
  • 27. copycat • distributed coordination framework • 수많은 Raft consensus protocol 구현물 중 하나 • Raft implement + α 26 /Mobile Convergence Laboratory
  • 28. Mobile Convergence Laboratory 27 / 22 • Active member • 리더가 될 수 있는 멤버 • Raft protocol • Synchronous log replication • Passive member • 리더 선출에 참여하지 않는 follower • Gossip protocol • Asynchronous log replication Copycat System Architecture
  • 29. Mobile Convergence Laboratory 28 / 22 Server Active Leader Follower Candidate Passive Follower Raft Protocol Gossip Protocol
  • 30. Gossip protocol • =epidemic protocol • messages broadcast • 주기적으로 랜덤한 타겟을 골라 gossip message 전송, 이것을 받아 infected 상태가 된 노드도 똑같이 행동 Mobile Convergence Laboratory 29 / 22
  • 31. Gossip protocol (1) • Gossiping = Probabilistic flooding • Nodes forward with probability, p Source
  • 32. Gossip protocol (2) • Gossip based broadcast • Nodes forward with probability, p Source
  • 33. Gossip protocol (3) • Gossip based broadcast • Nodes forward with probability, p Source
  • 34. Gossip protocol (4) • Gossip based broadcast • Nodes forward with probability, p Source
  • 35. Gossip protocol (5) • Gossip based broadcast • Nodes forward with probability, p Source
  • 36. Gossip protocol (6) • Gossip based broadcast • Nodes forward with probability, p Source
  • 37. Gossip protocol (7) • Gossip based broadcast • Nodes forward with probability, p Source
  • 38. Gossip protocol (8) • Gossip based broadcast • Nodes forward with probability, p Source
  • 39. Gossip protocol (9) • Gossip based broadcast • Nodes forward with probability, p Source
  • 40. Gossip protocol (10) • Gossip based broadcast • Nodes forward with probability, p Source
  • 41. Gossip protocol (11) • Gossip based broadcast • Nodes forward with probability, p Source 1. Simple, 2. Fault tolerant 3. Load-balanced