SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Clock-RSM: Low-Latency Inter-Datacenter
State Machine Replication Using Loosely
Synchronized Physical Clocks
Jiaqing Du, Daniele Sciascia, Sameh Elnikety
Willy Zwaenepoel, Fernando Pedone
EPFL, University of Lugano, Microsoft Research
Replicated State Machines (RSM)
• Strong consistency
– Execute same commands in same order
– Reach same state from same initial state
• Fault tolerance
– Store data at multiple replicas
– Failure masking / fast failover
2
Geo-Replication
Data Center
Data Center
Data Center
Data Center
Data Center
• High latency among replicas
• Messaging dominates replication latency
3
Leader-Based Protocols
• Order commands by a leader replica
• Require extra ordering messages at follower
Leader
client request client reply
Ordering
Replication
High latency for geo replication
Ordering
4
Follower
Clock-RSM
• Orders commands using physical clocks
• Overlaps ordering and replication
5
client request client reply
Ordering + Replication
Low latency for geo replication
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
6
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
7
Property and Assumption
• Provides linearizability
• Tolerates failure of minority replicas
• Assumptions
– Asynchronous FIFO channels
– Non-Byzantine faults
– Loosely synchronized physical clocks
8
Protocol Overview
client request client reply
client request client reply
9
PrepOK
cmd1.ts = Clock()
cmd2.ts = Clock()
Clock-RSM
cmd1cmd2
cmd1cmd2
cmd1cmd2
cmd1cmd2
cmd1cmd2
Major Message Steps
• Prep: Ask everyone to log a command
• PrepOK: Tell everyone after logging a command
R0
R2
R1
client request
R3
R4
Prep
PrepOK
PrepOK
cmd1.ts = 24
PrepOK
PrepOK
cmd1 committed?
client request
cmd2.ts = 23
10
Commit Conditions
• A command is committed if
– Replicated by a majority
– All commands ordered before are committed
• Wait until three conditions hold
C1: Majority replication
C2: Stable order
C3: Prefix replication
11
C1: Majority Replication
• More than half replicas log cmd1
R0
R2
R1
client request
R3
R4
PrepOK
PrepOK
cmd1.ts = 24
Prep
Replicated by R0, R1, R2
1 RTT: between R0 and majority
12
C2: Stable Order
• Replica knows all commands ordered before cmd1
– Receives a greater timestamp from every other replica
R0
R2
R1
client request
R3
R4
24
cmd1.ts = 24
2523
25
25
25
0.5 RTT: between R0 and farthest peer
cmd1 is stable at R0
13
Prep / PrepOK / ClockTime
C3: Prefix Replication
• All commands ordered before cmd1 are replicated
by a majority
14
R0
R2
R1
client request
R3
R4
cmd1.ts = 24
cmd2 is replicated
by R1, R2, R3
cmd2.ts = 23
Prep
PrepOk
1 RTT: R4 to majority + majority to R0
client request
Prep
Prep
PrepOkPrepOk
Overlapping Steps
15
R0
R2
R1
client request
R3
R4
Latency of cmd1 : about 1 RTT to majority
client reply
Majority replication
Stable order
Prefix replication
PrepOK
PrepOK
Prep
Log(cmd1)
Log(cmd1)
24 2523
25
25
25
Prep
Prep
PrepOk
PrepOk
cmd1.ts = 24
Commit Latency
Step Latency
Majority replication 1 RTT (majority1)
Stable order 0.5 RTT (farthest)
Prefix replication 1 RTT (majority2)
Overall latency =
MAX{ 1 RTT (majority1), 0.5 RTT (farthest), 1 RTT (majority2) }
16
If 0.5 RTT (farthest) < 1 RTT (majority),
then overall latency ≈ 1 RTT (majority).
R0
Topology Examples
Majority1
Farthest
R0
Majority1
Farthest
R3
R4
R2
R1
R4
R3
R2
R1
17
client request
client request
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
18
Paxos 1: Multi-Paxos
• Single leader orders commands
– Logical clock: 0, 1, 2, 3, ...
R0
Leader R2
R1
client request
Prep
CommitForward
client reply
PrepOK
R3
R4
Latency at followers: 2 RTTs (leader & majority) 19
Paxos 2: Paxos-bcast
• Every replica broadcasts PrepOK
– Trades off message complexity for latency
R0
Leader R2
R1
client request
Prep
Forward
client reply
PrepOK
R3
R4
Latency at followers: 1.5 RTTs (leader & majority)
20
Clock-RSM vs. Paxos
• With realistic topologies, Clock-RSM has
– Lower latency at Paxos follower replicas
– Similar / slightly higher latency at Paxos leader
21
Protocol Latency
Clock-RSM All replicas: 1 RTT (majority)
if 0.5 RTT (farthest) < 1 RTT (majority)
Paxos-bcast Leader: 1 RTT (majority)
Follower: 1.5 RTTs (leader & majority)
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
22
Experiment Setup
• Replicated key-value store
• Deployed on Amazon EC2
California (CA)
Virginia (VA)
Ireland (IR)
Singapore (SG)
Japan (JP)
23
Latency (1/2)
• All replicas serve client requests
24
Overlapping vs. Separate Steps
CA VA
IR
SG
JP
25
CA VA (L)
IR
SG
JP
Clock-RSM latency: max of three
Paxos-bcast latency: sum of three
client request
client request
Latency (2/2)
• Paxos leader is changed to CA
26
Throughput
• Five replicas on a local cluster
• Message batching is key
27
Also in the Paper
• A reconfiguration protocol
• Comparison with Mencius
• Latency analysis of protocols
28
Conclusion
• Clock-RSM: low latency geo-replication
– Uses loosely synchronized physical clocks
– Overlaps ordering and replication
• Leader-based protocols can incur high latency
29

Weitere ähnliche Inhalte

Was ist angesagt?

Real Time Application Interface for Linux
Real Time Application Interface for LinuxReal Time Application Interface for Linux
Real Time Application Interface for LinuxSarah Hussein
 
Free OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server PerformanceFree OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server PerformanceManageEngine, Zoho Corporation
 
Free OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringFree OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringManageEngine, Zoho Corporation
 
Round Robin Algorithm.pptx
Round Robin Algorithm.pptxRound Robin Algorithm.pptx
Round Robin Algorithm.pptxSanad Bhowmik
 
Free OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classificationFree OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classificationManageEngine, Zoho Corporation
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneOpen-NFP
 
Linux Administation
Linux AdministationLinux Administation
Linux Administationrkulandaivel
 
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache FlinkFlink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache FlinkFlink Forward
 
Centos failover link
Centos failover link Centos failover link
Centos failover link Ediga Watson
 
Getting Started with Performance Co-Pilot
Getting Started with Performance Co-PilotGetting Started with Performance Co-Pilot
Getting Started with Performance Co-PilotPaul V. Novarese
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vectorSandeep Kunkunuru
 
Introduction to Remote Procedure Call
Introduction to Remote Procedure CallIntroduction to Remote Procedure Call
Introduction to Remote Procedure CallAbdelrahman Al-Ogail
 
Lac2006 Lee Revell Slides
Lac2006 Lee Revell SlidesLac2006 Lee Revell Slides
Lac2006 Lee Revell Slidesrlrevell
 
Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSNamHyuk Ahn
 

Was ist angesagt? (20)

Real Time Application Interface for Linux
Real Time Application Interface for LinuxReal Time Application Interface for Linux
Real Time Application Interface for Linux
 
SCHEDULING ALGORITHMS
SCHEDULING ALGORITHMSSCHEDULING ALGORITHMS
SCHEDULING ALGORITHMS
 
Free OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server PerformanceFree OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server Performance
 
Free OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringFree OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoring
 
Round Robin Algorithm.pptx
Round Robin Algorithm.pptxRound Robin Algorithm.pptx
Round Robin Algorithm.pptx
 
Free OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classificationFree OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classification
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data Plane
 
Linux Administation
Linux AdministationLinux Administation
Linux Administation
 
Raft presentation
Raft presentationRaft presentation
Raft presentation
 
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache FlinkFlink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
 
Centos failover link
Centos failover link Centos failover link
Centos failover link
 
Getting Started with Performance Co-Pilot
Getting Started with Performance Co-PilotGetting Started with Performance Co-Pilot
Getting Started with Performance Co-Pilot
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vector
 
Introduction to Remote Procedure Call
Introduction to Remote Procedure CallIntroduction to Remote Procedure Call
Introduction to Remote Procedure Call
 
MidTerm-RatanMohapatra
MidTerm-RatanMohapatraMidTerm-RatanMohapatra
MidTerm-RatanMohapatra
 
Lac2006 Lee Revell Slides
Lac2006 Lee Revell SlidesLac2006 Lee Revell Slides
Lac2006 Lee Revell Slides
 
Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
 
Dns
DnsDns
Dns
 
PCP
PCPPCP
PCP
 
Week5 lec1-bscs1
Week5 lec1-bscs1Week5 lec1-bscs1
Week5 lec1-bscs1
 

Ähnlich wie Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks

3 process scheduling
3 process scheduling3 process scheduling
3 process schedulingahad alam
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward
 
fggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggffffffffffffffffffffggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggfffffffffffffffffffadugnanegero
 
Process Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating SystemsProcess Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating SystemsKathirvelRajan2
 
3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------DivyaBorade3
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...Redis Labs
 
Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011Muhammad Noor Ifansyah
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet CountAmazon Web Services
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)Amazon Web Services
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Community
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateDanielle Womboldt
 
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreLec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreHsien-Hsin Sean Lee, Ph.D.
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.pptssuserf6eb9b
 

Ähnlich wie Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks (20)

3 process scheduling
3 process scheduling3 process scheduling
3 process scheduling
 
Real time database
Real time databaseReal time database
Real time database
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
 
13 risc
13 risc13 risc
13 risc
 
fggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggffffffffffffffffffffggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggfffffffffffffffffff
 
Process Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating SystemsProcess Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating Systems
 
3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------
 
3_process_scheduling.ppt
3_process_scheduling.ppt3_process_scheduling.ppt
3_process_scheduling.ppt
 
3_process_scheduling.ppt
3_process_scheduling.ppt3_process_scheduling.ppt
3_process_scheduling.ppt
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
rtos.ppt
rtos.pptrtos.ppt
rtos.ppt
 
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
 
Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet Count
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
 
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreLec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.ppt
 
08Mapping.ppt
08Mapping.ppt08Mapping.ppt
08Mapping.ppt
 

Kürzlich hochgeladen

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 

Kürzlich hochgeladen (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 

Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks

  • 1. Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks Jiaqing Du, Daniele Sciascia, Sameh Elnikety Willy Zwaenepoel, Fernando Pedone EPFL, University of Lugano, Microsoft Research
  • 2. Replicated State Machines (RSM) • Strong consistency – Execute same commands in same order – Reach same state from same initial state • Fault tolerance – Store data at multiple replicas – Failure masking / fast failover 2
  • 3. Geo-Replication Data Center Data Center Data Center Data Center Data Center • High latency among replicas • Messaging dominates replication latency 3
  • 4. Leader-Based Protocols • Order commands by a leader replica • Require extra ordering messages at follower Leader client request client reply Ordering Replication High latency for geo replication Ordering 4 Follower
  • 5. Clock-RSM • Orders commands using physical clocks • Overlaps ordering and replication 5 client request client reply Ordering + Replication Low latency for geo replication
  • 6. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 6
  • 7. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 7
  • 8. Property and Assumption • Provides linearizability • Tolerates failure of minority replicas • Assumptions – Asynchronous FIFO channels – Non-Byzantine faults – Loosely synchronized physical clocks 8
  • 9. Protocol Overview client request client reply client request client reply 9 PrepOK cmd1.ts = Clock() cmd2.ts = Clock() Clock-RSM cmd1cmd2 cmd1cmd2 cmd1cmd2 cmd1cmd2 cmd1cmd2
  • 10. Major Message Steps • Prep: Ask everyone to log a command • PrepOK: Tell everyone after logging a command R0 R2 R1 client request R3 R4 Prep PrepOK PrepOK cmd1.ts = 24 PrepOK PrepOK cmd1 committed? client request cmd2.ts = 23 10
  • 11. Commit Conditions • A command is committed if – Replicated by a majority – All commands ordered before are committed • Wait until three conditions hold C1: Majority replication C2: Stable order C3: Prefix replication 11
  • 12. C1: Majority Replication • More than half replicas log cmd1 R0 R2 R1 client request R3 R4 PrepOK PrepOK cmd1.ts = 24 Prep Replicated by R0, R1, R2 1 RTT: between R0 and majority 12
  • 13. C2: Stable Order • Replica knows all commands ordered before cmd1 – Receives a greater timestamp from every other replica R0 R2 R1 client request R3 R4 24 cmd1.ts = 24 2523 25 25 25 0.5 RTT: between R0 and farthest peer cmd1 is stable at R0 13 Prep / PrepOK / ClockTime
  • 14. C3: Prefix Replication • All commands ordered before cmd1 are replicated by a majority 14 R0 R2 R1 client request R3 R4 cmd1.ts = 24 cmd2 is replicated by R1, R2, R3 cmd2.ts = 23 Prep PrepOk 1 RTT: R4 to majority + majority to R0 client request Prep Prep PrepOkPrepOk
  • 15. Overlapping Steps 15 R0 R2 R1 client request R3 R4 Latency of cmd1 : about 1 RTT to majority client reply Majority replication Stable order Prefix replication PrepOK PrepOK Prep Log(cmd1) Log(cmd1) 24 2523 25 25 25 Prep Prep PrepOk PrepOk cmd1.ts = 24
  • 16. Commit Latency Step Latency Majority replication 1 RTT (majority1) Stable order 0.5 RTT (farthest) Prefix replication 1 RTT (majority2) Overall latency = MAX{ 1 RTT (majority1), 0.5 RTT (farthest), 1 RTT (majority2) } 16 If 0.5 RTT (farthest) < 1 RTT (majority), then overall latency ≈ 1 RTT (majority).
  • 18. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 18
  • 19. Paxos 1: Multi-Paxos • Single leader orders commands – Logical clock: 0, 1, 2, 3, ... R0 Leader R2 R1 client request Prep CommitForward client reply PrepOK R3 R4 Latency at followers: 2 RTTs (leader & majority) 19
  • 20. Paxos 2: Paxos-bcast • Every replica broadcasts PrepOK – Trades off message complexity for latency R0 Leader R2 R1 client request Prep Forward client reply PrepOK R3 R4 Latency at followers: 1.5 RTTs (leader & majority) 20
  • 21. Clock-RSM vs. Paxos • With realistic topologies, Clock-RSM has – Lower latency at Paxos follower replicas – Similar / slightly higher latency at Paxos leader 21 Protocol Latency Clock-RSM All replicas: 1 RTT (majority) if 0.5 RTT (farthest) < 1 RTT (majority) Paxos-bcast Leader: 1 RTT (majority) Follower: 1.5 RTTs (leader & majority)
  • 22. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 22
  • 23. Experiment Setup • Replicated key-value store • Deployed on Amazon EC2 California (CA) Virginia (VA) Ireland (IR) Singapore (SG) Japan (JP) 23
  • 24. Latency (1/2) • All replicas serve client requests 24
  • 25. Overlapping vs. Separate Steps CA VA IR SG JP 25 CA VA (L) IR SG JP Clock-RSM latency: max of three Paxos-bcast latency: sum of three client request client request
  • 26. Latency (2/2) • Paxos leader is changed to CA 26
  • 27. Throughput • Five replicas on a local cluster • Message batching is key 27
  • 28. Also in the Paper • A reconfiguration protocol • Comparison with Mencius • Latency analysis of protocols 28
  • 29. Conclusion • Clock-RSM: low latency geo-replication – Uses loosely synchronized physical clocks – Overlaps ordering and replication • Leader-based protocols can incur high latency 29