SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
CRDTs and Redis
From sequential to concurrent executions
Carlos Baquero
Universidade do Minho & INESC TEC
2/31
The speed of communication in the 19th century
W. H. Harrison’s death
“At 12:30 am on April 4th, 1841 President
William Henry Harrison died of pneumonia
just a month after taking o ce. The Rich-
mond Enquirer published the news of his
death two days later on April 6th. The North-
Carolina standard newspaper published it on
April 14th. His death wasn’t known of in Los
Angeles until July 23rd, 110 days after it had
occurred.”
Text by Zack Bloom, A Quick History of Digital Communication Before the
Internet. https://eager.io/blog/communication-pre-internet/
Picture by By Albert Sands Southworth and Josiah Johnson Hawes
3/31
The speed of communication in the 19th century
Francis Galton Isochronic Map
4/31
The speed of communication in the 21st century
RTT data gathered via http://www.azurespeed.com
5/31
The speed of communication in the 21st century
If you really like high latencies . . .
Time delay between Mars and Earth
blogs.esa.int/mex/2012/08/05/time-delay-between-mars-and-earth/
Delay/Disruption Tolerant Networking
www.nasa.gov/content/dtn
6/31
Latency magnitudes
Geo-replication
, up to 50ms (local region DC)
⇤, between 100ms and 300ms (inter-continental)
No inter-DC replication
Client writes observe latency
Planet-wide geo-replication
Replication techniques versus client side write latency ranges
Consensus/Paxos [⇤, 2⇤] (with no divergence)
Primary-Backup [ , ⇤] (asynchronous/lazy)
Multi-Master (allowing divergence)
7/31
EC and CAP for Geo-Replication
Eventually Consistent. CACM 2009, Werner Vogels
In an ideal world there would be only one consistency model:
when an update is made all observers would see that update.
Building reliable distributed systems at a worldwide scale
demands trade-o↵s between consistency and availability.
CAP theorem. PODC 2000, Eric Brewer
Of three properties of shared-data systems – data consistency,
system availability, and tolerance to network partition – only two
can be achieved at any given time.
CRDTs provide support for partition-tolerant high availability
8/31
From sequential to concurrent executions
Consensus provides illusion of a single replica
This also preserves (slow) sequential behaviour
Sequential execution
Ops O o // p // q
Time //
We have an ordered set (O, <). O = {o, p, q} and o < p < q
8/31
From sequential to concurrent executions
Consensus provides illusion of a single replica
This also preserves (slow) sequential behaviour
Sequential execution
Ops O o // p // q
Time //
We have an ordered set (O, <). O = {o, p, q} and o < p < q
9/31
From sequential to concurrent executions
EC Multi-master (or active-active) can expose concurrency
Concurrent execution
p // q
Ops O o
??
r
s
77
Time //
Partially ordered set (O, ). o p q r and o s r
Some ops in O are concurrent: p k s and q k s
10/31
Design of Conflict-Free Replicated Data Types
A partially ordered log (polog) of operations implements any CRDT
Replicas keep increasing local views of an evolving distributed polog
Any query, at replica i, can be expressed from local polog Oi
Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }|
CRDTs are e cient representations that follow some general rules
10/31
Design of Conflict-Free Replicated Data Types
A partially ordered log (polog) of operations implements any CRDT
Replicas keep increasing local views of an evolving distributed polog
Any query, at replica i, can be expressed from local polog Oi
Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }|
CRDTs are e cient representations that follow some general rules
10/31
Design of Conflict-Free Replicated Data Types
A partially ordered log (polog) of operations implements any CRDT
Replicas keep increasing local views of an evolving distributed polog
Any query, at replica i, can be expressed from local polog Oi
Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }|
CRDTs are e cient representations that follow some general rules
11/31
Principle of permutation equivalence
If operations in sequence can commute, preserving a given result,
then under concurrency they should preserve the same result
Sequential
inc(10) // inc(35) // dec(5) // inc(2)
dec(5) // inc(2) // inc(10) // inc(35)
Concurrent
inc(35)
&&
inc(10)
88
&&
inc(2)
dec(5)
88
You guessed: Result is 42
12/31
Implementing Counters
Example: CRDT PNCounters
A inc(35)
&&
B inc(10)
88
&&
inc(2)
C dec(5)
88
Lets track total number of incs and decs done at each replica
{A(incs, decs), . . . , C(. . . , . . .)}
13/31
Implementing Counters
Example: CRDT PNCounters
Separate positive and negative counts are kept per replica
A {A(35, 0), B(10, 0)}
++
B {B(10, 0)}
66
((
{A(35, 0), B(12, 0), C(0, 5)}
C {B(10, 0), C(0, 5)}
33
Joining does point-wise maximums among entries (semilattice)
At any time, counter value is sum of incs minus sum of decs
13/31
Implementing Counters
Example: CRDT PNCounters
Separate positive and negative counts are kept per replica
A {A(35, 0), B(10, 0)}
++
B {B(10, 0)}
66
((
{A(35, 0), B(12, 0), C(0, 5)}
C {B(10, 0), C(0, 5)}
33
Joining does point-wise maximums among entries (semilattice)
At any time, counter value is sum of incs minus sum of decs
14/31
Implementing Counters
Redis CRDT Counters
There are multiple ways to implement CRDT counters
Redis has a distinct implementation that favours garbage collection
Redis CRDT counters are 59 bits (not 64) to avoid overflows
15/31
Registers
Registers are an ordered set of write operations
Sequential execution
A wr(x) // wr(j) // wr(k) // wr(x)
Sequential execution under distribution
A wr(x)
%%
wr(x)
B wr(j) // wr(k)
99
Register value is x, the last written value
16/31
Implementing Registers
Naive Last-Writer-Wins
CRDT register implemented by attaching local wall-clock times
Sequential execution under distribution
A (11:00)x
''
(11:30)?
##
B (12:02)j // (12:05)k
77
?
Problem: Wall-clock on B is one hour ahead of A
Value x might not be writeable again at A since 12:05 > 11:30
17/31
Registers
Sequential Semantics
Register shows value v at replica i i↵
wr(v) 2 Oi
and
@wr(v0
) 2 Oi · wr(v) < wr(v0
)
18/31
Preservation of sequential semantics
Concurrent semantics should preserve the sequential semantics
This also ensures correct sequential execution under distribution
19/31
Multi-value Registers
Concurrency semantics shows all concurrent values
{v | wr(v) 2 Oi ^ @wr(v0
) 2 Oi · wr(v) wr(v0
)}
Concurrent execution
A wr(x)
%%
// wr(y) // {y, k} // wr(m) // {m}
B wr(j) // wr(k)
99
Dynamo shopping carts are multi-value registers with payload sets
The m value could be an application level merge of values y and k
20/31
Implementing Multi-value Registers
Concurrency can be preciselly tracked with version vectors
Concurrent execution (version vectors)
A [1, 0]x
%%
// [2, 0]y // [2, 0]y, [1, 2]k // [3, 2]m
B [1, 1]j // [1, 2]k
66
Metadata can be compressed with a common causal context and a
single scalar per value (dotted version vectors)
21/31
Registers in Redis
LWW arbitration
Multi-value registers allows executions leading to concurrent values
Presenting concurrent values is at odds with the sequential API
Redis both tracks causality and registers wall-clock times
Querying uses Last-Writer-Wins selection among concurrent values
This preserves correctness of sequential semantics
A value with clock 12:05 can still be causally overwritten at 11:30
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
23/31
Sets
Concurrency Semantics
Problem: Concurrently adding and removing the same element
Concurrent execution
A add(x)
%%
// rmv(x) // {?} // add(x) // {x}
B rmv(x) // add(x)
::
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
25/31
Equivalence to a sequential execution?
Add-Wins Sets
Can we always explain a concurrent execution by a sequential one?
Concurrent execution
A {x, y} // add(y) // rmv(x) // {y} //
##
{x, y}
B {x, y} // add(x) // rmv(y) // {x} //
;;
{x, y}
Two (failed) sequential explanations
H1 {x, y} // . . . // rmv(x) // {6 x, y}
H2 {x, y} // . . . // rmv(y) // {x, 6 y}
Concurrent executions can have richer outcomes
26/31
Concurrency Semantics
Remove-Wins Sets
Alternative: Let’s choose Remove-Wins
Xi
.
= {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)}
Remove-Wins requires more metadata than Add-Wins
Both Add and Remove-Wins have same semantics in a total order
They are di↵erent but both preserve sequential semantics
26/31
Concurrency Semantics
Remove-Wins Sets
Alternative: Let’s choose Remove-Wins
Xi
.
= {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)}
Remove-Wins requires more metadata than Add-Wins
Both Add and Remove-Wins have same semantics in a total order
They are di↵erent but both preserve sequential semantics
27/31
Take home message
Concurrent executions are needed to deal with latency
Behaviour changes when moving from sequential to concurrent
Road to accommodate transition:
Permutation equivalence
Preserving sequential semantics
Concurrent executions lead to richer outcomes
CRDTs provide sound guidelines and encode policies
Thank you!
Carlos Baquero
Email: cbm@di.uminho.pt, Twitter: @xmal
29/31
Sequence/List
Weak/Strong Specification [Attiya et al, PODC 16]
Element x is kept
rpush(b) // hxbi
$$
hi // lpush(x) // hxi
99
%%
haxbi
lpush(a) // haxi
::
Element x is removed (Redis enforces Strong Specification)
rpush(b) // hxbi
##
hi // lpush(x) // hxi
::
$$
// rem(x) // hi // habi ¬hbai
lpush(a) // haxi
;;
30/31
Causal Consistency
Redis CRDTs provide per-key causal consistency
Source FIFO (TCP)
A apple
&&
%%
B apple pie
##
hot
##
C pie hot peach? apple
Causal consistency
A apple
%%
B apple
%%
pie
$$
hot
""
C apple pie hot tasty
Strongest highly available consistency model
31/31
Causal Consistency
Redis CRDTs provide per-key causal consistency
Source FIFO (TCP)
A add(a)
''
&&
B add(a) rmv(a)
''
add(b)
''
C rmv(a) add(b) add(a)
Causal consistency
A add(a)
%%
B add(a)
%%
rmv(a)
&&
add(b)
%%
C add(a) rmv(a) add(b)
Strongest highly available consistency model

Weitere ähnliche Inhalte

Was ist angesagt?

Reduction of multiple subsystem [compatibility mode]
Reduction of multiple subsystem [compatibility mode]Reduction of multiple subsystem [compatibility mode]
Reduction of multiple subsystem [compatibility mode]
azroyyazid
 

Was ist angesagt? (20)

Reduction of multiple subsystem [compatibility mode]
Reduction of multiple subsystem [compatibility mode]Reduction of multiple subsystem [compatibility mode]
Reduction of multiple subsystem [compatibility mode]
 
Algorithm.ppt
Algorithm.pptAlgorithm.ppt
Algorithm.ppt
 
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
 
Control chap3
Control chap3Control chap3
Control chap3
 
Aa sort-v4
Aa sort-v4Aa sort-v4
Aa sort-v4
 
discrete-hmm
discrete-hmmdiscrete-hmm
discrete-hmm
 
Ae11 sol
Ae11 solAe11 sol
Ae11 sol
 
control engineering revision
control engineering revisioncontrol engineering revision
control engineering revision
 
parallel Merging
parallel Mergingparallel Merging
parallel Merging
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
 
Exploring Petri Net State Spaces
Exploring Petri Net State SpacesExploring Petri Net State Spaces
Exploring Petri Net State Spaces
 
Analysis of Algorithm (Bubblesort and Quicksort)
Analysis of Algorithm (Bubblesort and Quicksort)Analysis of Algorithm (Bubblesort and Quicksort)
Analysis of Algorithm (Bubblesort and Quicksort)
 
DSP System Assignment Help
DSP System Assignment HelpDSP System Assignment Help
DSP System Assignment Help
 
Algorithm: Quick-Sort
Algorithm: Quick-SortAlgorithm: Quick-Sort
Algorithm: Quick-Sort
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Block diagrams and signal flow graphs
Block diagrams and signal flow graphsBlock diagrams and signal flow graphs
Block diagrams and signal flow graphs
 
Block diagram
Block diagramBlock diagram
Block diagram
 
Ke3617561763
Ke3617561763Ke3617561763
Ke3617561763
 
A petri-net
A petri-netA petri-net
A petri-net
 
Block diagram reduction techniques
Block diagram reduction techniquesBlock diagram reduction techniques
Block diagram reduction techniques
 

Ähnlich wie RedisDay London 2018 - CRDTs and Redis From sequential to concurrent executions

Ähnlich wie RedisDay London 2018 - CRDTs and Redis From sequential to concurrent executions (20)

Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
General pipeline concepts
General pipeline conceptsGeneral pipeline concepts
General pipeline concepts
 
Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...
Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...
Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...
 
I04124052057
I04124052057I04124052057
I04124052057
 
Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check  Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsDSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
 
Real Time Filtering on Embedded ARM
Real Time Filtering on Embedded ARMReal Time Filtering on Embedded ARM
Real Time Filtering on Embedded ARM
 
D0372027037
D0372027037D0372027037
D0372027037
 
slides.07.pptx
slides.07.pptxslides.07.pptx
slides.07.pptx
 
Planning Under Uncertainty With Markov Decision Processes
Planning Under Uncertainty With Markov Decision ProcessesPlanning Under Uncertainty With Markov Decision Processes
Planning Under Uncertainty With Markov Decision Processes
 
Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in Haskell
 
My Postdoctoral Research
My Postdoctoral ResearchMy Postdoctoral Research
My Postdoctoral Research
 
A calculus of mobile Real-Time processes
A calculus of mobile Real-Time processesA calculus of mobile Real-Time processes
A calculus of mobile Real-Time processes
 
Project Management Techniques
Project Management TechniquesProject Management Techniques
Project Management Techniques
 
Analysis of Algorithum
Analysis of AlgorithumAnalysis of Algorithum
Analysis of Algorithum
 
Availability of a Redundant System with Two Parallel Active Components
Availability of a Redundant System with Two Parallel Active ComponentsAvailability of a Redundant System with Two Parallel Active Components
Availability of a Redundant System with Two Parallel Active Components
 
A Strategic Model For Dynamic Traffic Assignment
A Strategic Model For Dynamic Traffic AssignmentA Strategic Model For Dynamic Traffic Assignment
A Strategic Model For Dynamic Traffic Assignment
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03
 

Mehr von Redis Labs

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
Redis Labs
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Redis Labs
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 

Mehr von Redis Labs (20)

Redis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redisRedis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redis
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
 
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
 
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of OracleRedis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
 
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
 
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
 
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
 
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
 
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
 
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
 
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

RedisDay London 2018 - CRDTs and Redis From sequential to concurrent executions

  • 1. CRDTs and Redis From sequential to concurrent executions Carlos Baquero Universidade do Minho & INESC TEC
  • 2. 2/31 The speed of communication in the 19th century W. H. Harrison’s death “At 12:30 am on April 4th, 1841 President William Henry Harrison died of pneumonia just a month after taking o ce. The Rich- mond Enquirer published the news of his death two days later on April 6th. The North- Carolina standard newspaper published it on April 14th. His death wasn’t known of in Los Angeles until July 23rd, 110 days after it had occurred.” Text by Zack Bloom, A Quick History of Digital Communication Before the Internet. https://eager.io/blog/communication-pre-internet/ Picture by By Albert Sands Southworth and Josiah Johnson Hawes
  • 3. 3/31 The speed of communication in the 19th century Francis Galton Isochronic Map
  • 4. 4/31 The speed of communication in the 21st century RTT data gathered via http://www.azurespeed.com
  • 5. 5/31 The speed of communication in the 21st century If you really like high latencies . . . Time delay between Mars and Earth blogs.esa.int/mex/2012/08/05/time-delay-between-mars-and-earth/ Delay/Disruption Tolerant Networking www.nasa.gov/content/dtn
  • 6. 6/31 Latency magnitudes Geo-replication , up to 50ms (local region DC) ⇤, between 100ms and 300ms (inter-continental) No inter-DC replication Client writes observe latency Planet-wide geo-replication Replication techniques versus client side write latency ranges Consensus/Paxos [⇤, 2⇤] (with no divergence) Primary-Backup [ , ⇤] (asynchronous/lazy) Multi-Master (allowing divergence)
  • 7. 7/31 EC and CAP for Geo-Replication Eventually Consistent. CACM 2009, Werner Vogels In an ideal world there would be only one consistency model: when an update is made all observers would see that update. Building reliable distributed systems at a worldwide scale demands trade-o↵s between consistency and availability. CAP theorem. PODC 2000, Eric Brewer Of three properties of shared-data systems – data consistency, system availability, and tolerance to network partition – only two can be achieved at any given time. CRDTs provide support for partition-tolerant high availability
  • 8. 8/31 From sequential to concurrent executions Consensus provides illusion of a single replica This also preserves (slow) sequential behaviour Sequential execution Ops O o // p // q Time // We have an ordered set (O, <). O = {o, p, q} and o < p < q
  • 9. 8/31 From sequential to concurrent executions Consensus provides illusion of a single replica This also preserves (slow) sequential behaviour Sequential execution Ops O o // p // q Time // We have an ordered set (O, <). O = {o, p, q} and o < p < q
  • 10. 9/31 From sequential to concurrent executions EC Multi-master (or active-active) can expose concurrency Concurrent execution p // q Ops O o ?? r s 77 Time // Partially ordered set (O, ). o p q r and o s r Some ops in O are concurrent: p k s and q k s
  • 11. 10/31 Design of Conflict-Free Replicated Data Types A partially ordered log (polog) of operations implements any CRDT Replicas keep increasing local views of an evolving distributed polog Any query, at replica i, can be expressed from local polog Oi Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }| CRDTs are e cient representations that follow some general rules
  • 12. 10/31 Design of Conflict-Free Replicated Data Types A partially ordered log (polog) of operations implements any CRDT Replicas keep increasing local views of an evolving distributed polog Any query, at replica i, can be expressed from local polog Oi Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }| CRDTs are e cient representations that follow some general rules
  • 13. 10/31 Design of Conflict-Free Replicated Data Types A partially ordered log (polog) of operations implements any CRDT Replicas keep increasing local views of an evolving distributed polog Any query, at replica i, can be expressed from local polog Oi Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }| CRDTs are e cient representations that follow some general rules
  • 14. 11/31 Principle of permutation equivalence If operations in sequence can commute, preserving a given result, then under concurrency they should preserve the same result Sequential inc(10) // inc(35) // dec(5) // inc(2) dec(5) // inc(2) // inc(10) // inc(35) Concurrent inc(35) && inc(10) 88 && inc(2) dec(5) 88 You guessed: Result is 42
  • 15. 12/31 Implementing Counters Example: CRDT PNCounters A inc(35) && B inc(10) 88 && inc(2) C dec(5) 88 Lets track total number of incs and decs done at each replica {A(incs, decs), . . . , C(. . . , . . .)}
  • 16. 13/31 Implementing Counters Example: CRDT PNCounters Separate positive and negative counts are kept per replica A {A(35, 0), B(10, 0)} ++ B {B(10, 0)} 66 (( {A(35, 0), B(12, 0), C(0, 5)} C {B(10, 0), C(0, 5)} 33 Joining does point-wise maximums among entries (semilattice) At any time, counter value is sum of incs minus sum of decs
  • 17. 13/31 Implementing Counters Example: CRDT PNCounters Separate positive and negative counts are kept per replica A {A(35, 0), B(10, 0)} ++ B {B(10, 0)} 66 (( {A(35, 0), B(12, 0), C(0, 5)} C {B(10, 0), C(0, 5)} 33 Joining does point-wise maximums among entries (semilattice) At any time, counter value is sum of incs minus sum of decs
  • 18. 14/31 Implementing Counters Redis CRDT Counters There are multiple ways to implement CRDT counters Redis has a distinct implementation that favours garbage collection Redis CRDT counters are 59 bits (not 64) to avoid overflows
  • 19. 15/31 Registers Registers are an ordered set of write operations Sequential execution A wr(x) // wr(j) // wr(k) // wr(x) Sequential execution under distribution A wr(x) %% wr(x) B wr(j) // wr(k) 99 Register value is x, the last written value
  • 20. 16/31 Implementing Registers Naive Last-Writer-Wins CRDT register implemented by attaching local wall-clock times Sequential execution under distribution A (11:00)x '' (11:30)? ## B (12:02)j // (12:05)k 77 ? Problem: Wall-clock on B is one hour ahead of A Value x might not be writeable again at A since 12:05 > 11:30
  • 21. 17/31 Registers Sequential Semantics Register shows value v at replica i i↵ wr(v) 2 Oi and @wr(v0 ) 2 Oi · wr(v) < wr(v0 )
  • 22. 18/31 Preservation of sequential semantics Concurrent semantics should preserve the sequential semantics This also ensures correct sequential execution under distribution
  • 23. 19/31 Multi-value Registers Concurrency semantics shows all concurrent values {v | wr(v) 2 Oi ^ @wr(v0 ) 2 Oi · wr(v) wr(v0 )} Concurrent execution A wr(x) %% // wr(y) // {y, k} // wr(m) // {m} B wr(j) // wr(k) 99 Dynamo shopping carts are multi-value registers with payload sets The m value could be an application level merge of values y and k
  • 24. 20/31 Implementing Multi-value Registers Concurrency can be preciselly tracked with version vectors Concurrent execution (version vectors) A [1, 0]x %% // [2, 0]y // [2, 0]y, [1, 2]k // [3, 2]m B [1, 1]j // [1, 2]k 66 Metadata can be compressed with a common causal context and a single scalar per value (dotted version vectors)
  • 25. 21/31 Registers in Redis LWW arbitration Multi-value registers allows executions leading to concurrent values Presenting concurrent values is at odds with the sequential API Redis both tracks causality and registers wall-clock times Querying uses Last-Writer-Wins selection among concurrent values This preserves correctness of sequential semantics A value with clock 12:05 can still be causally overwritten at 11:30
  • 26. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 27. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 28. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 29. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 30. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 31. 23/31 Sets Concurrency Semantics Problem: Concurrently adding and removing the same element Concurrent execution A add(x) %% // rmv(x) // {?} // add(x) // {x} B rmv(x) // add(x) ::
  • 32. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 33. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 34. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 35. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 36. 25/31 Equivalence to a sequential execution? Add-Wins Sets Can we always explain a concurrent execution by a sequential one? Concurrent execution A {x, y} // add(y) // rmv(x) // {y} // ## {x, y} B {x, y} // add(x) // rmv(y) // {x} // ;; {x, y} Two (failed) sequential explanations H1 {x, y} // . . . // rmv(x) // {6 x, y} H2 {x, y} // . . . // rmv(y) // {x, 6 y} Concurrent executions can have richer outcomes
  • 37. 26/31 Concurrency Semantics Remove-Wins Sets Alternative: Let’s choose Remove-Wins Xi . = {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)} Remove-Wins requires more metadata than Add-Wins Both Add and Remove-Wins have same semantics in a total order They are di↵erent but both preserve sequential semantics
  • 38. 26/31 Concurrency Semantics Remove-Wins Sets Alternative: Let’s choose Remove-Wins Xi . = {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)} Remove-Wins requires more metadata than Add-Wins Both Add and Remove-Wins have same semantics in a total order They are di↵erent but both preserve sequential semantics
  • 39. 27/31 Take home message Concurrent executions are needed to deal with latency Behaviour changes when moving from sequential to concurrent Road to accommodate transition: Permutation equivalence Preserving sequential semantics Concurrent executions lead to richer outcomes CRDTs provide sound guidelines and encode policies
  • 40. Thank you! Carlos Baquero Email: cbm@di.uminho.pt, Twitter: @xmal
  • 41. 29/31 Sequence/List Weak/Strong Specification [Attiya et al, PODC 16] Element x is kept rpush(b) // hxbi $$ hi // lpush(x) // hxi 99 %% haxbi lpush(a) // haxi :: Element x is removed (Redis enforces Strong Specification) rpush(b) // hxbi ## hi // lpush(x) // hxi :: $$ // rem(x) // hi // habi ¬hbai lpush(a) // haxi ;;
  • 42. 30/31 Causal Consistency Redis CRDTs provide per-key causal consistency Source FIFO (TCP) A apple && %% B apple pie ## hot ## C pie hot peach? apple Causal consistency A apple %% B apple %% pie $$ hot "" C apple pie hot tasty Strongest highly available consistency model
  • 43. 31/31 Causal Consistency Redis CRDTs provide per-key causal consistency Source FIFO (TCP) A add(a) '' && B add(a) rmv(a) '' add(b) '' C rmv(a) add(b) add(a) Causal consistency A add(a) %% B add(a) %% rmv(a) && add(b) %% C add(a) rmv(a) add(b) Strongest highly available consistency model