SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Life after CAP
CAP conjecture [reminder]
• Can only have two of:
– Consistency
– Availability
– Partition-tolerance

• Examples
– Databases, 2PC, centralized algo (C & A)
– Distributed databases, majority protocols (C & P)
– DNS, Bayou (A & P)
CAP theorem
• Formalization by Gilbert & Lynch
• What does impossible mean?
– There exist an execution which violates one of CAP
– not possible to guarantee that an algorithm has
all three at all times
• Shard data with different CAP tradeoffs
• Detect partitions and weaken consistency
Partition-tolerance & availability
• What is partition-tolerance?
– Consistency and Availability are provided by algo
– Partitions are external events (scheduler/oracle)
• Partition-tolerance is really a failure model
• Partition-tolerance equivalent with omissions

• In the CAP theorem
– Proof rests on partitions that never heal
– Datacenters can guarantee recovery of partitions!
• Can guarantee that conflict resolution eventually happens
How do we ensure consistency
• Main technique to be consistent
– Quorum principle
– Example: Majority quorums
• Always write to and read from a majority of nodes
• At least one node knows most recent value
majority(9)=5

WRITE(v)

READ v
Quorum Principle
• Majority Quorum
– Pro: tolerate up to N/2 -1 crashes
– Con: Have to read/write  N/2 +1 values

• Read/write quorums (Dynamo, ZooKeeper, Chain Repl)
– Read R nodes, Rrite W nodes, s.t. R + W > N (W > N/2)
– Pro: adjust performance of reads/writes
– Con: availability can suffer

• Maekwa Quorum
–
–
–
–

P1

Arrange nodes in a MxM grid
P4
Write to row+col, read cols (always overlap)
P7
Pro: Only need to read/write O( sqrt(N) ) nodes
Con: Tolerate at most O( sqrt(N) ) crashes (reconfiguration)

P2

P3

P5

P6

P8

P9

7
Probabilistic Quorums
• Quorum size α√N, (α > 1)
intersects with probability 1-exp(α2)
– Example:
– Maekwa:

N=16 nodes, quorum size 7,
intersects 95%, tolerates 9 failures
N=16 nodes, quorum size 7,
intersects 100%, tolerates 4 failures

– Pro: Small quorums, high fault-tolerance
– Con: Could fail to intersect, N usually large
8
Quorums and CAP
• With quorums we can get
– C & P: partition can make quorum unavailable
– C & A: no-partition ensures availability and atomicity

• Faced decision when fail to get quorum *brewer’11+
– Sacrifice availability by waiting for merger
– Sacrifice atomicity by ignoring the quorum

• Can we get CAP for weaker consistency?
What does atomicity really mean?
R

P1
R

P2
P3

W(5)

W(6)
invocation response

• Linearization Points
– Read ops appear as if immediately happened at all nodes at
• time between invocation and response

– Write ops appear as if immediately happened at all nodes at
• time between invocation and response
Definition of Atomicity
• Linearization Points
– Read ops appear as if immediately happened at all nodes at
• time between invocation and response

– Write ops appear as if immediately happened at all nodes at
• time between invocation and response

R:6

P1
R:5

P2
P3

W(5)

W(6)

atomic
Definition of Atomicity
R:6

P1
R:6

P2
P3

W(5)

W(6)
R:5

P1
R:6

P2
P3

atomic

W(5)

W(6)

not atomic
Atomicity too strong?
R:5

P1
R:6

P2
P3

W(5)

not atomic

W(6)

• Linearization points too strong?
– Why not just have R:5 appear atomically right after W(5)?
– Lamport: ”If P2’s operator phones P1 and tells her I just read 6”
Atomicity too strong?
R:5

P1
R:6

P2
P3

W(5)

W(6)

not atomic
sequentially
consistent

• Sequential consistency
–
–
–
–

Weaker than atomicity
Sequential consistency removes this ”real-time” requirement
Any global ordering OK as long as they respect local ordering
Does Gilbert’s proof fall apart for sequential consistency?

• Causal memory
–
–
–
–

Weaker than sequential
No need to have global view, each process different view
Local, read/writes immediately return to caller
CAP theorem does not apply to causal memory

P1
P2

causally
consistent
W(0) R:1

W(1) R:0
Going really weak
• Eventual consistency
– When network non-partitioned, all nodes eventually have the same
value
– I.e. don’t be ”consistent” at all times, but only after partitions heal!

• Based on powerful technique: gossipping
–
–
–
–

Periodically exchange ”logs” with one random node
Exchange must be constant-sized packets
Set reconciliation, merkle trees, etc
Use (clock, node_id) to break ties of events in log

• Properties of gossipping
– All nodes will have the same value in O(log N) time
– No positive-feedback cycles that congest the network
BASE
• Catch all for any consistency model C’ that
enables C’-A-P
– Eventual consistency
– PRAM consistency
– Causal consistency

• Main ingredients
– Stale data
– Soft-state (regenerateable state)
– Approximate answers
Summary
• No need to ensure CAP at all times
– Switch between algorithms or satisfy subset at different times

• Weaken consistency model
– Choose weaker consistency:
• Causal memory (relatively strong) work around CAP

– Only be consistent when network isn’t partitioned:
• Eventual consistency (very weak) works around CAP

• Weaken partition-tolerance
– Some environments never partition, e.g. datacenters
– Tolerate unavailability in small quorums
– Some env. have recovery guarantees (partitions heal within X
hours), perform conflict resolution
Related Work (ignored in talk)
• PRAM consistency (Pipelined RAM)
– Weaker than causal and non-blocking

• Eventual Linearizability (PODC’10)
– Becomes atomic after quiescent periods

• Gossipping & set reconciliation
– Lots of related work

Weitere ähnliche Inhalte

Was ist angesagt?

Clk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold timeClk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold timeVLSI SYSTEM Design
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5Peter Lawrey
 
Mantis qcon nyc_2015
Mantis qcon nyc_2015Mantis qcon nyc_2015
Mantis qcon nyc_2015neerajrj
 
Benchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersBenchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersJustin Dorfman
 

Was ist angesagt? (7)

ZeroMQ with NodeJS
ZeroMQ with NodeJSZeroMQ with NodeJS
ZeroMQ with NodeJS
 
Clk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold timeClk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold time
 
Who Broke My Crypto
Who Broke My CryptoWho Broke My Crypto
Who Broke My Crypto
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Scapy talk
Scapy talkScapy talk
Scapy talk
 
Mantis qcon nyc_2015
Mantis qcon nyc_2015Mantis qcon nyc_2015
Mantis qcon nyc_2015
 
Benchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersBenchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbers
 

Ähnlich wie CAP theorem by Ali Ghodsi

Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...NoSQLmatters
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency modelsrogerbodamer
 
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Tokyo Institute of Technology
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascaleMarc Snir
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databaseslovingprince58
 
A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...openseesdays
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Peter Breuer
 
Concurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papersConcurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papersSubhajit Sahu
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is uselessAdrian Cockcroft
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Studyelliando dias
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Making the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at OptimizelyMaking the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at OptimizelyScyllaDB
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Nikolay Savvinov
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixKishore Gopalakrishna
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksShalin Shekhar Mangar
 
Verification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLAVerification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLAUniversität Rostock
 
Computer network (8)
Computer network (8)Computer network (8)
Computer network (8)NYversity
 

Ähnlich wie CAP theorem by Ali Ghodsi (20)

Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
 
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascale
 
Ch3-2
Ch3-2Ch3-2
Ch3-2
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
 
Concurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papersConcurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papers
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is useless
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Making the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at OptimizelyMaking the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache Helix
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
 
Verification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLAVerification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLA
 
Computer network (8)
Computer network (8)Computer network (8)
Computer network (8)
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

CAP theorem by Ali Ghodsi

  • 2. CAP conjecture [reminder] • Can only have two of: – Consistency – Availability – Partition-tolerance • Examples – Databases, 2PC, centralized algo (C & A) – Distributed databases, majority protocols (C & P) – DNS, Bayou (A & P)
  • 3. CAP theorem • Formalization by Gilbert & Lynch • What does impossible mean? – There exist an execution which violates one of CAP – not possible to guarantee that an algorithm has all three at all times • Shard data with different CAP tradeoffs • Detect partitions and weaken consistency
  • 4. Partition-tolerance & availability • What is partition-tolerance? – Consistency and Availability are provided by algo – Partitions are external events (scheduler/oracle) • Partition-tolerance is really a failure model • Partition-tolerance equivalent with omissions • In the CAP theorem – Proof rests on partitions that never heal – Datacenters can guarantee recovery of partitions! • Can guarantee that conflict resolution eventually happens
  • 5. How do we ensure consistency • Main technique to be consistent – Quorum principle – Example: Majority quorums • Always write to and read from a majority of nodes • At least one node knows most recent value majority(9)=5 WRITE(v) READ v
  • 6. Quorum Principle • Majority Quorum – Pro: tolerate up to N/2 -1 crashes – Con: Have to read/write  N/2 +1 values • Read/write quorums (Dynamo, ZooKeeper, Chain Repl) – Read R nodes, Rrite W nodes, s.t. R + W > N (W > N/2) – Pro: adjust performance of reads/writes – Con: availability can suffer • Maekwa Quorum – – – – P1 Arrange nodes in a MxM grid P4 Write to row+col, read cols (always overlap) P7 Pro: Only need to read/write O( sqrt(N) ) nodes Con: Tolerate at most O( sqrt(N) ) crashes (reconfiguration) P2 P3 P5 P6 P8 P9 7
  • 7. Probabilistic Quorums • Quorum size α√N, (α > 1) intersects with probability 1-exp(α2) – Example: – Maekwa: N=16 nodes, quorum size 7, intersects 95%, tolerates 9 failures N=16 nodes, quorum size 7, intersects 100%, tolerates 4 failures – Pro: Small quorums, high fault-tolerance – Con: Could fail to intersect, N usually large 8
  • 8. Quorums and CAP • With quorums we can get – C & P: partition can make quorum unavailable – C & A: no-partition ensures availability and atomicity • Faced decision when fail to get quorum *brewer’11+ – Sacrifice availability by waiting for merger – Sacrifice atomicity by ignoring the quorum • Can we get CAP for weaker consistency?
  • 9. What does atomicity really mean? R P1 R P2 P3 W(5) W(6) invocation response • Linearization Points – Read ops appear as if immediately happened at all nodes at • time between invocation and response – Write ops appear as if immediately happened at all nodes at • time between invocation and response
  • 10. Definition of Atomicity • Linearization Points – Read ops appear as if immediately happened at all nodes at • time between invocation and response – Write ops appear as if immediately happened at all nodes at • time between invocation and response R:6 P1 R:5 P2 P3 W(5) W(6) atomic
  • 12. Atomicity too strong? R:5 P1 R:6 P2 P3 W(5) not atomic W(6) • Linearization points too strong? – Why not just have R:5 appear atomically right after W(5)? – Lamport: ”If P2’s operator phones P1 and tells her I just read 6”
  • 13. Atomicity too strong? R:5 P1 R:6 P2 P3 W(5) W(6) not atomic sequentially consistent • Sequential consistency – – – – Weaker than atomicity Sequential consistency removes this ”real-time” requirement Any global ordering OK as long as they respect local ordering Does Gilbert’s proof fall apart for sequential consistency? • Causal memory – – – – Weaker than sequential No need to have global view, each process different view Local, read/writes immediately return to caller CAP theorem does not apply to causal memory P1 P2 causally consistent W(0) R:1 W(1) R:0
  • 14. Going really weak • Eventual consistency – When network non-partitioned, all nodes eventually have the same value – I.e. don’t be ”consistent” at all times, but only after partitions heal! • Based on powerful technique: gossipping – – – – Periodically exchange ”logs” with one random node Exchange must be constant-sized packets Set reconciliation, merkle trees, etc Use (clock, node_id) to break ties of events in log • Properties of gossipping – All nodes will have the same value in O(log N) time – No positive-feedback cycles that congest the network
  • 15. BASE • Catch all for any consistency model C’ that enables C’-A-P – Eventual consistency – PRAM consistency – Causal consistency • Main ingredients – Stale data – Soft-state (regenerateable state) – Approximate answers
  • 16. Summary • No need to ensure CAP at all times – Switch between algorithms or satisfy subset at different times • Weaken consistency model – Choose weaker consistency: • Causal memory (relatively strong) work around CAP – Only be consistent when network isn’t partitioned: • Eventual consistency (very weak) works around CAP • Weaken partition-tolerance – Some environments never partition, e.g. datacenters – Tolerate unavailability in small quorums – Some env. have recovery guarantees (partitions heal within X hours), perform conflict resolution
  • 17. Related Work (ignored in talk) • PRAM consistency (Pipelined RAM) – Weaker than causal and non-blocking • Eventual Linearizability (PODC’10) – Becomes atomic after quiescent periods • Gossipping & set reconciliation – Lots of related work

Hinweis der Redaktion

  1. Failed ops appear ascompleted at every node, XORnever occurred at any node