SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Transactions with
Replicated Data
Distributed Software
Systems
One copy serializability
z Replicated transactional service
y Each replica manager provides concurrency control
and recovery of its own data items in the same way
as it would for non-replicated data
z Effects of transactions performed by various
clients on replicated data items are the same as
if they had been performed one at a time on a
single data item
z Additional complications: failures, network
partitions
y Failures should be serialized wrt transactions, i.e. any
failure observed by a transaction must appear to
have happened before a transaction started
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F200
Figure 14.18 Replicated transactional service.
B
A
Client + front end
BB BA A
GetBalance(A)
Client + front end
Replica managers
Replica managers
Deposit(B,3);
UT
Replication Schemes
z Read one –Write All
yCannot handle network partitions
z Schemes that can handle network
partitions
yAvailable copies with validation
yQuorum consensus
yVirtual Partition
Read one/Write All
z One copy serializability
y Each write operation sets a write lock at each replica
manager
y Each read sets a read lock at one replica manager
z Two phase commit
y Two-level nested transaction
xCoordinator -> Workers
xIf either coordinator or worker is a replica manager, it has to
communicate with replica managers
z Primary copy replication
y ALL client requests are directed to a single primary
server
xDifferent from scheme discussed earlier
Available copies replication
z Can handle some replica managers are
unavailable because they have failed or
communication failure
z Reads can be performed by any available replica
manager but writes must be performed by all
available replica managers
z Normal case is like read one/write all
y As long as the set of available replica managers does
not change during a transaction
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F201
Figure 14.19 Available copies.
A
X
Client + front end
P
B
Client + front end
Replica managers
Deposit(A,3);
UT
Deposit(B,3);
GetBalance(B)
GetBalance(A)
Replica managers
Y
M
B
N
A
B
Available copies replication
z Failure case
y One copy serializabilty requires that failures and
recovery be serialized wrt transactions
y This is not achieved when different transactions
make conflicting failure observations
y Example shows local concurrency control not enough
y Additional concurrency control procedure (called local
validation) has to be performed to ensure correctness
z Available copies with local validation assumes no
network partition - i.e. functioning replica
managers can communicate with one another
Local validation - example
z Assume X fails just after T has performed
GetBalance and N fails just after U has
performed GetBalance
z Assume X and N fail before T & U have
performed their Deposit operations
y T’s Deposit will be performed at M & P while U’s
Deposit will be performed at Y
y Concurrency control on A at X does not prevent U
from updating A at Y; similarly concurrency control
on B at N does not prevent Y from updating B at M &
P
y Local concurrency control not enough!
Local validation cont’d
z T has read from an item at X, so X’s
failure must be after T.
z T observes the failure of N, so N’s failure
must be before T
yN fails -> T reads A at X; T writes B at M & P
-> T commits -> X fails
ySimilarly, we can argue:
X fails -> U reads B at N; U writes A at Y ->
U commits -> N fails
Local validation cont’d
z Local validation ensures such incompatible
sequences cannot both occur
z Before a transaction commits it checks for
failures (and recoveries) of replica managers of
data items it has accessed
z In example, if T validates before U, T would
check that N is still unavailable and X,M, P are
available. If so, it can commit
z U’s validation would fail because N has already
failed.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F202
Figure 14.20 Network partition.
Client + front end
B
Withdraw(B, 4)
Client + front end
Replica managers
Deposit(B,3);
UT
Network
partition
B
B B
Handling Network Partitions
z Network partitions separate replica managers
into two or more subgroups, in such a way that
the members of a subgroup can communicate
with one another but members of different
subgroups cannot communicate
z Optimistic approaches
y Available copies with validation
z Pessimistic approaches
y Quorum consensus
Available Copies With
Validation
z Available copies algorithm applied within each
partition
y Maintains availability for Read operations
z When partition is repaired, possibly conflicting
transactions in separate partitions are validated
y The effects of a committed transaction that is now
aborted on validation will have to be undone
xOnly feasible for applications where such compensating
actions can be taken
Available copies with
validation cont’d
z Validation
y Version vectors (Write-Write conflicts)
y Precedence graphs (each partition maintains a log of
data items affected by the Read and Write operations
of transactions
y Log used to construct precedence graph whose
nodes are transactions and whose edges represent
conflicts between Read and Write operations
xNo cycles in graph corresponding to each partition
y If there are cycles in graph, validation fails
Quorum consensus
z A quorum is a subgroup of replica managers
whose size gives it the right to carry out
operations
z Majority voting one instance of a quorum
consensus scheme
y R + W > total number of votes in group
y W > half the total votes
y Ensures that each read quorum intersects a write
quorum, and two write quora will intersect
z Each replica has a version number that is used
to detect if the replica is up to date.
Virtual Partitions scheme
z Combines available copies and quorum
consensus
z Virtual partition = set of replica managers that
have a read and write quorum
z If a virtual partition can be formed, available
copies is used
y Improves performance of Reads
z If a failure occurs, and virtual partition changes
during a transaction, it is aborted
z Have to ensure virtual partitions do not overlap
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F203
Replica managers
Network partition
VX Y Z
TTransaction
Figure 14.21 Two network partitions.
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F204
Figure 14.22 Virtual partition.
X V Y Z
Replica managers
Virtual partition Network partition
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F205
Figure 14.23 Two overlapping virtual partitions.
Virtual partition V1 Virtual partition V2
Y X V Z
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F206
Figure 14.24 Creating a virtual partition.
Phase 1:
• The initiator sends a Join request to each potential member. The argument of Join
is a proposed logical timestamp for the new virtual partition;
• When a replica manager receives a Join request it compares the proposed logical
timestamp with that of its current virtual partition;
– If the proposed logical timestamp is greater it agrees to join and replies Yes;
– If it is less, it refuses to join and replies No;
Phase 2:
• If the initiator has received sufficient Yes replies to have Read and Write quora, it
may complete the creation of the new virtual partition by sending a Confirmation
message to the sites that agreed to join. The creation timestamp and list of actual
members are sent as arguments;
• Replica managers receiving the Confirmation message join the new virtual
partition and record its creation timestamp and list of actual members.

Weitere ähnliche Inhalte

Ähnlich wie Replicated data

Practical Byzantine Fault Tolernace
Practical Byzantine Fault TolernacePractical Byzantine Fault Tolernace
Practical Byzantine Fault TolernaceYongraeJo
 
19. Distributed Databases in DBMS
19. Distributed Databases in DBMS19. Distributed Databases in DBMS
19. Distributed Databases in DBMSkoolkampus
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsSachin Chauhan
 
M03 2 Behavioral Diagrams
M03 2 Behavioral DiagramsM03 2 Behavioral Diagrams
M03 2 Behavioral DiagramsDang Tuan
 
Chapter 14 replication
Chapter 14 replicationChapter 14 replication
Chapter 14 replicationAbDul ThaYyal
 
3 distributed transactions-cocurrency-query
3 distributed transactions-cocurrency-query3 distributed transactions-cocurrency-query
3 distributed transactions-cocurrency-queryM Rezaur Rahman
 
Distributed datababase Transaction and concurrency control
Distributed datababase Transaction and concurrency controlDistributed datababase Transaction and concurrency control
Distributed datababase Transaction and concurrency controlbalamurugan.k Kalibalamurugan
 
DIY: A distributed database cluster, or: MySQL Cluster
DIY: A distributed database cluster, or: MySQL ClusterDIY: A distributed database cluster, or: MySQL Cluster
DIY: A distributed database cluster, or: MySQL ClusterUlf Wendel
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency ControlDilum Bandara
 
20101116讨论会ppt
20101116讨论会ppt20101116讨论会ppt
20101116讨论会pptanyshpm chen
 
Distributed systems principles and paradigms.pdf
Distributed systems principles and paradigms.pdfDistributed systems principles and paradigms.pdf
Distributed systems principles and paradigms.pdfBizuayehuDesalegn
 
Two phase commit protocol in dbms
Two phase commit protocol in dbmsTwo phase commit protocol in dbms
Two phase commit protocol in dbmsDilouar Hossain
 
Introduction to transaction processing
Introduction to transaction processingIntroduction to transaction processing
Introduction to transaction processingJafar Nesargi
 
Chapter 9 introduction to transaction processing
Chapter 9 introduction to transaction processingChapter 9 introduction to transaction processing
Chapter 9 introduction to transaction processingJafar Nesargi
 

Ähnlich wie Replicated data (20)

Practical Byzantine Fault Tolernace
Practical Byzantine Fault TolernacePractical Byzantine Fault Tolernace
Practical Byzantine Fault Tolernace
 
Chapter 13
Chapter 13Chapter 13
Chapter 13
 
19. Distributed Databases in DBMS
19. Distributed Databases in DBMS19. Distributed Databases in DBMS
19. Distributed Databases in DBMS
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit Protocols
 
M03 2 Behavioral Diagrams
M03 2 Behavioral DiagramsM03 2 Behavioral Diagrams
M03 2 Behavioral Diagrams
 
Chapter 14 replication
Chapter 14 replicationChapter 14 replication
Chapter 14 replication
 
3 distributed transactions-cocurrency-query
3 distributed transactions-cocurrency-query3 distributed transactions-cocurrency-query
3 distributed transactions-cocurrency-query
 
Distributed datababase Transaction and concurrency control
Distributed datababase Transaction and concurrency controlDistributed datababase Transaction and concurrency control
Distributed datababase Transaction and concurrency control
 
DBMS UNIT V.pptx
DBMS UNIT V.pptxDBMS UNIT V.pptx
DBMS UNIT V.pptx
 
09 workflow
09 workflow09 workflow
09 workflow
 
DIY: A distributed database cluster, or: MySQL Cluster
DIY: A distributed database cluster, or: MySQL ClusterDIY: A distributed database cluster, or: MySQL Cluster
DIY: A distributed database cluster, or: MySQL Cluster
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
 
20101116讨论会
20101116讨论会20101116讨论会
20101116讨论会
 
20101116讨论会ppt
20101116讨论会ppt20101116讨论会ppt
20101116讨论会ppt
 
Distributed systems principles and paradigms.pdf
Distributed systems principles and paradigms.pdfDistributed systems principles and paradigms.pdf
Distributed systems principles and paradigms.pdf
 
Two phase commit protocol in dbms
Two phase commit protocol in dbmsTwo phase commit protocol in dbms
Two phase commit protocol in dbms
 
slides.07.pptx
slides.07.pptxslides.07.pptx
slides.07.pptx
 
F33 book-depend-pres-pt6
F33 book-depend-pres-pt6F33 book-depend-pres-pt6
F33 book-depend-pres-pt6
 
Introduction to transaction processing
Introduction to transaction processingIntroduction to transaction processing
Introduction to transaction processing
 
Chapter 9 introduction to transaction processing
Chapter 9 introduction to transaction processingChapter 9 introduction to transaction processing
Chapter 9 introduction to transaction processing
 

Kürzlich hochgeladen

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Kürzlich hochgeladen (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Replicated data

  • 2. One copy serializability z Replicated transactional service y Each replica manager provides concurrency control and recovery of its own data items in the same way as it would for non-replicated data z Effects of transactions performed by various clients on replicated data items are the same as if they had been performed one at a time on a single data item z Additional complications: failures, network partitions y Failures should be serialized wrt transactions, i.e. any failure observed by a transaction must appear to have happened before a transaction started
  • 3. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F200 Figure 14.18 Replicated transactional service. B A Client + front end BB BA A GetBalance(A) Client + front end Replica managers Replica managers Deposit(B,3); UT
  • 4. Replication Schemes z Read one –Write All yCannot handle network partitions z Schemes that can handle network partitions yAvailable copies with validation yQuorum consensus yVirtual Partition
  • 5. Read one/Write All z One copy serializability y Each write operation sets a write lock at each replica manager y Each read sets a read lock at one replica manager z Two phase commit y Two-level nested transaction xCoordinator -> Workers xIf either coordinator or worker is a replica manager, it has to communicate with replica managers z Primary copy replication y ALL client requests are directed to a single primary server xDifferent from scheme discussed earlier
  • 6. Available copies replication z Can handle some replica managers are unavailable because they have failed or communication failure z Reads can be performed by any available replica manager but writes must be performed by all available replica managers z Normal case is like read one/write all y As long as the set of available replica managers does not change during a transaction
  • 7. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F201 Figure 14.19 Available copies. A X Client + front end P B Client + front end Replica managers Deposit(A,3); UT Deposit(B,3); GetBalance(B) GetBalance(A) Replica managers Y M B N A B
  • 8. Available copies replication z Failure case y One copy serializabilty requires that failures and recovery be serialized wrt transactions y This is not achieved when different transactions make conflicting failure observations y Example shows local concurrency control not enough y Additional concurrency control procedure (called local validation) has to be performed to ensure correctness z Available copies with local validation assumes no network partition - i.e. functioning replica managers can communicate with one another
  • 9. Local validation - example z Assume X fails just after T has performed GetBalance and N fails just after U has performed GetBalance z Assume X and N fail before T & U have performed their Deposit operations y T’s Deposit will be performed at M & P while U’s Deposit will be performed at Y y Concurrency control on A at X does not prevent U from updating A at Y; similarly concurrency control on B at N does not prevent Y from updating B at M & P y Local concurrency control not enough!
  • 10. Local validation cont’d z T has read from an item at X, so X’s failure must be after T. z T observes the failure of N, so N’s failure must be before T yN fails -> T reads A at X; T writes B at M & P -> T commits -> X fails ySimilarly, we can argue: X fails -> U reads B at N; U writes A at Y -> U commits -> N fails
  • 11. Local validation cont’d z Local validation ensures such incompatible sequences cannot both occur z Before a transaction commits it checks for failures (and recoveries) of replica managers of data items it has accessed z In example, if T validates before U, T would check that N is still unavailable and X,M, P are available. If so, it can commit z U’s validation would fail because N has already failed.
  • 12. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F202 Figure 14.20 Network partition. Client + front end B Withdraw(B, 4) Client + front end Replica managers Deposit(B,3); UT Network partition B B B
  • 13. Handling Network Partitions z Network partitions separate replica managers into two or more subgroups, in such a way that the members of a subgroup can communicate with one another but members of different subgroups cannot communicate z Optimistic approaches y Available copies with validation z Pessimistic approaches y Quorum consensus
  • 14. Available Copies With Validation z Available copies algorithm applied within each partition y Maintains availability for Read operations z When partition is repaired, possibly conflicting transactions in separate partitions are validated y The effects of a committed transaction that is now aborted on validation will have to be undone xOnly feasible for applications where such compensating actions can be taken
  • 15. Available copies with validation cont’d z Validation y Version vectors (Write-Write conflicts) y Precedence graphs (each partition maintains a log of data items affected by the Read and Write operations of transactions y Log used to construct precedence graph whose nodes are transactions and whose edges represent conflicts between Read and Write operations xNo cycles in graph corresponding to each partition y If there are cycles in graph, validation fails
  • 16. Quorum consensus z A quorum is a subgroup of replica managers whose size gives it the right to carry out operations z Majority voting one instance of a quorum consensus scheme y R + W > total number of votes in group y W > half the total votes y Ensures that each read quorum intersects a write quorum, and two write quora will intersect z Each replica has a version number that is used to detect if the replica is up to date.
  • 17. Virtual Partitions scheme z Combines available copies and quorum consensus z Virtual partition = set of replica managers that have a read and write quorum z If a virtual partition can be formed, available copies is used y Improves performance of Reads z If a failure occurs, and virtual partition changes during a transaction, it is aborted z Have to ensure virtual partitions do not overlap
  • 18. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F203 Replica managers Network partition VX Y Z TTransaction Figure 14.21 Two network partitions.
  • 19. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F204 Figure 14.22 Virtual partition. X V Y Z Replica managers Virtual partition Network partition
  • 20. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F205 Figure 14.23 Two overlapping virtual partitions. Virtual partition V1 Virtual partition V2 Y X V Z
  • 21. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 2 (2nd impression) © Addison-Wesley Publishers 1994 F206 Figure 14.24 Creating a virtual partition. Phase 1: • The initiator sends a Join request to each potential member. The argument of Join is a proposed logical timestamp for the new virtual partition; • When a replica manager receives a Join request it compares the proposed logical timestamp with that of its current virtual partition; – If the proposed logical timestamp is greater it agrees to join and replies Yes; – If it is less, it refuses to join and replies No; Phase 2: • If the initiator has received sufficient Yes replies to have Read and Write quora, it may complete the creation of the new virtual partition by sending a Confirmation message to the sites that agreed to join. The creation timestamp and list of actual members are sent as arguments; • Replica managers receiving the Confirmation message join the new virtual partition and record its creation timestamp and list of actual members.